42
Optimization (Introduction)

Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Optimization (Introduction)

Page 2: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Goal: Find the minimizer 𝒙∗that minimizes the objective (cost) function 𝑓 𝒙 :ℛ" → ℛ

Optimization

Unconstrained Optimization

Page 3: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Goal: Find the minimizer 𝒙∗that minimizes the objective (cost) function 𝑓 𝒙 :ℛ" → ℛ

Optimization

Constrained Optimization

Page 4: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Unconstrained Optimization• What if we are looking for a maximizer 𝒙∗?

𝑓 𝒙∗ = max𝒙𝑓 𝒙

Page 5: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Calculus problem: maximize the rectangle area subject to perimeter constraint

max𝒅 ∈ ℛ!

Page 6: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

𝑑$

𝑑$

𝑑%

𝑑%

𝐴𝑟𝑒𝑎 = 𝑑$𝑑%

𝑃𝑒𝑟𝑖𝑚𝑒𝑡𝑒𝑟 = 2(𝑑$ + 𝑑%)

Page 7: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside
Page 8: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

What is the optimal solution? (1D)

(First-order) Necessary condition

(Second-order) Sufficient condition

𝑓 𝑥∗ = min"𝑓 𝑥

Page 9: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Types of optimization problems

Gradient-free methods

Gradient (first-derivative) methods

Evaluate 𝑓 𝑥 , 𝑓′ 𝑥 , 𝑓′′ 𝑥

Second-derivative methods

𝑓 𝑥∗ = min"𝑓 𝑥

Evaluate 𝑓 𝑥

Evaluate 𝑓 𝑥 , 𝑓′ 𝑥

𝑓: nonlinear, continuous and smooth

Page 10: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Does the solution exists? Local or global solution?

Page 11: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Example (1D)

-6 -4 -2 2 4 6

-200

-100

100

Consider the function 𝑓 𝒙 = 7!

8 −7"

9 − 11 𝑥: + 40𝑥. Find the stationary

point and check the sufficient condition

Page 12: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

What is the optimal solution? (ND)

(First-order) Necessary condition

(Second-order) Sufficient condition

1D: 𝑓## 𝑥 > 0

1D: 𝑓′ 𝑥 = 0

𝑓 𝒙∗ = min𝒙𝑓 𝒙

Page 13: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Taking derivatives…

Page 14: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

From linear algebra:

A symmetric 𝑛 ×𝑛 matrix 𝑯 is positive definite if 𝒚𝑻𝑯 𝒚 > 𝟎 for any 𝒚 ≠ 𝟎

A symmetric 𝑛 ×𝑛 matrix 𝑯 is positive semi-definite if 𝒚𝑻𝑯 𝒚 ≥ 𝟎 for any 𝒚 ≠ 𝟎

A symmetric 𝑛 ×𝑛 matrix 𝑯 is negative definite if 𝒚𝑻𝑯 𝒚 < 𝟎 for any 𝒚 ≠ 𝟎

A symmetric 𝑛 ×𝑛 matrix 𝑯 is negative semi-definite if 𝒚𝑻𝑯 𝒚 ≤ 𝟎 for any 𝒚 ≠ 𝟎

A symmetric 𝑛 ×𝑛 matrix 𝑯 that is not negative semi-definite and not positive semi-definite is called indefinite

Page 15: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

𝑓 𝒙∗ = min𝒙𝑓 𝒙

First order necessary condition: 𝜵𝑓 𝒙 = 𝟎Second order sufficient condition: 𝑯 𝒙 is positive definiteHow can we find out if the Hessian is positive definite?

Page 16: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Types of optimization problems

Gradient-free methods

Gradient (first-derivative) methods

Evaluate 𝑓 𝒙 , 𝜵𝑓 𝒙 , 𝜵𝟐𝑓 𝒙

Second-derivative methods

𝑓 𝒙∗ = min𝒙𝑓 𝒙

Evaluate 𝑓 𝒙

Evaluate 𝑓 𝒙 , 𝜵𝑓 𝒙

𝑓: nonlinear, continuous and smooth

Page 17: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Consider the function 𝑓 𝑥E, 𝑥: = 2𝑥E9 + 4𝑥:: + 2𝑥: − 24𝑥EFind the stationary point and check the sufficient condition

Example (ND)

Page 18: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Optimization (1D Methods)

Page 19: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding• Needs to bracket the minimum inside an interval• Required the function to be unimodal

A function 𝑓:ℛ → ℛ is unimodal on an interval [𝑎, 𝑏]

ü There is a unique 𝒙∗ ∈ [𝑎, 𝑏] such that 𝑓(𝒙∗) is the minimum in [𝑎, 𝑏]

ü For any 𝑥E, 𝑥: ∈ [𝑎, 𝑏] with 𝑥E < 𝑥:

§ 𝑥: < 𝒙∗⟹𝑓(𝑥E) > 𝑓(𝑥:)§ 𝑥E > 𝒙∗⟹𝑓(𝑥E) < 𝑓(𝑥:)

Page 20: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

𝑎𝑏

𝑐𝑥$ 𝑥% 𝑥'𝑥(

𝑓$

𝑓%

𝑓'

𝑓(

𝑎𝑏

𝑐𝑥$ 𝑥% 𝑥'𝑥(

𝑓$

𝑓%

𝑓'

𝑓(

Page 21: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside
Page 22: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside
Page 23: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Golden Section Search

Page 24: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Golden Section Search What happens with the length of the interval after one iteration?

ℎ! = 𝜏 ℎ"

Or in general: ℎ#$! = 𝜏 ℎ#

Hence the interval gets reduced by 𝝉(for bisection method to solve nonlinear equations, 𝜏=0.5)

For recursion: 𝜏 ℎ! = (1 − 𝜏) ℎ"𝜏 𝜏 ℎ" = (1 − 𝜏) ℎ"

𝜏% = (1 − 𝜏)𝝉 = 𝟎. 𝟔𝟏𝟖

Page 25: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

• Derivative free method!

• Slow convergence:

limI→K

|𝑒ILE|𝑒I

= 0.618 𝑟 = 1 (𝑙𝑖𝑛𝑒𝑎𝑟 𝑐𝑜𝑛𝑣𝑒𝑟𝑔𝑒𝑛𝑐𝑒)

• Only one function evaluation per iteration

Golden Section Search

Page 26: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Example

Page 27: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Newton’s MethodUsing Taylor Expansion, we can approximate the function 𝑓 with a quadratic function about 𝑥M𝑓 𝑥 ≈ 𝑓 𝑥M + 𝑓N 𝑥M (𝑥 − 𝑥M) +

E: 𝑓

N′ 𝑥M (𝑥 − 𝑥M):

And we want to find the minimum of the quadratic function using the first-order necessary condition

Page 28: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Newton’s Method• Algorithm:𝑥3 = starting guess𝑥456 = 𝑥4 − 𝑓′ 𝑥4 /𝑓′′ 𝑥4

• Convergence: • Typical quadratic convergence• Local convergence (start guess close to solution)• May fail to converge, or converge to a maximum or

point of inflection

Page 29: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Newton’s Method (Graphical Representation)

Page 30: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

ExampleConsider the function 𝑓 𝑥 = 4 𝑥9 + 2 𝑥: + 5 𝑥 + 40

If we use the initial guess 𝑥M = 2, what would be the value of 𝑥 after one iteration of the Newton’s method?

Page 31: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Optimization (ND Methods)

Page 32: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Optimization in ND: Steepest Descent Method Given a function 𝑓 𝒙 :ℛ7 → ℛ at a point 𝒙, the function will decrease its value in the direction of steepest descent: −𝜵𝑓 𝒙

𝑓 𝑥E, 𝑥: = (𝑥E − 1)𝟐+(𝑥: − 1)𝟐

What is the steepest descent direction?

Page 33: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Steepest Descent Method

𝑓 𝑥E, 𝑥: = (𝑥E − 1)𝟐+(𝑥: − 1)𝟐Start with initial guess:

𝒙M =33

Check the update:

Page 34: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Steepest Descent Method

𝑓 𝑥E, 𝑥: = (𝑥E − 1)𝟐+(𝑥: − 1)𝟐Update the variable with: 𝒙ILE = 𝒙I − 𝛼I𝜵𝑓 𝒙I

How far along the gradient should we go? What is the “best size” for 𝛼I?

Page 35: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside
Page 36: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Steepest Descent Method Algorithm:

Initial guess: 𝒙3

Evaluate: 𝒔4= −𝜵𝑓 𝒙4

Perform a line search to obtain 𝛼4 (for example, Golden Section Search)

𝛼4 = argmin8

𝑓 𝒙4 + 𝛼 𝒔4

Update: 𝒙456 = 𝒙4 + 𝛼4 𝒔4

Page 37: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Line Search

Page 38: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

ExampleConsider minimizing the function

𝑓 𝑥!, 𝑥" = 10(𝑥!)# − 𝑥" " + 𝑥! − 1

Given the initial guess𝑥! = 2, 𝑥"= 2

what is the direction of the first step of gradient descent?

Page 39: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Newton’s Method Using Taylor Expansion, we build the approximation:

Page 40: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Newton’s MethodAlgorithm:Initial guess: 𝒙3

Solve:𝑯𝒇 𝒙4 𝒔4 = −𝜵𝑓 𝒙4Update: 𝒙456 = 𝒙4 + 𝒔4

Note that the Hessian is related to the curvature and therefore contains the information about how large the step should be.

Page 41: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Try this out!𝑓 𝑥, 𝑦 = 0.5𝑥: + 2.5𝑦:

When using the Newton’s Method to find the minimizer of this function, estimate the number of iterations it would take for convergence?

A) 1 B) 2-5 C) 5-10 D) More than 10 E) Depends on the initial guess

Page 42: Optimization (Introduction) · 2020-05-06 · Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding • Needs to bracket the minimum inside

Newton’s Method SummaryAlgorithm:Initial guess: 𝒙3Solve:𝑯𝒇 𝒙4 𝒔4 = −𝜵𝑓 𝒙4Update: 𝒙456 = 𝒙4 + 𝒔4

About the method…• Typical quadratic convergence J• Need second derivatives L• Local convergence (start guess close to solution)• Works poorly when Hessian is nearly indefinite• Cost per iteration: 𝑂(𝑛:)