42
Optimization (Introduction)

Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Optimization (Introduction)

Page 2: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Goal: Find the minimizer !∗that minimizes the objective (cost) function " ! :ℛ" → ℛ

Optimization

Unconstrained Optimization

ID f(x) : IR → IR

NE FCI) : IR"→ 112

-

fatal*

/H¥nfGY or x: arg yi±yf-

Page 3: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Goal: Find the minimizer !∗that minimizes the objective (cost) function " ! :ℛ" → ℛ

Optimization

Constrained Optimizationf- (x*) = min fCx)

X

s .t . hi G) ⇐ o →equalitygig:# so → inequality(in

Page 4: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Unconstrained Optimization• What if we are looking for a maximizer !∗?

" !∗ = max# " ! t÷f(x*) = main (-fan )

Page 5: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Calculus problem: maximize the rectangle area subject to perimeter constraint

max! ∈ ℛ!

✓area

-Max Area

① - -

perimeter

→perimeterconstraint

②ditto③dzflo }

di.dz#Lwithoutpericoust)-s/di--dz=l0-sA=IooT-

Page 6: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

'$

'$

'%

'%

()*+ = '$'%

-*)./*0*) = 2('$ + '%)

Area [mata -100

perimeter t.a.pe (violates"

-f#¥Lr.D D

Page 7: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

sizes A

¥"

pig ¥.dz):÷÷÷ILi€¥⇐:*:i÷÷÷.* i;

(d, .dz) d' dies

Page 8: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

What is the optimal solution? (1D)

(First-order) Necessary condition

(Second-order) Sufficient condition

! "∗ = min"! "IT ,max-

gifted, pointsf"txt) > o → x

* is minimum

f-"(x*) co - X

* is maximum

Page 9: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Types of optimization problems

Gradient-free methods

Gradient (first-derivative) methods

Evaluate ! " , !′ " , !′′ "

Second-derivative methods

! "∗ = min" ! "

Evaluate ! "

Evaluate ! " , !′ "

5: nonlinear, continuous and smooth

If

off

Page 10: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Does the solution exists? Local or global solution? "°*^#

a IEEE

Page 11: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Example (1D)

-6 -4 -2 2 4 6

-200

-100

100

Consider the function " ! = 7!8 −

7"9 − 11 -

: + 40-. Find the stationary point and check the sufficient condition

miufcx)X

* lstarder necessary conditionmax: :

'

If-'(x) .-4¥ - 3¥ - 22×+40 ; min

f)G) = o ⇒ x3 - E - 22×+40=0 !

solutions ⇒ x =/-25 !* min

4

* 2nd order condition :

'''

as .ae#-aftIfa7::ii:i:.::.oIEhin ,f-" f-5) =3 (25) +10-2220

(Min)

Page 12: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

What is the optimal solution? (ND)

(First-order) Necessary condition

(Second-order) Sufficient condition

1D: !## " > 0

1D: !′ " = 0

! *∗ = min$! * f- (x)

HI)= A

NI : If(E) = Q - gives stationary solution×*

NI i t ) is positive definite → x*is

minimizer

Page 13: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Taking derivatives…f : IR"→ R

f-(Xn) = f- (Xi , Xz , - - -,Xn)

¥.÷ . - - - :# ⇒ I:÷÷÷lgradient of

f CnxD

¥. If z

µ ,±,Ein Eas - - - - Tian Hi .-¥,

:*. :*. :#ox.- - -

- o¥oxn a ¥÷⇒KYE' o¥a.. ¥¥.

--

- . g¥)nth

Page 14: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

From linear algebra:

A symmetric ; ×; matrix = is positive definite if >&= > > @ for any > ≠ @

A symmetric ; ×; matrix = is positive semi-definite if >&= > ≥ @ for any > ≠ @

A symmetric ; ×; matrix = is negative definite if >&= > < @ for any > ≠ @

A symmetric ; ×; matrix = is negative semi-definite if >&= > ≤ @ for any > ≠ @

A symmetric ; ×; matrix = that is not negative semi-definite and not positive semi-definite is called indefinite

I.i ta- O

y . HyGEo II's. s

e::÷.I :.

Page 15: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

! *∗ = min$! *

First order necessary condition: +! * = ,Second order sufficient condition: - * is positive definiteHow can we find out if the Hessian is positive definite?

la. eight)

THW→ ( X, y ) → are eigenpairs of H

jyxyty -- x " yn: → "5fa¥:*.

* ki so ti →

/)yTHy >

o tty ftp.defsxt isminimizer

* Xi Lo fi ⇒ y'

Hy < o Hy → His neg def ⇒ x* is

maximizer

* Tgi,Yo }→ H is indefinite → x*is psaddetdeoint

Page 16: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Types of optimization problems

Gradient-free methods

Gradient (first-derivative) methods

Evaluate ! * , +! * , +%! *

Second-derivative methods

! *∗ = min$ ! *

Evaluate ! '

Evaluate ! ' , !" #

5: nonlinear, continuous and smooth

+HH

Page 17: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Consider the function " -E, -: = 2-E9 + 4-:: + 2-: − 24-EFind the stationary point and check the sufficient condition

Example (ND)me

E- if" -244¥:("o" : )8×2+2

1)of -- e → (GI!:{4]=fg]⇒

6×2--24 → xf=4→x,=±z8×2=-2 → Xz= - O -25

stationary points : x# III.as] x'*-- fo?⇐]

''et: Hina.fi#:.u.fttfo.-t-l::Ih::::i.n.

Page 18: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Optimization (1D Methods)

Page 19: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Optimization in 1D: Golden Section Search • Similar idea of bisection method for root finding• Needs to bracket the minimum inside an interval• Required the function to be unimodal

A function ":ℛ → ℛ is unimodal on an interval [4, 5]

ü There is a unique !∗ ∈ [4, 5] such that "(!∗) is the minimum in [4, 5]

ü For any -E, -: ∈ [4, 5] with -E < -:

§ -: < !∗⟹"(-E) > "(-:)§ -E > !∗⟹"(-E) < "(-:)

HuhfK¥,if'mI::¥:*"i÷/ / 4 I

fix.✓

*

OI- -

r

- -

r

Page 20: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

+F

GH$ H% H'H(

5$

5%

5'

5(

+F

GH$ H% H'H(

5$

5%

5'

5(

¥7 (FITta fa

Fb Fb

• B.+**22

B A *a *2×* B

[email protected][ 4 , b ]

Page 21: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

K£7 Propose the point asks- t.IT x

, e-atCl-E) hk

/XI Xz

HI,

Xz --at Chic

Ill- C) hk :

The 1 at the start h$⇐ (b -a)i I

1←Ehk'#£E)hI f , > f,or f, sfz

!¥%,9k Ex, , b ]

[a. xD

T.ee/-fEoeryiteration

aFI¥EIb * hkti-E.hn interval gets'

Em!i hatch. .ch#..EfET;db"I

'(I - E) hrs -

- Ehr

¥t%h* ,Cte) -- E - 12=0.6-187

I t

Page 22: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

KI interval Caio) 12-0.618-1h. .- Cb -a)-

! I I → x, = at( t - E) ho

I,hofz-t-GD-qh.pe#qs.--cyh-/if f , sfz

: -→x*E[a,Xz]

" I 9k b=X2

µh-# xz=x, →fz - fi

a:④TTbtf+,

III.Ethan..

t¥h¥¥/( f , - FCK)

if f , > fz : →x*E Ex, , b]

¥+,¥¥¥hµ,That Ej÷¥÷nn-f=fz-

Xz= At Chu fz=f(Xz)

Page 23: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Golden Section Search

Page 24: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Golden Section Search What happens with the length of the interval after one iteration?

ℎ! = ( ℎ"

Or in general: ℎ#$! = ( ℎ#

Hence the interval gets reduced by )(for bisection method to solve nonlinear equations, (=0.5)

For recursion: ( ℎ! = (1 − () ℎ"( ( ℎ" = (1 − () ℎ"

(% = (1 − ()) = .. 012

Page 25: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

• Derivative free method!

• Slow convergence:

limI→K|@ILE|@I

= 0.618 D = 1 (EFG@4D HIGJ@DK@GH@)

• Only one function evaluation per iteration

Golden Section Search II → the < tot

① IevaluateFG)

HE ha

-¥ -- hee ef÷=Y÷ -- Chien-

reI → T

-

-Xi cheap ,

Page 26: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Example

A = - to → ho = 20TT b -- to h

,= ?

he,=I ko → 0.618×20 = 12.36

Page 27: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Newton’s MethodUsing Taylor Expansion, we can approximate the function " with a quadratic function about -M" - ≈ " -M + "N -M (- − -M) + E

: "N′ -M (- − -M):

And we want to find the minimum of the quadratic function using the first-order necessary condition

Yeti = Xk t h

n¥near" =I

* = O →point5-

'Go) tyzf

''

Go) Cx - to)¢= ofstationary

sin:÷±÷÷÷µ÷¥¥÷7¥-

Page 28: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Newton’s Method• Algorithm:"3 = starting guess"456 = "4 − !′ "4 /!′′ "4

• Convergence: • Typical quadratic convergence• Local convergence (start guess close to solution)• May fail to converge, or converge to a maximum or

point of inflection

i

- ---

Page 29: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Newton’s Method (Graphical Representation)get A

tho)HA

IED'

•Iµ.¥i

l l l

#i→X3 Xz Xl Xo X

sequence of opt.

using quad. approx I

Page 30: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

ExampleConsider the function " - = 4 -9 + 2 -: + 5 - + 40

If we use the initial guess -M = 2, what would be the value of - after one iteration of the Newton’s method?

-

-

x,= ?

f-'

G) = 12×2+4×+5

f-"

G) = 24 X t 4

h =-fifty -- - CMg¥£¥t# =

- ft

X , = Xoth → X

,= 2 - ¥2 -1×1--0.82697

Page 31: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Optimization (ND Methods)

Page 32: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Optimization in ND: Steepest Descent Method Given a function ! * :ℛ7 → ℛ at a point *, the function will decrease its value in the direction of steepest descent: −+! *

" -E, -: = (-E − 1)O+(-: − 1)O

What is the steepest descent direction?

min FG)x

EI-

FIFA

£Xz - - - - - - - - • ×

if"

(Xo)iX1

Page 33: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Steepest Descent Method

" -E, -: = (-E − 1)O+(-: − 1)OStart with initial guess:

!M = 33

Check the update:

¥2 = Ii - Of

XoKD

es I 0ItEtan .- IT, ] missed± . ft;]

-

i

Page 34: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Steepest Descent Method

" -E, -: = (-E − 1)O+(-: − 1)OUpdate the variable with: !ILE = !I − PIQ" !I

How far along the gradient should we go? What is the “best size” for PI?

=

I , = Io - 0=5 OfGo).-

-µ12=0.57How can we get a ?

I

Page 35: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

i:"

÷÷÷÷÷:÷÷÷..

05¥

Page 36: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Steepest Descent Method Algorithm:

Initial guess: *3

Evaluate: ;4= −+! *4

Perform a line search to obtain <4 (for example, Golden Section Search)

<4 = argmin8

! *4 + < ;4

Update: *456 = *4 + <4 ;4

#of

'"

=

① ÷÷.

D Oret ask-

Page 37: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Line Search Xktt = Xk - Ak Tf(Xn)we want to find de St .

f- (Hett ) min f(Xn -④DfC)x he

1st order condition ¥70 → gives a)DI = ofArt, ) . Tf(Xr) = Oda OXKH

pg(XnH)•Tf¥f(Xkti ) is orthogonal tozigpoEEfrwnergefa-a.GR#

Page 38: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

ExampleConsider minimizing the function

! "!, "" = 10("!)# − "" " + "! − 1

Given the initial guess"! = 2, ""= 2

what is the direction of the first step of gradient descent?

Miu far, , Xz)X, gXz

=@± .- I:]

a- =p if Iska -- III ]

4un±÷÷t→fi;

Page 39: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Newton’s Method Using Taylor Expansion, we build the approximation:

f- (Ets ) -- f t ELITE tf It④ = ICI-- tnonlinear quadratic+St order condition : II - o approx off

I⇒t⇒ → His :p.my#etric--/tfH.)s---TfGTf→ solve liusys to find

-Newton slip s

Page 40: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Newton’s MethodAlgorithm:Initial guess: *3

Solve:-9 *4 ;4 = −+! *4Update: *456 = *4 + ;4

Note that the Hessian is related to the curvature and therefore contains the information about how large the step should be.

,#ii=f¥any

-- -

→ solve 0¥⑤-

T G

Page 41: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Try this out!" -, R = 0.5-: + 2.5R:

When using the Newton’s Method to find the minimizer of this function, estimate the number of iterations it would take for convergence?

A) 1 B) 2-5 C) 5-10 D) More than 10 E) Depends on the initial guess

Page 42: Optimization (Introduction) · Optimization (Introduction) Goal: Find the minimizer! ... •Local convergence (start guess close to solution) •Works poorly when Hessian is nearly

Newton’s Method SummaryAlgorithm:Initial guess: *3Solve:-9 *4 ;4 = −+! *4Update: *456 = *4 + ;4

About the method…• Typical quadratic convergence J• Need second derivatives L• Local convergence (start guess close to solution)• Works poorly when Hessian is nearly indefinite• Cost per iteration: >(@:)

=

-