28
Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407 …Taylor Series and Rotations Spring, 2008

Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Embed Size (px)

Citation preview

Page 1: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Curved Trajectories towards Local Minimum of a Function

Al JimenezMathematics Department

California Polytechnic State University San Luis Obispo, CA 93407

…Taylor Series and Rotations

Spring, 2008

Page 2: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Introduction and Notation

• The Problem

Minimize ( ), :n

n

xf x f

• Derivatives: (4)( ), ( ), ( ), ( ), etcf x f x f x f x

• A local min x* is a critical point: ( *) 0f x

• Necessary condition: ≥ 0( *)f x

Page 3: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Typical Iterative Methods

• Sequence is generated from x0 1 2 1, ,..., ,k kx x x x

• Such that 1( ) ( ) ( )k k k k kf x f x p v f x

• With vk a vector with propertya descent direction

( ) 0k kf x v

• And pk > 0 typically approximates solution of

called the line search or the scalar search

Minimize ( )k kp

f x pv

• Proven to converge for smooth functions

Page 4: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Current Methods• Selecting vk has huge effect on convergence rate:

– Steepest Descent: 1st order– Newton’s direction: 2nd order,

but may not be a descent direction when far from a min

– Conjugate Directions uses vk-1, vk-2, ...

– Quasi-Newton/Variable metric also uses vk-1, vk-2, ...

– High order Tensor models fit prior iteration values– Number of derivatives available affects method

( )k kv f x 1

( ) ( )k k kv f x f x

• The scalar search– Accuracy of scalar minimization – Quadratic models: “Trust Region”

Page 5: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Infinite Series of Solution

• Matrix vector products, but shown with exponents for connections with scalar Taylor series.

* 2 31 1( ) ( ) ( ) ( ) ...

2 6k k k k k k kx h z h z z h z z h z z

1 1

1 2

1 3 (4)

( ) ( )

( ) ( ) ( 1) ( ) ( ) ( )

( ) ( ) ( 1)( 2) 3 ( ) ( ) ( ) ( ) ( ) ( ) ( )

pk k k

pk k k k k k

pk k k k k k k k k k

h z p f x z

h z f x p p z f x h z h z

h z f x p p p z f x h z h z f x h z h z h z

Page 6: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Infinite Series of Solution…• Define:

2

3 2 2

(4)4 2 3 2 2 2

( ) ( )

1( ) ( )

21

( ) ( ) ( )6

k k

k k

k k k

f x d f x

f x d f x d d

f x d f x d d f x d d d

• Then: * 22 2 3

2 32 3 4

1( 1)

2

1( 1)( 2) ( 1) ...

6

kx x pd p p d p d

p p p d p p d p d

• For p = 1: *2 3 4 ....kx x d d d

Page 7: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Curved Trajectories Algorithm• At kth iteration, estimate , then calculate:

2

3 22

24 2 33

( ) ( )

1( ) ( ) ( 1) ( )

1( ) ( ) ( 1) ( )

k k

k k k

k k k

f x d f x

f x d f x d f x

f x d f x d d f x

• Select order, modify di , and select pk

1 2

21 2 2 3

2 31 2 2 3 2 3 4

3 12

2 211 1

2 6 66 6

k k

k k

k k

x x pd

x x d p d d p

x x d p d d p d d d p

2nd order:3rd order:

4th order:

Page 8: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Challenges• High order terms accurately approximated

from the Gradient and the Hessian

• Scalar searches along polynomial curved trajectories

• Performance for large problems– Exploit Sparse Hessian

• Store nonzeros only, no operations on zeros

• Far from solution:– Hessian not positive definite (solved)

• Hessian modified and use CG step as last resort

Page 9: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407
Page 10: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Hessian < 0 Changes

Page 11: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

CPU-time Profile (127 problems < 500 variables)

30%

40%

50%

60%

70%

80%

90%

100%

1 2 3 4 5 6 7 8 9 10 11 12Normalized CPU-time/problem

Cu

mu

lati

ve

Dis

trib

uti

on

CTA CTAn CTAnn CG Descent

Lancelot Tenmin L-BFGS L-BFGS-B

Cuter Performance Profiles

Page 12: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

CPU-time Profile (51 problems >= 500 variables)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 3 5 7 9 11 13

Normalized CPU-time/problem

Cu

mu

lati

ve

Dis

trib

uti

on

CTA CTAn CG Descent Lancelot L-BFGS-B

Cuter Performance Profiles

Page 13: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Current Research Pursuits

Handle multiple functions: Pareto optimal points

Handle Constraint Functions Explore the family of infinite series for

combination of composition functions.

Page 14: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Rosenbrock Banana Function

• Algorithm selects

2 2 2( , ) 100( ) (1 )f x y y x x *

0 [ ] [ 1.2 1] [1 1]T T Tx x y x

, , , , ,x

-1.21.0

f 24.2000 Gradient

-215.600-88.00

,Hessian

1330.00 480.0480.0 200

,d2

-0.02472-0.3807

,d3

-0.024440.05805

,d4

-0.024200.05687

,4th order xk1

1.2 1. p ( ) 0.04532 p ( ) 0.02416 0.003879 p1.0 1. p ( ) 0.6979 p ( )0.4968 0.06462 p

1 [0.1156 0.1479] , 2.59Tx f

Page 15: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

x0

x1

x2

x3f = 24.2

f = 24.2

f = 4f = 0.5

Page 16: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

3D View

Page 17: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

2 2 2( , ) 100( ) (1 )f x y y x x

Page 18: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

2 2 2( , ) 100( ) (1 )f x y y x x

Page 19: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Trajectories from starting point

Page 20: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Rotations

Page 21: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Rotations 3D

Page 22: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Rotations• At point we have kx 1 ( ) ( )k kx x R h p

h(p) is trajectory and R(θ) is rotation matrix.

• h(0) = 0 and R(0) = I, and for 2 coordinates, counterclockwise

cos sin( )

sin cosR

• At the kth step far from solution we want:

,

Minimize ( ) ( )kp

f x R h p

( ) ( ) ( )k k k kf x R h p f x But settle for pk, θk:

Page 23: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Rotations (continued)

• Gives

1

21 2

( )sin cos( ) 0

( )cos sin

h pf f ff R h p

h px x

1 2* 1 *2 1

1 21 2

( ) ( )tan ,

2 2( ) ( )

f fh p h p

x xf f

h p h px x

• Trajectory angle with the gradient for R(0) = I1 ( )

cos , 0( ) 2G G

f h p

f h p

• Observations:1 2

0 2 1

1 21 2

2

( ) ( )

( ) ( )

f f fh p h p

x x

f f fh p h p

x x

Page 24: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

2 2 2( , ) 100( ) (1 )f x y y x x

Page 25: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

Rotation Challenges/Results• Select effective θk without too much work

– Using existing strategy to calculate pk, then calculate a θk from θ* and θG . Then calculate a new pk again using rotated trajectory.

*0.4min( , )k G – Good results with– θk > 40º indicates elongated ellipse contours,

and rotation seems unproductive in this case.– Effective when CTA series is convergent and

iteration is not close to the minimum point.

• Functions of more than 2 variables later

Page 26: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

f (p, θ)

Page 27: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

f (p, θ), θ = 0, -0.1, -0.2, -0.3

θ = 0

θ = -0.1

θ = -0.2θ = -0.3

Page 28: Curved Trajectories towards Local Minimum of a Function Al Jimenez Mathematics Department California Polytechnic State University San Luis Obispo, CA 93407

More than Two Coordinates• Ignore coordinates with insignificant Newton

correction magnitudes.• Success achieved by adding the 3rd coordinate

to the first two as follows:– Calculate the rotation by paring the 3rd coordinate

with each of the top 2 coordinates.– This results in a rotation matrix:

2 2 1 1

3 3 1 1

3 3 2 2

1 0 0 cos 0 sin cos sin 0

0 cos sin 0 1 0 sin cos 0

0 sin cos sin 0 cos 0 0 1

R

– Where the angles θ1 , θ2 , θ3 are each calculated between two coordinates as explained before.

• The 4th coordinate is added by pairing rotations with the first 3 coordinates, and so on.