View
33
Download
0
Category
Preview:
Citation preview
Steepest Gradient Method
Conjugate Gradient Method[ ]Hoonyoung Jeong
Department of Energy Resources EngineeringSeoul National University
Review
β’ What is the formula of βππ?β’ What is the physical meaning of βππ?β’ What is the formula of a Jacobian matrix?β’ How is a Jacobian matrix utilized?β’ What is the formula of a Hessian matrix?β’ How is a Hessian matrix utilized?
Key Questions
β’ Explain the steepest gradient method intuitivelyβ’ Explain a step sizeβ’ Present stop criteriaβ’ What is the disadvantage of the steepest gradient method?β’ Explain the conjugate gradient method by comparing with the steepest
gradient method
Comparison Intuitive Solution to Steepest Gradient Method
The example is mathematically expressed as: ππ(x): elevation at location x = π₯π₯ π¦π¦ Find x maximizing ππ(x)
1) Find the steepest ascent direction by taking one step to each direction at the current locationSteepest ascent direction= Tangential direction at the current location= π»π»ππ xππ = ππππ
πππ₯π₯πππππππ¦π¦
= Direction locally maximizing f(x) or Locally steepest ascent direction
2) Take steps along the steepest directionxππ+1 = xππ + πππππ»π»ππ xππ
ππ: iterationππππ: how many steps to the steepest direction
3) Stop if you meet uphill
4) Repeat 1) β 3) until you reach a peak
4
xππ
xππ+1
Direciton:Length: ππππ π»π»ππ xππ
xππ+1 β xππ = πππππ»π»ππ xππ
Example of Gradient Descent Method (1)
β’ π₯π₯ππ+1 = π₯π₯ππ β πππππ»π»ππ(π₯π₯ππ)β’ ππ = π₯π₯4
β’ π»π»ππβ’ π₯π₯0 = 2β’ ππππ =0.02 (constant)β’ π₯π₯1 = π₯π₯0 β ππ0π»π»ππ(π₯π₯0)=?β’ π₯π₯2 = π₯π₯1 β ππ1π»π»ππ(π₯π₯1)=?β’ π₯π₯3 = π₯π₯2 β ππ2π»π»ππ(π₯π₯2)=?β’ Repeat for ππππ =0.1 (constant)
Disadvantage of Gradient Descent Method
β’ Slow near local minima or maximaThe 2-norm of a gradient vector
becomes small near local minima or maxima
Step Size
β’ Step length / Learning rate called in machine learningβ’ What are the advantages of large and small step sizes?Large: fastSmall: stable
β’ What are the disadvantages of large and small step sizes?Large: unstableSmall: slow
β’ Step sizeShould be appropriateBut, different for different problems
How Determine a Step Size?β’ Line search methodsBacktracking line search
Golden section searchSecant method
β’ No implementation, No need to know detailsWhy? Matlab already has good line search methods
Try a large step size (ππ0 )
ππ1=ππ0/2ππ2=ππ1/2
The convergence criteria is satisfied
π₯π₯πππ»π»ππ(π₯π₯ππ)
π₯π₯ππ+1
Stop (Convergence) Criteria
β’ 2-norm of βππ βππ(π₯π₯ππ) < ππ βππ , ex) 1e-5
β’ Relative change of ππ
ππ π₯π₯ππ+1 βππ(π₯π₯ππ)ππ(π₯π₯ππ+1)
< ππππ, ex) 1e-5
β’ Relative change of π₯π₯πππππ₯π₯ π₯π₯ππ+1,ππβπ₯π₯ππ,ππ
π₯π₯ππ+1,ππ< πππ₯π₯ ππππππ ππ = 1, β¦ ,πππ₯π₯ , ex) 1e-10
β’ Maximum number of iterationsβ’ Maximum number of ππ evaluations
Conjugate Gradient Method
β’ A modification of the gradient descent methodβ’ Add a scaled direction to the search directionβ’ The scaled direction is computed using the previous search direction, a
nd the current and previous gradientsβ’ More stable convergence because the previous search direction is cons
idered (less oscillations)
Recommended