Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
Introduction to
unconstrained optimization
- direct search methods
Jussi Hakanen
Post-doctoral researcher [email protected]
spring 2014 TIES483 Nonlinear optimization
Structure of optimization methods
Typically – Constraint handling
converts the problem to (a series of) unconstrained problems
– In unconstrained optimization a search direction is determined at each iteration
– The best solution in the search direction is found with line search
spring 2014 TIES483 Nonlinear optimization
Constraint handling
method
Unconstrained
optimization
Line
search
Group discussion
1. What kind of optimality conditions there exist for
unconstrained optimization (𝑥 ∈ 𝑅𝑛)?
2. List methods for unconstrained optimization?
– what are their general ideas?
Discuss in small groups (3-4) for 15-20 minutes
Each group has a secretary who writes down the
answers of the group
At the end, we summarize what each group found
spring 2014 TIES483 Nonlinear optimization
Reminder: gradient and hessian Definition: If function 𝑓: 𝑅𝑛 → 𝑅 is differentiable, then the
gradient 𝛻𝑓(𝑥) consists of the partial derivatives 𝜕𝑓(𝑥)
𝜕𝑥𝑖 i.e.
𝛻𝑓 𝑥 =𝜕𝑓(𝑥)
𝜕𝑥1, … ,
𝜕𝑓(𝑥)
𝜕𝑥𝑛
𝑇
Definition: If 𝑓 is twice differentiable, then the matrix
𝐻 𝑥 =
𝜕2𝑓(𝑥)
𝜕𝑥1𝜕𝑥1⋯
𝜕2𝑓(𝑥)
𝜕𝑥1𝜕𝑥𝑛
⋮ ⋱ ⋮𝜕2𝑓(𝑥)
𝜕𝑥𝑛𝜕𝑥1⋯
𝜕2𝑓(𝑥)
𝜕𝑥𝑛𝜕𝑥𝑛
is called the Hessian (matrix) of 𝑓 at 𝑥
Result: If 𝑓 is twice continuously differentiable, then 𝜕2𝑓(𝑥)
𝜕𝑥𝑖𝜕𝑥𝑗=
𝜕2𝑓(𝑥)
𝜕𝑥𝑗𝜕𝑥𝑖
spring 2014 TIES483 Nonlinear optimization
Reminder: Definite Matrices
Definition: A symmetric 𝑛 × 𝑛 matrix 𝐻 is positive semidefinite if ∀ 𝑥 ∈ ℝ𝑛
𝑥𝑇𝐻𝑥 ≥ 0.
Definition: A symmetric 𝑛 × 𝑛 matrix 𝐻 is positive definite if
𝑥𝑇𝐻𝑥 > 0 ∀ 0 ≠ 𝑥 ∈ ℝ𝑛
Note: If ≥ → ≤ (> → <), then 𝐻 is negative semidefinite (definite). If 𝐻 is neither positive nor negative semidefinite, then it is indefinite.
Result: Let ∅ ≠ 𝑆 ⊂ 𝑅𝑛 be open convex set and 𝑓: 𝑆 → 𝑅 twice differentiable in 𝑆. Function 𝑓 is convex if and only if 𝐻(𝑥∗) is positive semidefinite for all 𝑥∗ ∈ 𝑆.
Unconstraint problem
min 𝑓 𝑥 , 𝑠. 𝑡. 𝑥 ∈ 𝑅𝑛
Necessary conditions: Let 𝑓 be twice differentiable in 𝑥∗. If 𝑥∗ is a local minimizer, then
– 𝛻𝑓 𝑥∗ = 0 (that is, 𝑥∗ is a critical point of 𝑓) and
– 𝐻 𝑥∗ is positive semidefinite.
Sufficient conditions: Let 𝑓 be twice differentiable in 𝑥∗. If
– 𝛻𝑓 𝑥∗ = 0 and
– 𝐻(𝑥∗) is positive definite,
then 𝑥∗ is a strict local minimizer.
Result: Let 𝑓: 𝑅𝑛 → 𝑅 is twice differentiable in 𝑥∗. If 𝛻𝑓 𝑥∗ = 0 and 𝐻(𝑥∗) is indefinite, then 𝑥∗ is a saddle point.
Unconstraint problem
Adopted from Prof. L.T. Biegler (Carnegie Mellon
University)
Descent direction
Definition: Let 𝑓: 𝑅𝑛 → 𝑅. A vector 𝑑 ∈ 𝑅𝑛 is a descent
direction for 𝑓 in 𝑥∗ ∈ 𝑅𝑛 if ∃ 𝛿 > 0 s.t.
𝑓 𝑥∗ + 𝜆𝑑 < 𝑓(𝑥∗) ∀ 𝜆 ∈ (0, 𝛿].
Result: Let 𝑓: 𝑅𝑛 → 𝑅 be differentiable in 𝑥∗. If ∃𝑑 ∈ 𝑅𝑛
s.t. 𝛻𝑓 𝑥∗ 𝑇𝑑 < 0 then 𝑑 is a descent direction for 𝑓 in
𝑥∗.
spring 2014 TIES483 Nonlinear optimization
Model algorithm for unconstrained
minimization
Let 𝑥ℎ be the current estimate for 𝑥∗
1) [Test for convergence.] If conditions are satisfied, stop. The solution is 𝑥ℎ.
2) [Compute a search direction.] Compute a non-zero vector 𝑑ℎ ∈ 𝑅𝑛 which is the search direction.
3) [Compute a step length.] Compute 𝛼ℎ > 0, the step length, for which it holds that 𝑓 𝑥ℎ + 𝛼ℎ𝑑ℎ < 𝑓(𝑥ℎ).
4) [Update the estimate for minimum.] Set 𝑥ℎ+1 = 𝑥ℎ + 𝛼ℎ𝑑ℎ, ℎ = ℎ + 1 and go to step 1.
spring 2014 TIES483 Nonlinear optimization
From Gill et al., Practical Optimization, 1981, Academic Press
On convergence
Iterative method: a sequence {𝑥ℎ} s.t. 𝑥ℎ → 𝑥∗ when ℎ → ∞
Definition: A method converges – linearly if ∃𝛼 ∈ [0,1) and 𝑀 ≥ 0 s.t. ∀ ℎ ≥ 𝑀
𝑥ℎ+1 − 𝑥∗ ≤ 𝛼 𝑥ℎ − 𝑥∗ ,
– superlinearly if ∃𝑀 ≥ 0 and for some sequence 𝛼ℎ → 0 it holds that ∀ℎ ≥ 𝑀
𝑥ℎ+1 − 𝑥∗ ≤ 𝛼ℎ 𝑥ℎ − 𝑥∗ ,
– with degree 𝑝 if ∃𝛼 ≥ 0, 𝑝 > 0 and 𝑀 ≥ 0 s.t. ∀ℎ ≥ 𝑀 𝑥ℎ+1 − 𝑥∗ ≤ 𝛼 𝑥ℎ − 𝑥∗ 𝑝
.
If 𝑝 = 2 (𝑝 = 3), the convergence is quadratic (cubic).
spring 2014 TIES483 Nonlinear optimization
Summary of group discussion for
methods
1. Newton’s method 1. Utilizes tangent
2. Golden section method 1. For line search
3. Downhill Simplex
4. Cyclic coordinate method 1. One coordinate at a time
5. Polytopy search (Nelder-Mead) 1. Idea based on geometry
6. Gradient descent (steepest descent) 1. Based on gradient information
spring 2014 TIES483 Nonlinear optimization
Direct search methods
Univariate search, coordinate descent, cyclic
coordinate search
Hooke and Jeeves
Powell’s method
spring 2014 TIES483 Nonlinear optimization
Coordinate descent
spring 2014 TIES483 Nonlinear optimization
Fro
m M
iett
ine
n: N
on
line
ar
op
tim
iza
tio
n, 2
00
7 (
in F
inn
ish
)
𝑓 𝑥 = 2𝑥12 + 2𝑥1𝑥2 + 𝑥2
2 + 𝑥1 − 𝑥2
Idea of pattern search
spring 2014 TIES483 Nonlinear optimization
Fro
m M
iett
ine
n: N
on
line
ar
op
tim
iza
tio
n, 2
00
7 (
in F
inn
ish
)
Hooke and Jeeves
spring 2014 TIES483 Nonlinear optimization
Fro
m M
iett
ine
n: N
on
line
ar
op
tim
iza
tio
n, 2
00
7 (
in F
inn
ish
)
𝑓 𝑥 = 𝑥1 − 2 4 + x1 − 2x22
Hooke and Jeeves with fixed step length
spring 2014 TIES483 Nonlinear optimization
Fro
m M
iett
ine
n: N
on
line
ar
op
tim
iza
tio
n, 2
00
7 (
in F
inn
ish
)
𝑓 𝑥 = 𝑥1 − 2 4 + x1 − 2x22
Powell’s method
Most efficient pattern search method
Differs from Hooke and Jeeves so that for
each pattern search step one of the coordinate
directions is replaced with previous pattern
search direction.
spring 2014 TIES483 Nonlinear optimization