Newton Method
Materials courtesy: B. Poczos, R. Tibshirani, C. Carmanis & S. Sanghavi
Outline
5
Newton Method for Finding a Root
Linear Approximation (1st order Taylor approx):
Goal:
Therefore,
6
Illustration of Newton’s method Goal: finding a root
In the next step we will linearize here in x
7
Example: Finding a Root
http://en.wikipedia.org/wiki/Newton%27s_method
8
Newton Method for Finding a Root This can be generalized to multivariate functions
Therefore,
[Pseudo inverse if there is no inverse]
9
10
Newton method for minimization
11
Newton method for minimization
unconstrained
twice differentiable
12
13
Motivation with Quadratic Approximation
unconstrained
twice differentiable
14
Motivation with Quadratic Approximation
15
Comparison with Gradient Descent
16
Comparison with Gradient Descent
Gradient descent uses a different quadratic approximation:
Newton method is obtained by minimizing over quadratic approximation:
17
Comparison with Gradient Descent
18
19
20
Pre-Conditioning for Gradient descent
Recall convergence rate for gradient descent:
Can we convert it into well-conditioned problem by changing coordinates?
Can get if A = [r2
f(x)]�1
21
Pre-Conditioning for Gradient descent
Can we convert it into well-conditioned problem by changing coordinates?
Can get if
Running gradient descent for g(y), gives best descent direction and convergence rate.
Equivalent to Newton step on f(x).
A = [r2f(x)]�1/2
22
Affine Invariance
[Proof: HW3]
23
Affine Invariant stopping criterion
Stopping criterion for gradient descent:
Stopping criterion for Newton method:
where is the
Newton decrement.
Not affine-invariant
24
Affine Invariant stopping criterion
25
26
Damped Newton’s Method
27
Backtracking line search
28
Convergence Rate
29
Local convergence for finding root
Quadratic convergence
30
Convergence analysis
31
Convergence analysis
32
Convergence analysis
Analysis can be improved e.g. for self-concordant functions
33
Comparison to first-order methods
34
Comparison to first-order methods
35
Example: Logistic regression
36
Example: Logistic regression