Slides Optimization

Embed Size (px)

Citation preview

  • 8/12/2019 Slides Optimization

    1/28

    Introduction and Unconstrained Optimization

    Master QFin at WU ViennaLecture Optimization

    Rudiger Frey

    [email protected]://statmath.wu.ac.at/frey

    Spring 2014

    1 / 2 9

    http://goforward/http://find/http://goback/
  • 8/12/2019 Slides Optimization

    2/28

    Introduction and Unconstrained Optimization

    Admin

    Dates of the lecture. 4.3., 6.3., 11.3.,18.3. (incl mid-term)

    25.3., 1.4., 8.4. (final exam at some point before easter) Examination. 10 % home assignments, 30% Midterm Exam

    (unit4) 60% Final Exam

    Tutor. Giorgo Ottonello, 2nd year QFin. He will correcthomework assignments

    2 / 2 9

    http://find/
  • 8/12/2019 Slides Optimization

    3/28

    Introduction and Unconstrained Optimization

    Useful references

    R. Frey, lecture notes Optimization, available on learn@WU,

    2014 Bertsekas, D., Nonlinear Programming, Athena Scientific

    Publishing, 1999

    Griva, Nash, Sofer, Linear and Nonlinear optimization, SIAMpublishing, 2009

    3 / 2 9

    http://find/
  • 8/12/2019 Slides Optimization

    4/28

    Introduction and Unconstrained Optimization

    Overview

    Introduction and Unconstrained OptimizationIntroductionMathematical BackgroundUnconstrained Optimization: Theory

    4 / 2 9

    http://find/
  • 8/12/2019 Slides Optimization

    5/28

    Introduction and Unconstrained Optimization

    Optimization problems

    In its most general for an optimization problem is

    minimizef(x) subject tox X (1)

    Here the set of admissible pointsX is a subset ofRn, and the costfunction f is a function from X to R. Often the admissible pointsare further restricted by explicit inequality constraints.

    Note that maximization problemscan be addressed by replacing fwith f, as supxXf(x) = infxX{f(x)}.

    5 / 2 9

    O

    http://find/
  • 8/12/2019 Slides Optimization

    6/28

    Introduction and Unconstrained Optimization

    Types of optimization problems.

    Continuous problems. Here X is of continuous nature suchas X = Rn or sets of the formX ={x Rn :g(x)0 for some g : Rn Rn}. Theseproblems are usually tackled with calculus or convex analysis.

    Discrete problems. Here X is a (usually large) finite set (as innetwork optimization)

    Nonlinear programming. Here f is nonlinear or theconstrained set X is specified by nonlinear equations.

    Linear programming. Here f and gare linear, and (1) takesthe form

    min cxsuch thatAxb

    forc, x Rn, a matrix A Rmn, b Rm and mn.

    6 / 2 9

    I d i d U i d O i i i

    http://find/
  • 8/12/2019 Slides Optimization

    7/28

    Introduction and Unconstrained Optimization

    Optimization problems in finance, economics and statistics.

    a) Portfolio optimization. We give two examples.

    Maximization of expected utility.

    maxRn

    E (u(V0+(ST S0))) ,

    Here V0 is the initial wealth of an investor; S0 = (S10 , . . . S

    n0 )

    is the initial asset price; ST() = (S1T(), . . . S

    nT()) the

    terminal asset price; represents the portfolio strategy. Theincreasing and concave function u: R R is the utilityfunctionthat is used to model the risk aversion of the investor.

    Markowitz problem. Here one looks for the minimal-varianceportfolio under all portfolios with a given mean.

    7 / 2 9

    I t d ti d U t i d O ti i ti

    http://find/
  • 8/12/2019 Slides Optimization

    8/28

    Introduction and Unconstrained Optimization

    Optimization problems ctd.

    b) Calibration problems. Denote by g1(), . . . , gm() modelprices ofm financial instruments for given parameter vector X Rn and by g1 , . . . , g

    m the observed market price.

    Model calibration leads to the optimization problem

    minX

    12

    mi=1

    (gi() gi)

    2.

    Ifgi is linear in we have a standard regression problem;otherwise one speaks of a generalized regression problem.

    c) Maximum likelihood methodsin statistics.

    d) Financial mathematics. Duality results from convex analysisare crucial in financial mathematics (first fundamentaltheorem of asset pricing or superhedging duality).

    8 / 2 9

    Introduction and Unconstrained Optimization

    http://find/http://goback/
  • 8/12/2019 Slides Optimization

    9/28

    Introduction and Unconstrained Optimization

    Overview

    1. Unconstrained Optimization Necessary and sufficient optimality conditions

    Numerical methods2. Lagrange multiplier theory and Karush-Kuhn-Tucker theory

    3. Convex optimization and duality Convexity and separation; The dual problem Duality results and economic applications

    9 / 2 9

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    10/28

    Introduction and Unconstrained Optimization

    Differentiability and partial derivatives

    Consider some f: U Rn R, (x1, . . . , xn)t f(x1, . . . , xn),

    where U is an open subset ofRn, and some xU.

    Definition. (1) f is called continuously differentiable on U

    (f C1

    (U)) if for all xUall partial derivatives exist and if thepartial derivatives are are continuous functions ofx.

    (2) More generally, a function f: U Rn Rm,

    (x1, . . . , xn)t (f1(x1, . . . , xn), . . . , fm(x1, . . . , xn))

    t

    is continuously differentiable on U if all components f1, . . . , fmbelong to C1(U).

    10/29

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    11/28

    Introduction and Unconstrained Optimization

    Example: Quadratic form

    Consider a symmetric 2 2 matrix A and let

    f(x1, x2) =xtAx=a11x21 + 2a12x1x2+a22x22 .

    f(x)

    x1= 2a11x1+ 2a12x2 = (2Ax)1 and

    f(x)

    x2= (2Ax)2

    11/29

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    12/28

    Introduction and Unconstrained Optimization

    Gradient and Jacobi matrix

    Suppose that f: U R is in C1(U). Then the column vector

    f(x) =

    fx1

    (x), . . . , fxn

    (x)t

    is the gradientoff.

    For a C1 function g :U Rm the Jacobi matrix is given by

    Jg(x) =

    g1(x)x1

    g1(x)xn

    ... ...

    gm(x)x1

    . . . gm(x)

    xn

    Sometimes one uses also the gradient matrix

    g(x) =Jg(x)t = (g1(x), . . . , gm(x)).

    12/29

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    13/28

    p

    First order (Taylor) approximation

    Consider some C1 function f: U R. Then for any x, yU

    f(y) f(x) =f(x)t(y x) +R(x, y x) (2)

    where it holds that limz0R(x,z)z = 0.

    Idea. The function fcan be approximated locally around xby theaffine mapping y f(x) + f(x)t(y x).

    Similarly, we get for a C1 function g: U Rm that

    g(y)g(x) =Jg(x)(yx)+R(x, yx) with limz0

    R(x, z)

    z = 0.

    13/29

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    14/28

    Chain rule

    Theorem. Consider C1 functions f: Rn Rm and g: Rk Rn

    and let h:=f g. Then h is C1 and it holds for the Jacobi matrixthat Jh(x) =Jf(g(x))Jg(x), i.e. the Jacobian of theconcatenation is the product of the individual Jacobi matrices.

    Example. (Derivative along a vector). Consider a C1 functionsf: Rn R We want to consider the function f along the straightline (t) :=x+tv, fort R, x, v Rn. We have J(t) =v,Jf(x) = (f(x))t and hence

    d

    dtf((t)) = (f(x+tv))tv, in particular

    d

    dtf((0)) = (f(x))tv.

    14/29

    Introduction and Unconstrained Optimization

    http://find/http://goback/
  • 8/12/2019 Slides Optimization

    15/28

    Second derivatives

    Definition. Consider C1 function f: U Rn R. Then the firstorder partial derivatives f(x)

    xi, 1in, are themselves functions

    from U to R.

    1. If all partial derivatives are C1 functions, f is called twice

    continuously differentiable on U (f C2

    (U)). Fixi,j {1, . . . , n}. Then one writes

    2f

    xixj(x) :=

    fxi

    (x)

    xj

    for the second partial derivative in direction xi and xj.

    2. For f C2(U) the matrix Hf with Hfij(x) = 2fxixj

    (x) is the

    Hessian matrixoff.

    15/29

    Introduction and Unconstrained Optimization

    http://find/http://goback/
  • 8/12/2019 Slides Optimization

    16/28

    Theorem of Young and Schwarz

    Theorem. Consider f C2(U). Then the Hessian matrix issymmetric, that is

    2f

    xixj(x) =

    2f

    xjxi(x) , 1 i,jn.

    It follows that the Hessian is a symmetric matrix, that isHfij(x) =Hfji(x), 1 i,jn. In particular, the definiteness ofHf

    can be checked using eigenvalues: HFis positive semi-definite if alleigenvalues are strictly positive and positive semidefinite if alleigenvalues are non-negative.

    16/29

    Introduction and Unconstrained Optimization

    http://find/http://goback/
  • 8/12/2019 Slides Optimization

    17/28

    Example

    (1) Consider f(x1, x2) =x31 x2+x

    21 x

    22 +x1+x

    22 . Then we have

    2fx21

    = 6x1x2+ 2x22 , 2f

    x22= 2x21 + 2 ,

    2fx1x2

    = 3x21 + 4x1x2.

    (2) Consider f(x) =xtAx for some symmetric matrix A. Then

    Hf(x) = 2A.

    17/29

    Introduction and Unconstrained Optimization

    http://find/http://goback/
  • 8/12/2019 Slides Optimization

    18/28

    Second order Taylor expansion

    Theorem. Iff is C2(U) and x, yU the Taylor formula becomes

    f(y)f(x) =f(x)t(yx)+1

    2(yx)tHf(x)(yx)+R2(x, yx)

    where limz0 R2(x,z)z2 = 0.

    Idea. fcan be approximated locally around xUby the quadraticfunction

    y f(x) + f(x)t

    (y x) +

    1

    2 (y x)t

    Hf(x)(y x) .

    Locally, this is a better approximation than the first order Taylorapproximation.

    18/29

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    19/28

    Unconstrained optimization: the problem

    In this section we consider problems of the form

    minimizef(x) for x X = Rn

    (3)

    Moreover, we assume that f is once or twice continuouslydifferentiable.

    Most results hold also in the case where X is an open subset ofRn.

    19/29

    Introduction and Unconstrained Optimization

    http://find/http://goback/
  • 8/12/2019 Slides Optimization

    20/28

    Local and global optima

    Definition. Consider the optimization problem (3).

    1. x is called (unconstrained) local minimum off if there issome >0 such that f(x)f(x) for all x Rn with

    x x< .2. x is called global minimum off, if f(x) f(x)x Rn.

    3. x is said to be a strict local/global minimum if the inequalityf(x)f(x) is strict for x=x.

    4. The valueof the problem is f := inf{f(x) :x Rn

    }RemarkLocal and global maxima are defined analogously.

    20/29

    Introduction and Unconstrained Optimization

    http://find/http://goback/
  • 8/12/2019 Slides Optimization

    21/28

    Necessary optimality conditions

    Proposition. Suppose that x Uis a local optimum off.1. Iff is C1 in U, then f(x) = 0. (First Order Necessary

    Condition or FONC).

    2. If moreover f C2(U) then Hf(x) is positive semi-definite

    (Second Order Necessary Condition or SONC).Comments.

    x Rn with f(x) = 0 is called stationary pointoff.

    Proof is based on Taylor formula.

    Necessary conditions for a local maximum: f(x) = 0,Hf(x) negative semidefinite.

    Necessary conditions in general not sufficient: considerf(x) =x3, x = 0.

    21/29

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    22/28

    Sufficient optimality conditions

    Proposition. (Sufficient conditions.) Let f :U Rn R be C2

    on U. Suppose that x Usatisfies the conditions

    f(x) = 0, Hf(x) strictly positive definite (4)

    Then x is a local minimum.

    Comments.

    Sufficient conditions not necessary: Consider eg.

    f(x) =x4, x = 0. No global statements possible.

    22/29

    Introduction and Unconstrained Optimization

    http://find/http://goback/
  • 8/12/2019 Slides Optimization

    23/28

    The case of convex functions

    Definition. (Convex sets and functions)

    i) A set X Rn is convex ifx1, x2X, [0, 1] the convexcombination x1+ (1 )x2 belongs to X.

    ii) A function f :X Rn

    R

    (Xconvex) is called convex ifx1, x2X, [0, 1]

    f(x1+ (1 )x2) f(x1) + (1 )f(x2); (5)

    f is strict convex if the inequality is strict for

    (0,

    1).iii) f :X Rn R is concave f is (strict) convex

    holds in (5). Strict concavity is defined in the same way.

    23/29

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    24/28

    Characterizations of Convexity

    Lemma. Consider an open convex set X Rn. A C1 functionf :X Rn is convex on the if and only if it holds for all x, zXthat

    f(z)f(x) + f(x)(z x).

    Iff is C2 a necessary and sufficient condition for the convexity offon X is the condition that Hf(x) is positive semi-definite for allxX.

    Comments

    f is concave on U Hf is negative semidefinite on U.

    Note that we may decide convexity or concavity by finding theeigenvalues ofHf(x).

    24/29

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    25/28

    Example

    Problem. Let f(x1, x2) = 2x1 x2 x21 + 2x1x2 x22 . Is f convex,concave or none of both ?

    Solution. The symmetric matrix representing the quadratic part off is

    A=

    1 11 1

    An easy computation gives for the Hessian that Hf(x) = 2A.Hence we need to check the definiteness ofA.

    Approach via eigenvalues. The characteristic polynomial ofA isP() =2 + 2; the equation P() = 0 has solutions(eigenvalues) 2 and 0. Hence, A is negative semidefinite and thefunction is concave.

    25/29

    Introduction and Unconstrained Optimization

    http://find/
  • 8/12/2019 Slides Optimization

    26/28

    Optimality conditions for convex functions

    Proposition. Let f :X R be a convex function on some convexset X Rn. Then

    1. A local minimum off over X is also a global minimum. Iff isstrictly convex, there exists at most one global minimum.

    2. IfX is open, the condition f(x) = 0 is necessary andsufficient for x Xto be a global minimum off.

    26/29

    Introduction and Unconstrained Optimization

    http://goforward/http://find/http://goback/
  • 8/12/2019 Slides Optimization

    27/28

    Example: Quadratic cost functions

    Let f(x) =

    1

    2 x

    Qx b

    x, xRn

    for a symmetric n n matrix Qand some b Rn. Then we have

    f(x) =Qx band Hf(x) =Q.

    a) Local minima. Ifx

    is a local minimum we must havef(x) =Qx b= 0, Hf(x) =Qpositive semi-definite;

    hence ifQ is not positive semi-definite, fhas no local minima.

    b) IfQ is positive semi-definite, f is convex. In that case we neednot distinguish global and local minima, and f has a globalminimum if and only if there is some x with Qx =b.

    c) IfQ is positive definite, Q1 exists and the unique globalminimum is attained at x =Q1b.

    27/29

    Introduction and Unconstrained Optimization

    http://find/http://goback/
  • 8/12/2019 Slides Optimization

    28/28

    Existence results for a global minimumProposition(Weierstrass Theorem) Let X Rn be non-empty and

    suppose that f :X R is continuous in X. Suppose moreover,that one of the following three conditions holds

    (1) Xis compact (closed and bounded).

    (2) X is closed and f is coercive, that is

    (xk)kNX withxk one has lim

    kf(xk) =

    (3) There is some R such that the level set{xX :f(x)} is non-empty and compact.

    Then fhas at least one global minimum and the set of all globalminima is compact.

    Remark. The result holds also for lower semicontinous functions.(f is called lower semicontinuous if for all xX, all (xk)kN with

    x

    k

    x lim infkf(xk

    )f(x).) 28/29

    http://find/http://goback/