38
erˆ ome Bolte Why KL ? Semi-algebraic sets and functions What is a semi-algebraic set ? What is KL ? How to use KL? Abstract descent methods IIlustration: splitting and others method Other methods A user’s guide to Lojasiewicz/KL inequalities erˆ ome Bolte Toulouse School of Economics, Universit´ e Toulouse I SLRA, Grenoble, 2015

A user's guide to ojasiewicz/KL inequalitiesJ er ome^ Bolte Why KL ? Semi-algebraic sets and functions What is a semi-algebraic set ? What is KL ? How to use KL? Abstract descent methods

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    A user’s guide to Lojasiewicz/KLinequalities

    Jérôme Bolte

    Toulouse School of Economics,

    Université Toulouse I

    SLRA, Grenoble, 2015

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Motivations behind KL

    f : Rn → R smooth ẋ(t) = −∇f (x(t))or xk+1 = xk − λk∇f (xk)

    KL answers a “counterintuitive” phenomenon of gradientsystems.

    Bounded curves/sequences of with f C∞ may oscillateindefinitely (for any choices of steps...) and generatepaths with INFINITE LENGTH

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Smooth counter-examples: Palis-De Melo, Absil etal.

    ẋ(t) = −∇f (x(t))Gradient: xk+1 = xk − λk∇f (xk)

    Idea: Spiraling bump yields spiraling curves

    f (r , θ) = e(1−r2)

    (1− r

    4

    r4 + (1− r2)4

    )sin

    (θ − 1

    1− r2

    ), r ≥ 1

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Counter-examples (continued)

    Oscillations in Optimization:

    Very bad predictionsarbitrarily bad complexity

    – Even worst behaviors for nonsmooth functions

    – Same awful behaviors for more complex meth-ods: e.g. Forward-Backward

    – Solutions to these bad behaviors

    Topological approach: to avoid oscillationsuse semi-algebraic (or definable) functions

    Ad hoc approach: study deformations of aclass of nonsmooth functions “sharpfunction”

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Definition : semi-algebraic sets and functions

    A semi-algebraic subset of Rn is a finite union of sets of theform

    {x ∈ Rn : fi (x) = 0, gj(x) < 0, i ∈ I} ,

    where I , J are finite and fi , gj : Rn → R are real polynomialfunctions.

    Stability by finite⋃,⋂, and complementation.

    A function or a mapping is semi-algebraic if its graph is asemi-algebraic set (same definition for real-extended function ormultivalued mappings)

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    How to recognize semi-algebraic sets / functions?

    Theorem (Tarski-Seidenberg)

    The image of a semi-algebraic set by a linear projection issemi-algebraic.

    Example: Let A be a semi-algebraic subset of Rnand f : Rn → Rp semi-algebraic.Then f (A) is semi-algebraic.

    Proof. Thenf (A) = {y ∈ Rp : ∃x ∈ A, y = f (x)}= Πx(graph f ∩(A× Rp)),where Πx(x , y) = y for all (x , y) ∈ Rn × Rp.

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    How to recognize semi-algebraic sets / functions?

    Theorem (Tarski-Seidenberg)

    The image of a semi-algebraic set by a linear projection issemi-algebraic.

    Example: Let A be a semi-algebraic subset of Rnand f : Rn → Rp semi-algebraic.Then f (A) is semi-algebraic.

    Proof. Thenf (A) = {y ∈ Rp : ∃x ∈ A, y = f (x)}= Πx(graph f ∩(A× Rp)),where Πx(x , y) = y for all (x , y) ∈ Rn × Rp.

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Checking semi-algebraicity is easy

    Typical illustration:

    The closure of a semi-algebraic set A is semi-algebraic.

    Proof. Observe that

    A =

    {y ∈ Rn : ∀� ∈]0,+∞[, ∃x ∈ A,

    n∑i=1

    (xi − yi )2 < �2}.

    Set

    B = {(x , �, y) ∈ Rn×]0,+∞[×Rn :n∑

    i=1

    (xi − yi )2 < �2}

    then

    A = Rn \ Πy ((Rn × Rn \ Π�(B)))where

    Π�(x , �, y) = (x , y) Πy (x , y) = x .

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Checking semi-algebraicity is easy

    Typical illustration:

    The closure of a semi-algebraic set A is semi-algebraic.

    Proof. Observe that

    A =

    {y ∈ Rn : ∀� ∈]0,+∞[, ∃x ∈ A,

    n∑i=1

    (xi − yi )2 < �2}.

    Set

    B = {(x , �, y) ∈ Rn×]0,+∞[×Rn :n∑

    i=1

    (xi − yi )2 < �2}

    then

    A = Rn \ Πy ((Rn × Rn \ Π�(B)))where

    Π�(x , �, y) = (x , y) Πy (x , y) = x .

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Other examples

    Other examples:Is the derivative semi-algebraic ?

    F : U ⊂ Rn → Rm, graph dF = {(x , L) ∈ U×Rn×m, L = df (x)}

    L = df (x)

    ⇔ f (x + h) = f (x) + Lh + o(‖h‖)⇔ ∀� > 0, ∃δ > 0,(‖h‖ < δ, x + h ∈ U)⇒ ‖f (x + h)− f (x)− Lh‖ < �‖h‖)

    Exercise: show that g(x) = max{f (x , y) : y ∈ K} issemi-algebraic whenever f , K semi-algebraic.

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Examples

    Other examples:

    min1

    2||Ax − b||2 + λ‖x‖p (p rational)

    min

    {1

    2||AX − B||2 + λrank (X ) : X matrix

    }min

    {1

    2||AB −M||2 : A ≥ 0,B ≥ 0

    }min

    {1

    2||AB −M||2 : A ≥ 0,B ≥ 0, ‖A‖0 ≤ r , ‖B‖ ≤ s

    }. . .

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Fréchet subdifferential

    f : Rn → R ∪ {+∞}.First-order formulation: Let x in dom f(Fréchet) subdifferential : p ∈ ∂̂f (x) iff

    f (u) ≥ f (x) + 〈p, u − x〉+ o(||u − x ||), ∀u ∈ Rn.

    0 1 2 3 4 5 6

    0

    1

    2

    3u → f (u)

    u → f (x) + 〈p.u − x〉

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Subdifferential

    Definition (Subdifferential)

    It is denoted by ∂f and defined through:x∗ ∈ ∂f (x) iff (xk , x∗k )→ (x , x∗) such that f (xk)→ f (x) and

    f (u) ≥ f (xk) + 〈x∗k , u − xk〉+ o(‖u − xk‖).

    Example: f (x) = ‖x‖1, ∂f (0) = B∞ = [−1, 1]nSet

    ‖∂f (x)‖− = min{‖x∗‖ : x∗ ∈ ∂f (x)}

    Properties (Critical point)

    Fermat’s rule if f has a minimizer at x then ∂f (x) 3 0.Conversely when 0 ∈ ∂f (x), the point x is called critical.

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    An elementary remedy to “gradient oscillation”:Sharpness

    A function f : Rn → R ∪ {+∞} is called sharp on the slice[r0 < f < r1] := {x ∈ Rn : r0 < f (x) < r1}, if there exists c > 0

    ‖∂f (x)‖− ≥ c > 0, ∀x ∈ [r0 < f < r1]

    Basic example f (x) = ‖x‖

    −5 0 5 −50

    50

    20

    Many works since 78, Rockafellar, Polyak, Ferris, Burke andmany others..

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Sharpness: example

    −4 −2 0 2 4 −5

    0

    50

    5

    Nonconvex illustration with a continuum of minimizers

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Finite convergence

    Why is sharpness interesting?

    Gradient curves reach “the valley” within a finitetime.

    Easy to see the phenomenon on proximal descentProx descent = formal setting for implicit gradient

    x+ = x − step. ∂f (x+)

    Prox operator:proxsf x = argmin{f (u) +

    12s ||u − x ||

    2 : u ∈ Rn}(Moreau)

    x+ = prox stepf (x)

    When f is sharp, convergence occurs in finite time !!

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Proof

    Assume f is sharp, then if xk+1 is non criticalf (xk+1) ≤ f (xk)− δ.

    Write (assume “step” is constant)

    f (xk+1) ≤ f (xk)− 12.step‖xk+1 − xk‖2

    xk+1 − xk ∈ step. ∂f (xk+1) thus ‖xk+1 − xk‖ ≥ c .step

    f (xk+1) ≤ f (xk)− c2.step2

    Set δ = 0.5c2.step.

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Measuring the default of sharpness

    0 is a critical value of f (true up to a translation).Set [0 < f < r0] := {x ∈ Rn : 0 < f (x) < r0} and assumethat there are no critical points in [0 < f < r0].

    f has the KL property on [0 < f < r0] if there exists afunction g : [0 < f < r0]→ R ∪ {+∞} whose sublevel sets arethose of f , ie [f ≤ r ]r∈(0,r0), and such that

    ||∂g(x)||− ≥ 1 for all x in [0 < f < r0].

    EX (Concentric circles) f (x) = ||x ||2 and g(x) = ||x ||

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Formal KL property

    Formally :Desingularizing functions on (0, r0) :ϕ ∈ C ([0, r0),R+), concave, ϕ ∈ C 1(0, r0), ϕ′ > 0 andϕ(0) = 0.

    Definition

    f has the KL property on [0 < f < r0] if there exists adesingularizing function ϕ such that

    ||∂(ϕ ◦ f )(x)||− ≥ 1, ∀x ∈ [0 < f < r0].

    Local version : replace [0 < f < r0] by the intersection of[0 < f < r0] with a closed ball.

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Key results

    Theorem [ Lojasiewicz (Hörmander) 1968; Kurdyka 98] Iff is analytic or smooth and semialgebraic, it satisfies the Lojasiewicz inequality around each point of Rn.

    Nonsmooth semialgebraic/subanalytic case 2006:[B-Daniilidis-Lewis] Take f : Rn → R ∪ {+∞} lowersemicontinuous and semialgebraic then f has KL propertyaround each point.

    Many many functions satisfy KL inequality (B-Daniilidis-Lewis-Shiota 2007, Attouch-B-Redont 2009...)

    More to come

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Descent methods at large

    Let f : Rn → R ∪ {+∞} be a proper lower semicontinuousfunction; a, b > 0.Let xk be a sequence in dom f such that

    Sufficient decrease condition

    f (xk+1) + a‖xk+1 − xk‖2 ≤ f (xk); ∀k ≥ 0

    Relative error condition For each k ∈ N, there existswk+1 ∈ ∂f (xk+1) such that

    ‖wk+1‖ ≤ b‖xk+1 − xk‖;

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    An abstract convergence theorem

    Theorem (Attouch-B-Svaiter 2012)

    Let f be a KL function and xk a descent sequence for f .If xk is bounded then it converges to a critical point of f .

    Corollary

    Let f be a coercive semi-algebraic function and xk a descentsequence for f .Then xk converges to a critical point of f .

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Rate of convergence

    Let x∞ the (unique) limit point of xk . Assume thatϕ(s) = cs1−θ with c > 0, θ ∈ [0, 1) (cover semi-algebraic case).

    Theorem (Attouch-B 2009)

    (i) If θ = 0, the sequence (xk)k∈N converges in a finite numberof steps,(ii) If θ ∈ (0, 12 ] then there exist c > 0 and Q ∈ [0, 1) such that

    ‖xk − x∞‖ ≤ c Qk ,

    (iii) If θ ∈ (12 , 1) then there exists c > 0 such that

    ‖xk − x∞‖ ≤ c k−1−θ2θ−1 .

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    (Nonconvex) forward-backward splitting algorithm

    Minimizing nonsmooth+smooth structure: f = g + h

    with

    {h ∈ C 1 and ∇h L− Lipschitz continuousg lsc bounded from below + prox is easily computable

    Forward-backward splitting (Lions-Mercier 79): Let γk be suchthat 0 < γ < γk < γ <

    1L

    xk+1 ∈ prox γk g (xk − γk∇h(xk))

    Theorem

    If the problem is coercive xk is a converging sequence

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Sparse solutions of under-determined systems

    min{λ‖x‖0 +1

    2‖Ax − b‖2}, (Blumensath − Davis 2009)

    Forward-backward splitting

    xk+1 ∈ prox γkλ‖·‖0(

    xk − γk(ATAxk − ATb))

    where 0 < γ < γk < γ <1

    ‖ATA‖F,

    Theorem (Attouch-B-Svaiter)

    The sequence xk converges to a critical point ofλ‖x‖0 + 12‖Ax − b‖

    2

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Low rank solutions of under-determined systems

    rankM= rank of M or a semi-algebraic (or definable)surrogate of the rank whose prox is easily computable.See Hiriart-Urruty talk.

    min

    {λrank (M) +

    1

    2‖AM − B‖2

    }, where A = linear operator

    Forward-backward splitting

    Mk+1 ∈ prox γkλrank(

    xk − γk(A TAMk −A TB))

    where 0 < γ < γk < γ <1

    ‖A TA‖F,

    Theorem

    The sequence Mk converges to a critical point ofλrankM + 12‖AM − B‖

    2

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Gradient projection

    C a semi-algebraic set, f a semi-algebraic function.

    xk+1 ∈ PC(

    xk − γk∇f (xk))

    Theorem

    The sequence xk converges to a critical point of the problemminC f

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Gradient projection (Example)

    12‖AM − B‖

    2 + iCs (x) with Cs the set of matrix of at mostrank s.

    PCs M is known in closed form “threhsold the inner diag-onal matrix in the SVD”

    Gradient-Projection method:

    Mk+1 ∈ PCs(

    Mk − γk(A TAMk −A TB))

    where γk ∈ (�, 1‖A TA‖F ). No oscillations, Mk converges to acritical point!!

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Another illustration: Averaged projections

    xk+1 ∈ (1− θ) xk + θ

    (1

    p

    p∑i=1

    PFi (xk)

    )+ �k ,

    where θ ∈ (0, 1) and ‖�k‖ ≤ M‖xk+1 − xk‖

    Theorem (Inexact averaged projection method)

    F1, . . . ,Fp be semi-algebraic with C2 boundaries or convex

    which satisfy⋂p

    i=1 Fi 6= ∅. If x0 is sufficiently close to⋂p

    i=1 Fi ,then xk converges to a feasible point x̄ , i.e. such that

    x̄ ∈p⋂

    i=1

    Fi .

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Alternating versions of prox algorithm

    F : Rm1 × . . .× Rmp → R ∪ {+∞} lsc semi-algebraicStructure of F :

    F (x1, . . . , xp) =

    p∑i=1

    fi (xi ) + Q(x1, . . . , xp)

    fi are proper lsc and Q is C1.

    Gauss-Seidel method/ Prox (Auslender 1993,Grippo-Sciandrone 2000)

    xk+11 ∈ argminu∈Rm1

    F (u, xk2 , . . . , xkp ) +

    1

    2µ1k‖u − x1k‖2

    . . .

    xk+1p ∈ argminu∈Rmp

    F (xk+11 , . . . , xk+1p−1 , u) +

    1

    2µpk‖u − xpk ‖

    2

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Alternating versions of prox algorithm

    F : Rm1 × . . .× Rmp → R ∪ {+∞} lsc semi-algebraicStructure of F :

    F (x1, . . . , xp) =

    p∑i=1

    fi (xi ) + Q(x1, . . . , xp)

    fi are proper lsc and Q is C1.

    Gauss-Seidel method/ Prox (Auslender 1993,Grippo-Sciandrone 2000)

    xk+11 ∈ argminu∈Rm1

    F (u, xk2 , . . . , xkp ) +

    1

    2µ1k‖u − x1k‖2

    . . .

    xk+1p ∈ argminu∈Rmp

    F (xk+11 , . . . , xk+1p−1 , u) +

    1

    2µpk‖u − xpk ‖

    2

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Convergence

    Theorem (Attouch-B.-Redont-Soubeyran 2010,B.-Combettes-Pesquet, 2010)

    Assume that F is a KL function (e.g. fi ,Q are semi-algebraic),then the sequence (xk1 , . . . , x

    kp ) satisfies either

    (xk1 , . . . , xkp )→ +∞

    (xk1 , . . . , xkp ) converges to a critical point of F

    Note that convergence automatically happens if either one ofthe fi or Q is coercive.

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Proximal alternating linearized method

    F (x) = Q(x1, x2) + f1(x1) + f2(x2),

    ∀x1, Q(x1, ·) is C 1 with L2(x1) > 0 Lipschitz continuousgradient (same for x2 with L1(x2))

    xk+11 ∈ prox 1L1(x

    k2)f1

    (xk1 −

    1

    L1(xk2 )∇x1Q(xk1 , xk2 )

    ).

    xk+12 ∈ prox 1L2(x

    k1)f1

    (xk1 −

    1

    L2(xk1 )∇x2Q(xk+11 , x

    k2 )

    ).

    Set xk+1 = (xk+11 , xk+12 ).

    Theorem (Convergence of PALM (B., Teboulle, Sabach))

    Any bounded sequence generated by PALM converges to acritical point of F .

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Sparse Nonnegative Matrix Factorization

    Consider in NMF some sparsity measure as

    ‖X‖0 = number of nonzero entries

    min

    {1

    2‖A− XY‖2F : X ≥ 0, ‖X‖0 ≤ r ,

    Y ≥ 0, ‖Y‖0 ≤ s}

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    PALM for Sparse NMF

    1. Take X 0 ∈ Rm×r+ and Y 0 ∈ R+r×n.2. Generate a sequence

    {(X k ,Y k

    )}k∈N:

    2.1. Take γ1 > 1, set ck = γ1‖Y k(Y k)T ‖F and compute

    Uk = X k − 1ck

    (X kY k − A

    ) (Y k)T

    ;

    X k+1 ∈ prox ‖·‖1ck(Uk)

    = Tα(P+(Uk)).

    2.2. Take γ2 > 1, set dk = γ2‖X k+1(X k+1

    )T ‖F and computeV k = Y k − 1

    dk

    (X k+1

    )T (X k+1Y k − A

    )Y k+1 ∈ prox R2dk

    (V k)

    = Tβ(P+(V k)).

    We thus get the desired convergence result by our generaltheorem

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Some complications: A SQP method

    We wish to solve

    min{f (x) : x ∈ Rn, fi (x) ≤ 0}

    which can be written

    min{f + iC} with C = [fi ≤ 0, ∀i ].

    We assume:

    ∇f is Lf Lipschitz∀i , ∇fi is Lfi Lipschitz continuous.

    Prox operator here are out of reach: too complex !!

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Moving balls methods

    Classical Sequential Quadratic Programming (SQP) idea,replace functions by some simple approximations.

    Moving balls method (Auslender-Shefi-Teboulle, Math. Prog.,2010)

    minx∈Rn

    f (xk) + f′(xk)(x − xk)+

    Lf2||x − xk ||2

    f1(xk) + f′1(xk)(x − xk)+

    Lf12||x − xk ||2 ≤ 0

    . . .

    fm(xk) + f′m(xk)(x − xk)+

    Lfm2||x − xk ||2 ≤ 0

    Bad surprise: xk is not a descent sequence for f + iC .

  • JérômeBolte

    Why KL ?

    Semi-algebraicsets andfunctions

    What is asemi-algebraicset ?

    What is KL ?

    How to useKL?

    Abstract descentmethods

    IIlustration:splitting andothers method

    Other methods

    Moving balls methods

    Introduce

    val(x) = miny∈Rn

    f (x) + f ′(x)(y − x)+Lf2||y − x ||2

    f1(x) + f′1(x)(y − x)+

    Lf12||y − x ||2 ≤ 0

    . . .

    fm(x) + f′m(x)(y − x)+

    Lfm2||y − x ||2 ≤ 0

    Good surprise: xk is a descent sequence for val.

    Theorem

    Assume Mangasarian-Fromovitz qualification condition.If xk is bounded it converges to a KKT point of the originalproblem.

    Semi-algebraic sets and functionsWhat is KL ?How to use KL?Other methods