Décompositions parcimonieuses: l’approche...

Preview:

Citation preview

Décompositions parcimonieuses: l’approche gloutonne

Cédric Herzet Inria Rennes - Télécom Bretagne

1

Sparse data abound in Nature

Image wavelet decomposition Voice spectrogram

Exploiting sparse prior may allow to reconstruct unobserved dataCourtesy of G. Peyré

data + sparse prior

The sparse inversion problem

y = x

measurements unknown sparse vector

Can we recover x from y by exploiting the sparsity of x?

A 2 Rm⇥n

The target problem: find the best approximation in a union of subspaces

5

yV1

V2

V3

minV 2⌃

minz2V

ky � zk22

Solve:

Let ⌃ = [iVi ⇢ Rm

The inner minimization has a tractable solution

6

Solve:

yV1

minz2V1

ky � zk22

⌘ least square problem

Complexity: O(m dim(V1)3)

The overall complexity scales linearly with the number of subspaces

7

minV 2⌃

minz2V

ky � zk22

Solve:

Let ⌃ = [iVi ⇢ Rm

yV1

V2

V3

Complexity:

card(⌃)⇥O(m dim(V1)3)

The «sparse» paradigm involves a large number of subspaces

8

Solve: minV 2⌃k

minz2V

ky � zk22

where ⌃k = [S:card(S)kspan (AS), A 2 Rm⇥n

card(⌃k) =

✓nk

◆!!!

The sparse approximation problem is NP-Hard... see e.g, [Natarajan 95], [Foucart 10]

Different approaches to deal with the NP-hardness of the sparse problem

9

• Integer programming: [Bourguignon 16], [Miller 02, Chapter 3]

• Greedy approaches (today’s talk)

• (Convex) relaxation: [Gorodnitsky 97], [Chen 99], [Wipf 04], [Cemgil 07]

Greedy procedures construct sequentially the approximation subspace

10

V (1) ! V (2) ! . . . V (l)

z(l) 2 V (l).

Greedy procedure generates:

In the sparse context: S(1) ! S(2) ! . . . S(l)

z(l) 2 span�AS(l)

MP, OMP and OLS

11

MP, OMP, OLS are forward greedy algorithms

12

The algorithms generate

S(1) ! S(2) ! . . . S(l)

with

S(l+1) = S(l) [ {j}

1 atom of the dictionary is added at each iteration

The approximation problem is easy to solve for unions of 1D subspaces

13

Consider: minV 2⌃1

minz2V

ky � zk22 where ⌃1 = [1inspan (ai)

We have: minV 2⌃1

minz2V

ky � zk22 ⌘ mini2[1,n]

minz2span(ai)

ky � zk22| {z }

achieved for z = hai,yiai

= mini2[1,n]

n

kyk22 � hai,yi2o

The best approximation subspace maximizes |hai,yi|

MP strategy: first iteration

14

Evaluate: j = argmax

i2[1,n]|hai,yi|

Set: S(1) = {j}z(1) = haj ,yiaj

MP strategy for the subsequent iterations: one-dimensional update of the current estimate

15

Let

We compute a 1D correction to as

r(l�1) , y � z(l�1) (current approximation error)

z(l�1) minV 2⌃1

minz2V

���r(l�1) � z���2

2

minV 2⌃1

minz2V

���r(l�1) � z���2

2= min

i2[1,n]

⇢���r(l�1)���2

2�

Dai, r

(l�1)E2

�As previously, we find:

MP strategy: iterations > 1

16

Evaluate:

Set:

j = argmax

i2[1,n]

���Dai, r

(l�1)E���

z(l) ! y in both finite and infinite dimensional spaces[Jones 87], [Mallat 93]

Convergence can be slow: MP may select the same atom many times

S(l) = S(l�1) [ {j}

z(l) = z(l�1) +Daj , r

(l�1)Eaj

Suboptimality of MP’s approximation update

17

Evaluate:

Set:

j = argmax

i2[1,n]

���Dai, r

(l�1)E���

2 span(AS(l))

... but we usually have

z(l) 6= argminz2span(AS(l))

ky � zk22

S(l) = S(l�1) [ {j}

z(l) = z(l�1) +Daj , r

(l�1)Eaj

OMP strategy

18

Evaluate:

Set:

j = argmax

i2[1,n]

���Dai, r

(l�1)E���

z(l) ! y in both finite and infinite dimensional spacesOMP never selects the same atom twice!

S(l) = S(l�1) [ {j}z(l) = argmin

z2span(AS(l))ky � zk22

OLS strategy

19

Evaluate:

Set:

z(l) ! y in both finite and infinite dimensional spacesOLS never selects the same atom twice!

S(l) = S(l�1) [ {j}z(l) = argmin

z2span(AS(l))ky � zk22

j = argmini2[1,n]

minz2span(AS(l�1)[i)

ky � zk22

Complexity

20

Let

Complexity MP :

Complexity OMP :

Complexity OLS :

O(mn)

O�mn+ k3 + km

O�(k3 + km)n

A 2 Rm⇥n

Complexity MP Complexity OMP Complexity OLS Thus:

A first insight into the performance

21

Let be the residual at iteration l.r(l�1)

Then,

���r(l�1)���2�

���r(l)MP

���2�

���r(l)OMP

���2�

���r(l)OLS

���2

MP, OMP and OLS in the literature

22

MP :

OMP : [Pati 93], [Zhang 93], [Davis 94],

[Mallat 93], «projection pursuit» [Friedman 81],[Huber 85],

OLS : [Chen 89],

«forward selection» [Miller 02],

«greedy algorithm» [Natarajan 95],

«order recursive matching pursuit» [Cotter 99],

«optimized orthogonal matching pursuit» [Reibollo-Neira 02],

«pure orthogonal matching pursuit» [Foucart 11]

«pure greedy» [Temlyakov 08]

«orthogonal greedy algorithm» [Temlyakov 08],

Other types of greedy algorithms

23

Multiple selections :

Approximate update :

Backward recursion :

«gradient pursuit» [Blumensath 08]

«stagewise OMP» [Donoho 12]

«backward greedy» [Couvreur 00]

Forward-backward recursion : [Efroymson 60], [Broersen 86],

«SMLR» [Kormylo 99], [Haugland 07]

«Bayesian Pursuit» [Herzet 10, 12],

«FoBa» [Zhang 11], SBR [Soussen 11], ...

Randomized selection : [Elad 09], [Divekar 10]

«Weak» greedy: [Temlyakov 00], [Gribonval 01]

Under which conditions can OMP solve the target approximation problem?

24

Is the solution of the greedy algorithm close to the optimal one?

25

9?C such that���y � z(k)

��� C minz2⌃k

ky � zk

y V1

V2

V3

Cdist(y,⌃)

z(k)

see e.g., [Temlyakov 08]

A more simple question is: can one identify the support of y?

26

y

V1

V2

V3

?= z(k)

if y 2 span (AS),

z(k) = y

S(k) = S ?

A tight condition of success in k steps: the «exact recovery condition»

27

S(k) = S for any y 2 span (AS), AS full rank, iff

ERC(S) < 1,

where

ERC(S) , max

i/2S

��A+S ai

��1.

Valid for OMP [Tropp 05], MP [Gribonval 06], OLS [Soussen 13]

Let card(S) = k.

Checking the ERC for all supports of size k is a combinatorial task...

28

iff max

S:card(S)=kERC(S) < 1,

S(k) = S for

⇢any y 2 span (AS), AS full rank,

any S, card(S) = k,

Conditions based on the mutual coherence are simple to evaluate

29

OMP succeeds in k steps for all S, card(S) = k if

µ <1

2k � 1

where µ = max

i 6=j|hai,aji|.

• Valid for MP, OLS [Tropp 05], l1 and l0 minimizations [Gribonval 03]

• The condition is sharp [Cai 11]

Definition: the restricted isometry constant

The restricted isometry constant of order s is the smallest

constant �s such that

(1� �s)kxk2 kAxk2 (1 + �s)kxk2

holds for all s-sparse vectors x.

For Gaussian random dictionaries A 2 Rm⇥n,

�s < ↵ with high probability if

see e.g., [Foucart 10, Theorem 9.2]

m � C↵�2s log⇣ns

⌘,

Conditions based on RICs for the success of OMP in k steps

OMP succeeds in k steps for all S, card(S) = k if

The condition �k+1 < 1pk+1

is sharp [Wen 13].

�k+1 < 13pk

[Davenport 10]�k+1 < 1

2pk+1

[Huang 11]�k+1 < 1

2pk

[Liu 12]�k+1 < 1p

k+1[Maleh 11], [Mo 12], [Wang 12]

�k+1 <p4k+1�12k [Chang 14]

�k+1 < 1pk+1

[Mo 15]

Goal of compressive sensing: recover x with as few measurements as possible

y = xAmeasurements unknown sparse vector

How many (random) measurements do we need to recover k-sparse vector x from y?

The coherence and RIC conditions require m~k2 to be satisfied

) m > C 0k2

(µ < 1

2k�1

µ �q

n�mm(n�1) [Foucart 13, Th 5.7]

m⌧n) m > (2k � 1)2

⇢�k+1 < 1p

k+1

�s < ↵ if m � C↵�2s log�ns

The recovery performance seems to evolve linearly with sparsity

m=(2k-1)2

simulated pointsnum

ber

of m

easu

rem

ents

number of nonzero elements

n=512 A ~ random algo = OMP

linear behavior!

Open question

[Rauhut 08]:

[Donoho 06]:

• Uniform recovery can not be achieved with m = O(k)

Is m > Ck2 a necessary condition for the recovery of

any support S, card(S) = k, in k steps?

• Uniform recovery in k steps is not possible with m = O(k3/2)

• Conjecture: Uniform recovery in k steps requires m = O(k2)

Towards the average case analysis

• What is proportion of supports S for which ERC(S) � 1?• If ERC(S) � 1, what are the y for which OMP fails?

2 main questions:

Recovery guarantees for decaying signals

Hypothesis: |x1| � |x2| � . . . � |xk|

[Davenport 10]: If �k+1 < 1

3 and

|xi||xi+1| � f(�

k+1),

then OMP recovers any support S in k steps.

See also [Soussen 13], [Ehler 14], [Herzet 16]

Success of OMP in more than k steps

38

OMP in more than k steps

Strategy:

Success: If y = ASxS and S ✓ S, we have xS = xS

xS\S = 0.

i) Run OMP until r(l) = 0

ii) Solve xS = argminxS

��y �ASxS

��iii) Set Sfinal = supp

�xS

Open question

Is there a polynomial-time tight condition ensuring the success of OMP in more than k iterations?

Conditions based on RIC

See also [Livshitz 14], [Wang 16] for more refined analyses

[Zhang 11] : S ⇢ S(30k)if �31k < 1/3

[Foucart 11] : S ⇢ S(12k)if �20k 1/6

[Foucart 13] : S ⇢ S(24k)if �26k < 1/6

Thank you for attention!

42

Recovery of a given support from random measurements

S(k) = S for any y 2 span (AS) and AS full rank,

with probability exceeding

1� 2 exp

n

� m

2Ck

o

, if

m � 2Ck log n

[Tropp 07], [Fletcher 09], [Lin 13]

Let A ⇠ N (0, 1pm) and card(S) = k.

Bibliography

44

Bibliography

45

Bibliography

46

Bibliography

47

Recommended