Upload
gabriel-peyre
View
302
Download
1
Embed Size (px)
DESCRIPTION
Talk given at Sampta 2013. The corresponding paper is : Model Selection with Piecewise Regular Gauges (S. Vaiter, M. Golbabaee, J. Fadili, G. Peyré), Technical report, Preprint hal-00842603, 2013. http://hal.archives-ouvertes.fr/hal-00842603/
Citation preview
Gabriel Peyré
www.numerical-tours.com
Model Selectionwith Piecewise
Regular Gauges
Samuel VaiterCharles Deledalle
Jalal FadiliJoint work with:
Joseph Salmon
VISI N
Overview
• Inverse Problems
• Gauge Decomposition and Model Selection
• L2 Stability Performances
• Model Stability Performances
y = �x0 + w 2 RP
Inverse Problems
Recovering x0 � RN from noisy observations
Examples: Inpainting, super-resolution, compressed-sensing
y = �x0 + w 2 RP
�
Inverse Problems
Recovering x0 � RN from noisy observations
x0
x0
�
Regularized inversion:
Estimators
x(y) 2 argminx2RN
1
2||y � �x||2 + � J(x)
Data fidelity Regularity
Observations: y = �x0 + w 2 RP .
L2error stability: ||x(y)� x0|| = O(||w||).
Promoted subspace (“model”) stability.
Goal: Performance analysis:
Regularized inversion:
Estimators
x(y) 2 argminx2RN
1
2||y � �x||2 + � J(x)
�! Criteria on (x0, ||w||,�) to ensure
Data fidelity Regularity
Observations: y = �x0 + w 2 RP .
Overview
• Inverse Problems
• Gauge Decomposition and Model Selection
• L2 Stability Performances
• Model Stability Performances
�
Coe�cients x Image x
Union of Linear Models for Data ProcessingUnion of models: T 2 T linear spaces.
Synthesissparsity:
T
�
Coe�cients x Image x
Union of Linear Models for Data ProcessingUnion of models: T 2 T linear spaces.
Synthesissparsity:
T
Structuredsparsity:
�
Coe�cients x Image x
Union of Linear Models for Data Processing
D�
Image x
Gradient D⇤x
Union of models: T 2 T linear spaces.
Synthesissparsity:
T
Structuredsparsity:
Analysissparsity:
�
Coe�cients x Image x
Multi-spectral imaging:xi,· =
Prj=1 Ai,jSj,·
Union of Linear Models for Data Processing
D�
Image x
Gradient D⇤x
Union of models: T 2 T linear spaces.
Synthesissparsity:
T
Structuredsparsity:
Analysissparsity:
Low-rank:
S1,·
S2,·
S3,·Figure 3. The concept of hyperspectral imagery. Image measurements are made atmany narrow contiguous wavelength bands, resulting in a complete spectrum for eachpixel.
Hyperspectral Data
Most multispectral imagers (e.g., Landsat, SPOT, AVHRR) measure radiation reflectedfrom a surface at a few wide, separated wavelength bands (Fig. 4). Most hyperspectralimagers (Table 1), on the other hand, measure reflected radiation at a series of narrowand contiguous wavelength bands. When we look at a spectrum for one pixel in ahyperspectral image, it looks very much like a spectrum that would be measured in aspectroscopy laboratory (Fig. 5). This type of detailed pixel spectrum can provide muchmore information about the surface than a multispectral pixel spectrum.
x
Gauge: J : RN ! R+8↵ 2 R+
, J(↵x) = ↵J(x)
Gauges for Union of Linear ModelsConvex
Gauge: J : RN ! R+8↵ 2 R+
, J(↵x) = ↵J(x)
J(x) = �C(x) = inf {⇢ > 0 \ x 2 ⇢C}C = {x \ J(x) 6 1} (assuming 0 2 C)
Gauges for Union of Linear Models
J(x)
C
1
Convex
Gauge:
,Union of linear models
(T )T2TPiecewise regular ball
J : RN ! R+8↵ 2 R+
, J(↵x) = ↵J(x)
J(x) = �C(x) = inf {⇢ > 0 \ x 2 ⇢C}C = {x \ J(x) 6 1} (assuming 0 2 C)
Gauges for Union of Linear Models
J(x)
C
1
Convex
x
T
Gauge:
,Union of linear models
(T )T2TPiecewise regular ball
J : RN ! R+8↵ 2 R+
, J(↵x) = ↵J(x)
J(x) = �C(x) = inf {⇢ > 0 \ x 2 ⇢C}C = {x \ J(x) 6 1} (assuming 0 2 C)
Gauges for Union of Linear Models
J(x) = ||x||1T = sparsevectors
J(x)
C
1
Convex
x
T
Gauge:
,Union of linear models
(T )T2TPiecewise regular ball
J : RN ! R+8↵ 2 R+
, J(↵x) = ↵J(x)
J(x) = �C(x) = inf {⇢ > 0 \ x 2 ⇢C}C = {x \ J(x) 6 1} (assuming 0 2 C)
Gauges for Union of Linear Models
J(x) = ||x||1
x
0T 0
T = sparsevectors
J(x)
C
1
Convex
x
T
Gauge:
,Union of linear models
(T )T2TPiecewise regular ball
J : RN ! R+8↵ 2 R+
, J(↵x) = ↵J(x)
J(x) = �C(x) = inf {⇢ > 0 \ x 2 ⇢C}C = {x \ J(x) 6 1} (assuming 0 2 C)
Gauges for Union of Linear Models
J(x) = ||x||1
x
0T 0
T = sparsevectors
|x1|+||x2,3||x
0
x
T
T 0
T = block
vectors
sparse
J(x)
C
1
Convex
x
T
Gauge:
,Union of linear models
(T )T2TPiecewise regular ball
J : RN ! R+8↵ 2 R+
, J(↵x) = ↵J(x)
J(x) = �C(x) = inf {⇢ > 0 \ x 2 ⇢C}C = {x \ J(x) 6 1} (assuming 0 2 C)
Gauges for Union of Linear Models
J(x) = ||x||1T = low-rank
matrices
J(x) = ||x||⇤
x
x
0T 0
T = sparsevectors
|x1|+||x2,3||x
0
x
T
T 0
T = block
vectors
sparse
J(x)
C
1
Convex
x
T
Gauge:
,Union of linear models
(T )T2TPiecewise regular ball
J : RN ! R+8↵ 2 R+
, J(↵x) = ↵J(x)
J(x) = �C(x) = inf {⇢ > 0 \ x 2 ⇢C}C = {x \ J(x) 6 1} (assuming 0 2 C)
Gauges for Union of Linear Models
J(x) = ||x||1T = low-rank
matrices
J(x) = ||x||⇤
x
x
0T 0
T = anti-sparsevectors
J(x) = ||x||1
x
x
0
T 0
T = sparsevectors
|x1|+||x2,3||x
0
x
T
T 0
T = block
vectors
sparse
J(x)
C
1
Convex
Subdifferentials and Models
@J(x) =�⌘ 2 RN \ 8 y, J(y) > J(x) + h⌘, y � xi
I = supp(x) = {i \ xi 6= 0}
Subdifferentials and Models
Example: J(x) = ||x||1 @||x||1 =
⇢⌘ \ supp(⌘) = I,
8 j /2 I, |⌘j | 6 1
�
x
@J(x)
0
@J(x) =�⌘ 2 RN \ 8 y, J(y) > J(x) + h⌘, y � xi
x
@J(x)
0
I = supp(x) = {i \ xi 6= 0}
Subdifferentials and Models
Example: J(x) = ||x||1 @||x||1 =
⇢⌘ \ supp(⌘) = I,
8 j /2 I, |⌘j | 6 1
�
x
@J(x)
0
@J(x) =�⌘ 2 RN \ 8 y, J(y) > J(x) + h⌘, y � xi
Tx
x
@J(x)
0Definition:
I = supp(x) = {i \ xi 6= 0}
T
x
= VectHull(@J(x))?
Subdifferentials and Models
Tx
= {⌘ \ supp(⌘) = I}
Example: J(x) = ||x||1 @||x||1 =
⇢⌘ \ supp(⌘) = I,
8 j /2 I, |⌘j | 6 1
�
Tx
x
@J(x)
0
@J(x) =�⌘ 2 RN \ 8 y, J(y) > J(x) + h⌘, y � xi
Tx
x
@J(x)
0Definition:
I = supp(x) = {i \ xi 6= 0}
T
x
= VectHull(@J(x))?
Subdifferentials and Models
ex
e
x
= Proj
T
x
(@J(x))
e
x
= sign(x)
Tx
= {⌘ \ supp(⌘) = I}
Example: J(x) = ||x||1 @||x||1 =
⇢⌘ \ supp(⌘) = I,
8 j /2 I, |⌘j | 6 1
�
ex
Tx
x
@J(x)
0
@J(x) =�⌘ 2 RN \ 8 y, J(y) > J(x) + h⌘, y � xi
Examples`
1 sparsity: J(x) = ||x||1e
x
= sign(x) T
x
= {z \ supp(z) ⇢ supp(x)}
x
0x
@J(x)
Examples
x
0x
@J(x)
`
1 sparsity: J(x) = ||x||1e
x
= sign(x) T
x
= {z \ supp(z) ⇢ supp(x)}
e
x
= (N (xb
))b2B
N (a) = a/||a||Structured sparsity: J(x) =P
b ||xb||T
x
= {z \ supp(z) ⇢ supp(x)}
x
0x
@J(x)
Examples
x
0x
@J(x)
Tx
= {z \ U⇤?zV? = 0}e
x
= UV ⇤Nuclear norm: J(x) = ||x||⇤ x = U⇤V ⇤SVD:
`
1 sparsity: J(x) = ||x||1e
x
= sign(x) T
x
= {z \ supp(z) ⇢ supp(x)}
e
x
= (N (xb
))b2B
N (a) = a/||a||Structured sparsity: J(x) =P
b ||xb||T
x
= {z \ supp(z) ⇢ supp(x)}
x
@J(x)x
0x
@J(x)
Examples
x
0x
@J(x)
x
x
0
@J(x)
I = {i \ |xi| = ||x||1}Anti-sparsity: J(x) = ||x||1T
x
= {y \ y
I
/ sign(xI
)}e
x
= |I|�1 sign(x)
Tx
= {z \ U⇤?zV? = 0}e
x
= UV ⇤Nuclear norm: J(x) = ||x||⇤ x = U⇤V ⇤SVD:
`
1 sparsity: J(x) = ||x||1e
x
= sign(x) T
x
= {z \ supp(z) ⇢ supp(x)}
e
x
= (N (xb
))b2B
N (a) = a/||a||Structured sparsity: J(x) =P
b ||xb||T
x
= {z \ supp(z) ⇢ supp(x)}
x
@J(x)x
0x
@J(x)
Overview
• Inverse Problems
• Gauge Decomposition and Model Selection
• L2 Stability Performances
• Model Stability Performances
Noiseless recovery:
min�x=�x0
J(x) (P0)
�x = �
x0
Dual Certificate and L2 Stability
x
?
Noiseless recovery:
min�x=�x0
J(x) (P0)
Dual certificates:
�x = �
x0
⌘
Proposition:
D = Im(�⇤) \ @J(x0)
9 ⌘ 2 D () x0 solution of (P0)
Dual Certificate and L2 Stability@J(x0)
x
?
Noiseless recovery:
min�x=�x0
J(x) (P0)
Dual certificates:
Tight dual certificates:
�x = �
x0
⌘
Proposition:
D = Im(�⇤) \ @J(x0)
D̄ = Im(�⇤) \ ri(@J(x0))
9 ⌘ 2 D () x0 solution of (P0)
Dual Certificate and L2 Stability@J(x0)
x
?
Noiseless recovery:
min�x=�x0
J(x) (P0)
Dual certificates:
Tight dual certificates:
�x = �
x0
⌘
Proposition:
D = Im(�⇤) \ @J(x0)
D̄ = Im(�⇤) \ ri(@J(x0))
9 ⌘ 2 D () x0 solution of (P0)
Dual Certificate and L2 Stability@J(x0)
x
?
Theorem:
[Fadili et al. 2013] for � ⇠ ||w|| one has ||x? � x0|| = O(||w||)If 9 ⌘ 2 D̄ and ker(�) \ T
x0 = {0}
Noiseless recovery:
min�x=�x0
J(x) (P0)
Dual certificates:
Tight dual certificates:
�x = �
x0
⌘
Proposition:
�! The constants depend on N . . .
D = Im(�⇤) \ @J(x0)
D̄ = Im(�⇤) \ ri(@J(x0))
9 ⌘ 2 D () x0 solution of (P0)
Dual Certificate and L2 Stability@J(x0)
x
?
Theorem:
[Fadili et al. 2013] for � ⇠ ||w|| one has ||x? � x0|| = O(||w||)If 9 ⌘ 2 D̄ and ker(�) \ T
x0 = {0}
Noiseless recovery:
min�x=�x0
J(x) (P0)
Dual certificates:
Tight dual certificates:
�x = �
x0
⌘
Proposition:
[Grassmair 2012]: J(x? � x0) = O(||w||).[Grassmair, Haltmeier, Scherzer 2010]: J = || · ||1.
�! The constants depend on N . . .
D = Im(�⇤) \ @J(x0)
D̄ = Im(�⇤) \ ri(@J(x0))
9 ⌘ 2 D () x0 solution of (P0)
Dual Certificate and L2 Stability@J(x0)
x
?
Theorem:
[Fadili et al. 2013] for � ⇠ ||w|| one has ||x? � x0|| = O(||w||)If 9 ⌘ 2 D̄ and ker(�) \ T
x0 = {0}
Overview
• Inverse Problems
• Gauge Decomposition and Model Selection
• L2 Stability Performances
• Model Stability Performances
⌘ 2 D () and J�(⌘) = 1
Minimal-norm Certificate⌘ = �⇤q⌘T = e
⇢T = T
x0
e = ex0
⌘ 2 D ()
We assume ker(�) \ T = {0} and J piecewise regular.
and J�(⌘) = 1
Minimal-norm Certificate⌘ = �⇤q⌘T = e
⇢T = T
x0
e = ex0
⌘0 = argmin⌘=�⇤q,⌘T=e
||q||
⌘ 2 D ()
We assume ker(�) \ T = {0} and J piecewise regular.
and J�(⌘) = 1
Minimal-norm Certificate⌘ = �⇤q⌘T = e
Minimal-norm pre-certificate:
⇢T = T
x0
e = ex0
⌘0 = argmin⌘=�⇤q,⌘T=e
||q||
⌘ 2 D ()
We assume ker(�) \ T = {0} and J piecewise regular.
and J�(⌘) = 1
Minimal-norm Certificate
Proposition: One has
⌘ = �⇤q⌘T = e
Minimal-norm pre-certificate:
⇢T = T
x0
e = ex0
⌘0 = (�+T�)
⇤e
⌘0 = argmin⌘=�⇤q,⌘T=e
||q||
⌘ 2 D ()
We assume ker(�) \ T = {0} and J piecewise regular.
and J�(⌘) = 1
Minimal-norm Certificate
Proposition:
||w|| = O(⌫x0) and � ⇠ ||w||,Theorem:
the unique solution x
?of P�(y) for y = �x0 + w satisfies
Tx
? = Tx0 and ||x? � x0|| = O(||w||) [Vaiter et al. 2013]
One has
⌘ = �⇤q⌘T = e
Minimal-norm pre-certificate:
⇢T = T
x0
e = ex0
⌘0 = (�+T�)
⇤e
If ⌘0 2 D̄,
[Fuchs 2004]: J = || · ||1.[Bach 2008]: J = || · ||1,2 and J = || · ||⇤.
[Vaiter et al. 2011]: J = ||D⇤ · ||1.
⌘0 = argmin⌘=�⇤q,⌘T=e
||q||
⌘ 2 D ()
We assume ker(�) \ T = {0} and J piecewise regular.
and J�(⌘) = 1
Minimal-norm Certificate
Proposition:
||w|| = O(⌫x0) and � ⇠ ||w||,Theorem:
the unique solution x
?of P�(y) for y = �x0 + w satisfies
Tx
? = Tx0 and ||x? � x0|| = O(||w||) [Vaiter et al. 2013]
One has
⌘ = �⇤q⌘T = e
Minimal-norm pre-certificate:
⇢T = T
x0
e = ex0
⌘0 = (�+T�)
⇤e
If ⌘0 2 D̄,
⇥x =�
i
xi�(·��i)
Increasing �:� reduces correlation.� reduces resolution.
J(x) = ||x||1
Example: 1-D Sparse Deconvolution
�x0
x0�
⇥x =�
i
xi�(·��i)
Increasing �:� reduces correlation.� reduces resolution.
�0 10
2
support recovery.
J(x) = ||x||1
()
||⌘0,Ic ||1 < 1
⌘0 2 D̄(x0)
I = {j \ x0(j) 6= 0}||⌘0,Ic ||1
Example: 1-D Sparse Deconvolution
�x0
x0�
201
()
J(x) = ||rx||1 (rx)i = xi � xi�1
� = Id I = {i \ (rx0)i 6= 0}
8 j /2 I, (�↵0)j = 0⌘0 = div(↵0) where
Example: 1-D TV Denoisingx0
+1
�1�I
�J
Support stability.
J(x) = ||rx||1 (rx)i = xi � xi�1
� = Id I = {i \ (rx0)i 6= 0}
8 j /2 I, (�↵0)j = 0⌘0 = div(↵0) where
||↵0,Ic || < 1
Example: 1-D TV Denoising
x0
x0
+1
�1�I
�J
`2 stability only
Support stability.
x0
J(x) = ||rx||1 (rx)i = xi � xi�1
� = Id I = {i \ (rx0)i 6= 0}
8 j /2 I, (�↵0)j = 0⌘0 = div(↵0) where
||↵0,Ic || < 1 ||↵0,Ic || = 1
Example: 1-D TV Denoising
+1
�1
�J
x0
x0
Gauges: encode linear models as singular points.
Conclusion
Piecewise smooth gauges: enable model recovery Tx
?= T
x0 .
Gauges: encode linear models as singular points.
Tight dual certificates: enables L2 stability.
Conclusion
Piecewise smooth gauges: enable model recovery Tx
?= T
x0 .
– Approximate model recovery Tx
? ⇡ Tx0 .
Gauges: encode linear models as singular points.
– Infinite dimensional problems (measures, TV, etc.).
Tight dual certificates: enables L2 stability.
Conclusion
Open problems: