60
Compressive Sensing Gabriel Peyré www.numerical-tours.com

Signal Processing Course : Compressed Sensing

Embed Size (px)

DESCRIPTION

Slides for a course on signal and image processing.

Citation preview

Page 2: Signal Processing Course : Compressed Sensing

Overview

•Shannon’s World

•Compressive Sensing Acquisition

•Compressive Sensing Recovery

•Theoretical Guarantees

•Fourier Domain Measurements

Page 3: Signal Processing Course : Compressed Sensing

Sampling:

f̃ � L2([0, 1]d) f � RN

Idealization:

acquisitiondevice

f [n] ⇡ f̃(n/N)

Discretization

Page 4: Signal Processing Course : Compressed Sensing

Data aquisition:

Sensors

Pointwise Sampling and Smoothness

f̃ � L2 f � RN

f [i] = f̃(i/N)

Page 5: Signal Processing Course : Compressed Sensing

Data aquisition:

Sensors

f̃(t) =�

i

f [i]h(Nt� i)

Shannon interpolation: if Supp( ˆ̃f) � [�N�, N�]

h(t) =sin(�t)

�t

Pointwise Sampling and Smoothness

f̃ � L2 f � RN

f [i] = f̃(i/N)

Page 6: Signal Processing Course : Compressed Sensing

Data aquisition:

Sensors

f̃(t) =�

i

f [i]h(Nt� i)

�� Natural images are not smooth.

Shannon interpolation: if Supp( ˆ̃f) � [�N�, N�]

h(t) =sin(�t)

�t

Pointwise Sampling and Smoothness

f̃ � L2 f � RN

f [i] = f̃(i/N)

Page 7: Signal Processing Course : Compressed Sensing

Data aquisition:

Sensors

f̃(t) =�

i

f [i]h(Nt� i)

�� Natural images are not smooth.

�� But can be compressed e�ciently.

Shannon interpolation: if Supp( ˆ̃f) � [�N�, N�]

0,1,0,. . .

h(t) =sin(�t)

�t

�� Sample and compress simultaneously?

Pointwise Sampling and Smoothness

f̃ � L2 f � RN

f [i] = f̃(i/N)

JPEG-2k

Page 8: Signal Processing Course : Compressed Sensing

Sampling and Periodization

(a)

(c)

(d)

(b)

1

0

Page 9: Signal Processing Course : Compressed Sensing

Sampling and Periodization: Aliasing

(b)

(c)

(d)

(a)

0

1

Page 10: Signal Processing Course : Compressed Sensing

Overview

•Shannon’s World

•Compressive Sensing Acquisition

•Compressive Sensing Recovery

•Theoretical Guarantees

•Fourier Domain Measurements

Page 11: Signal Processing Course : Compressed Sensing

Single Pixel Camera (Rice)

Page 12: Signal Processing Course : Compressed Sensing

P measures � N micro-mirrors

Single Pixel Camera (Rice)

y[i] = �f, �i�

Page 13: Signal Processing Course : Compressed Sensing

P/N = 0.16 P/N = 0.02P/N = 1

P measures � N micro-mirrors

Single Pixel Camera (Rice)

y[i] = �f, �i�

Page 14: Signal Processing Course : Compressed Sensing

Physical hardware resolution limit: target resolution f � RN .

f̃ � L2 f � RN y � RPmicromirrors

arrayresolution

CS hardwareK

CS Hardware Model

CS is about designing hardware: input signals f̃ � L2(R2).

Page 15: Signal Processing Course : Compressed Sensing

Physical hardware resolution limit: target resolution f � RN .

f̃ � L2 f � RN y � RPmicromirrors

arrayresolution

CS hardware

,

...

K

CS Hardware Model

CS is about designing hardware: input signals f̃ � L2(R2).

,

,

Page 16: Signal Processing Course : Compressed Sensing

Physical hardware resolution limit: target resolution f � RN .

f̃ � L2 f � RN y � RPmicromirrors

arrayresolution

CS hardware

,

...

fOperator K

K

CS Hardware Model

CS is about designing hardware: input signals f̃ � L2(R2).

,

,

Page 17: Signal Processing Course : Compressed Sensing

Overview

•Shannon’s World

•Compressive Sensing Acquisition

•Compressive Sensing Recovery

•Theoretical Guarantees

•Fourier Domain Measurements

Page 18: Signal Processing Course : Compressed Sensing

Need to solve y = Kf .

� More unknown than equations.

dim(ker(K)) = N � P is huge.

Inversion and Sparsity

f

Operator K

Page 19: Signal Processing Course : Compressed Sensing

Need to solve y = Kf .

� More unknown than equations.

dim(ker(K)) = N � P is huge.

Prior information: f is sparse in a basis {�m}m.

J�(f) = Card {m \ |�f, �m�| > �} is small.

Inversion and Sparsity

f

Operator K

�f, �m�f

Page 20: Signal Processing Course : Compressed Sensing

Image with 2 pixels:

q = 0

Convex Relaxation: L1 Prior

J0(f) = # {m \ ⇥f, �m⇤ �= 0}J0(f) = 0 �� null image.J0(f) = 1 �� sparse image.J0(f) = 2 �� non-sparse image.

Page 21: Signal Processing Course : Compressed Sensing

Image with 2 pixels:

Jq(f) =�

m

|�f, �m⇥|qq = 0 q = 1 q = 2q = 3/2q = 1/2

Convex Relaxation: L1 Prior

J0(f) = # {m \ ⇥f, �m⇤ �= 0}

�q priors: (convex for q � 1)

J0(f) = 0 �� null image.J0(f) = 1 �� sparse image.J0(f) = 2 �� non-sparse image.

Page 22: Signal Processing Course : Compressed Sensing

Image with 2 pixels:

Jq(f) =�

m

|�f, �m⇥|qq = 0 q = 1 q = 2q = 3/2q = 1/2

Convex Relaxation: L1 Prior

J1(f) =�

m

|�f, �m⇥|Sparse �1 prior:

J0(f) = # {m \ ⇥f, �m⇤ �= 0}

�q priors: (convex for q � 1)

J0(f) = 0 �� null image.J0(f) = 1 �� sparse image.J0(f) = 2 �� non-sparse image.

Page 23: Signal Processing Course : Compressed Sensing

f0 � RN sparse in ortho-basis �

Sparse CS Recovery

���

x0 � RN

f0 � RN

Page 24: Signal Processing Course : Compressed Sensing

(Discretized) sampling acquisition:

f0 � RN sparse in ortho-basis �

y = Kf0 + w = K � �(x0) + w= �

Sparse CS Recovery

���

x0 � RN

f0 � RN

Page 25: Signal Processing Course : Compressed Sensing

(Discretized) sampling acquisition:

f0 � RN sparse in ortho-basis �

y = Kf0 + w = K � �(x0) + w= �

K drawn from the Gaussian matrix ensemble

Ki,j � N (0, P�1/2) i.i.d.

� � drawn from the Gaussian matrix ensemble

Sparse CS Recovery

���

x0 � RN

f0 � RN

Page 26: Signal Processing Course : Compressed Sensing

(Discretized) sampling acquisition:

f0 � RN sparse in ortho-basis �

y = Kf0 + w = K � �(x0) + w= �

K drawn from the Gaussian matrix ensemble

Ki,j � N (0, P�1/2) i.i.d.

� � drawn from the Gaussian matrix ensemble

Sparse recovery:min

||�x�y||�||w||||x||1 min

x

12

||�x� y||2 + �||x||1||w||�� �

Sparse CS Recovery

���

x0 � RN

f0 � RN

Page 27: Signal Processing Course : Compressed Sensing

� = translation invariantwavelet frame

Original f0

CS Simulation Example

Page 28: Signal Processing Course : Compressed Sensing

Overview

•Shannon’s World

•Compressive Sensing Acquisition

•Compressive Sensing Recovery

•Theoretical Guarantees

•Fourier Domain Measurements

Page 29: Signal Processing Course : Compressed Sensing

⇥ ||x||0 � k, (1� �k)||x||2 � ||�x||2 � (1 + �k)||x||2Restricted Isometry Constants:

�1 recovery:

CS with RIP

x⇥ � argmin||�x�y||��

||x||1 where�

y = �x0 + w||w|| � �

Page 30: Signal Processing Course : Compressed Sensing

⇥ ||x||0 � k, (1� �k)||x||2 � ||�x||2 � (1 + �k)||x||2Restricted Isometry Constants:

�1 recovery:

CS with RIP

[Candes 2009]

x⇥ � argmin||�x�y||��

||x||1 where�

y = �x0 + w||w|| � �

Theorem: If �2k ��

2� 1, then

where xk is the best k-term approximation of x0.

||x0 � x�|| � C0⇥k

||x0 � xk||1 + C1�

Page 31: Signal Processing Course : Compressed Sensing

f�(⇥) =1

2⇤�⇥

�(⇥� b)+(a� ⇥)+

Eigenvalues of ��I�I with |I| = k are essentially in [a, b]

a = (1��

�)2 and b = (1��

�)2 where � = k/P

When k = �P � +�, the eigenvalue distribution tends to

[Marcenko-Pastur]

Large deviation inequality [Ledoux]

Singular Values Distributions

0 0.5 1 1.5 2 2.50

0.5

1

1.5

P=200, k=10

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

1

P=200, k=30

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

P=200, k=50

0 0.5 1 1.5 2 2.50

0.5

1

1.5

P=200, k=10

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

1

P=200, k=30

0 0.5 1 1.5 2 2.50

0.2

0.4

0.6

0.8

P=200, k=50

P = 200, k = 10

f�(�)

�k = 30

Page 32: Signal Processing Course : Compressed Sensing

Link with coherence:

�k � (k � 1)µ(�)

�2 = µ(�)

RIP for Gaussian Matrices

µ(�) = maxi �=j

|��i, �j⇥|

Page 33: Signal Processing Course : Compressed Sensing

Link with coherence:

�k � (k � 1)µ(�)

For Gaussian matrices:

�2 = µ(�)

RIP for Gaussian Matrices

µ(�) = maxi �=j

|��i, �j⇥|

µ(�) ��

log(PN)/P

Page 34: Signal Processing Course : Compressed Sensing

Link with coherence:

�k � (k � 1)µ(�)

For Gaussian matrices:

Stronger result:

�2 = µ(�)

RIP for Gaussian Matrices

k � C

log(N/P )PTheorem: If

then �2k ��

2� 1 with high probability.

µ(�) = maxi �=j

|��i, �j⇥|

µ(�) ��

log(PN)/P

Page 35: Signal Processing Course : Compressed Sensing

(1� ⇥1(A))||�||2 � ||A�||2 � (1 + ⇥2(A))||�||2Stability constant of A:

smallest / largest eigenvalues of A�A

Numerics with RIP

Page 36: Signal Processing Course : Compressed Sensing

�2� 1

(1� ⇥1(A))||�||2 � ||A�||2 � (1 + ⇥2(A))||�||2Stability constant of A:

Upper/lower RIC:

�ik = max

|I|=k�i(�I)

�k = min(�1k, �2

k)

k

�̂2k

�̂2k

Monte-Carlo estimation:�̂k � �k

smallest / largest eigenvalues of A�A

N = 4000, P = 1000

Numerics with RIP

Page 37: Signal Processing Course : Compressed Sensing

�(B�)

x0 �x0

�1

��2

�2�3

��3

��1

� = (�i)i � R2�3

B� = {x \ ||x||1 � �}� = ||x0||1

x� � argmin�x=y

||x||1 (P0(y))Noiseless recovery:

y �� x�

Polytopes-based Guarantees

Page 38: Signal Processing Course : Compressed Sensing

�(B�)

x0 �x0

�1

��2

�2�3

��3

��1

� = (�i)i � R2�3

B� = {x \ ||x||1 � �}� = ||x0||1

x0 solution of P0(�x0) �⇥ �x0 ⇤ ��(B�)

x� � argmin�x=y

||x||1 (P0(y))Noiseless recovery:

y �� x�

Polytopes-based Guarantees

Page 39: Signal Processing Course : Compressed Sensing

C(0,1,1)

K(0,1,1)

Ks =�(�isi)i � R3 \ �i � 0

� 2-D conesCs = �Ks

2-D quadrant

L1 Recovery in 2-D

��1

�2�3

� = (�i)i � R2�3

y �� x�

Page 40: Signal Processing Course : Compressed Sensing

All MostRIP

� Sharp constants.

� No noise robustness.

All x0 such that ||x0||0 � Call(P/N)P are identifiable.Most x0 such that ||x0||0 � Cmost(P/N)P are identifiable.

Call(1/4) � 0.065

Cmost(1/4) � 0.25

[Donoho]

Polytope Noiseless Recovery

50 100 150 200 250 300 350 4000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Counting faces of random polytopes:

Page 41: Signal Processing Course : Compressed Sensing

All MostRIP

� Sharp constants.

� No noise robustness.

All x0 such that ||x0||0 � Call(P/N)P are identifiable.Most x0 such that ||x0||0 � Cmost(P/N)P are identifiable.

Call(1/4) � 0.065

Cmost(1/4) � 0.25

[Donoho]

� Computation of“pathological” signals

[Dossal, P, Fadili, 2010]

Polytope Noiseless Recovery

50 100 150 200 250 300 350 4000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Counting faces of random polytopes:

Page 42: Signal Processing Course : Compressed Sensing

Overview

•Shannon’s World

•Compressive Sensing Acquisition

•Compressive Sensing Recovery

•Theoretical Guarantees

•Fourier Domain Measurements

Page 43: Signal Processing Course : Compressed Sensing

Tomography and Fourier Measures

Page 44: Signal Processing Course : Compressed Sensing

Kf = (f̂ [!])!2⌦

Tomography and Fourier Measures

Fourier slice theorem: p̂�(⇥) = f̂(⇥ cos(�), ⇥ sin(�))

1D 2D Fourier

�k

f̂ = FFT2(f)

Partial Fourier measurements:

Equivalent to:

{p�k(t)}t�R0�k<K

Page 45: Signal Processing Course : Compressed Sensing

Regularized Inversion

f⇥ = argminf

12

���

|y[⇤] � f̂ [⇤]|2 + ��

m

|⇥f, ⇥m⇤|.�1 regularization:

Noisy measurements: ⇥� � �, y[�] = f̂0[�] + w[�].

Noise: w[⇥] � N (0,�), white noise.

Page 46: Signal Processing Course : Compressed Sensing

MRI ImagingFrom [Lutsig et al.]

Page 47: Signal Processing Course : Compressed Sensing

Fourier sub-sampling pattern:

randomization

MRI Reconstruction

High resolution Linear SparsityLow resolution

From [Lutsig et al.]

Page 48: Signal Processing Course : Compressed Sensing

Fourier sampling(Earth’s rotation)

Linearreconstruction

Radar InterferometryCARMA (USA)

Page 49: Signal Processing Course : Compressed Sensing

Gaussian matrices: intractable for large N .

Random partial orthogonal matrix: {��}� orthogonal basis.

Fast measurements: (e.g. Fourier basis)

Kf = (h'!, fi)!2⌦ where |⌦| = P uniformly random.

Structured Measurements

Page 50: Signal Processing Course : Compressed Sensing

Gaussian matrices: intractable for large N .

Random partial orthogonal matrix: {��}� orthogonal basis.

Fast measurements: (e.g. Fourier basis)

Mutual incoherence: µ =⌅

Nmax�,m

|⇥⇥�, �m⇤| � [1,⌅

N ]

Kf = (h'!, fi)!2⌦ where |⌦| = P uniformly random.

Structured Measurements

Page 51: Signal Processing Course : Compressed Sensing

�� not universal: requires incoherence.

Gaussian matrices: intractable for large N .

Random partial orthogonal matrix: {��}� orthogonal basis.

Fast measurements: (e.g. Fourier basis)

Mutual incoherence: µ =⌅

Nmax�,m

|⇥⇥�, �m⇤| � [1,⌅

N ]

Kf = (h'!, fi)!2⌦ where |⌦| = P uniformly random.

Structured Measurements

Theorem: with high probability on �,

If M � CP

µ2 log(N)4, then �2M �

�2� 1

[Rudelson, Vershynin, 2006]

� = K

Page 52: Signal Processing Course : Compressed Sensing

dictionary

ConclusionSparsity: approximate signals with few atoms.

Page 53: Signal Processing Course : Compressed Sensing

�� Randomized sensors + sparse recovery.�� Number of measurements � signal complexity.

Compressed sensing ideas:

�� CS is about designing new hardware.

dictionary

ConclusionSparsity: approximate signals with few atoms.

Page 54: Signal Processing Course : Compressed Sensing

�� Randomized sensors + sparse recovery.�� Number of measurements � signal complexity.

Compressed sensing ideas:

The devil is in the constants:

�� Worse case analysis is problematic.

�� Designing good signal models.

�� CS is about designing new hardware.

dictionary

ConclusionSparsity: approximate signals with few atoms.

Page 55: Signal Processing Course : Compressed Sensing

Dictionary learning:

learning

Some Hot Topics

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MA

IRA

Letal.:SPA

RSE

RE

PRE

SEN

TAT

ION

FOR

CO

LO

RIM

AG

ER

EST

OR

AT

ION

61

Fig.7.D

atasetused

forevaluating

denoisingexperim

ents.

TAB

LE

IPSN

RR

ESU

LTS

OF

OU

RD

EN

OISIN

GA

LG

OR

ITH

MW

ITH

256A

TO

MS

OF

SIZ

E7

73

FOR

AN

D6

63

FOR

.EA

CH

CA

SEIS

DIV

IDE

DIN

FO

UR

PA

RT

S:TH

ET

OP-L

EFT

RE

SULT

SA

RE

TH

OSE

GIV

EN

BY

MCA

UL

EY

AN

DA

L[28]W

ITH

TH

EIR

“33

MO

DE

L.”T

HE

TO

P-RIG

HT

RE

SULT

SA

RE

TH

OSE

OB

TAIN

ED

BY

APPLY

ING

TH

EG

RA

YSC

AL

EK

-SVD

AL

GO

RIT

HM

[2]O

NE

AC

HC

HA

NN

EL

SE

PAR

AT

ELY

WIT

H8

8A

TO

MS.T

HE

BO

TT

OM

-LE

FTA

RE

OU

RR

ESU

LTS

OB

TAIN

ED

WIT

HA

GL

OB

AL

LYT

RA

INE

DD

ICT

ION

AR

Y.TH

EB

OT

TO

M-R

IGH

TA

RE

TH

EIM

PRO

VE

ME

NT

SO

BTA

INE

DW

ITH

TH

EA

DA

PTIV

EA

PPRO

AC

HW

ITH

20IT

ER

AT

ION

S.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT

SFO

RE

AC

HG

RO

UP.

AS

CA

NB

ESE

EN,

OU

RP

RO

POSE

DT

EC

HN

IQU

EC

ON

SISTE

NT

LYP

RO

DU

CE

ST

HE

BE

STR

ESU

LTS

TAB

LE

IIC

OM

PAR

ISON

OF

TH

EPSN

RR

ESU

LTS

ON

TH

EIM

AG

E“C

AST

LE”

BE

TW

EE

N[28]

AN

DW

HA

TW

EO

BTA

INE

DW

ITH

2566

63

AN

D7

73

PA

TC

HE

S.F

OR

TH

EA

DA

PTIV

EA

PPRO

AC

H,20

ITE

RA

TIO

NS

HA

VE

BE

EN

PE

RFO

RM

ED.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT,

IND

ICA

TIN

GO

NC

EA

GA

INT

HE

CO

NSIST

EN

TIM

PRO

VE

ME

NT

OB

TAIN

ED

WIT

HO

UR

PR

OPO

SED

TE

CH

NIQ

UE

patch),inorder

topreventany

learningof

theseartifacts

(over-fitting).

We

definethen

thepatch

sparsityof

thedecom

po-sition

asthis

number

ofsteps.T

hestopping

criteriain

(2)be-

comes

thenum

berof

atoms

usedinstead

ofthe

reconstructionerror.U

singa

small

duringthe

OM

Pperm

itsto

learna

dic-tionary

specializedin

providinga

coarseapproxim

ation.O

urassum

ptionis

that(pattern)

artifactsare

lesspresent

incoarse

approximations,preventing

thedictionary

fromlearning

them.

We

proposethen

thealgorithm

describedin

Fig.6.We

typicallyused

toprevent

thelearning

ofartifacts

andfound

outthattw

oouteriterations

inthe

scheme

inFig.6

aresufficientto

givesatisfactory

results,while

within

theK

-SVD

,10–20itera-

tionsare

required.To

conclude,inorderto

addressthedem

osaicingproblem

,we

usethe

modified

K-SV

Dalgorithm

thatdealsw

ithnonuniform

noise,asdescribed

inprevious

section,andadd

toitan

adaptivedictionary

thathasbeen

learnedw

ithlow

patchsparsity

inorder

toavoid

over-fittingthe

mosaic

pattern.The

same

techniquecan

beapplied

togeneric

colorinpainting

asdem

onstratedin

thenextsection.

V.

EX

PER

IME

NTA

LR

ESU

LTS

We

arenow

readyto

presentthe

colorim

agedenoising,in-

painting,anddem

osaicingresultsthatare

obtainedw

iththe

pro-posed

framew

ork.

A.

Denoising

Color

Images

The

state-of-the-artperform

anceof

thealgorithm

ongrayscale

images

hasalready

beenstudied

in[2].

We

nowevaluate

ourextension

forcolor

images.

We

trainedsom

edictionaries

with

differentsizesof

atoms

55

3,66

3,7

73

and8

83,

on200

000patches

takenfrom

adatabase

of15

000im

agesw

iththe

patch-sparsityparam

eter(six

atoms

inthe

representations).We

usedthe

databaseL

abelMe

[55]to

buildour

image

database.T

henw

etrained

eachdictionary

with

600iterations.

This

providedus

aset

ofgeneric

dictionariesthat

we

usedas

initialdictionaries

inour

denoisingalgorithm

.C

omparing

theresults

obtainedw

iththe

globalapproach

andthe

adaptiveone

permits

usto

seethe

improvem

entsin

thelearning

process.W

echose

toevaluate

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61

Fig. 7. Data set used for evaluating denoising experiments.

TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY

APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.

BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS

TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.

FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE

patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.

To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.

V. EXPERIMENTAL RESULTS

We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.

A. Denoising Color Images

The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter

(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate

Page 56: Signal Processing Course : Compressed Sensing

Dictionary learning:

Analysis vs. synthesis:

learning

Js(f) = minf=�x

||x||1

Some Hot Topics

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MA

IRA

Letal.:SPA

RSE

RE

PRE

SEN

TAT

ION

FOR

CO

LO

RIM

AG

ER

EST

OR

AT

ION

61

Fig.7.D

atasetused

forevaluating

denoisingexperim

ents.

TAB

LE

IPSN

RR

ESU

LTS

OF

OU

RD

EN

OISIN

GA

LG

OR

ITH

MW

ITH

256A

TO

MS

OF

SIZ

E7

73

FOR

AN

D6

63

FOR

.EA

CH

CA

SEIS

DIV

IDE

DIN

FO

UR

PA

RT

S:TH

ET

OP-L

EFT

RE

SULT

SA

RE

TH

OSE

GIV

EN

BY

MCA

UL

EY

AN

DA

L[28]W

ITH

TH

EIR

“33

MO

DE

L.”T

HE

TO

P-RIG

HT

RE

SULT

SA

RE

TH

OSE

OB

TAIN

ED

BY

APPLY

ING

TH

EG

RA

YSC

AL

EK

-SVD

AL

GO

RIT

HM

[2]O

NE

AC

HC

HA

NN

EL

SE

PAR

AT

ELY

WIT

H8

8A

TO

MS.T

HE

BO

TT

OM

-LE

FTA

RE

OU

RR

ESU

LTS

OB

TAIN

ED

WIT

HA

GL

OB

AL

LYT

RA

INE

DD

ICT

ION

AR

Y.TH

EB

OT

TO

M-R

IGH

TA

RE

TH

EIM

PRO

VE

ME

NT

SO

BTA

INE

DW

ITH

TH

EA

DA

PTIV

EA

PPRO

AC

HW

ITH

20IT

ER

AT

ION

S.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT

SFO

RE

AC

HG

RO

UP.

AS

CA

NB

ESE

EN,

OU

RP

RO

POSE

DT

EC

HN

IQU

EC

ON

SISTE

NT

LYP

RO

DU

CE

ST

HE

BE

STR

ESU

LTS

TAB

LE

IIC

OM

PAR

ISON

OF

TH

EPSN

RR

ESU

LTS

ON

TH

EIM

AG

E“C

AST

LE”

BE

TW

EE

N[28]

AN

DW

HA

TW

EO

BTA

INE

DW

ITH

2566

63

AN

D7

73

PA

TC

HE

S.F

OR

TH

EA

DA

PTIV

EA

PPRO

AC

H,20

ITE

RA

TIO

NS

HA

VE

BE

EN

PE

RFO

RM

ED.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT,

IND

ICA

TIN

GO

NC

EA

GA

INT

HE

CO

NSIST

EN

TIM

PRO

VE

ME

NT

OB

TAIN

ED

WIT

HO

UR

PR

OPO

SED

TE

CH

NIQ

UE

patch),inorder

topreventany

learningof

theseartifacts

(over-fitting).

We

definethen

thepatch

sparsityof

thedecom

po-sition

asthis

number

ofsteps.T

hestopping

criteriain

(2)be-

comes

thenum

berof

atoms

usedinstead

ofthe

reconstructionerror.U

singa

small

duringthe

OM

Pperm

itsto

learna

dic-tionary

specializedin

providinga

coarseapproxim

ation.O

urassum

ptionis

that(pattern)

artifactsare

lesspresent

incoarse

approximations,preventing

thedictionary

fromlearning

them.

We

proposethen

thealgorithm

describedin

Fig.6.We

typicallyused

toprevent

thelearning

ofartifacts

andfound

outthattw

oouteriterations

inthe

scheme

inFig.6

aresufficientto

givesatisfactory

results,while

within

theK

-SVD

,10–20itera-

tionsare

required.To

conclude,inorderto

addressthedem

osaicingproblem

,we

usethe

modified

K-SV

Dalgorithm

thatdealsw

ithnonuniform

noise,asdescribed

inprevious

section,andadd

toitan

adaptivedictionary

thathasbeen

learnedw

ithlow

patchsparsity

inorder

toavoid

over-fittingthe

mosaic

pattern.The

same

techniquecan

beapplied

togeneric

colorinpainting

asdem

onstratedin

thenextsection.

V.

EX

PER

IME

NTA

LR

ESU

LTS

We

arenow

readyto

presentthe

colorim

agedenoising,in-

painting,anddem

osaicingresultsthatare

obtainedw

iththe

pro-posed

framew

ork.

A.

Denoising

Color

Images

The

state-of-the-artperform

anceof

thealgorithm

ongrayscale

images

hasalready

beenstudied

in[2].

We

nowevaluate

ourextension

forcolor

images.

We

trainedsom

edictionaries

with

differentsizesof

atoms

55

3,66

3,7

73

and8

83,

on200

000patches

takenfrom

adatabase

of15

000im

agesw

iththe

patch-sparsityparam

eter(six

atoms

inthe

representations).We

usedthe

databaseL

abelMe

[55]to

buildour

image

database.T

henw

etrained

eachdictionary

with

600iterations.

This

providedus

aset

ofgeneric

dictionariesthat

we

usedas

initialdictionaries

inour

denoisingalgorithm

.C

omparing

theresults

obtainedw

iththe

globalapproach

andthe

adaptiveone

permits

usto

seethe

improvem

entsin

thelearning

process.W

echose

toevaluate

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61

Fig. 7. Data set used for evaluating denoising experiments.

TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY

APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.

BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS

TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.

FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE

patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.

To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.

V. EXPERIMENTAL RESULTS

We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.

A. Denoising Color Images

The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter

(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate

Image f = �x

Coe�cients x

Page 57: Signal Processing Course : Compressed Sensing

Dictionary learning:

Analysis vs. synthesis:

learning

Ja(f) = ||D�f ||1

Js(f) = minf=�x

||x||1

Some Hot Topics

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MA

IRA

Letal.:SPA

RSE

RE

PRE

SEN

TAT

ION

FOR

CO

LO

RIM

AG

ER

EST

OR

AT

ION

61

Fig.7.D

atasetused

forevaluating

denoisingexperim

ents.

TAB

LE

IPSN

RR

ESU

LTS

OF

OU

RD

EN

OISIN

GA

LG

OR

ITH

MW

ITH

256A

TO

MS

OF

SIZ

E7

73

FOR

AN

D6

63

FOR

.EA

CH

CA

SEIS

DIV

IDE

DIN

FO

UR

PA

RT

S:TH

ET

OP-L

EFT

RE

SULT

SA

RE

TH

OSE

GIV

EN

BY

MCA

UL

EY

AN

DA

L[28]W

ITH

TH

EIR

“33

MO

DE

L.”T

HE

TO

P-RIG

HT

RE

SULT

SA

RE

TH

OSE

OB

TAIN

ED

BY

APPLY

ING

TH

EG

RA

YSC

AL

EK

-SVD

AL

GO

RIT

HM

[2]O

NE

AC

HC

HA

NN

EL

SE

PAR

AT

ELY

WIT

H8

8A

TO

MS.T

HE

BO

TT

OM

-LE

FTA

RE

OU

RR

ESU

LTS

OB

TAIN

ED

WIT

HA

GL

OB

AL

LYT

RA

INE

DD

ICT

ION

AR

Y.TH

EB

OT

TO

M-R

IGH

TA

RE

TH

EIM

PRO

VE

ME

NT

SO

BTA

INE

DW

ITH

TH

EA

DA

PTIV

EA

PPRO

AC

HW

ITH

20IT

ER

AT

ION

S.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT

SFO

RE

AC

HG

RO

UP.

AS

CA

NB

ESE

EN,

OU

RP

RO

POSE

DT

EC

HN

IQU

EC

ON

SISTE

NT

LYP

RO

DU

CE

ST

HE

BE

STR

ESU

LTS

TAB

LE

IIC

OM

PAR

ISON

OF

TH

EPSN

RR

ESU

LTS

ON

TH

EIM

AG

E“C

AST

LE”

BE

TW

EE

N[28]

AN

DW

HA

TW

EO

BTA

INE

DW

ITH

2566

63

AN

D7

73

PA

TC

HE

S.F

OR

TH

EA

DA

PTIV

EA

PPRO

AC

H,20

ITE

RA

TIO

NS

HA

VE

BE

EN

PE

RFO

RM

ED.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT,

IND

ICA

TIN

GO

NC

EA

GA

INT

HE

CO

NSIST

EN

TIM

PRO

VE

ME

NT

OB

TAIN

ED

WIT

HO

UR

PR

OPO

SED

TE

CH

NIQ

UE

patch),inorder

topreventany

learningof

theseartifacts

(over-fitting).

We

definethen

thepatch

sparsityof

thedecom

po-sition

asthis

number

ofsteps.T

hestopping

criteriain

(2)be-

comes

thenum

berof

atoms

usedinstead

ofthe

reconstructionerror.U

singa

small

duringthe

OM

Pperm

itsto

learna

dic-tionary

specializedin

providinga

coarseapproxim

ation.O

urassum

ptionis

that(pattern)

artifactsare

lesspresent

incoarse

approximations,preventing

thedictionary

fromlearning

them.

We

proposethen

thealgorithm

describedin

Fig.6.We

typicallyused

toprevent

thelearning

ofartifacts

andfound

outthattw

oouteriterations

inthe

scheme

inFig.6

aresufficientto

givesatisfactory

results,while

within

theK

-SVD

,10–20itera-

tionsare

required.To

conclude,inorderto

addressthedem

osaicingproblem

,we

usethe

modified

K-SV

Dalgorithm

thatdealsw

ithnonuniform

noise,asdescribed

inprevious

section,andadd

toitan

adaptivedictionary

thathasbeen

learnedw

ithlow

patchsparsity

inorder

toavoid

over-fittingthe

mosaic

pattern.The

same

techniquecan

beapplied

togeneric

colorinpainting

asdem

onstratedin

thenextsection.

V.

EX

PER

IME

NTA

LR

ESU

LTS

We

arenow

readyto

presentthe

colorim

agedenoising,in-

painting,anddem

osaicingresultsthatare

obtainedw

iththe

pro-posed

framew

ork.

A.

Denoising

Color

Images

The

state-of-the-artperform

anceof

thealgorithm

ongrayscale

images

hasalready

beenstudied

in[2].

We

nowevaluate

ourextension

forcolor

images.

We

trainedsom

edictionaries

with

differentsizesof

atoms

55

3,66

3,7

73

and8

83,

on200

000patches

takenfrom

adatabase

of15

000im

agesw

iththe

patch-sparsityparam

eter(six

atoms

inthe

representations).We

usedthe

databaseL

abelMe

[55]to

buildour

image

database.T

henw

etrained

eachdictionary

with

600iterations.

This

providedus

aset

ofgeneric

dictionariesthat

we

usedas

initialdictionaries

inour

denoisingalgorithm

.C

omparing

theresults

obtainedw

iththe

globalapproach

andthe

adaptiveone

permits

usto

seethe

improvem

entsin

thelearning

process.W

echose

toevaluate

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61

Fig. 7. Data set used for evaluating denoising experiments.

TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY

APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.

BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS

TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.

FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE

patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.

To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.

V. EXPERIMENTAL RESULTS

We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.

A. Denoising Color Images

The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter

(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate

Image f = �x

Coe�cients x c = D�f

� D�

Page 58: Signal Processing Course : Compressed Sensing

(a) (b) (c)

Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.

natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.

We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond

3

(a) (b) (c)

Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.

natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.

We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond

3

Dictionary learning:

Analysis vs. synthesis:

learning

Ja(f) = ||D�f ||1

Js(f) = minf=�x

||x||1

Some Hot Topics

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MA

IRA

Letal.:SPA

RSE

RE

PRE

SEN

TAT

ION

FOR

CO

LO

RIM

AG

ER

EST

OR

AT

ION

61

Fig.7.D

atasetused

forevaluating

denoisingexperim

ents.

TAB

LE

IPSN

RR

ESU

LTS

OF

OU

RD

EN

OISIN

GA

LG

OR

ITH

MW

ITH

256A

TO

MS

OF

SIZ

E7

73

FOR

AN

D6

63

FOR

.EA

CH

CA

SEIS

DIV

IDE

DIN

FO

UR

PA

RT

S:TH

ET

OP-L

EFT

RE

SULT

SA

RE

TH

OSE

GIV

EN

BY

MCA

UL

EY

AN

DA

L[28]W

ITH

TH

EIR

“33

MO

DE

L.”T

HE

TO

P-RIG

HT

RE

SULT

SA

RE

TH

OSE

OB

TAIN

ED

BY

APPLY

ING

TH

EG

RA

YSC

AL

EK

-SVD

AL

GO

RIT

HM

[2]O

NE

AC

HC

HA

NN

EL

SE

PAR

AT

ELY

WIT

H8

8A

TO

MS.T

HE

BO

TT

OM

-LE

FTA

RE

OU

RR

ESU

LTS

OB

TAIN

ED

WIT

HA

GL

OB

AL

LYT

RA

INE

DD

ICT

ION

AR

Y.TH

EB

OT

TO

M-R

IGH

TA

RE

TH

EIM

PRO

VE

ME

NT

SO

BTA

INE

DW

ITH

TH

EA

DA

PTIV

EA

PPRO

AC

HW

ITH

20IT

ER

AT

ION

S.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT

SFO

RE

AC

HG

RO

UP.

AS

CA

NB

ESE

EN,

OU

RP

RO

POSE

DT

EC

HN

IQU

EC

ON

SISTE

NT

LYP

RO

DU

CE

ST

HE

BE

STR

ESU

LTS

TAB

LE

IIC

OM

PAR

ISON

OF

TH

EPSN

RR

ESU

LTS

ON

TH

EIM

AG

E“C

AST

LE”

BE

TW

EE

N[28]

AN

DW

HA

TW

EO

BTA

INE

DW

ITH

2566

63

AN

D7

73

PA

TC

HE

S.F

OR

TH

EA

DA

PTIV

EA

PPRO

AC

H,20

ITE

RA

TIO

NS

HA

VE

BE

EN

PE

RFO

RM

ED.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT,

IND

ICA

TIN

GO

NC

EA

GA

INT

HE

CO

NSIST

EN

TIM

PRO

VE

ME

NT

OB

TAIN

ED

WIT

HO

UR

PR

OPO

SED

TE

CH

NIQ

UE

patch),inorder

topreventany

learningof

theseartifacts

(over-fitting).

We

definethen

thepatch

sparsityof

thedecom

po-sition

asthis

number

ofsteps.T

hestopping

criteriain

(2)be-

comes

thenum

berof

atoms

usedinstead

ofthe

reconstructionerror.U

singa

small

duringthe

OM

Pperm

itsto

learna

dic-tionary

specializedin

providinga

coarseapproxim

ation.O

urassum

ptionis

that(pattern)

artifactsare

lesspresent

incoarse

approximations,preventing

thedictionary

fromlearning

them.

We

proposethen

thealgorithm

describedin

Fig.6.We

typicallyused

toprevent

thelearning

ofartifacts

andfound

outthattw

oouteriterations

inthe

scheme

inFig.6

aresufficientto

givesatisfactory

results,while

within

theK

-SVD

,10–20itera-

tionsare

required.To

conclude,inorderto

addressthedem

osaicingproblem

,we

usethe

modified

K-SV

Dalgorithm

thatdealsw

ithnonuniform

noise,asdescribed

inprevious

section,andadd

toitan

adaptivedictionary

thathasbeen

learnedw

ithlow

patchsparsity

inorder

toavoid

over-fittingthe

mosaic

pattern.The

same

techniquecan

beapplied

togeneric

colorinpainting

asdem

onstratedin

thenextsection.

V.

EX

PER

IME

NTA

LR

ESU

LTS

We

arenow

readyto

presentthe

colorim

agedenoising,in-

painting,anddem

osaicingresultsthatare

obtainedw

iththe

pro-posed

framew

ork.

A.

Denoising

Color

Images

The

state-of-the-artperform

anceof

thealgorithm

ongrayscale

images

hasalready

beenstudied

in[2].

We

nowevaluate

ourextension

forcolor

images.

We

trainedsom

edictionaries

with

differentsizesof

atoms

55

3,66

3,7

73

and8

83,

on200

000patches

takenfrom

adatabase

of15

000im

agesw

iththe

patch-sparsityparam

eter(six

atoms

inthe

representations).We

usedthe

databaseL

abelMe

[55]to

buildour

image

database.T

henw

etrained

eachdictionary

with

600iterations.

This

providedus

aset

ofgeneric

dictionariesthat

we

usedas

initialdictionaries

inour

denoisingalgorithm

.C

omparing

theresults

obtainedw

iththe

globalapproach

andthe

adaptiveone

permits

usto

seethe

improvem

entsin

thelearning

process.W

echose

toevaluate

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61

Fig. 7. Data set used for evaluating denoising experiments.

TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY

APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.

BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS

TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.

FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE

patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.

To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.

V. EXPERIMENTAL RESULTS

We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.

A. Denoising Color Images

The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter

(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate

Other sparse priors:

Image f = �x

Coe�cients x c = D�f

� D�

|x1| + |x2| max(|x1|, |x2|)

Page 59: Signal Processing Course : Compressed Sensing

(a) (b) (c)

Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.

natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.

We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond

3

(a) (b) (c)

Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.

natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.

We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond

3

Dictionary learning:

Analysis vs. synthesis:

learning

Ja(f) = ||D�f ||1

Js(f) = minf=�x

||x||1

|x1| + (x22 + x2

3)12

Some Hot Topics

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MA

IRA

Letal.:SPA

RSE

RE

PRE

SEN

TAT

ION

FOR

CO

LO

RIM

AG

ER

EST

OR

AT

ION

61

Fig.7.D

atasetused

forevaluating

denoisingexperim

ents.

TAB

LE

IPSN

RR

ESU

LTS

OF

OU

RD

EN

OISIN

GA

LG

OR

ITH

MW

ITH

256A

TO

MS

OF

SIZ

E7

73

FOR

AN

D6

63

FOR

.EA

CH

CA

SEIS

DIV

IDE

DIN

FO

UR

PA

RT

S:TH

ET

OP-L

EFT

RE

SULT

SA

RE

TH

OSE

GIV

EN

BY

MCA

UL

EY

AN

DA

L[28]W

ITH

TH

EIR

“33

MO

DE

L.”T

HE

TO

P-RIG

HT

RE

SULT

SA

RE

TH

OSE

OB

TAIN

ED

BY

APPLY

ING

TH

EG

RA

YSC

AL

EK

-SVD

AL

GO

RIT

HM

[2]O

NE

AC

HC

HA

NN

EL

SE

PAR

AT

ELY

WIT

H8

8A

TO

MS.T

HE

BO

TT

OM

-LE

FTA

RE

OU

RR

ESU

LTS

OB

TAIN

ED

WIT

HA

GL

OB

AL

LYT

RA

INE

DD

ICT

ION

AR

Y.TH

EB

OT

TO

M-R

IGH

TA

RE

TH

EIM

PRO

VE

ME

NT

SO

BTA

INE

DW

ITH

TH

EA

DA

PTIV

EA

PPRO

AC

HW

ITH

20IT

ER

AT

ION

S.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT

SFO

RE

AC

HG

RO

UP.

AS

CA

NB

ESE

EN,

OU

RP

RO

POSE

DT

EC

HN

IQU

EC

ON

SISTE

NT

LYP

RO

DU

CE

ST

HE

BE

STR

ESU

LTS

TAB

LE

IIC

OM

PAR

ISON

OF

TH

EPSN

RR

ESU

LTS

ON

TH

EIM

AG

E“C

AST

LE”

BE

TW

EE

N[28]

AN

DW

HA

TW

EO

BTA

INE

DW

ITH

2566

63

AN

D7

73

PA

TC

HE

S.F

OR

TH

EA

DA

PTIV

EA

PPRO

AC

H,20

ITE

RA

TIO

NS

HA

VE

BE

EN

PE

RFO

RM

ED.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT,

IND

ICA

TIN

GO

NC

EA

GA

INT

HE

CO

NSIST

EN

TIM

PRO

VE

ME

NT

OB

TAIN

ED

WIT

HO

UR

PR

OPO

SED

TE

CH

NIQ

UE

patch),inorder

topreventany

learningof

theseartifacts

(over-fitting).

We

definethen

thepatch

sparsityof

thedecom

po-sition

asthis

number

ofsteps.T

hestopping

criteriain

(2)be-

comes

thenum

berof

atoms

usedinstead

ofthe

reconstructionerror.U

singa

small

duringthe

OM

Pperm

itsto

learna

dic-tionary

specializedin

providinga

coarseapproxim

ation.O

urassum

ptionis

that(pattern)

artifactsare

lesspresent

incoarse

approximations,preventing

thedictionary

fromlearning

them.

We

proposethen

thealgorithm

describedin

Fig.6.We

typicallyused

toprevent

thelearning

ofartifacts

andfound

outthattw

oouteriterations

inthe

scheme

inFig.6

aresufficientto

givesatisfactory

results,while

within

theK

-SVD

,10–20itera-

tionsare

required.To

conclude,inorderto

addressthedem

osaicingproblem

,we

usethe

modified

K-SV

Dalgorithm

thatdealsw

ithnonuniform

noise,asdescribed

inprevious

section,andadd

toitan

adaptivedictionary

thathasbeen

learnedw

ithlow

patchsparsity

inorder

toavoid

over-fittingthe

mosaic

pattern.The

same

techniquecan

beapplied

togeneric

colorinpainting

asdem

onstratedin

thenextsection.

V.

EX

PER

IME

NTA

LR

ESU

LTS

We

arenow

readyto

presentthe

colorim

agedenoising,in-

painting,anddem

osaicingresultsthatare

obtainedw

iththe

pro-posed

framew

ork.

A.

Denoising

Color

Images

The

state-of-the-artperform

anceof

thealgorithm

ongrayscale

images

hasalready

beenstudied

in[2].

We

nowevaluate

ourextension

forcolor

images.

We

trainedsom

edictionaries

with

differentsizesof

atoms

55

3,66

3,7

73

and8

83,

on200

000patches

takenfrom

adatabase

of15

000im

agesw

iththe

patch-sparsityparam

eter(six

atoms

inthe

representations).We

usedthe

databaseL

abelMe

[55]to

buildour

image

database.T

henw

etrained

eachdictionary

with

600iterations.

This

providedus

aset

ofgeneric

dictionariesthat

we

usedas

initialdictionaries

inour

denoisingalgorithm

.C

omparing

theresults

obtainedw

iththe

globalapproach

andthe

adaptiveone

permits

usto

seethe

improvem

entsin

thelearning

process.W

echose

toevaluate

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61

Fig. 7. Data set used for evaluating denoising experiments.

TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY

APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.

BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS

TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.

FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE

patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.

To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.

V. EXPERIMENTAL RESULTS

We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.

A. Denoising Color Images

The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter

(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate

Other sparse priors:

Image f = �x

Coe�cients x c = D�f

� D�

|x1| + |x2| max(|x1|, |x2|)

Page 60: Signal Processing Course : Compressed Sensing

(a) (b) (c)

Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.

natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.

We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond

3

(a) (b) (c)

Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.

natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.

We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond

3

Dictionary learning:

Analysis vs. synthesis:

learning

Ja(f) = ||D�f ||1

Js(f) = minf=�x

||x||1

|x1| + (x22 + x2

3)12

Some Hot Topics

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57

Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.

Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,

dB.

Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.

MA

IRA

Letal.:SPA

RSE

RE

PRE

SEN

TAT

ION

FOR

CO

LO

RIM

AG

ER

EST

OR

AT

ION

61

Fig.7.D

atasetused

forevaluating

denoisingexperim

ents.

TAB

LE

IPSN

RR

ESU

LTS

OF

OU

RD

EN

OISIN

GA

LG

OR

ITH

MW

ITH

256A

TO

MS

OF

SIZ

E7

73

FOR

AN

D6

63

FOR

.EA

CH

CA

SEIS

DIV

IDE

DIN

FO

UR

PA

RT

S:TH

ET

OP-L

EFT

RE

SULT

SA

RE

TH

OSE

GIV

EN

BY

MCA

UL

EY

AN

DA

L[28]W

ITH

TH

EIR

“33

MO

DE

L.”T

HE

TO

P-RIG

HT

RE

SULT

SA

RE

TH

OSE

OB

TAIN

ED

BY

APPLY

ING

TH

EG

RA

YSC

AL

EK

-SVD

AL

GO

RIT

HM

[2]O

NE

AC

HC

HA

NN

EL

SE

PAR

AT

ELY

WIT

H8

8A

TO

MS.T

HE

BO

TT

OM

-LE

FTA

RE

OU

RR

ESU

LTS

OB

TAIN

ED

WIT

HA

GL

OB

AL

LYT

RA

INE

DD

ICT

ION

AR

Y.TH

EB

OT

TO

M-R

IGH

TA

RE

TH

EIM

PRO

VE

ME

NT

SO

BTA

INE

DW

ITH

TH

EA

DA

PTIV

EA

PPRO

AC

HW

ITH

20IT

ER

AT

ION

S.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT

SFO

RE

AC

HG

RO

UP.

AS

CA

NB

ESE

EN,

OU

RP

RO

POSE

DT

EC

HN

IQU

EC

ON

SISTE

NT

LYP

RO

DU

CE

ST

HE

BE

STR

ESU

LTS

TAB

LE

IIC

OM

PAR

ISON

OF

TH

EPSN

RR

ESU

LTS

ON

TH

EIM

AG

E“C

AST

LE”

BE

TW

EE

N[28]

AN

DW

HA

TW

EO

BTA

INE

DW

ITH

2566

63

AN

D7

73

PA

TC

HE

S.F

OR

TH

EA

DA

PTIV

EA

PPRO

AC

H,20

ITE

RA

TIO

NS

HA

VE

BE

EN

PE

RFO

RM

ED.B

OL

DIN

DIC

AT

ES

TH

EB

EST

RE

SULT,

IND

ICA

TIN

GO

NC

EA

GA

INT

HE

CO

NSIST

EN

TIM

PRO

VE

ME

NT

OB

TAIN

ED

WIT

HO

UR

PR

OPO

SED

TE

CH

NIQ

UE

patch),inorder

topreventany

learningof

theseartifacts

(over-fitting).

We

definethen

thepatch

sparsityof

thedecom

po-sition

asthis

number

ofsteps.T

hestopping

criteriain

(2)be-

comes

thenum

berof

atoms

usedinstead

ofthe

reconstructionerror.U

singa

small

duringthe

OM

Pperm

itsto

learna

dic-tionary

specializedin

providinga

coarseapproxim

ation.O

urassum

ptionis

that(pattern)

artifactsare

lesspresent

incoarse

approximations,preventing

thedictionary

fromlearning

them.

We

proposethen

thealgorithm

describedin

Fig.6.We

typicallyused

toprevent

thelearning

ofartifacts

andfound

outthattw

oouteriterations

inthe

scheme

inFig.6

aresufficientto

givesatisfactory

results,while

within

theK

-SVD

,10–20itera-

tionsare

required.To

conclude,inorderto

addressthedem

osaicingproblem

,we

usethe

modified

K-SV

Dalgorithm

thatdealsw

ithnonuniform

noise,asdescribed

inprevious

section,andadd

toitan

adaptivedictionary

thathasbeen

learnedw

ithlow

patchsparsity

inorder

toavoid

over-fittingthe

mosaic

pattern.The

same

techniquecan

beapplied

togeneric

colorinpainting

asdem

onstratedin

thenextsection.

V.

EX

PER

IME

NTA

LR

ESU

LTS

We

arenow

readyto

presentthe

colorim

agedenoising,in-

painting,anddem

osaicingresultsthatare

obtainedw

iththe

pro-posed

framew

ork.

A.

Denoising

Color

Images

The

state-of-the-artperform

anceof

thealgorithm

ongrayscale

images

hasalready

beenstudied

in[2].

We

nowevaluate

ourextension

forcolor

images.

We

trainedsom

edictionaries

with

differentsizesof

atoms

55

3,66

3,7

73

and8

83,

on200

000patches

takenfrom

adatabase

of15

000im

agesw

iththe

patch-sparsityparam

eter(six

atoms

inthe

representations).We

usedthe

databaseL

abelMe

[55]to

buildour

image

database.T

henw

etrained

eachdictionary

with

600iterations.

This

providedus

aset

ofgeneric

dictionariesthat

we

usedas

initialdictionaries

inour

denoisingalgorithm

.C

omparing

theresults

obtainedw

iththe

globalapproach

andthe

adaptiveone

permits

usto

seethe

improvem

entsin

thelearning

process.W

echose

toevaluate

MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61

Fig. 7. Data set used for evaluating denoising experiments.

TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY

APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.

BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS

TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.

FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE

patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.

To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.

V. EXPERIMENTAL RESULTS

We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.

A. Denoising Color Images

The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter

(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate

Other sparse priors:

Image f = �x

Coe�cients x c = D�f

� D�

|x1| + |x2| max(|x1|, |x2|)

(a) (b) (c)

Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.

natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.

We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond

3

Nuclear