29
Random matrix theory in sparse recovery Maryia Kabanava RWTH Aachen University CoSIP Winter Retreat 2016 Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Random matrix theory in sparse recovery - TU Berlin

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Random matrix theory in sparse recovery - TU Berlin

Random matrix theory in sparse recovery

Maryia Kabanava

RWTH Aachen University

CoSIP Winter Retreat 2016

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 2: Random matrix theory in sparse recovery - TU Berlin

Compressed sensing

Goal: reconstruction of (high-dimensional) signals from minimalamount of measured data

Key ingredients:

Exploit low complexity of signals (e.g. sparsity/compressibility)

Efficient algorithms (e.g. convex optimization)

Randomness (random matrices)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 3: Random matrix theory in sparse recovery - TU Berlin

Signal recovery problem

Signal x ∈ Rd is unknown.

Given:

Signal linear measurement map: M : Rd → Rm, m ≪ d .

Measurement vector: y = Mx + w ∈ Rm, ‖w‖2 ≤ η.

Goal: recover x from y .Idea: recovery is possible if x belongs to a set of low complexity.

Standard compressed sensing: sparsity (small number ofnonzero coefficients)

Cosparsity: sparsity after transformation

Structured sparsity: e.g. block sparsity

Low rank matrix recovery

Low rank tensor recovery

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 4: Random matrix theory in sparse recovery - TU Berlin

Noiseless model

M x

S

S c

y

m m × d=

under-determined linear system

supp x = S ⊂ {1, 2, . . . , d}ℓ0-minimization

minz∈Rd

‖z‖0 s.t. Mz = y

NP-hard

ℓ1-minimization

minz∈Rd

‖z‖1 s.t. Mz = y

efficient minim. methods

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 5: Random matrix theory in sparse recovery - TU Berlin

Nonuniform vs. uniform recovery

Nonuniform recoveryA fixed sparse (compressible) vector is recovered with highprobability using M.Sufficient conditions on M

Descent cone of ℓ1-norm at x intersects kerM trivially.Construct (approximate) dual certificate.

Uniform recoveryWith high probability on M every sparse (compressible)vector is recovered.Sufficient conditions on M

Null space property.Restricted isometry property.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 6: Random matrix theory in sparse recovery - TU Berlin

Nonuniform recovery: descent cone

For fixed x ∈ Rd , we define the convex cone

T (x) = cone{z − x : z ∈ Rd , ‖z‖1 ≤ ‖x‖1}.

Theorem

Let M ∈ Rm×d . A vector x ∈ R

d isthe unique minimizer of ‖z‖1 subjectto Mz = Mx if and only ifkerM ∩ T (x) = {0}.

x + kerM

x

x + T (x)

Let Sd−1 = {x ∈ Rd : ‖x‖2 = 1} and set T := T (x) ∩ S

d−1. If

infx∈T

‖Mx‖2 > 0, (1)

then kerM ∩ T = ∅ and kerM ∩ T (x) = {0}.Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 7: Random matrix theory in sparse recovery - TU Berlin

Uniform recovery: null space property (NSP)

M ∈ Rm×d is said to satisfy the stable NSP of order s with

0 < ρ < 1, if for any S ⊂ [d ] with |S | ≤ s it holds

‖vS‖1 < ρ‖vSc‖1 for all v ∈ kerM. (2)

Theorem

Let M ∈ Rm×d satisfy (2). Then, for any x ∈ R

d the solution x̂ of

minz∈Rd

‖z‖1 subject to Mz = y ,

with y = Mx, approximates x with ℓ1-error

‖x − x̂‖1 ≤2(1 + ρ)

1− ρσs(x)1, (3)

where σs(x)1 := inf {‖x − z‖1 : z is s-sparse}.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 8: Random matrix theory in sparse recovery - TU Berlin

Strategy to check NSP

Lemma

Let

Tρ,s:={

w ∈ Rd : ‖wS‖1≥ ρ‖wSc‖1 for some S ⊂ [d ], |S |≤ s

}

.

Set T := Tρ,k ∩ Sd−1. If

infw∈T

‖Mw‖2 > 0,

then for any v ∈ kerM it holds

‖vS‖1 < ρ‖vSc‖1.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 9: Random matrix theory in sparse recovery - TU Berlin

Uniform recovery: restricted isometry property (RIP)

Definition

The restricted isometry constant δs of a matrix M ∈ Rm×d is

defined as the smallest δs such that

(1− δs)‖x‖22 ≤ ‖Mx‖22 ≤ (1 + δs)‖x‖22 (4)

for all s-sparse x ∈ Rd .

Requires that all s-column submatrices of M arewell-conditioned.

δs = max|S|≤s

‖MTS MS − Id ‖2→2

Implies stable NSP.

We say that M satisfies the restricted isometry property if δs issmall for reasonably large s.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 10: Random matrix theory in sparse recovery - TU Berlin

RIP implies recovery by ℓ1-minimization

(1− δs)‖x‖22 ≤ ‖Mx‖22 ≤ (1 + δs)‖x‖22 (5)

Theorem

Assume that the restricted isometry constant of M ∈ Rm×d

satisfiesδ2s < 1/

√2 ≈ 0.7071.

Then ℓ1-minimization reconstructs every s-sparse vector x ∈ Rd

from y = Mx.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 11: Random matrix theory in sparse recovery - TU Berlin

Matrices satisfying recovery conditions

Open problem: Give explicit matrices M ∈ Rm×d that satisfy

recovery conditions.Goal: Successful recovery with M ∈ R

m×d , if

m ≥ Cs lnα(d),

for constants C and α.

Deterministic matrices known, for which m ≥ Cs2.

Way out: consider random matrices.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 12: Random matrix theory in sparse recovery - TU Berlin

Gaussian random variables

A standard Gaussian random variabel X ∼ N(0, 1) has probabilitydensity function

ψ(x) =1√2π

e−x2/2. (6)

1 The tail of X decays super-exponentially

P(|X | > t) ≤ e−t2/2, t > 0. (7)

2 The absolute moments of X can be computed as

(E |X |p)1/p =√2

(

Γ((1 + p)/2)

Γ(1/2)

)1/p

= O(√p), p ≥ 1.

3 The moment generating function of X equals

E exp(tX ) = et2/2, t ∈ R.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 13: Random matrix theory in sparse recovery - TU Berlin

Subgaussian random variables

Lemma

Let X be a random variable with EX = 0. Then the followingproperties are equivalent.

1 Tails: There exist β, κ > 0 such that

P(|X | > t) ≤ βe−κt2 for all t > 0. (8)

2 Moments:

(E |X |p)1/p ≤ C√p for all p ≥ 1. (9)

3 Moment generating function:

E exp(tX ) ≤ ect2

for all t ∈ R. (10)

A random variable X with EX = 0 that satisfies one of theproperties above is called subgaussian.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 14: Random matrix theory in sparse recovery - TU Berlin

Subgaussian random variables: examples

1 Gaussian

2 Bernoulli: P {X = −1} = P {X = 1} =1

23 Bounded: |X | ≤ M almost surely for some M

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 15: Random matrix theory in sparse recovery - TU Berlin

Hoeffding-type inequality

Theorem

Let X1, . . . ,XN be a sequence of independent subgaussian randomvariables,

E exp(tXi) ≤ ect2

for all t ∈ R and i ∈ {1, . . . ,N}. (11)

For a ∈ RN , the random variable Z :=

N∑

i=1

aiXi is subgaussian, i.e.

E exp(tZ ) ≤ exp(

c‖a‖22t2)

for all t ∈ R (12)

and

P

(∣

N∑

i=1

aiXi

≥ t

)

≤ 2 exp

(

− t2

4c‖a‖22

)

for all t ∈ R. (13)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 16: Random matrix theory in sparse recovery - TU Berlin

Subexponential random variables

A random variable X with EX = 0 is called subexponential if thereexist β, κ > 0 such that

P(|X | > t) ≤ βe−κt for all t > 0. (14)

Theorem (Bernstein-type inequality)

Let X1, . . . ,XN be a sequence of independent subexponentialrandom variables,

P(|Xi | > t) ≤ βe−κt for all t > 0 and i ∈ {1, . . . ,N}. (15)

Then

P

(∣

N∑

i=1

Xi

≥ t

)

≤ 2 exp

(

− (κt)2

2βN + κt

)

for all t ∈ R. (16)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 17: Random matrix theory in sparse recovery - TU Berlin

Random matrices

Definition

Let M ∈ Rm×d be a random matrix.

If the entries of M are independent Bernoulli variables (i.e.taking values ±1 with equal probability), then M is called aBernoulli random matrix.

If the entries of M are independent standard Gaussian randomvariables, then M is called a Gaussian random matrix.

If the entries of M are independent subgaussian randomvariables,

P (|Mjk | ≥ t) ≤ βe−κt2 for all t > 0,

then M is called a subgaussian random matrix.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 18: Random matrix theory in sparse recovery - TU Berlin

RIP for subgaussian random matrices

Theorem

Let M ∈ Rm×d be subgaussian random matrix. Then there exists

C = C (β, κ) > 0 such that the restricted isometry constant of1√mM satisfies δs ≤ δ w.p. at least 1− ε provided

m ≥ Cδ−2(

s ln(ed/s) + ln(2ε−1))

. (17)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 19: Random matrix theory in sparse recovery - TU Berlin

Random matrices with subgaussian rows

Let Y ∈ Rd be random.

If E |〈Y , x〉|2 = ‖x‖22 for all x ∈ Rd , then Y is called isotropic.

If, for all x ∈ Rd with ‖x2‖ = 1, the random variable 〈Y , x〉 is

subgaussian,

E exp (t〈Y , x〉) ≤ exp(ct2) for all t ∈ R, (c is indep. of x),

then Y is called a subgaussian random vector.

Theorem

Let M ∈ Rm×d be random with independent, isotropic,

subgaussian rows with the same parameter c. If

m ≥ Cδ−2(

s ln(ed/s) + ln(2ε−1))

, (18)

then the restricted isometry constant of 1√mM satisfies δs ≤ δ w.p.

at least 1− ε.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 20: Random matrix theory in sparse recovery - TU Berlin

Ingredients of the proof: concentration inequality

Let M ∈ Rm×d be random with independent, isotropic,

subgaussian rows. Then, for all x ∈ Rd and every t ∈ (0, 1),

P(∣

∣m−1‖Mx‖22 − ‖x‖22∣

∣ ≥ t‖x‖22)

≤ 2 exp(−ct2m). (19)

Proof.

Let x ∈ Rd , ‖x‖2 = 1. Denote the rows of M by Y1, . . . ,Ym ∈ R

d .Define

Zi = |〈Yi , x〉|2 − ‖x‖22, i = 1, . . . ,m.

EZi = 0, P (|Zi | ≥ r) ≤ β exp(−κr)

m−1‖Mx‖22 − ‖x‖22 = m−1m∑

i=1

Zi

Bernstein inequality:

P

(∣

m−1m∑

i=1

Zi

≥ t

)

≤ 2 exp

(

− κ2

4β + 2κmt2

)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 21: Random matrix theory in sparse recovery - TU Berlin

Ingredients of the proof: covering argument

Let M ∈ Rm×d be random and

P(∣

∣m−1‖Mx‖22 − ‖x‖22∣

∣ ≥ t‖x‖22)

≤ 2 exp(−ct2m) for all x ∈ Rd .

Define M̃ = 1√mM. Then

P

(∣

∣‖M̃x‖22 − ‖x‖22

∣≥ t‖x‖22

)

≤ 2 exp(−ct2m) for all x ∈ Rd .

For S ⊂ {1, . . . , d}, |S | = s and δ, ε ∈ (0, 1), if

m ≥ Cδ−2(7s + 2 ln(2ε−1)), (20)

then w.p. at least 1− ε

‖M̃TS M̃S − Id ‖2→2 < δ. (21)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 22: Random matrix theory in sparse recovery - TU Berlin

Ingredients of the proof: union bound

Let M̃ ∈ Rm×d be random and

P

(∣

∣‖M̃x‖22 − ‖x‖22

∣≥ t‖x‖22

)

≤ 2 exp(−ct2m) for all x ∈ Rd .

If for δ, ε ∈ (0, 1),

m ≥ Cδ−2[

s(9 + 2 ln(d/s)) + 2 ln(2ε−1)]

, (22)

then w.p. at least 1− ε, the restricted isometry constant δs of M̃satisfies δs < δ.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 23: Random matrix theory in sparse recovery - TU Berlin

Gaussian width

For T ⊂ Rd we define its Gaussian width by

ℓ(T ) := Esupx∈T

〈x , g〉, g ∈ Rd is Gaussian. (23)

u

T

width Due to the rotation invariance(23) can be written as

ℓ(T ) = E‖g‖2 · Esupx∈T

〈x , u〉,

where u is uniformly distributedon S

d−1.

ℓ(Sd−1) = E sup‖x‖2=1

〈x , g〉 = E‖g‖2 ∼√d

D := conv{

x ∈ Sd−1 : |supp x | ≤ s

}

, ℓ(D) ∼√

s ln(d/s)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 24: Random matrix theory in sparse recovery - TU Berlin

Gordon’s escape through a mesh

ℓ(T ) := Esupx∈T

〈x , g〉, g ∈ Rd is Gaussian.

Em := E‖g‖2 =√2 Γ((m+1)/2)

Γ(m/2) , g ∈ Rm is Gaussian,

m√m + 1

≤ Em ≤ √m.

Theorem

Let M ∈ Rm×d be Gaussian and T ⊂ S

d−1. Then, for t > 0, itholds

P

(

infx∈T

‖Mx‖2 > Em − ℓ(T )− t

)

≥ 1− e−t2

2 . (24)

The proof relies on the concentration of measure inequality forLipschitz functions.m is determined by:

Em ≥ m√m + 1

≥ ℓ(T ) + t +1

τ(m & ℓ(T )2)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 25: Random matrix theory in sparse recovery - TU Berlin

Estimates for Gaussian widths of T (x)

T (x) = cone{z − x : z ∈ Rd , ‖z‖1 ≤ ‖x‖1} (25)

N (x) := {z ∈ Rd:〈z ,w−x〉 ≤ 0 for all w s.t. ‖w‖1 ≤ ‖x‖1} (26)

ℓ(T (x) ∩ Sd−1) ≤ E min

z∈N (x)‖g − z‖2, g ∈ R

d is a standard

Gaussian random vector.

Let supp(x) = S . Then

N (x) =⋃

t≥0

{

z ∈ Rd : zi = t sgn(xi ), i ∈ S , |zi | ≤ t, i ∈ Sc

}

[

ℓ(

T (x) ∩ Sd−1)]2 ≤ 2s ln(ed/s)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 26: Random matrix theory in sparse recovery - TU Berlin

Nonuniform recovery with Gaussian measurements

Theorem

Let x ∈ Rd be an s-sparse vector. Let M ∈ R

m×d be a randomlydrawn Gaussian matrix. If, for some ε ∈ (0, 1),

m2

m + 1≥ 2s

(

ln(ed/s) +

ln(ε−1)

s

)2

, (27)

then w.p. at least 1− ε the vector x is the unique minimizer of‖z‖1 subject to Mz = Mx.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 27: Random matrix theory in sparse recovery - TU Berlin

Estimates for Gaussian widths of Tρ,s

Tρ,s:={

w ∈ Rd :‖wS‖1≥ρ‖wSc‖1 for some S ⊂ [d ], |S |= s

}

(28)

D := conv{

x ∈ Sd−1 : |supp(x)| ≤ s

}

(29)

Tρ,s ∩ Sd−1 ⊂ (1 + ρ−1)D

ℓ(D) ≤√

2s ln(ed/s) +√s

ℓ(Tρ,s ∩ Sd−1) ≤ (1 + ρ−1)(

2s ln(ed/s) +√s)

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 28: Random matrix theory in sparse recovery - TU Berlin

Ununiform recovery with Gaussian measurements

Theorem

Let M ∈ Rm×d be Gaussian, 0 < ρ < 1 and 0 < ε < 1. If

m2

m + 1≥ 2s

(

(1 + ρ−1)2)

(

ln(ed/s) +1√2+

ln(ε−1)

s ((1 + ρ−1)2)

)2

then w. p. at least 1− ε for every x ∈ Rd a minimizer x̂ of ‖z‖1

subject to Mz = Mx approximates x with ℓ1-error

‖x − x̂‖1 ≤2(1 + ρ)

(1− ρ)σs(x)1.

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016

Page 29: Random matrix theory in sparse recovery - TU Berlin

Thank you for your attention !!!

Maryia Kabanava (RWTH Aachen) Random matrix theory in sparse recovery CoSIP 2016