Compressed Sensing and All Thatpages.cs.wisc.edu/~duzhe/CS880Slides.pdf1 Outline Compressed Sensing 101 Traditional Sampling Compressive Sampling RIP and JL Distributional JL ()RIP

Compressed Sensing and All ThatCS880 Final Project

Duzhe WangDepartment of Statistics, UW-Madison

December 11, 2017

1

Outline

Compressed Sensing 101Traditional SamplingCompressive Sampling

RIP and JLDistributional JL⇐⇒RIPJL is Tight in Linear Case

Compressed Sensing MRISparsity of Medical ImagingMathematical Framework of CSMRIImage Reconstruction

Fast and Efficient Compressed SensingSensing MatricesStructurally Random Ensemble SystemTheoretical Analysis

CS880 Final Project

2

Pressure from Medical Imaging

1

I Nyquist-Shannon sampling theorem: no information loss if wesample at 2x signal bandwidth.

I How do we reduce the data acquisition time?1Comics courtesy of Michael Lustig

CS880 Final Project

3

Traditional Sampling

I The traditional sampling method: oversample and then removeredundancy to extract information

I Compressed sensing principle :

directly acquire the significant part of the signal by nonadaptive linearmeasurements

CS880 Final Project

4

Sparse Image Representation

I Signal transform

I For example,

CS880 Final Project

5

Compressive Sampling

I Signal x is K-sparse in basisΨ. For example, Ψ = I.

I Replace samples with linearprojections y = Φx . Φ iscalled sensing matrix.

I Random measurements Φ willwork and it requires Φ and Ψare incoherent.

I Signal is local, measurementsare global. Eachmeasurement picks up a litterinformation about eachcomponent.

CS880 Final Project

5






CS880 Final Project

5






CS880 Final Project

5






CS880 Final Project

6

Sensing Matrices

I It’s possible to fully reconstruct any sparse signal if sensingmatrix Φ satisfies Restricted Isometry Property(RIP).

I A matrix Φ ∈ RM×N is (ε,K )-RIP if for all x 6= 0, s.t. ||x ||0 ≤ K , wehave

(1− ε)||x ||22 ≤ ||Φx ||22 ≤ (1 + ε)||x ||22

CS880 Final Project

7

Theoretical Analysis

Let Ψ be arbitrary fixed N × N orthonormal matrix, let ε, δ be scalarsin (0,1), let s be an integer in [N], and let M be an integer that satisfies

M ≥ 100s ln(40N/(δε))

ε2

Let Φ ∈ RM×N be a matrix, s.t. each element of Φ is distributednormally with zero mean and variance of 1/M. Then with probabilityof at least 1− δ over the choice of Φ, the matrix ΦΨ is (ε, s)-RIP.

I The proof uses Distributional Johnson-Lindenstrauss Lemma.

CS880 Final Project

8

CS Signal Recovery

I Mathematical notion of sparsity:Bq(Rq) = {θ ∈ RN ,

∑Nj=1 |θj |q ≤ Rq}, q ∈ [0,1]

I Reconstruct via L1 minimization(Basis Pursuit):

minimizex

||x ||1

subject to y = Φx

I If x is K-sparse, solving the above basis pursuit problem canrecover x exactly when M is large. [Candes and Tao, ’04;Donoho, ’04]

CS880 Final Project

9


[Candes ’08]Let ε < 1

1+√

2and let Φ be a (ε,2s)-RIP matrix, let x be any arbitrary

vector and denotexs ∈ argmin||x − v ||1v : ||v ||0 ≤ s

and letx̂ ∈ argmin||x ||1x : Φx = y

be the reconstructed vector, then

||x̂ − x ||2 ≤ 2(1− ρ)−1s−1/2||x − xs||1

where ρ =√

2ε/(1− ε).

CS880 Final Project

10

RIP and Distributional JL

A matrix Φ ∈ RM×N is (ε,K )-RIP iffor all x 6= 0, s.t. ||x ||0 ≤ K , wehave

(1− ε)||x ||22 ≤ ||Φx ||22 ≤ (1+ ε)||x ||22

A random matrix ΦM×N ∼ Dsatisfies the (ε, δ)-distributional JLproperty if for any fixed x ∈ RN ,with probability greater than 1− δ,

(1− ε)||x ||22 ≤ ||Φx ||22 ≤ (1+ ε)||x ||22

CS880 Final Project

11

Distributional JL⇐⇒RIP

I Distributional JL=⇒RIP:

[Baraniuk-Davenport-DeVore-Wakin ’08]Suppose ε < 1 and M ≥ C1(ε)K log( N

K ). If Φ satisfies the( ε2 , δ)-distributional JL property with δ = e−Mε, then with probability atleast 1− e−Mε/2, Φ satisfies (ε,K )-RIP.

I RIP=⇒Distributional JL:

[Krahmer-Ward’11]Suppose ΦM×N satisfies (ε,2K )-RIP, let Dε = diag(ε1, ..., εN) be adiagonal matrix of i.i.d. Rademacher RVs, then ΦDε satisfies the(3ε,3e−cK )-distributional JL property.

CS880 Final Project

11

Distributional JL⇐⇒RIP

I Distributional JL=⇒RIP:

[Baraniuk-Davenport-DeVore-Wakin ’08]Suppose ε < 1 and M ≥ C1(ε)K log( N

K ). If Φ satisfies the( ε2 , δ)-distributional JL property with δ = e−Mε, then with probability atleast 1− e−Mε/2, Φ satisfies (ε,K )-RIP.

I RIP=⇒Distributional JL:

[Krahmer-Ward’11]Suppose ΦM×N satisfies (ε,2K )-RIP, let Dε = diag(ε1, ..., εN) be adiagonal matrix of i.i.d. Rademacher RVs, then ΦDε satisfies the(3ε,3e−cK )-distributional JL property.

CS880 Final Project

12

Is JL Tight?

I Distributional JL=⇒JL: Suppose 0 < ε < 12 , let {x1, ...., xN} ⊂ Rn,

and m = 20 log Nε2 , then there exists a mapping f : Rn → Rm, s.t. for

any (i, j),

(1− ε)||xi − xj ||22 ≤ ||f (xi )− f (xj )||22 ≤ (1 + ε)||xi − xj ||22

I For any n > 1,0 < ε < 12 , and N > nC for some constant C > 0,

then JL lemma is optimal in the case where f is linear[Larsen-Nelson’16]

CS880 Final Project

12

Is JL Tight?

I Distributional JL=⇒JL: Suppose 0 < ε < 12 , let {x1, ...., xN} ⊂ Rn,

and m = 20 log Nε2 , then there exists a mapping f : Rn → Rm, s.t. for

any (i, j),

(1− ε)||xi − xj ||22 ≤ ||f (xi )− f (xj )||22 ≤ (1 + ε)||xi − xj ||22

I For any n > 1,0 < ε < 12 , and N > nC for some constant C > 0,

then JL lemma is optimal in the case where f is linear[Larsen-Nelson’16]

CS880 Final Project

13

MRI

I The signal measured by the MRI system is the Fourier transformof the magnitude of the magnetization(the image)

I A traditional MRI reconstruction method is inverse Fouriertransform

I Number of measurements ∝ scanning timeI Reduce samples to reduce time

CS880 Final Project

14

Sparsity of Medical Imaging

2

2Comics courtesy of Michael LustigCS880 Final Project

15

Sparsity of Medical Imaging

I In general, medical Images are not sparse, but have a sparserepresentation by Wavelet transform.

3 4

3This is an exercise from Michael Lustig’s webpage.4Wavelet transform code is from Wavelab(David Donoho).

CS880 Final Project

16

Mathematical Framework of CSMRI

θ̂ ∈ argminθ

12||Fu(Ψθ)− y ||22 + λ||θ||1 (1)

where Ψ is a sparsifying basis, θ is the transform coefficient vector, Fuis the undersampled Fourier transform, y is the samples we haveacquired in the K-space. Hence, Ψθ̂ is the estimated image.

CS880 Final Project

17

Compressed Sensing Reconstruction

I Left: original image

I Compute the 2D Fouriertransform of the image,multiple by the non-uniformmask to get random sampledK-space data y

I Implement the projection overconvex sets(POCS) algorithmto solve (1)

I Left: original image; Right:compressed sensing image

CS880 Final Project

17


I Left: original imageI Compute the 2D Fourier

transform of the image,multiple by the non-uniformmask to get random sampledK-space data y



CS880 Final Project

17






CS880 Final Project

17






CS880 Final Project

18

Review of Sensing Matrices

3 Universality: incoherent with alot of sparsifying bases

7 Huge memory andcomputational complexity

I To process a 512 × 512image with 64Kmeasurements( 25% of theoriginal sampling rate), aRademacher random matrixrequires a gigabyte ofstorage

7 Not efficient in large-scaleapplications

CS880 Final Project

18






CS880 Final Project

18






CS880 Final Project

19


3 Fast and efficientimplementation

7 Non-universality: only workswhen the sparsifying basis isthe identity matrix

CS880 Final Project

19


3 Fast and efficientimplementation

7 Non-universality: only workswhen the sparsifying basis isthe identity matrix

CS880 Final Project

20

Motivation

I Do-Gan-Nguyen-Tran’12 proposes Structurally randomensembles, which is an extension of scrambled FFT.

I The sensing ensembles have practical features: memoryefficiency, fast computation, hardware implementationfriendliness, streaming capability

CS880 Final Project

20

Motivation

I Do-Gan-Nguyen-Tran’12 proposes Structurally randomensembles, which is an extension of scrambled FFT.

I The sensing ensembles have practical features: memoryefficiency, fast computation, hardware implementationfriendliness, streaming capability

CS880 Final Project

21

Structurally Random Ensemble System

I Global randomizer R ∈ RN×N : uniform random permutationmatrix

I Local randomizer R ∈ RN×N : diagonal random matrix withP(Rii = ±1) = 1

2 .I Randomize a target signal by either flipping its sample signs or

uniformly permuting its sample locations

CS880 Final Project

22


I F ∈ RN×N is an orthonormal matrix. In practice, it’s selectedamong popular fast computable transform such as FFT

I The purpose of the matrix F is to spread information of thesignal’s samples over all measurements

CS880 Final Project

23


I D ∈ RM×N is a subsampling matrix. It selects a random subset ofrows of the matrix FR.

I In matrix representation, D is a random subset of M rows of theidentity matrix of size N × N.

I Need multiple the scale coefficient√

NM to normalize the

transform so that the energy of the measurement vector is similarto the energy of the input signal vector

CS880 Final Project

24

Structurally Sensing Matrix

I Φ =√

NM DFR

I Then conventional CS reconstruction algorithm(Basis pursuit) isused to recovery the transform coefficient vector α

CS880 Final Project

25

Sparse Structurally Sensing Matrix

Φ is not sparse

Replace fast transform bytheir block diagonal versions

I It has memory efficiencyand fast computation

I For local pre-randomizer, italso has streamingcapability

CS880 Final Project

25

Sparse Structurally Sensing Matrix

Φ is not sparseReplace fast transform bytheir block diagonal versions

I It has memory efficiencyand fast computation

I For local pre-randomizer, italso has streamingcapability

CS880 Final Project

26


Coherence[Do-Gan-Nguyen-Tran’12]Assume that the maximum absolute entries of a structurally randommatrix ΦM×N and an orthogonal matrix ΨN×N is not larger than1/

√log N, then with high probability, coherence of ΦM×N and ΨN×N is

not larger than O(√

(log N)/s) where s is the average number ofnonzero entries per row of ΦM×N .

I The optimal coherence(Gaussian/Rademacher randommatrices) is O(

√(log N)/N)

CS880 Final Project

26


Coherence[Do-Gan-Nguyen-Tran’12]Assume that the maximum absolute entries of a structurally randommatrix ΦM×N and an orthogonal matrix ΨN×N is not larger than1/

√log N, then with high probability, coherence of ΦM×N and ΨN×N is

not larger than O(√

(log N)/s) where s is the average number ofnonzero entries per row of ΦM×N .

I The optimal coherence(Gaussian/Rademacher randommatrices) is O(

√(log N)/N)

CS880 Final Project

27


Reconstruction[Do-Gan-Nguyen-Tran’12]With the previous assumption, sampling a signal using a structurallyrandom matrix guarantees exact reconstruction( by Basis pursuit)with high probability, provided M ∼ (KN/s) log2 N.

I The optimal number of measurements required byGaussian/Rademacher dense random matrices is O(K log N)

CS880 Final Project

27


Reconstruction[Do-Gan-Nguyen-Tran’12]With the previous assumption, sampling a signal using a structurallyrandom matrix guarantees exact reconstruction( by Basis pursuit)with high probability, provided M ∼ (KN/s) log2 N.

I The optimal number of measurements required byGaussian/Rademacher dense random matrices is O(K log N)

CS880 Final Project

28

Summary

I Sparse signal can be recovered from a small number ofnonadaptive linear measurements

I RIP is a sufficient condition for exact/approximate recovery, wecan show for different ensembles of random matrices, RIP holdsw.h.p

I JL lemma is tight for linear mappingI It’s amazing when MRI meets compressed sensingI Structurally random sensing matrices have memory efficiency,

fast computation and hardware friendliness.

CS880 Final Project

Documents

Compressed Sensing and All Thatpages.cs.wisc.edu/~duzhe/CS880Slides.pdf1 Outline Compressed Sensing 101 Traditional Sampling Compressive Sampling RIP and JL Distributional JL ()RIP