Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Compressed Sensing and All ThatCS880 Final Project
Duzhe WangDepartment of Statistics, UW-Madison
December 11, 2017
1
Outline
Compressed Sensing 101Traditional SamplingCompressive Sampling
RIP and JLDistributional JL⇐⇒RIPJL is Tight in Linear Case
Compressed Sensing MRISparsity of Medical ImagingMathematical Framework of CSMRIImage Reconstruction
Fast and Efficient Compressed SensingSensing MatricesStructurally Random Ensemble SystemTheoretical Analysis
CS880 Final Project
2
Pressure from Medical Imaging
1
I Nyquist-Shannon sampling theorem: no information loss if wesample at 2x signal bandwidth.
I How do we reduce the data acquisition time?1Comics courtesy of Michael Lustig
CS880 Final Project
3
Traditional Sampling
I The traditional sampling method: oversample and then removeredundancy to extract information
I Compressed sensing principle :
directly acquire the significant part of the signal by nonadaptive linearmeasurements
CS880 Final Project
4
Sparse Image Representation
I Signal transform
I For example,
CS880 Final Project
5
Compressive Sampling
I Signal x is K-sparse in basisΨ. For example, Ψ = I.
I Replace samples with linearprojections y = Φx . Φ iscalled sensing matrix.
I Random measurements Φ willwork and it requires Φ and Ψare incoherent.
I Signal is local, measurementsare global. Eachmeasurement picks up a litterinformation about eachcomponent.
CS880 Final Project
5
Compressive Sampling
I Signal x is K-sparse in basisΨ. For example, Ψ = I.
I Replace samples with linearprojections y = Φx . Φ iscalled sensing matrix.
I Random measurements Φ willwork and it requires Φ and Ψare incoherent.
I Signal is local, measurementsare global. Eachmeasurement picks up a litterinformation about eachcomponent.
CS880 Final Project
5
Compressive Sampling
I Signal x is K-sparse in basisΨ. For example, Ψ = I.
I Replace samples with linearprojections y = Φx . Φ iscalled sensing matrix.
I Random measurements Φ willwork and it requires Φ and Ψare incoherent.
I Signal is local, measurementsare global. Eachmeasurement picks up a litterinformation about eachcomponent.
CS880 Final Project
5
Compressive Sampling
I Signal x is K-sparse in basisΨ. For example, Ψ = I.
I Replace samples with linearprojections y = Φx . Φ iscalled sensing matrix.
I Random measurements Φ willwork and it requires Φ and Ψare incoherent.
I Signal is local, measurementsare global. Eachmeasurement picks up a litterinformation about eachcomponent.
CS880 Final Project
6
Sensing Matrices
I It’s possible to fully reconstruct any sparse signal if sensingmatrix Φ satisfies Restricted Isometry Property(RIP).
I A matrix Φ ∈ RM×N is (ε,K )-RIP if for all x 6= 0, s.t. ||x ||0 ≤ K , wehave
(1− ε)||x ||22 ≤ ||Φx ||22 ≤ (1 + ε)||x ||22
CS880 Final Project
7
Theoretical Analysis
Let Ψ be arbitrary fixed N × N orthonormal matrix, let ε, δ be scalarsin (0,1), let s be an integer in [N], and let M be an integer that satisfies
M ≥ 100s ln(40N/(δε))
ε2
Let Φ ∈ RM×N be a matrix, s.t. each element of Φ is distributednormally with zero mean and variance of 1/M. Then with probabilityof at least 1− δ over the choice of Φ, the matrix ΦΨ is (ε, s)-RIP.
I The proof uses Distributional Johnson-Lindenstrauss Lemma.
CS880 Final Project
8
CS Signal Recovery
I Mathematical notion of sparsity:Bq(Rq) = {θ ∈ RN ,
∑Nj=1 |θj |q ≤ Rq}, q ∈ [0,1]
I Reconstruct via L1 minimization(Basis Pursuit):
minimizex
||x ||1
subject to y = Φx
I If x is K-sparse, solving the above basis pursuit problem canrecover x exactly when M is large. [Candes and Tao, ’04;Donoho, ’04]
CS880 Final Project
9
Theoretical Analysis
[Candes ’08]Let ε < 1
1+√
2and let Φ be a (ε,2s)-RIP matrix, let x be any arbitrary
vector and denotexs ∈ argmin||x − v ||1v : ||v ||0 ≤ s
and letx̂ ∈ argmin||x ||1x : Φx = y
be the reconstructed vector, then
||x̂ − x ||2 ≤ 2(1− ρ)−1s−1/2||x − xs||1
where ρ =√
2ε/(1− ε).
CS880 Final Project
10
RIP and Distributional JL
A matrix Φ ∈ RM×N is (ε,K )-RIP iffor all x 6= 0, s.t. ||x ||0 ≤ K , wehave
(1− ε)||x ||22 ≤ ||Φx ||22 ≤ (1+ ε)||x ||22
A random matrix ΦM×N ∼ Dsatisfies the (ε, δ)-distributional JLproperty if for any fixed x ∈ RN ,with probability greater than 1− δ,
(1− ε)||x ||22 ≤ ||Φx ||22 ≤ (1+ ε)||x ||22
CS880 Final Project
11
Distributional JL⇐⇒RIP
I Distributional JL=⇒RIP:
[Baraniuk-Davenport-DeVore-Wakin ’08]Suppose ε < 1 and M ≥ C1(ε)K log( N
K ). If Φ satisfies the( ε2 , δ)-distributional JL property with δ = e−Mε, then with probability atleast 1− e−Mε/2, Φ satisfies (ε,K )-RIP.
I RIP=⇒Distributional JL:
[Krahmer-Ward’11]Suppose ΦM×N satisfies (ε,2K )-RIP, let Dε = diag(ε1, ..., εN) be adiagonal matrix of i.i.d. Rademacher RVs, then ΦDε satisfies the(3ε,3e−cK )-distributional JL property.
CS880 Final Project
11
Distributional JL⇐⇒RIP
I Distributional JL=⇒RIP:
[Baraniuk-Davenport-DeVore-Wakin ’08]Suppose ε < 1 and M ≥ C1(ε)K log( N
K ). If Φ satisfies the( ε2 , δ)-distributional JL property with δ = e−Mε, then with probability atleast 1− e−Mε/2, Φ satisfies (ε,K )-RIP.
I RIP=⇒Distributional JL:
[Krahmer-Ward’11]Suppose ΦM×N satisfies (ε,2K )-RIP, let Dε = diag(ε1, ..., εN) be adiagonal matrix of i.i.d. Rademacher RVs, then ΦDε satisfies the(3ε,3e−cK )-distributional JL property.
CS880 Final Project
12
Is JL Tight?
I Distributional JL=⇒JL: Suppose 0 < ε < 12 , let {x1, ...., xN} ⊂ Rn,
and m = 20 log Nε2 , then there exists a mapping f : Rn → Rm, s.t. for
any (i, j),
(1− ε)||xi − xj ||22 ≤ ||f (xi )− f (xj )||22 ≤ (1 + ε)||xi − xj ||22
I For any n > 1,0 < ε < 12 , and N > nC for some constant C > 0,
then JL lemma is optimal in the case where f is linear[Larsen-Nelson’16]
CS880 Final Project
12
Is JL Tight?
I Distributional JL=⇒JL: Suppose 0 < ε < 12 , let {x1, ...., xN} ⊂ Rn,
and m = 20 log Nε2 , then there exists a mapping f : Rn → Rm, s.t. for
any (i, j),
(1− ε)||xi − xj ||22 ≤ ||f (xi )− f (xj )||22 ≤ (1 + ε)||xi − xj ||22
I For any n > 1,0 < ε < 12 , and N > nC for some constant C > 0,
then JL lemma is optimal in the case where f is linear[Larsen-Nelson’16]
CS880 Final Project
13
MRI
I The signal measured by the MRI system is the Fourier transformof the magnitude of the magnetization(the image)
I A traditional MRI reconstruction method is inverse Fouriertransform
I Number of measurements ∝ scanning timeI Reduce samples to reduce time
CS880 Final Project
14
Sparsity of Medical Imaging
2
2Comics courtesy of Michael LustigCS880 Final Project
15
Sparsity of Medical Imaging
I In general, medical Images are not sparse, but have a sparserepresentation by Wavelet transform.
3 4
3This is an exercise from Michael Lustig’s webpage.4Wavelet transform code is from Wavelab(David Donoho).
CS880 Final Project
16
Mathematical Framework of CSMRI
θ̂ ∈ argminθ
12||Fu(Ψθ)− y ||22 + λ||θ||1 (1)
where Ψ is a sparsifying basis, θ is the transform coefficient vector, Fuis the undersampled Fourier transform, y is the samples we haveacquired in the K-space. Hence, Ψθ̂ is the estimated image.
CS880 Final Project
17
Compressed Sensing Reconstruction
I Left: original image
I Compute the 2D Fouriertransform of the image,multiple by the non-uniformmask to get random sampledK-space data y
I Implement the projection overconvex sets(POCS) algorithmto solve (1)
I Left: original image; Right:compressed sensing image
CS880 Final Project
17
Compressed Sensing Reconstruction
I Left: original imageI Compute the 2D Fourier
transform of the image,multiple by the non-uniformmask to get random sampledK-space data y
I Implement the projection overconvex sets(POCS) algorithmto solve (1)
I Left: original image; Right:compressed sensing image
CS880 Final Project
17
Compressed Sensing Reconstruction
I Left: original imageI Compute the 2D Fourier
transform of the image,multiple by the non-uniformmask to get random sampledK-space data y
I Implement the projection overconvex sets(POCS) algorithmto solve (1)
I Left: original image; Right:compressed sensing image
CS880 Final Project
17
Compressed Sensing Reconstruction
I Left: original imageI Compute the 2D Fourier
transform of the image,multiple by the non-uniformmask to get random sampledK-space data y
I Implement the projection overconvex sets(POCS) algorithmto solve (1)
I Left: original image; Right:compressed sensing image
CS880 Final Project
18
Review of Sensing Matrices
3 Universality: incoherent with alot of sparsifying bases
7 Huge memory andcomputational complexity
I To process a 512 × 512image with 64Kmeasurements( 25% of theoriginal sampling rate), aRademacher random matrixrequires a gigabyte ofstorage
7 Not efficient in large-scaleapplications
CS880 Final Project
18
Review of Sensing Matrices
3 Universality: incoherent with alot of sparsifying bases
7 Huge memory andcomputational complexity
I To process a 512 × 512image with 64Kmeasurements( 25% of theoriginal sampling rate), aRademacher random matrixrequires a gigabyte ofstorage
7 Not efficient in large-scaleapplications
CS880 Final Project
18
Review of Sensing Matrices
3 Universality: incoherent with alot of sparsifying bases
7 Huge memory andcomputational complexity
I To process a 512 × 512image with 64Kmeasurements( 25% of theoriginal sampling rate), aRademacher random matrixrequires a gigabyte ofstorage
7 Not efficient in large-scaleapplications
CS880 Final Project
19
Review of Sensing Matrices
3 Fast and efficientimplementation
7 Non-universality: only workswhen the sparsifying basis isthe identity matrix
CS880 Final Project
19
Review of Sensing Matrices
3 Fast and efficientimplementation
7 Non-universality: only workswhen the sparsifying basis isthe identity matrix
CS880 Final Project
20
Motivation
I Do-Gan-Nguyen-Tran’12 proposes Structurally randomensembles, which is an extension of scrambled FFT.
I The sensing ensembles have practical features: memoryefficiency, fast computation, hardware implementationfriendliness, streaming capability
CS880 Final Project
20
Motivation
I Do-Gan-Nguyen-Tran’12 proposes Structurally randomensembles, which is an extension of scrambled FFT.
I The sensing ensembles have practical features: memoryefficiency, fast computation, hardware implementationfriendliness, streaming capability
CS880 Final Project
21
Structurally Random Ensemble System
I Global randomizer R ∈ RN×N : uniform random permutationmatrix
I Local randomizer R ∈ RN×N : diagonal random matrix withP(Rii = ±1) = 1
2 .I Randomize a target signal by either flipping its sample signs or
uniformly permuting its sample locations
CS880 Final Project
22
Structurally Random Ensemble System
I F ∈ RN×N is an orthonormal matrix. In practice, it’s selectedamong popular fast computable transform such as FFT
I The purpose of the matrix F is to spread information of thesignal’s samples over all measurements
CS880 Final Project
23
Structurally Random Ensemble System
I D ∈ RM×N is a subsampling matrix. It selects a random subset ofrows of the matrix FR.
I In matrix representation, D is a random subset of M rows of theidentity matrix of size N × N.
I Need multiple the scale coefficient√
NM to normalize the
transform so that the energy of the measurement vector is similarto the energy of the input signal vector
CS880 Final Project
24
Structurally Sensing Matrix
I Φ =√
NM DFR
I Then conventional CS reconstruction algorithm(Basis pursuit) isused to recovery the transform coefficient vector α
CS880 Final Project
25
Sparse Structurally Sensing Matrix
Φ is not sparse
Replace fast transform bytheir block diagonal versions
I It has memory efficiencyand fast computation
I For local pre-randomizer, italso has streamingcapability
CS880 Final Project
25
Sparse Structurally Sensing Matrix
Φ is not sparseReplace fast transform bytheir block diagonal versions
I It has memory efficiencyand fast computation
I For local pre-randomizer, italso has streamingcapability
CS880 Final Project
26
Theoretical Analysis
Coherence[Do-Gan-Nguyen-Tran’12]Assume that the maximum absolute entries of a structurally randommatrix ΦM×N and an orthogonal matrix ΨN×N is not larger than1/
√log N, then with high probability, coherence of ΦM×N and ΨN×N is
not larger than O(√
(log N)/s) where s is the average number ofnonzero entries per row of ΦM×N .
I The optimal coherence(Gaussian/Rademacher randommatrices) is O(
√(log N)/N)
CS880 Final Project
26
Theoretical Analysis
Coherence[Do-Gan-Nguyen-Tran’12]Assume that the maximum absolute entries of a structurally randommatrix ΦM×N and an orthogonal matrix ΨN×N is not larger than1/
√log N, then with high probability, coherence of ΦM×N and ΨN×N is
not larger than O(√
(log N)/s) where s is the average number ofnonzero entries per row of ΦM×N .
I The optimal coherence(Gaussian/Rademacher randommatrices) is O(
√(log N)/N)
CS880 Final Project
27
Theoretical Analysis
Reconstruction[Do-Gan-Nguyen-Tran’12]With the previous assumption, sampling a signal using a structurallyrandom matrix guarantees exact reconstruction( by Basis pursuit)with high probability, provided M ∼ (KN/s) log2 N.
I The optimal number of measurements required byGaussian/Rademacher dense random matrices is O(K log N)
CS880 Final Project
27
Theoretical Analysis
Reconstruction[Do-Gan-Nguyen-Tran’12]With the previous assumption, sampling a signal using a structurallyrandom matrix guarantees exact reconstruction( by Basis pursuit)with high probability, provided M ∼ (KN/s) log2 N.
I The optimal number of measurements required byGaussian/Rademacher dense random matrices is O(K log N)
CS880 Final Project
28
Summary
I Sparse signal can be recovered from a small number ofnonadaptive linear measurements
I RIP is a sufficient condition for exact/approximate recovery, wecan show for different ensembles of random matrices, RIP holdsw.h.p
I JL lemma is tight for linear mappingI It’s amazing when MRI meets compressed sensingI Structurally random sensing matrices have memory efficiency,
fast computation and hardware friendliness.
CS880 Final Project