68
Dror Baron ECE Department Rice University dsp.rice.edu/cs Measurements and Bits: Compressed Sensing meets Information Theory

Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Dror BaronECE DepartmentRice Universitydsp.rice.edu/cs

Measurements and Bits: Compressed Sensing

meets Information Theory

Page 2: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Sensing by Sampling• Sample data at Nyquist rate• Compress data using model (e.g., sparsity)

– encode coefficient locations and values• Lots of work to throw away >80% of the coefficients• Most computation at sensor (asymmetrical)• Brick wall to performance of modern acquisition systems

compress transmit/store

receive decompress

sample

sparse wavelet transform

Page 3: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Sparsity / Compressibility

pixels largewaveletcoefficients

widebandsignalsamples

largeGaborcoefficients

• Many signals are sparse or compressible in some representation/basis (Fourier, wavelets, …)

Page 4: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Compressed Sensing

• Shannon/Nyquist sampling theorem– worst case bound for any bandlimited signal– too pessimistic for some classes of signals– does not exploit signal sparsity/compressibility

• Seek direct sensing of compressible information

• Compressed Sensing (CS)– sparse signals can be recovered from a small number

of nonadaptive (fixed) linear measurements– [Candes et al.; Donoho; Kashin; Gluskin; Rice…]

– based on new uncertainty principles beyond Heisenberg (“incoherency”)

Page 5: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Incoherent Bases (matrices)

• Spikes and sines (Fourier)

Page 6: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Incoherent Bases

• Spikes and “random noise”

Page 7: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

• Measure linear projections onto incoherent basis where data is not sparse/compressible– random projections are universally incoherent– fewer measurements– no location information

• Reconstruct via optimization • Highly asymmetrical (most computation at receiver)

Compressed Sensing via Random Projections

project transmit/store

receive reconstruct

Page 8: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

CS Encoding• Replace samples by more general encoder

based on a few linear projections (inner products)• Matrix vector multiplication – potentially analog

measurements sparsesignal

# non-zeros

Page 9: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

• Random projections• Universally incoherent with any compressible/sparse

signal class

measurements sparsesignal

Universality via Random Projections

# non-zeros

Page 10: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Reconstruction Before-CS –• Goal: Given measurements find signal• Fewer rows than columns in measurement matrix• Ill-posed: infinitely many solutions • Classical solution: least squares

Page 11: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

• Goal: Given measurements find signal• Fewer rows than columns in measurement matrix• Ill-posed: infinitely many solutions • Classical solution: least squares• Problem: small L2 doesn’t imply sparsity

Reconstruction Before-CS –

Page 12: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Ideal Solution –• Ideal solution: exploit sparsity of• Of the infinitely many solutions seek sparsest one

number of nonzero entries

Page 13: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Ideal Solution –• Ideal solution: exploit sparsity of• Of the infinitely many solutions seek sparsest one• If M · K then w/ high probability this can’t be done• If M ¸ K+1 then perfect reconstruction

w/ high probability [Bresler et al.; Wakin et al.]

• But not robust and combinatorial complexity

Page 14: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

The CS Revelation –• Of the infinitely many solutions seek the one

with smallest L1 norm

Page 15: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

• Of the infinitely many solutions seek the onewith smallest L1 norm

• If then perfect reconstructionw/ high probability [Candes et al.; Donoho]

• Robust to measurement noise• Linear programming

The CS Revelation –

Page 16: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

CS Hallmarks• CS changes the rules of data acquisition game

– exploits a priori signal sparsity information (signal is compressible)

• Hardware: Universality– same random projections / hardware for any compressible signal class – simplifies hardware and algorithm design

• Processing: Information scalability– random projections ~ sufficient statistics– same random projections for range of tasks

reconstruction > estimation > recognition > detection– far fewer measurements required to detect/recognize

• Next generation data acquisition– new imaging devices and A/D converters [DARPA]

– new reconstruction algorithms– new distributed source coding algorithms [Baron et al.]

Page 17: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Random Projections in Analog

Page 18: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Optical Computation of Random Projections

• CS encoder integrates sensing, compression, processing• Example: new cameras and imaging algorithms

Page 19: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

First Image Acquisition (M=0.38N)ideal 64x64 image

(4096 pixels)400

wavelets

image onDMD array

1600random meas.

Page 20: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

A/D Conversion Below Nyquist Rate

• Challenge:– wideband signals (radar, communications, …)– currently impossible to sample at Nyquist rate

• Proposed CS-based solution: – sample at “information rate”– simple hardware components– good reconstruction performance

DownsampleFilter

Modulator

Page 21: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Connections Between Compressed Sensing and Information Theory

Page 22: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Measurement Reduction via CS

• CS reconstruction via – If then perfect reconstruction

w/ high probability [Candes et al.; Donoho]

– Linear programming

• Compressible signals (signal components decay)– also requires– polynomial complexity (BPDN) [Candes et al.]

– cannot reduce order of [Kashin,Gluskin]

Page 23: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Fundamental Goal: Minimize

• Compressed sensing aims to minimize resource consumption due to measurements

• Donoho: “Why go to so much effort to acquire all the data when most of what we get will be thrown away?”

Page 24: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Fundamental Goal: Minimize

• Compressed sensing aims to minimize resource consumption due to measurements

• Donoho: “Why go to so much effort to acquire all the data when most of what we get will be thrown away?”

• Recall sparse signals– only measurements for reconstruction– not robust and combinatorial complexity

Page 25: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Rich Design Space

• What performance metric to use?– Determine support set of nonzero entries [Wainwright]

this is distortion metricbut why let tiny nonzero entries spoil the fun?

– metric? ?

Page 26: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Rich Design Space

• What performance metric to use?– Determine support set of nonzero entries [Wainwright]

this is distortion metricbut why let tiny nonzero entries spoil the fun?

– metric? ?

• What complexity class of reconstruction algorithms?– any algorithms? – polynomial complexity?– near-linear or better?

Page 27: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Rich Design Space

• What performance metric to use?– Determine support set of nonzero entries [Wainwright]

this is distortion metricbut why let tiny nonzero entries wreck spoil the fun?

– metric? ?

• What complexity class of reconstruction algorithms?– any algorithms? – polynomial complexity?– near-linear or better?

• How to account for imprecisions?– noise in measurements?– compressible signal model?

Page 28: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Lower Bound on Number of Measurements

Page 29: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Measurement Noise

• Measurement process is analog• Analog systems add noise, non-linearities, etc.

• Assume Gaussian noise for ease of analysis

Page 30: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Setup• Signal is iid• Additive white Gaussian noise• Noisy measurement process

Page 31: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Setup• Signal is iid• Additive white Gaussian noise• Noisy measurement process

• Random projection of tiny coefficients (compressible signals) similar to measurement noise

Page 32: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Measurement and Reconstruction Quality

• Measurement signal to noise ratio

• Reconstruct using decoder mapping

• Reconstruction distortion metric

• Goal: minimize CS measurement rate

Page 33: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Measurement Channel

• Model process as measurement channel

• Capacity of measurement channel

• Measurements are bits!

Page 34: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Lower Bound [Sarvotham et al.]

• Theorem: For a sparse signal with rate-distortion function , lower bound on measurement rate subject to measurement quality and reconstruction distortion satisfies

• Direct relationship to rate-distortion content

• Applies to any linear signal acquisition system

Page 35: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Lower Bound [Sarvotham et al.]

• Theorem: For a sparse signal with rate-distortion function , lower bound on measurement rate subject to measurement quality and reconstruction distortion satisfies

• Proof sketch:– each measurement provides bits– information content of source bits– source-channel separation for continuous amplitude sources – minimal number of measurements

– obtain measurement rate via normalization by

Page 36: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Example

• Spike process - spikes of uniform amplitude• Rate-distortion function• Lower bound

• Numbers:– signal of length 107

– 103 spikes– SNR=10 dB ⇒– SNR=-20 dB ⇒

• If interesting portion of signal has relatively small energy then need significantly more measurements!

• Upper bound (achievable) in progress…

Page 37: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

CS Reconstruction Meets Channel Coding

Page 38: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Why is Reconstruction Expensive?

measurementssparsesignal

nonzeroentries

Culprit: dense, unstructured

Page 39: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Fast CS Reconstruction

measurementssparsesignal

nonzeroentries

• LDPC measurement matrix (sparse)• Only 0/1 in • Each row of contains randomly placed 1’s • Fast matrix multiplication

fast encoding fast reconstruction

Page 40: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Ongoing Work: CS Using BP

• Considering noisy CS signals• Application of Belief Propagation

– BP over real number field– sparsity is modeled as prior in graph

MeasurementsYStates Coefficients

XQ

Page 41: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Promising Results

500 1000 1500 2000 2500 30000

100

200

300

400

500

l2 norm of (x−x_hat) vs M

Number of measurements (M)

l2 re

cons

truct

ion

erro

r

l=5l=10l=15l=20l=25norm(x)norm(x_n)

0 200 400 600−40

−20

0

20

40

60

80

j

x(j)

Page 42: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Theoretical Advantages of CS-BP

• Low complexity

• Provable reconstruction with noisy measurements using

• Success of LDPC+BP in channel coding carried over to CS!

Page 43: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

DistributedCompressed Sensing (DCS)

CS for distributed signal ensembles

Page 44: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Why Distributed?

• Networks of many sensor nodes– sensor, microprocessor for computation,

wireless communication, networking, battery– can be spread over large geographical area

• Must be energy efficient– minimize communication at expense of computation– motivates distributed compression

Page 45: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

destination

rawdata

Distributed Sensing

• Transmitting raw data typically inefficient

Page 46: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

• Can we exploit intra-sensor and inter-sensor

correlation to jointly compress?• Ongoing challenge in information

theory (distributed source coding)

Correlation

destination

?

Page 47: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

destination

Collaborative Sensing

• Collaboration introduces – inter-sensor

communication overhead– complexity at sensors

compressed data

Page 48: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

destination

Distributed Compressed Sensing

• Exploit intra- and inter-sensor correlations with – zero inter-sensor

communication overhead– low complexity at sensors

• Distributed source coding via CS

compressed data

Page 49: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Model 1:Common + Innovations

Page 50: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Common + Innovations Model• Motivation: measuring signals in smooth field

– “average” temperature value common at multiple locations– “innovations” driven by wind, rain, clouds, etc.

• Joint sparsity model:– length-N sequences x1 and x

2

– zC is length-N common component– z1, z2 length-N innovations components– zC, z1, z2 have sparsity KC, K1, K2

• Measurements

Page 51: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Measurement Rate Region with Separate Reconstruction

separate encoding & recon

Decoder g1

Decoder g2

Encoder f1

Encoder f2

Page 52: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Slepian-Wolf Theorem (Distributed lossless coding)

• Theorem: [Slepian and Wolf 1973]

R1 > H(X1|X2) (conditional entropy)R2 > H(X2|X1) (conditional entropy)R1+R2 > H(X1,X2) (joint entropy)

R1

R2

H(X2|X1)

H(X2)

H(X1)H(X1|X2)

Slepian-Wolf joint recon

separate encoding & separate recon

Page 53: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

separate encoding & joint recon

Measurement Rate Region with Joint Reconstruction

Encoder f1

Decoder g

Encoder f2

• Inspired by Slepian-Wolf coding

Page 54: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

simulation

separate reconstruction

converse

achievable

Measurement Rate Region [Baron et al.]

Page 55: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Multiple Sensors

Page 56: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Model 2:Common Sparse Supports

Page 57: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Ex: Many audio signals• sparse in Fourier Domain• same frequencies received

by each node• different attenuations and delays

(magnitudes and phases)

Common Sparse Supports Model

Page 58: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

• Signals share sparse components butdifferent coefficients

• Intuition: Each measurement vector holds clues about coefficient support set

Common Sparse Supports Model

Page 59: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Required Number of Measurements[Baron et al. 2005]

• Theorem: M=K measurements per sensor do not suffice to reconstruct signal ensemble

• Theorem: As number of sensors J increases, M=K+1 measurements suffice to reconstruct

• Joint reconstruction with reasonable computational complexity

Page 60: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Results for Common Sparse Supports K=5N=50

SeparateJoint

Reconstruction

Page 61: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Real Data Example

• Light levels in Intel Berkeley Lab• 49 sensors, 1024 samples each• Compare:

– wavelet approx 100 terms per sensor– separate CS 400 measurements per sensor– joint CS (SOMP) 400 measurements per sensor

• Correlated signal ensemble

Page 62: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Light Intensity at Node 19

Page 63: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Model 3:Non-Sparse Common Component

Page 64: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Non-Sparse Common Model

• Motivation: non-sparse video frame + sparse motion• Length-N common component zC is non-sparse⇒Each signal is incompressible• Innovation sequences zj may share supports

• Intuition: each measurement vector contains clues about common component zC

…not sparse

sparse

Page 65: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Results for Non-Sparse Common (same supports)

K=5N=50

Impact of zC vanishesas J ! 1

Page 66: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

Summary• Compressed Sensing

– “random projections”– process sparse signals using far fewer measurements– universality and information scalability

• Determination of measurement rates in CS– measurements are bits– lower bound on measurement rate

direct relationship to rate-distortion content

• Promising results with LDPC measurement matrices

• Distributed CS– new models for joint sparsity– analogy with Slepian-Wolf coding from information theory– compression of sources w/ intra- and inter-sensor correlation

• Much potential and much more to be done• Compressed sensing meets information theory

dsp.rice.edu/cs

Page 67: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

THE END

Page 68: Measurements and Bits: Compressed Sensing · • Compress data using model (e.g., sparsity) – encode coefficient locations and values • Lots of work to throw away >80% of the

“With High Probability”