123
Université catholique de Louvain BRIDGING 1 -BIT & HIGH RESOLUTION QUANTIZED COMPRESSED SENSING Reconstruction Methods & Applications kévin degraux submitted in partial fulfilment of the requirements for the degree of Master in Electrical Engineering supervisors: Prof. Laurent Jacques Prof. Christophe De Vleeschouwer June 2013 – Louvain-la-Neuve

Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

Université catholique de Louvain

B R I D G I N G 1 - B I T & H I G H R E S O L U T I O N Q U A N T I Z E DC O M P R E S S E D S E N S I N G

Reconstruction Methods & Applications

kévin degraux

submitted in partial fulfilment of the requirements for the degree ofMaster in Electrical Engineering

supervisors:Prof. Laurent JacquesProf. Christophe De Vleeschouwer

June 2013 – Louvain-la-Neuve

Page 2: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

Kévin Degraux: Bridging 1-bit & High Resolution Quantized CompressedSensing, Reconstruction Methods & Applications, c© June 2013

supervisors:Prof. Laurent JacquesProf. Christophe De Vleeschouwer

Page 3: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

To my beloved family, for the support and their invaluable presence.

Page 4: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 5: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

A B S T R A C T

In a nutshell, the compressed sensing (CS) [CRT06b, Don06] theorystates that one can acquire almost any real world signal at a “rate”(meaning number of discrete measurements) inferior to the classicalNyquist rate and still be able to recover it perfectly provided the orig-inal signal was sufficiently sparse in a known basis or dictionary oralmost perfectly if the signal was compressible even if the measure-ments where noisy.

Now that the reconstruction of compressively acquired signals havebeen studied under the assumption of additive Gaussian noise, re-search is focusing on quantization of the compressive measurements,a mandatory step for digitalizing those signals (see, e. g., [GFR08,JHF11]).

The goal of this master thesis is to study the different aspectssurrounding this quantization and to analyse and develop new re-construction algorithms which take into account the induced non-linearity.

After a preliminary study on state-of-the-art, this document de-scribes a Matlab benchmark framework designed to compare theperformances of existing reconstruction algorithms under differentconditions. Based on previous work [JLBB13, KJM+

13], it has beenpossible to develop a new greedy algorithm (see publication [JDD13])which bridges the extreme 1-bit [JLBB13] and high-resolution [BD09]quantization schemes. In order to test this main contribution, mywork considers several types of quantization and sensing scenarios,namely, multiplication by Gaussian and more realistic sensing schemes.In particular it deals with matrices modeling the behaviour of Ran-dom Modulator Pre Integration sensors for compressive acquisitionof 1-D signals.

Finally, this thesis briefly explores the possibility to adapt the non-linear convex optimization class of reconstruction algorithms to thequantization scenarios.

v

Page 6: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 7: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

P U B L I C AT I O N

Some ideas and figures have appeared previously in the followingpublication:

[JDD13] Laurent Jacques, Kévin Degraux, Christophe De Vleeschouwer.Quantized Iterative Hard Thresholding: Bridging 1-bit and High-ResolutionQuantized Compressed Sensing. Accepted in SampTA 2013 (10thInternational Conference on Sampling Theory and Applications),July 1st-5th, 2013, Bremen, Germany.

The reference [JDD13] given in the bibliography is an extended ver-sion of this original paper.

vii

Page 8: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 9: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

The only true wisdomis in knowing you know nothing.

— Socrates

It has long been an axiom of minethat the little things are infinitely

the most important.

— Arthur Conan Doyle

A C K N O W L E D G E M E N T S

First, I would like to thank Prof. Laurent Jacques who has been ex-traordinarily available and has supported me from the beginning tothe end of my thesis. He took his time and was concerned to give mehelp and advice. He allowed me to carry out my personal researchand to achieve this much in my work. I consider myself very lucky tohave had the opportunity to work with someone as friendly, passion-ate and talented as him.

Then, I would like to thank Prof. Christophe De Vleeschouwer whogave me support and precious advice throughout my thesis and forthe opportunity he gave me. I am also thankful that my supervisorsinspired me to work in the fascinating discipline of image and signalprocessing.

Furthermore, I would like to thank particularly Prof. David Bol forhis interest in my work and his friendly support.

I also thank Loïc Neu for allowing me to work on part of his mas-ter’s thesis subject.

Let me also thank Pawel J. Pankiewicz, a PhD student from Aal-borg University, Denmark, for his interesting talk in ISP seminar andthe opportunity to discuss my research with him.

My thanks also go Dr. Prasad Sudhakar and the doctoral and post-doctoral researchers of the ISP group for their warm welcome in theseminars and in the reading group about proximal algorithms.

Finally, I would like to thank Robin Lefrant for the help gave meand everyone that I have forgotten.

ix

Page 10: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 11: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

C O N T E N T S

i a breakthrough in sampling theory 1

1 introduction 3

1.1 Context – a game of sensing . . . . . . . . . . . . . . . . . 3

1.2 Thesis plan – from incoherence to consistency . . . . . . . 5

1.3 Notations and conventions . . . . . . . . . . . . . . . . 7

2 state of the art 9

2.1 Informative sensing . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Compression and best K−terms approximation 9

2.1.2 Compressed sensing . . . . . . . . . . . . . . . . 14

2.2 Reconstruction – guarantees and challenges . . . . . . . . 17

2.2.1 Exact sparse solution and the restricted isome-try property . . . . . . . . . . . . . . . . . . . . . 17

2.2.2 Basis pursuit . . . . . . . . . . . . . . . . . . . . . 18

2.2.3 Noisy CS and basis pursuit denoise . . . . . . . 19

2.2.4 Iterative hard thresholding . . . . . . . . . . . . 21

2.2.5 Examples of sensing matrices . . . . . . . . . . . 24

2.2.6 Matlab benchmark framework . . . . . . . . . . 24

2.2.7 Smile: CS works . . . . . . . . . . . . . . . . . . . 31

2.3 Gray, we need to quantize! . . . . . . . . . . . . . . . . . 33

2.3.1 Quantizers and entropy . . . . . . . . . . . . . . 34

2.3.2 Quantized compressed sensing . . . . . . . . . . 36

2.3.3 Binary compressed sensing . . . . . . . . . . . . 39

2.3.4 BIHT and theoretical guarantees . . . . . . . . . 40

ii a small step in quantized compressed sensing 43

3 the bridging algorithm 45

3.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.2 Derivations and interpretations . . . . . . . . . . . . . . 46

3.2.1 Inconsistency energy and QIHT algorithm . . . 46

3.2.2 Equivalence with BIHT when b = 1 . . . . . . . 48

3.2.3 Convergence towards IHT when b→∞ . . . . 49

3.3 Results and discussions . . . . . . . . . . . . . . . . . . 50

3.3.1 Setting the µ parameter for QIHT . . . . . . . . 51

3.3.2 Comparison of the performances of QIHT ver-sus IHT and BPDN . . . . . . . . . . . . . . . . . 53

3.3.3 Entropy and classical compression . . . . . . . . 59

3.4 RMPI QCS device – model and numerical validation . . . 62

3.4.1 Device presentation . . . . . . . . . . . . . . . . 62

3.4.2 Model and idealization . . . . . . . . . . . . . . 64

3.4.3 Results and discussions . . . . . . . . . . . . . . 67

3.4.4 Not modeled non idealities . . . . . . . . . . . . 76

xi

Page 12: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

xii contents

iii everything comes to an end 79

4 further work and discussions 81

4.1 Robustness against noise . . . . . . . . . . . . . . . . . . 81

4.2 Quantized embeddings and formal proofs . . . . . . . 82

4.3 Minimizing inconsistency – exact optimization . . . . . . 83

4.3.1 Motivations . . . . . . . . . . . . . . . . . . . . . 83

4.3.2 Proximal operators . . . . . . . . . . . . . . . . . 84

4.3.3 Quantized inverse problem . . . . . . . . . . . . 85

4.3.4 Hints about a possible algorithm . . . . . . . . . 87

4.4 More applications . . . . . . . . . . . . . . . . . . . . . . 88

5 conclusions 89

5.1 Objective and main results . . . . . . . . . . . . . . . . . 89

5.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.3 Difficulties and solutions . . . . . . . . . . . . . . . . . . 91

5.4 Further tracks . . . . . . . . . . . . . . . . . . . . . . . . 92

iv appendix 95

a explicit bounds for reconstruction 97

b distribution on the unit hyper-sphere 99

bibliography 103

Acronyms 109

Page 13: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

Part I

A B R E A K T H R O U G H I N S A M P L I N G T H E O RY

Here we introduce the context, the purpose and the con-tents of this document. Then we describe the groundbreak-ing new way of acquiring signals around which this thesiswas constructed: compressed sensing. The state of the artis described in three parts. First, the principles of com-pressed sensing are presented, then come the reconstruc-tion methods and finally, quantization and the quantizedcompressed sensing framework are introduced.

Page 14: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 15: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

1I N T R O D U C T I O N

Winter is coming.

— Words of House Stark – A Song of Ice and FireGeorges R. R. Martin [Mar96]

1.1 context – a game of sensing

From the most specialized fields in scientific applications (medicaland space imaging, monitoring, analyses of all kinds, . . . ) to the morecommon and spread fields of culture and entertainment (music, pic-tures, video, . . . ), not to mention the omnipresent telecommunicationsystems surrounding us, the ability to capture signals and manipu-late them is of paramount importance.

The meaning of the word signal has to be understood here1 in itswide sense. A signal can be a function of time (e. g., an electric voltagesignal, a sound, a temperature at a certain spot), space (e. g., an im-age, a temperature field at a given time), both (e. g., a video) or evenmore atypical variables like wavelength (e. g., hyper-spectral images).We will call these variables (time, spatial coordinates, wavelength . . . )theD dimensions of the ambient space or simply the ambient dimensionsof the signal. In general we could consider any function RD → R asa D−dimensional signal defined on a certain domain. They can becontinuous (e. g., the pressure intensity on the membrane of a micro-phone as a function of time) or discrete (e. g., the pixels of a digitalpicture) 2.

Digital electronics and computer sciences allow us to treat, processand exploit an exponentially increasing amount of data. But beforebeing able to use it, one need a way to transform real world signalsinto digital information. For decades, electronic devices have doneanalog-to-digital conversion (ADC) using various techniques to sam-

1 as on my supervisor’s blog Le petit chercheur illustré [Jac13].2 We could even imagine a mixture of discrete and continuous dimensions for example

the analog electric signal resulting from a grid of photo-diodes before sampling iscontinuous in time but discrete in space.

3

Page 16: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

4 introduction

ple the continuous signals and encode their values with bits.

There are some drawbacks about the classical way of acquiring sig-nals. First, in many cases the sampling rate has to be high so as torespect the Nyquist-Shannon sampling theorem i. e., at a frequencymore than two times the bandwidth of the signal in the Fourier do-main [Nyq28, Sha49]. Moreover, the encoding step or quantization hasto be sufficiently precise to minimize the distortion in the digital sig-nal (see Section 2.3). This often comes with a lot of complexity at theexpense of a high power consumption (e. g., videos in smartphones)or a slow acquisition (e. g., in computed tomography or magnetic res-onance imaging).

On the other hand, once the signal has been imported in the digitalworld, we often want to compress it. Compression aims at eliminat-ing redundancy in the signal in order to keep and store or transmitonly useful information bits. According to information theory, thepurpose of this step called source coding is to maximize the entropyper physical3 bit. This implies keeping the most important and throw-ing the rest (useless or insignificant) away. Compression thus requiresadditional work for the device which means additional design com-plexity, power consumption and delay. The most common examplesof compression standards are the MP3 for music, JPEG for picturesand MPEG-4 for video.

This thesis subscribes to the field of compressed sensing whichwas created in 2004 (though published in 2006) by Emmanuel Can-dès, Justin Romberg and Terence Tao [CRT06b] and David Donoho[Don06] in order to overcome some of the aforementioned problems.

The idea of compressed sensing (CS) or compressive sampling isto merge the steps of acquisition and compression (hence the name).Instead of sampling the signal itself, we take random projections ofthe signal with a special sensing device. By making these randomprojections (or measurements), we mix the information contained inthe signal. Thanks to their random nature, every measurement takenis likely to be equally informative about the signal. From an infor-mation theory point of view, we have decreased their redundancyor increased their mutual information. One of the main results of[CRT06b] and [Don06] is that if the random projections are correctlybuilt and respect a certain property, there is an overwhelming prob-ability that the essence of the signal is captured with less measure-ments than the number of samples prescribed by Nyquist: we have

3 The distinction is made with an information bit which is an information measure ofthe uncertainty of a value. A physical bit is simply an object (signal, state, randomvariable) that can only take two values most often identified by 0 or 1.

Page 17: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

1.2 thesis plan – from incoherence to consistency 5

compressed the signal.

The question is now: How do we decompress the signal? How dowe reconstruct it based on these random projections? We will see inChapter 2 that we can use prior information on the natural structureof a particular class of signal to solve that problem. We can alreadymention that these priors are the same kind of structural priors thatare used in classical compression. It is called the signal sparsity (orsparseness) or compressibility in a certain basis or dictionary. Theprice to pay is that we can no longer recover the signal by simplyinterpolating the samples or even by solving a more general linearsystem: we have to solve a non-linear convex optimization program.

In summary, we can state that the CS theory is a change in theparadigms of sensing with respect to classical data compression: wecan build extremely simple, cheap, efficient, fast and low-power sen-sors while we need a lot of computation power afterwards to recon-struct the signal.

After this major potential improvement about the sampling rate, re-search in compressed sensing is now focusing on the second manda-tory step of digitalization: quantization. The measurements need tobe quantized i. e., coded in bits [GN98] in order to be stored or trans-mitted. In other words, the exact real values of the random projectionsmust be mapped to a discrete set of 2b values, b being the number ofphysical bits allocated to each measurement. This non linear processintroduces errors that have a non-negligible impact on the quality ofthe reconstructions, especially because quantization distortion is notwell handled by usual reconstruction techniques [JHF11]. Some im-provements can be made on these techniques to handle the effects ofquantization and this is precisely the main concern of my thesis.

1.2 thesis plan – from incoherence to consistency

The thesis is organized as follows.

In Chapter 2, a preliminary study on the state of the art is presented.There, we introduce first what, in the past few years, has becomethe discipline of compressed sensing [CRT06b, Don06]. In particular,we explain how the information content of a signal can be mixed sothat every single measurement acquired by a CS-ready sensor equallybrings its load of information. We will see the role played by ran-domness. In particular, it will be explained why and how we cansample by taking linear measurements at a rate way lower than thatprescribed by the Nyquist criterion. The second section presents thereconstruction methods. We will see how, based on a priori hypothe-

Page 18: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

6 introduction

ses on the sparsity of the signal, we can build and solve a certainclass of non-linear convex optimization problems [CRT06b, Don06].These optimization problems are computationally intensive to solveas the size of the problem is growing. We will see how it is possible torelax the problems by using greedy methods like the Iterative HardThresholding [BD09] algorithm. Finally, we will try to understandquantitatively and qualitatively the requirements of the sensing/re-construction systems based on theoretical works from the literature[BDDW07, Can08] and on numerical experiments. The last sectionof Chapter 2 introduces a non-linear but necessary building block ofevery digital device: quantization. We will try to understand the dif-ferent ways to digitalize samples. In particular, we will see how wecan insert that in the frame of CS. We will explain the implications onwhat was previously described and try to provide a brief summaryof what was already done to handle it in the literature.

Chapter 3 is the main contribution of my work. It describes a greedyalgorithm based on the IHT algorithm [BD09] and inspired by the Bi-nary IHT [JLBB13] and by the signal de-clipping method based onIHT [KJM+

13]. This greedy algorithm was called Quantized IterativeHard Thresholding [JDD13]. It is designed to enforce consistency ofthe reconstructed signal with the quantized measurements by mini-mizing an inconsistency energy. We present the derivations of this en-ergy and some interpretations. Then we compare it with two existingalgorithms in the case of roughly quantized measurements (b from1 to 5 bits) by several quantization schemes and discuss the results.we analyse the results in the perspective of information theory andcompare with classical ADC with and without posterior compression.We also give some of the additional discussions about the parametersof the algorithm and some hints about a possible proof of conver-gence. Section 3.4 presents a device called RMPI (or multi-branchesRD) studied in several papers [TLD+

09, PAL13] and notably simu-lated by Loïc Neu another master student with whom I had the op-portunity to collaborate. This device is modeled for several physicalparameters and a few non idealities. Results of the Matlab simulationof the model are discussed.

Chapter 4 presents possible further work on the topic and discussesthe obtained results and their implication and tries to give an insighton what could be done based on this work. Section 4.3, in particular,uses the inconsistency energy developed in Chapter 3 to define an op-timization program. In order to solve the optimization program, wewould like to use proximal methods [PB13, CP09, CP10] which arebriefly introduced.

The conclusions are given in Chapter 5.

Page 19: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

1.3 notations and conventions 7

1.3 notations and conventions

This section presents some of the notations and conventions used inthis document.

• Scalar quantities are noted with lower-case normal font Greekor Latin letters e. g., µ,a, xi,. . .

• x,y, z,α and other lower-case bold Latin or Greek letters are(column) vectors. Their elements are denoted by (.)i e. g., (x)i isthe ith entry of the vector x. It can also be noted xi (since it isa scalar quantity) when there is no possible confusion. When asubscript denotes a particular vector from a certain set of severalvectors, the subscript is indicated in bold font as is xk. In thiscase, the ith entry of xk is simply (xk)i.

• A,Φ and other upper-case bold Latin or Greek letters are ma-trices. The same principle is applied for subscripts. The first(normal font) subscript denotes the row and the second the col-umn: (Ak)i,j is the entry of the ith row and jth column of thematrix Ak. It is also possible to use the appropriately indexedcorresponding lower-case letter to indicate a (scalar) entry of amatrix e. g., φi,j is the element i, j of the matrix Φ. We can alsonote a row or column (depending on the context) with a boldfont lower-case symbol e. g., φi can denote the row i of Φ.

• Superscript in parenthesis as in x(n+1) can be used to identifyan iteration or more generally the index in a suite. SuperscriptT denotes a matrix (complex conjugate) transpose, ∗ denotes aconvex conjugate function and ? denotes generally an objectiveof an optimization program.

• Capital (normal font) letters (M,N,K etc.) denote dimensions(or simply constant integer). The conventions is generally thatx ∈ RN, y ∈ RM and Φ ∈ RM×N. Capital letters can also beused for random variables (e. g., Zi, Φi,k,. . . ) or random vectors(e. g., X, Z,. . . ).

• Operators defined on scalar quantities are (unless specified oth-erwise) applied element-wise on vectors or matrices.

• The scalar product (or dot product or inner product) betweentwo vectors is noted 〈x,y〉 = xTy.

• The `p-norm of a vector u is defined as ‖u‖p = (∑i(ui)

p)1/p

for p > 1. The `2-norm is simply noted ‖u‖. We also define‖u‖∞ = maxi |ui| and ‖u‖0 = #supp(u) which is not a norm.

• A set of indexes is indicated by a calligraphic capital lettere. g., I and the restriction of a vector/matrix to the elements/-columns of indexes I is noted (.)|I or simply xI (for vector x).

Page 20: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

8 introduction

The set of all the indexes between 1 and N is noted [N]. Squarebrackets can also indicate a time index of a discrete sequence,e. g., x[n]. Note that we can have n ∈ [N] which simply meansn ∈ {1, 2, . . . ,N}.

• If #supp(x) 6 K where # means the set cardinality and supp(x)is the support of x , i. e., {i : xi 6= 0} then we say that x is K-sparse.

• ΣNK or simply ΣK is the set of all K−sparse vectors (from RN).

• If # {i : xi > ε‖α‖} 6 K, we say that x is (K, ε)-compressible.The best K−terms approximation of x is xK, i. e., (xK)i = xiif (xK)i > ε‖α‖ and (xK)i = 0 if (xK)i < ε‖α‖.

• MSE or the Mean Squared Error between to signals a and b ∈RN is defined as MSE(a,b) = 1

N

∑Ni=1

(ai − bi

)2.

• PSNR or Peak Signal to Noise Ratio is defined in image process-ing as the square of the maximum possible value of a pixel pmax

(e. g., 255 in 8-bit representation) divided by the mean squarederror (MSE) between a reference a and a tested image b, i. e., indecibels, PSNRdB(a,b) = 20 log10

(pmax

)−10 log10

(MSE(a,b)

).

• SNR or Signal to Noise Ratio is defined as SNR(a,b) = ‖a‖2‖a−b‖2

where a is the pure signal (without noise) and n = a − b

is the noise. In decibels, the SNR is expressed as SNRdB =

10 log10(SNR).

• NSNR or Normalized Signal to Noise Ratio between two signalsis defined as NSNR(a,b) = SNR( a‖a‖ , b

‖b‖) = ‖ a‖a‖ − b‖b‖‖−2.

This normalized SNR measures the angular error between thevectors (i. e., does not take their amplitude into account). In deci-bels, the NSNR is expressed as NSNRdB = 10 log10(NSNR).

• N(µ,σ) is the normal (Gaussian) distribution with mean µ andstandard deviation σ. U(a,b) is the uniform distribution be-tween a and b. χ(N) is the Chi distribution with N degreesof freedom.

Page 21: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2S TAT E O F T H E A RT

This is pure glass.You’re a damn artist!

This is art, Mr. White!

— Jesse Pinkman – Breaking BadSeason 1 Episode 1: “Pilot”

2.1 informative sensing

2.1.1 Compression and best K−terms approximation

Data compression is, nowadays, a concept that is within the reach ofanyone who is familiar with computers. In particular, we’ve all heardabout compression standards like MP3, JPEG, MPEG and so on. Butwe are not necessarily familiar with the way it works in practice. Be-fore talking about compressed sensing, a brief explanation is givenabout classical compression. This will allow comparisons and paral-lels between both paradigms.

In this work, for convenience, we will only consider signals thatlive on a finite domain. Mathematically we consider that continuoussignals are sampled at a rate allowing their perfect reconstruction,i. e., more than two times their bandwidth. This does not correspondto reality since one of the main interests of compressed sensing is theability to sample below this rate. However, we can model reality withthis assumption because of the equivalence (in terms of information)and the easy bijective transition between the (possibly approximated)continuous and the sampled signals. All the samples (no matter thenumber of ambient dimensions) can, afterwards, be gathered in a bigvector of N elements. We will note the discrete equivalent of the inputor target signal as:

x ∈ RN. (1)

The number of elements in the vector is N, that’s why we call it thedimension of the signal even if it belongs to a D−dimensional ambi-ent space. The most obvious example is a digital image: consider an

9

Page 22: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

10 state of the art

original real world (continuous and possibly filtered) image x(s, t).Its ambient dimensions are s (the width) and t (the height) so D = 2.Consider that this image can be sampled without loss in a grid ofN = H×W pixels X ∈ RH×W . The discrete equivalent of x(s, t) isthen the vector x ∈ RN where the real values associated with all thepixels of X are stacked together in a single column.

In order to compress such a signal, one wants to find a way togather the information1 in a few coefficients. One way to do this, isto express the signal in an other (orthogonal) basis or in an over-complete dictionary Ψ. For simplicity, we will only talk about basesand consider them by default as orthogonal but the results still holdfor certain other bases and over-complete dictionaries. Some commonexamples (with discrete signals) are the Discrete Fourier Transform(DFT) 2, its real counterpart the Discrete Cosine Transform (DCT) orthe broad class of Discrete Wavelet Transforms (DWT). We will note

x = Ψα, (2)

where α is the coefficients of the signal expressed in the base. Here,note that since x lies in the direct domain (its ambient space) and αlies in the transform domain, Ψ is the inverse transform. The directtransform itself is noted ΨT and we have ΨTΨ = ΨΨT = Id (sincewe only consider orthonormal bases). The main advantage of sucha transform is that, if it is well chosen with respect to the structureof the original signal x, the energy of the transformed signal α is,as we wanted, (almost) entirely contained in only a small fractionof its elements. This brings us to the definitions of K−sparse and(K, ε)−compressible signals in a certain basis:

Definition 1. A signal x is K−sparse in a basis Ψ if the vector α definedas in Eq. 2 is K−sparse, i. e.,

#supp (α) 6 K (3)

K is then the maximum number of non-zero coefficients of α and is calledsparsity. We also note α ∈ ΣK (or ΣNK if the dimension N is not obviousfrom the context), i. e., α belongs to the set of all K−sparse signals (in RN).

1 We sometimes talk about energy, meaning the sum of squared amplitudes of thecoefficients.

2 Note that the resulting signal will be, in that case, in CN. For the sake of simplicitywe will limit our discussions to real signals. This is why we will avoid the use of theDFT. However, keep in mind that the discussions remain valid for complex signals(e. g., the complex envelope in telecommunications).

Page 23: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.1 informative sensing 11

Definition 2. A signal x is (K, ε)−compressible in a basis Ψ if the vectorα defined as in Eq.2 is (K, ε)−compressible, i. e.,

# {i : |αi| > ε‖α‖} 6 K (4)

In other words, K is the maximum number of coefficients of α that are biggerthan ε‖α‖.

Note that a signal is said sparse when its sparsity K is sufficientlylow compared to N. In the same way, a signal is said compressible 3

if both ‖α‖ε and K are low compared respectively to maxi(αi) andN. The simpler way to compress a signal, and it is basically whatis done in most compression standards, is to keep its best K−termsapproximation in an appropriate basis Ψ.

Definition 3. The best K−terms approximation xK of a signal x in a basisΨ is defined as

xK = ΨαK (5)

with αK the best K−terms approximation of α (defined as in Eq.2), i. e.,

(αK)i = xi if (αK)i > ε‖α‖(αK)i = 0 if (αK)i < ε‖α‖

(6)

With ε defined such that exactly K terms of αK are non-zero (if α has atleast K non-zero coefficients, else αK = α). This is equivalent to

αK = argminα ′

‖α−α ′‖ s.t. ‖α ′‖0 6 K. (7)

Note that xK is K−sparse and x is (K, ε)−compressible inΨ by the definitionof ε. If Ψ in an orthonormal basis, then we also have that

xK = Ψ argminα ′

‖x−Ψα ′‖ s.t. ‖α ′‖0 6 K. (8)

In other words, in order to compress a signal (if we skip the details),we can compute its transform, throw the N− K smaller coefficientsaway and encode only the K greatest coefficients, e. g., by recordingtheir position in the signal and their amplitude.

The Figure 1 shows an example of (K, ε)−compressible image (smile)and its best K−terms approximation in the 2−DHaar discrete wavelets

3 Other definitions are sometimes given. From [JV10], a vector α is termed compressibleif its entries sorted in decreasing order of magnitude decay like a power law: |αk| 6 ck−b

for some constants c > 0 and b > 1. It matches definition 2 if ‖α‖ε = cK−b.Another possibility is to define the set of compressible signals as C :={α : ‖α‖1/‖α‖2 6

√K}

. We have then ΣK ( C but if |αk| 6 ck−b, then ‖α‖1/‖α‖2 ∼=√2b−1b−1

√N. This means that if b is such that

√2b−1b−1 <

√K/N then α ∈ C while

supp (α) = [N].

Page 24: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

12 state of the art

x = Ψα α

xK = ΨαKαK

Figure 1: An example of (K, ε)−compressible signal in the Haar DWT and itsK−sparse best K−terms approximation. Here, N = 4096, K = 512 andε‖α‖ = 45.4 (while maxi(αi) > 104). The non-zero coefficients of thewavelet transforms are shown in absolute value and set brighter for theillustration (a 0 corresponds to a black pixel).

transform. Note that the best K−terms approximation of xK is simplyxK itself. Figure 2 illustrates the importance of the choice of the spar-sity basis Ψ. It compares the DWT and the DCT with a common met-ric in image processing, the peak signal to noise ratio (PSNR) definedin Section 1.3.

The purpose is to see the progression of the quality of xK withrespect to K both qualitatively and quantitatively. We see that theimage is more compressible in the DWT basis than in the DCT (thePSNR is higher for a fixed K). The image is even exactly K−sparsewith K = 3630 in the DWT basis while it is not the case in the DCT.A final remark is that even with N/K = 4, the quality of the image isalready nearly perfect, especially with the DWT.

Page 25: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.1 informative sensing 13

PSNR of the best K−terms approximation

Haar DWT

Haar DWT

DCT

DCT

Perfect

50dB

10dB

K = 0 4096

40963630

1024

51250

Figure 2: PSNR of the best K−terms approximation of the image smile with respectto K

Page 26: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

14 state of the art

2.1.2 Compressed sensing

In compressed sensing, the goal is to capture a maximum of infor-mation about the signal in a minimum number of non adaptive mea-surements. This means that we cannot adapt the sensing by selectingonly the best K−terms. In a way, we want to compress the signal di-rectly in the sensing stage before knowing anything about it. The onlyavailable prior information is that the signal has a good chance to besparse or compressible in a given basis.

The sensing device is assumed to provide linear measurements ofthe discrete equivalent of the signal, i. e.,

y =Φx (CS)

with the measurement vector y ∈ RM and the so-called sensing matrixΦ ∈ RM×N. If the sensing is done properly (we will see how here-inafter) the number of measurements M can be a lot smaller than N.However, since it is not as optimal as posterior compression, we canalready state that M will need to be bigger than K for achieving thesame quality4. In summary, the objective is to have

K < M� N. (9)

A classical graphic “toy example” representation of this situation is

(a) Compressed sensing (b) Sparsity of x in Ψ

Figure 3: Toy example of a “typical” sensing scenario. The parameters are N = 16,M = 8 and x ∈ RN is 3−sparse in Ψ ∈ RN×N which is the inverse DCT.The 3 non zero entries of α ∈ RN generated randomly (exact 0 are high-lighted in white). The matrix Ψ ∈ RM×N is Gaussian (see later in the textfor explanation). The measurements obtained by the compressed sensingare contained in the vector y ∈ RM.

shown on Figure 3. Here, N = 16, M = 8 and x is 3−sparse in Ψ

4 We will see however in Section 2.3 that this has to be interpreted with caution ifwe consider that the K classically compressed coefficients, the M compressed sens-ing measurements and the N discrete equivalent samples of the original signal arequantized with different resolutions.

Page 27: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.1 informative sensing 15

which is the inverse DCT. You can already see that usual methodsto solve linear systems will not be useful here to recover x from y.Indeed, the set of M equations with N unknowns defined by the CSequation is largely under-determined. We will need non-linear meth-ods to reconstruct x based on the knowledge of y and Φ but alsothe prior hypothesis that it is sparse (or compressible) in Ψ. Some ofthese methods are presented in the next section (2.2).

As referred by the CS equation, each measurement can be writtenas yi = 〈φi, x〉. We can emphasize the fact that in practice, the mea-surements are done in an analog way (e. g., in optics, electronics,. . . ).Each measurement is, so to say, the result of the correlation of thesignal with a particular waveform corresponding to φi. Some exam-ples of CS devices that can achieve that are given in [Car13]. We canmention imagers like the Rice single pixel camera [DDT+

08] or theEPFL CMOS CS imager [JVB+

09] but also very high frequency sam-pling hardware, magnetic resonance imaging (which was, accordingto E. Candès [Can11], one of the first motivations to develop the CStheory), seismic devices, ground penetrating radar, biology (genetics),X-ray holography, X-ray astronomy camera, radar, ECG.

What is important in the sensing process, is that we do not want towaste measurements by capturing redundant information. Imagine aclassical digital camera taking the picture of a night sky with a fewbright stars. By the way, we can note that (assuming black is 0) thisimage is really sparse in Ψ = Id. As one pixel captures the dark colorof the night sky, there is a big chance that the neighbor pixels willalso capture that dark color and most of the pixels will give the sameinformation. This is redundancy, i. e., exactly what we do not want.

Indeed, in CS, we want to maximize the information that each mea-surement brings about the signal. If we reuse our example of thenight sky, each CS measurement would provide a new hint aboutthe location of the stars until we have enough clues and can deducethe position of each. Intuitively, in order to maximize the informationbrought, each measurement must be the result of a unique distinctivemixture of all (or a major part) of the sparse transform of the signal.

More formally, the sensing matrix Φ must be as incoherent as pos-sible with the sparsity basis Ψ. Using our notations, the definition ofcoherence as given by [CW08] is the following:

Page 28: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

16 state of the art

Definition 4. The coherence between the sensing basis Φ and the represen-tation5 basis Ψ is

µ(Φ,Ψ) =√N max16k,j6N

|〈φk,ψj〉| (10)

Note that if both bases are orthonormal, µ(Φ,Ψ) ∈ [1,√N].

The coherence (still according to [CW08]) measures the largest correla-tion between any two elements (respectively rows and columns if theyare matrices or more generally elementary waveforms) of Φ and Ψ.We can mention some examples of incoherent bases in the continu-ous domain: a train of impulses (corresponding to the identity in thediscrete equivalent) is maximally incoherent with the Fourier basis(µ = 1), the wavelet transform is incoherent with noiselets (µ =

√2

for Haar), as are both the Dirac impulses train and Fourier. We can fi-nally add an important example: random matrices Φ generated withindependent and identically distributed (i.i.d.) entries, e. g., followinga normal distribution N(0,σ), will be largely incoherent with any fixedrepresentation matrix Ψ with high probability.

This last example means that the measurements obtained by projec-tions on completely random vectors are very susceptible to bring nonredundant information about the signal and allow its reconstruction.In particular, the sensing device does absolutely not need to adaptto the signal. We could talk about universal sensing matrix. The maindrawback of such a matrix is that it leads to slow matrix-vector multi-plication (and we will see that this will imply a slow reconstruction).A lot of sensing strategies are developed in the field. The strategies in-volving randomness in a way or another (random Fourier ensemble,random Basis ensemble, random convolution,. . . ) often give compa-rable results with much less computation [JV10].

To summarize, whenever the sensing matrix (modeling the behav-ior of the sensor) is incoherent with the (arbitrary) sparsity basis, com-pressed sensing is an efficient way to capture information about thesignal without the need to know it in advance. The signal is acquiredat a rate M way lower than the Nyquist rate N such that ADC hard-ware can be much simpler and we do not need any further compres-sion so the DSP on the device is reduced to its minimum. This couldmean a greater speed of acquisition, which can be crucial e. g., inmedical imaging (MRI) to reduce the risks for a patient, and a lowerpower consumption, which is more and more a concern of modernelectronics for many reasons as the autonomy of portable devices orthe ecological impact.

5 We talk about representation basis instead of sparsity basis because it applies to anytransform, no matter if the resulting signal is sparse or not. This definition appliesalso to bases of signals in the continuous domain.

Page 29: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.2 reconstruction – guarantees and challenges 17

2.2 reconstruction – guarantees and challenges

The main idea of all the reconstruction methods is to suppose thatx is sparse or compressible in Ψ and to find its representation α inthis basis. Recover x from α is immediate from Eq. 2. In the follow-ing, for the sake of simplicity, we consider that x is K−sparse (or(K, ε)−compressible with small ε) in its direct domain. This is equiv-alent to consider Ψ = Id and α = x.

2.2.1 Exact sparse solution and the restricted isometry property

Let us state the problem. We want to find a sparse signal x knowingonly y,Φ and the linear relation (CS) between them. If we can assume(and we will see how) that x is the sparsest signal (the one with thelowest K) that respects this relation, it will be the only solution of thefollowing Exact Sparse (ES) problem6:

x?ES = argminu

‖u‖0 s.t. Φu = y, (ES)

with ‖u‖0 = #supp (u) that is, the objective x? is the sparsest signalu respecting Φu = y. We can prove that this problem gives the exactsolution x?ES = x provided that the sensing matrix Φ respects theRestricted Isometry Property (RIP).

Definition 5. A matrix Φ ∈ RM×N satisfies the RIP of order K < M,noted K−RIP or (K, δK)−RIP if there is an isometry constant 0 6 δK < 1such that for all K−sparse signal u ∈ ΣNK ,

(1− δK)‖u‖22 6 ‖Φu‖22 6 (1+ δK)‖u‖22 . (11)

If not explicitly defined, δK is the minimum value for which this property isrespected by all u ∈ ΣNK .

6 Once and for all, if Ψ 6= Id, the problem simply becomes

α?ES = argmin

u‖u‖0 s.t. ΦΨu = y

and x?ES = Ψα?ES

Page 30: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

18 state of the art

Theorem 1 (found in [JV10]). Assuming x is K−sparse and y is given bythe CS equation, if Φ respects the RIP7 of order 2K, then x?ES = x with x?ESgiven by ES.

Proof: The proof, as stated by [JV10], is very simple and gives agood insight of the usefulness of this property.Since x?ES is the solution to ES, ‖x?ES‖0 6 ‖x‖0 and ‖x− x?ES‖0 6 2K.Since Φ is 2K−RIP, we have

(1− δK)‖x− x?ES‖22 6 ‖Φx−Φx?ES‖22 = ‖y−y‖22 = 0 (12)

since (1− δK) > 0, we have x?ES = x.

It also means that [CRT06b, CW08] for any K ⊂ [N] such that#K = K, ΦK the restriction of Φ to any K of it’s columns has arank K, i. e., there is no K−sparse vector in the null space of Φ. It canalso be noted that ΦK is nearly orthonormal.

Note that this property is linked with the incoherence of the sens-ing and representation bases. In particular Theorem 1 of [CW08] tellsus that if M > Cµ2(Φ,Ψ)K log(N) with some constant C, then withoverwhelming probability, x?ES = x (= x?BP). See [CR07] for details andsee below the definition of x?BP for the meaning of the second equality.

2.2.2 Basis pursuit

The problem with the ES minimization program is that it is hard tosolve. Indeed, it is a combinatorial problem because of the pseudo`0−norm. We can, however relax this problem by using the convex`1−norm instead. This is proven in [CRT06b, Can08] that under aslightly stronger assumption (see Theorem 2 below) the result will beequal to ES and that it can be solved easily (e. g., with linear program-ming [CT05]) since it is now a convex problem. This first practicalreconstruction method is called Basis Pursuit (BP) and was proposedby Chen, Donoho and Saunders in [CDS98].

x?BP = argminu

‖u‖1 s.t. Φu = y (BP)

The drawing on Figure 4 inspired by [JV10, Rom08] shows in R2

why, intuitively, the results of ES and BP should match. x ∈ R2 is1−sparse. The constraint Φx = y defines a set of M hyper-planesso, here, since M = 1, a single line. We see that the point where

7 Note that if Ψ 6= Id and x is K−sparse in Ψ, the RIP must be satisfied by Θ =ΦΨ.

Page 31: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.2 reconstruction – guarantees and challenges 19

x = x?ES = x?BP

x`2

“`0−ball” or ΣK

`1−ball

`2−ballΦu = y

Figure 4: Illustration of the BP optimality inspired by [JV10, Rom08]. We see that ESand BP will give the same result. Note how the use of the `2−norm (in graycolor) would have given a wrong result (x`2 ).

the smallest `1−ball intersects the line is also on the axis, i. e., on the“`0−ball” of “radius” K defined as

ΣK = {u | ‖u‖0 6 K} (13)

i. e., the set of all K−sparse signals, hence the equivalence with ES.

In order to show that the `1 norm is the only convex norm thatwould give the same result as `0, the result of the minimization of the`2 norm, i. e.,

x?`2 = argminu

‖u‖2 s.t. Φu = y, (14)

known as the least square solution is also drawn (in gray).

Note that the limits of this 2-D interpretation can be easily seensince the line also intersects the other axis which also belongs to ΣK(so the solution of ES is not unique). In particular, note that M = K

and Φ which is, in this case, a row vector, does not respect the RIP oforder 2K 8. However, this illustrates quite well the idea.

2.2.3 Noisy CS and basis pursuit denoise

Now, that we have a method to solve CS when x is exactly K−sparse,let us introduce a less ideal case. In practice, indeed, the measure-ments that we can obtain with a device are altered by many non-idealities. These can be modeled in a simple first approximation (which

8 Take for example u orthogonal to Φ (you can because Σ2K = R2), then ‖Φu‖22 = 0

and with a sufficiently large ‖u‖22 the definition of δK is violated.

Page 32: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

20 state of the art

can be considered accurate in a lot of cases) by an additive GaussianN(0,σn) noise n ∈ RM on the measurements y. We call this frame-work the noisy compressed sensing (NCS):

y =Φx+n (NCS)

Note that we do not say anything about the correlation between theelements of the noise vector n. Indeed, the noise could come from anadditive white (i. e., uncorrelated) Gaussian noise (AWGN) n ′ ∈ RN

on the original signal (or its discrete equivalent) x. Then n =Φn ′, theresulting equivalent noise on y could appear slightly correlated. Thisis related to the phenomenon called noise folding, when Φn ′ actuallyamplifies n ′ when M < N by “folding” it.

The Theorem 1 does not guarantee anything in this framework.However, we can adapt the BP in order to gain in robustness againstnoise. The adapted program is called Basis Pursuit DeNoise and isgiven by

x?BPDN = argminu

‖u‖1 s.t. ‖Φu−y‖2 6 ε, (BPDN)

which is illustrated on Figure 5 in the same R2 context as the previousfigure. The idea is that we have to provide a bound on the amplitudeof the noise, i. e., an ε such that ‖n‖2 6 ε. If we are able to do so, weensure that the solution is within the tube defined by the constraintin BPDN. As we can foresee thanks to the illustration, the error ‖x−x?BPDN‖2 will be proportional to this bound.

x

ΣK

`1−ball

‖Φu−y‖2 6 ε

x?BPDN

ε

Figure 5: Illustration of the BPDN inspired by [JV10]

In addition to the noise, we can consider that the signal x is com-pressible instead of exactly sparse, i. e., we will allow xK the bestK−terms approximation to differ slightly from x. As it was shownin the previous subsection, this corresponds much more to the reality

Page 33: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.2 reconstruction – guarantees and challenges 21

especially when we target a low K. Even in this framework, there is aproof that BPDN will recover a bounded approximation of the solu-tion. The theorem proven in [CRT06b, Can08] is the following.

Theorem 2 (from [CRT06b, Can08]). Assume that x is (K, ε)−compressibleand y is given by the NCS equation with ‖n‖2 6 ε. If Φ respects the RIPof order 2K with an isometry constant δ2K <

√2− 1, then

‖x?BPDN − x‖2 6 C0‖x− xK‖1√

K+C1ε, (15)

with x?BPDN given by BPDN. C0 and C1 (given explicitly in [Can08] andin Appendix A) are small constants that depend only on δ2K. For example,if δ2K = 0.2 then C0 < 4.2 and C1 < 8.5.

Remark that if ε = 0, i. e., there is no noise, the BPDN recovery pro-gram is exactly the same as BP and the error on the reconstruction isonly bounded by the error of the best K−terms approximation multi-plied by a small constant. In the noisy framework, i. e., when ε > 0, asecond term, which is proportional to ε, appears.

The main issue with this type of methods that we could call exactminimization methods is that they require the use of an exact solverwhich can be very slow for large problems. The unique solution ofthe convex optimization may, indeed, require to explore a large space.The complexity of the best solver is not lower than O(N3). This can bea real problem when the ambient dimension D of the signal to recoveris increasing (which means an exponential increase of N). This is whyresearchers have also developed alternative greedy solutions. One ofthem is presented below.

2.2.4 Iterative hard thresholding

The so-called iterative hard thresholding (IHT) was proposed andstudied for sparse recovery by [BD08] based on the ideas of [KR03,DDDM04] and then in [BD09] specifically for compressed sensing. Ithas multiple advantages. It is easy to interpret, its implementation isvery simple and as for BPDN, there are near-optimal theoretical guar-antees on the quality of the reconstruction in the NCS frameworkwith slightly stronger assumptions on the sensing matrix and it con-verges in a number of iterations depending only on the noise. But themain advantage of this greedy algorithm is certainly its low memoryrequirement and low computational complexity, i. e., its speed. As wewill see, it only requires one applications of the sensing matrix (whichcan be the model of a fast transform algorithm) and one applicationof its adjoint per iteration.

Page 34: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

22 state of the art

The idea of IHT (and of other greedy algorithms such as matchingpursuit [MZ93]) is to approximately minimize the following optimiza-tion problem (which is a variant of LASSO9 formulation of ES):

x?IHT ≈ argminu

1

2‖Φu−y‖2 s.t. ‖u‖0 6 K. (16)

Like ES, finding the exact solution of (16) is a combinatorial prob-lem. However, IHT relaxes the problem by searching an approximatesolution with the following iteration:

a(n+1) := x(n) + µΦT(y−Φx(n)

)x(n+1) := HK

(a(n)

),

(IHT)

withHK(x) the non linear K-hard thresholding operator. It simply setsthe N− K smallest elements of x to 0 (and keeps only the K biggestelements to a non-zero value). In words, the IHT iteration combinesa gradient descent step on the quadratic objective of Eq. 16 (first line inIHT) with a hard thresholding operation HK. The step size µ is a pa-rameter that controls the stability and the convergence speed. If µ istoo large, the algorithm diverges. In particular, it is proven in [Blu12]that µ > 0 must satisfy µ−2 > ‖Φ‖ := supu:‖u‖=1 ‖Φu‖ to guaran-tee convergence. If it is too small, the algorithm converges too slowly.Usually it is fixed but some work [BD10, Blu12] have been done tofind a better (and adaptive) value of µ.

The algorithm must be initialized. Here, we start with x0 = 0. The

stopping criterion is ‖x(n+1)−x(n)‖‖x(n+1)‖ < tol, i. e., when the left hand side

goes below a certain tolerance, we consider that the algorithm hasreached a fixed point and we stop. Remark that it is independentof ‖x‖.

Note that here, we have to set an a priori sparsity K for x. Thisis a more restrictive framework than BP and BPDN. Indeed, suchan information is not always available in practice. However, thereexists some variants of IHT that use a threshold on the value ofthe elements of a(n) or with a parameter K(n) that starts from 1

and grows slowly with n.

9 Least absolute shrinkage and selection operator. This operator is very common inoptimization and in particular in the field of Machine Learning. In the classical defi-nition of LASSO, the “`0−norm” in the constraint is in fact replaced by the `1−norm,Φu by u and K by a real positive constant ν ∈ R+,i. e.,

x?LASSO = argminu

1

2‖u−y‖2 s.t. ‖u‖1 6 ν. (LASSO)

It corresponds then to the so-called soft thresholding (or shrinkage) operatorSλ(y) = (|y|− λ)+sign (y) with λ the (fixed) multiplier of the equivalent Lagrangianformulation. Note that the LASSO problem with 1

2‖Φu−y‖2 as objective is relatedto BPDN.

Page 35: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.2 reconstruction – guarantees and challenges 23

As noted in [BD09], the complexity bottleneck of this algorithm isclearly the applications of Φ and ΦT , the other operations being twovector additions and a partial ordering for the application of HK. Ingeneral, when Φ is full (e. g., the Gaussian sensing mentioned above),the complexity of one iteration is O(MN). However, instead of fullsensing matrices we can sometimes use structured operators basedon fast transforms like FFT or FWT which reduce the complexity toO(N logM) or even O(N).

IHT is guaranteed to converge to a local minimum of (16) undercertain conditions on Φ. In particular Theorems 2 and Corollary 3

from [BD09] give the following results.

Theorem 3 (Theorem 2 from [BD09]). In the NCS framework where xis (K, ε)−compressible, xK is its best K−terms approximation and ‖n‖2 =

‖Φx− y‖2 6 ε, if Φ respects the (3K, δ3K)−RIP with isometry constantδ3K < 1/

√32 , then, at iteration n, IHT will recover an approximation x(n)

satisfying

‖x− x(n)‖2 6 2−n‖xK‖2 + 6ε̃K , (17)

where

ε̃K = ‖x− xK‖2 +1√K‖x− xK‖1 + ε . (18)

Furthermore, after at most

n? =

⌈log2

(‖xK‖2ε̃K

)⌉(19)

iterations, IHT estimates x with accuracy

‖x− x(n?)‖2 6 7ε̃K (20)

Corollary 1 (Corollary 3 from [BD09]). Under the same assumptionsthan Theorem 3 except that x is exactly K−sparse, IHT will recover anapproximation x(n) satisfying

‖x− x(n)‖2 6 2−n‖x‖2 + 5ε . (21)

Furthermore, after at most

n? =

⌈log2

(‖x‖2ε

)⌉(22)

iterations, IHT estimates x with accuracy

‖x− x(n?)‖2 6 6ε . (23)

Page 36: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

24 state of the art

As BPDN, under some assumptions on the sensing scenario, theoptimal solution is guaranteed to tend to a close approximation tox with a bounded error that depends on two parameters: the bestK−sparse approximation error ‖x− xK‖2 + 1√

K‖x− xK‖1 and the am-

plitude of the sensing noise bounded by ε. In addition, Lemma 6

from [BD09] could be used as stopping criterion. It states that ifδ3K < 1/

√32, when we have found an approximation such that

‖y −Φx(n)‖2 6 ε(n) then ‖x − x(n)‖ 6 1.11ε(n) + 2.41ε̃K. We cantherefore impose an arbitrary precision and stop the algorithm when-ever the corresponding ε(n) is reached.

Let us finally mention that there exists a lot of other greedy al-gorithms and exact (convex or not) optimization formulations thatare beyond the scope of this thesis. Some of them are given in [JV10].Let us mention in the optimization class: BPDN-TV [ROF92, CRT06a],BPDN+ [Zha08, KDXH09], LASSO [Tib96, VDBF08a], BPDQp [JHF09,JHF11], `q−BPDN [Cha07, CWB08], . . . And in the greedy class: MP[MZ93], OMP [PRK93], CoSaMP [NT09], SP [DM09],. . . IHT, whichbelongs to the latter, has also a lot of variants like NIHT [BD10],AIHT[Blu12], the signal-declipping method proposed by [KJM+

13], BIHT[JLBB13] which is presented in the next section about quantizationand the last one, named QIHT which we developed with my super-visors Prof. Laurent Jacques and Prof. Christophe De Vleeschouwer[JDD13] and is detailed in Chapter 3.

2.2.5 Examples of sensing matrices

According to [JV10], some of the criteria for finding a realistic andefficient sensing matrix are the existence of an analog model correspondingto this sensing, the availability of a fast implementation of the numericalsensing at the decoding stage, the storage of the sensing matrix in the sensorsor/and in the decoder, the coherence between this sensing and the (sparsity)class of signal to be observed,. . . We give in Table 1 a few examples ofusual sensing matrices that exhibit proofs about their RIP constantsholding with a certain high probability (more details can be found in[JV10]).

2.2.6 Matlab benchmark framework

In this subsection, we present the general framework setup that weused to test the different discussed methods. It was implemented inMatlab and designed to be versatile, i. e., capable of testing differentmethods under different conditions.

Page 37: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.2 reconstruction – guarantees and challenges 25

type condition on m universality

Random sub-Gaussian(Gaussian, Bernoulli, . . . )

O(δ−2K K lnN/K

)Yes

Random Fourier ensembleΦ = SF

O(K ln4N

)but

O (K lnN) in practice (MRI)No,Ψ = Id

Random basis ensembleΦ = SU

O(µ(U,Ψ)2NK ln4N

)Depends onµ(U,Ψ)

Random convolutionΦ = SFTΣF

O(K ln5N

)Yes

Other sensing strategies . . .

Table 1: Examples of sensing matrices given in [JV10]. The column universality indi-cates if the condition on M is valid for every sparsity basis Ψ. S is the ran-dom selection matrix, F is the (fast) Fourier transform,U is any orthonormalbasis and Σ is a complex diagonal matrix made of unit amplitude diagonal elementand random phase (their generation respects a certain symmetry to guarantee a realmeasurement vector).

Evaluate the RIP

Before talking about the benchmark framework itself, let us presenta method that has been proposed in order to empirically test the RIPof a given class of matrices. It is, in fact, a numerical Monte-Carloestimation of the isometry constants δK, δ2K and δ3K of a matrixΦ ∈ RM×N. The estimations are noted δ̂K, δ̂2K and δ̂3K. In the ex-ample shown, the method is applied to a Gaussian matrix, i. e., eachentry is generated randomly from a normal distribution N(0, 1√

M).

The setup is N = 1024, K = 16 and M varying from 4K to N. Foreach M, the matrix is randomly generated 400 times and each matrixis tested on 3× 1000 test vectors u (1000 K−sparse, 1000 2K−sparseand 1000 3K−sparse). The maximum δ is then taken among the testvectors u. Finally, the mean and 4σ−interval are computed amongthe 400 matrices. The result is shown on Figure 6.

We provide no formal proof that the method is trustworthy nor sayquantitatively in what measure it is but we give some intuitive justifi-cations and thinking tracks for possible future works in that direction.

In fact, it was proven in [TP12] that the computation of the isometryconstant is (using their words) a strongly NP-hard problem.

• u is generated by selecting uniformly at random its support andgenerating the non zero values with i.i.d. Gaussian N(0, 1). Be-cause of the linearity, the amplitude of u has no influence on theδ found. This means that only the angle matters. The distribu-tion of a Gaussian i.i.d. vector is isotropic, i. e., the distribution

Page 38: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

26 state of the art

of the angles of a Gaussian i.i.d. is uniform (see Appendix Afor a definition of isotropy). This means that every possibility ofdirection for u has the same chance to be generated.

• Since the distribution of u is, so to say, uniform in the subspaceof interest ΣK ∩ {u : ‖u‖ = 1}, we could make the assumptionthat the distribution of δ is well represented by a sufficientlylarge set. We must keep in mind that the real δ will always behigher since it must hold for any signal. The important thing forthat lower bound is to be as close as possible to the real value.

• There might be particular worst cases that give drastically higherδ but if they happen with big probability, we will notice it. Else,we suppose that we could neglect those very improbable casessince they will almost never happen in the sensing process.

• We could use the expressions “apparent” RIP and “apparent”isometry constant since we only see the effect of the matrix on atiny subset of all the possible sparse signals. Note that the sizeof ΣK increases dramatically fast with K (the only selection ofthe K indexes of the support among [N] counts CNK possibili-ties) which means that the portion of the space explored by theMonte-Carlo is a lot smaller when K increases (it is linked withthe “curse of dimentionality” and in particular the “concentra-tion of measure”).

• We could statistically estimate the confidence of this apparentisometry constant with hypotheses tests (like what is used, e. g.,in machine learning to asses the performances) by arguing thatit holds for “a lot” of signals and matrix realizations so it isreliable with a certain high probability.

• We see that on Figure 6, the constant seems to be the same forK, 2K and 3K. However, since ΣK ⊂ Σ2K ⊂ Σ3K, we have thatδK 6 δ2K 6 δ3K. In addition, in [NT09], corollary 3.4 says thatfor an integer A, δAK 6 Aδ2K. It implies that δ3K 6 3δ2K. Theserelations suggest that the constants may not be equal in general.

Setup of the benchmark framework

In order to test the methods, we created an artificial benchmark inMatlab. It was constructed as follows. First we generate a random setof S = 100 K−sparse (K = 16) samples xi of dimension N = 1024.They are sparse in the canonical basis, i. e., Ψ = Id. The support isselected uniformly at random and the 16 non zero entries are gen-erated following a normalized i.i.d.10 Gaussian distribution N(0, 1).

10 They are independent before normalization but it is not true after. See Appendix Bfor details about this distribution.

Page 39: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.2 reconstruction – guarantees and challenges 27

δ̂

δ̂Kδ̂2K

δ̂3K

0

BPDN

IHT

1

K 16K 32K 48K 64K = NM

Figure 6: Numerical estimation of the isometry constants δK, δ2K and δ3K of a matrixΦ ∈ RM×N for which each entry is generated randomly from a normaldistribution N(0, 1√

M). The setup is N = 1024, K = 16 and M varying from

4K to N. For each M, the matrix is randomly generated 400 times and eachmatrix is tested on 3× 1000 test vectors u (1000 K−sparse, 1000 2K−sparseand 1000 3K−sparse). The maximum δ is taken among the test vectors uand then the mean and 4σ−confidence interval are computed among the400 matrices. The green line labeled BPDN indicates the threshold belowwitch δ2K must be in order to satisfy the Theorem 2. The red line labeledIHT is the threshold for δ3K in Theorem 3.

Note that the expectation of ‖xi‖2 is then K.

Then we define m ∈ {1, . . . , 16} such that M := mK and for eachm, we generate S = 100 sensing matrices Φm,i ∈ RM×N. Each entryis generated at random following a normalized i.i.d. Gaussian distri-bution N(0, 1). This matrix clearly does not strictly follow the RIP asdefined in Definition 5. However, it can be shown that the results ofTheorems 1, 2 and 3 will hold since there is a constant C such that thescaled matrix CΦm,i follows the RIP with big probability. In this case,the constant C is 1/

√M. We apply each matrix Φm,i to the sample

xi for all i ∈ [S] to form the noiseless measurement vectors

y0m,i =Φm,ixi (24)

The final step before testing the reconstruction is to generate thenoise. In order to do so, from a determined input SNR (see section 1.3for the definition), SNRin ∈ {0dB, 10dB, 20dB, 30dB,∞}, we generate a

Page 40: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

28 state of the art

random vector nSNRinm,i ∈ RM with each entry following the normal

distribution N(0,σn). The variance is computed as

σ2n =

∥∥∥y0m,i

∥∥∥2SNRinM

(25)

Note that the noiseless case CS is treated as a noisy case of infiniteSNR, i. e., n∞m,i = 0. For each m, we try to recover as many xi as possi-ble based on the measurements vectors ym,i and the sensing matricesΦm,i using either BPDN or IHT.

For BPDN, we use the function spg_bpdn from the spgl1 toolbox[vdBF07, vdBF08b] a BPDN11 “solver for large-scale sparse reconstruc-tion”. The ε parameter is set as ε2 = σ2nM which is the expectation ofsquared norm of the noise ‖nSNRin

m,i ‖2. We assume that we know thisaverage noise power.

For IHT, we implemented the iteration defined above (IHT). Theparameters of IHT are defined as follows.

µ :=1

M

(1− min(

√cK

M,η)

), (26)

with c an arbitrary constant bigger than 1 (here c = 3) and η a thresh-old slightly smaller than 1 (in order to avoid µ 6 0, here η = 0.9).This heuristic definition of µ was found during the development ofQIHT [JDD13] and is discussed in Chapter 3. The stopping criterionis ‖x(n+1) − x(n)‖‖x(n+1)‖−1 < 10−6 or if the number of iterationsexceeds 1000. The initialization is x(0) = 0.

In the sequel, where the results are presented, we deliberately for-get the subscripts and superscripts in order to keep notations light.The reconstructed samples are simply noted x?BPDN and x?IHT or sim-ply x? when the context applies to both.

Results

The performances of both algorithms are presented hereafter. Theyare measured in terms of SNR(x, x?) for each sample i in each config-uration (M, SNRin) and for both algorithms.

The first graph (Figure 7) compares BPDN and IHT in the noise-less CS case. We can clearly see an abrupt transition around M = 4K

to 6K: the reconstruction seems almost impossible below this thresh-old while above, i. e., for any compression rate above M/N ≈ 10%,

11 It can also solve the LASSO formulation argminu12‖y −Φu‖22 + λ‖u‖1 not dis-

cussed here.

Page 41: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.2 reconstruction – guarantees and challenges 29

SN

R(x

,x⋆)

(dB

)

M

IHT

BPDN

δ̂2K <√

2 − 1

δ̂3K < 1/√

32 →

K = 16 4K 8K 12K 16K = 256

0

20

40

60

80

100

120

Figure 7: Comparison of BPDN and IHT in the noiseless CS framework.

it is perfect (except for extremely small numerical rounding errors).Furthermore, we can observe a rather unexpected result: the perfor-mances of IHT seem to be (if not exactly equivalent) slightly betterthan BPDN and the transition is occurring slightly before. This seemscounter-intuitive because of the results of Theorems 2 and 3 but thisenlighten the fact that they give only sufficient conditions but do nottalk about what is going on under the threshold (particularly Theo-rem 3). The vertical line (dotted blue) is drawn based on the previ-ous numerical computation of the isometry constants δ̂2K and δ̂3K.The line agrees with the conclusions of Theorem 2: the reconstructionwith BPDN (BP in this case) is perfect as soon as δ2K <

√2− 1. The

rightmost bound corresponding to Theorem 3 is indicated by an ar-row because according the the estimation of δ3K, it should be drawnat M = 767, i. e., ≈ 50K. That bound obviously agrees with the theo-rem since after it, the reconstruction of IHT is perfect. However, wesee the sub-optimality of that theorem. Indeed,the abrupt transitionoccurs already at 6K and the reconstruction is already perfect a 8Kwhich is much smaller. We must also emphasize the fact that mostprobably, the estimations of δ̂2K and δ̂3K are too optimistic, i. e., theyare only lower bounds δ̂2K < δ2K and δ̂3K < δ3K, which means thatin reality, the lines (the blue BPDN line and the not drawn “red” IHTline) must be placed farther to the right.

On the two next graphs (Figure 8), the results are shown for a in-put SNR (white noise on the measurements) from 0 to 30 dB. Wecan see that the transition between “impossible reconstruction” and“good reconstruction” is less abrupt than without noise. We clearlysee that the performances of IHT are the best in this framework. This

Page 42: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

30 state of the art

SN

R(x

,x⋆)

(dB

)

MK = 16 4K 8K 12K 16K = 256

30 dB

20 dB

10 dB

0 dB

−5

0

5

10

15

20

25

30

35

40

45

50

(a) BPDN

SN

R(x

,x⋆)

(dB

)

MK = 16 4K 8K 12K 16K = 256

30 dB

20 dB

10 dB

0 dB

−5

0

5

10

15

20

25

30

35

40

45

50

(b) IHT

Figure 8: Comparison of BPDN and IHT in the noisy compressed sensing (NCS)framework for several input SNR.

fact could seem strange because IHT is supposed to be sub-optimalbecause of its greedy nature. It may be explained by the fact that IHTuses an important information that BPDN (in this formulation) can-not use, i. e., the sparsity K. The reason could also be linked with theoptimistic nature of the estimation of ε which is supposed to be anupper bound on ‖n‖ and not its expectation. Anyway, in this setup,IHT is clearly advantageous.

Page 43: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.2 reconstruction – guarantees and challenges 31

0

0

Ex

ecu

tio

nti

me

(sec

/sa

mp

le)

M

IHT (all)

BPDNNo noise

30 dB20 dB

10 dB0 dB

10

10

10

−1

−2

50 100 150 200 250 300

Figure 9: Comparison of BPDN and IHT in terms of execution time for several inputSNR.

The last graph (Figure 9) does nothing but confirm the conclusionthat IHT must be preferred (in this setup). Indeed, it presents theaverage time (in seconds per sample over the 100 trials) that eachreconstruction scenario took on a common laptop (Intel R©CoreTM

2

Duo CPU P7350 @ 2.00GHz 2.00GHz). IHT is (almost) everywherefaster than BPDN and in every configuration in the noiseless and lownoise cases. Moreover in high noise, it is anyway faster from M ≈ 150.Another advantage is that the speed of IHT is much more constantthan BPDN when the noise power varies. The Matlab implementationof IHT used the fact that x(n) was extremely sparse (in any iteration).This way, the first matrix-vector Φx(n) product was extremely fast(O(KM)).

2.2.7 Smile: CS works

A last experiment has been conducted to see qualitatively the effect ofcompressed sensing on a real image. The same image as in section 2.1,i. e., smile has been used. We tested two different cases. First, the DWTtransform of the sensed signal was truncated to its best K−terms withK = 512 in order to produce an artificial K−sparse signal. The sig-nal is sensed with a M×N Gaussian matrix (like in the benchmarkframework described in the previous subsection) with M = 2704, i. e.,≈ 5.3K and then reconstructed using the fact that it is exactly K =

512−sparse in the DWT basis, i. e., using Θ = ΦΨ instead of simplyΦ in IHT (with Ψ, the inverse DWT). The result shown on Figure 10

witnesses the fact that this number of measurements is sufficient to

Page 44: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

32 state of the art

Figure 10: Sensing and reconstruction with IHT of the best K−terms approximation(in the DWT) of the 64× 64 px image smile. With N = 4096, K = 512 andM = 2704.

obtain a (almost) perfect reconstruction (SNR(x, x?) = 63.5dB).

The second case consists in sensing the original signal which is, asnoted in the previous section, a (512, ε)−compressible signal (withε‖α‖ = 45.4 while maxi(αi) > 104). The result of the reconstructionassuming K = 512 is shown on Figure 11 The reconstruction is notas perfect as previously because there is an error between xK and x.However, the reconstruction error (SNR(x, x?) = 19.9 dB) is coherentwith the bound given in Theorem 3.

Page 45: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.3 gray, we need to quantize! 33

Figure 11: Sensing and reconstruction with IHT of the 64× 64 px image smile whichis (512, ε)−compressible in the DWT. With N = 4096, K = 512 and M =

2704.

2.3 gray, we need to quantize!

As we saw in the previous section, reconstruction after compressedsensing works when we are able to provide M possibly slightly noisymeasurements y with M sufficiently big with respect to K and to (apower of) the logarithm of N/K. However, before even thinking aboutreconstruction, if we want to be able to store or transmit the mea-surements, we must digitize the information acquired in the analogsensing stage. One of the goals of compressed sensing is to enhancethe efficiency of the sensors in terms of speed, power consumptionand precision (information rate). In the sampling processes, like com-pressed sensing but also like classical Nyquist rate analog to digitalconversion (ADC), the digitalization, in fact most often called quanti-zation in the field of signal processing, is clearly the bottleneck of thedevice for these three efficiency parameters. Indeed, with today’s tech-nology, when they design a sensor, electronic designers have to maketrade-offs between power and speed, speed and precision or powerand precision. As we have seen, compressed sensing can drasticallydecrease the acquisition rate by increasing the information content ofeach measurement. It means that the trade-offs can be much more re-laxed than in classical ADC, e. g., achieve the same speed with muchless power, achieve more efficient bitrate with same power or be ableto sense extremely high dimensional data that were out of reach withreasonable power. But this assumes that we master the effect of quan-

Page 46: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

34 state of the art

tization which was not yet addressed in this thesis. This is why, beforepresenting the reconstruction algorithm developed during my works,we have to study these effects and give a brief overview of what wasdone in compressed sensing.

2.3.1 Quantizers and entropy

Let us first describe what is quantization and how it is done. Thehistory, fundamentals and principles of quantization are described indetails in the reference article from R. Gray and D. Neuhoff [GN98].There exists several types of quantization paradigms: scalar or vectorquantization, uniform or non uniform, variable or fixed rate codingetc. In this thesis, we only consider (uniform and non uniform) fixedrate scalar quantization. It means that each measurement yi will bequantized independently with a quantizer Qb defined as follows

Definition 6. A b−bits quantizerQb is a mappingQb : R→ {ω1, . . . ,ω2b} :a 7→ Qb(a) = {ωj ∈ Rj|a ∈ Rj} with Rj =]τj, τj+1], ∀j ∈ 1, 2, . . . , 2b.τj ∈ R∪ {±∞} are the 2b + 1 quantization thresholds, Rj =]τj, τj+1] arethe 2b quantization bins and ωj ∈ Rj are the 2b quantization levels. If acan take any real value, the domain of the quantizer is the whole real linesuch that −τ1 = τ2b+1 = ∞. When a is outside ]τ2, τ2b ], we say that thequantizer is saturated.

Assuming that the ωj are coded with fixed length code, it is easyto see that in order to represent each possible output value, we needb bits, hence the name b−bits quantizer. In electronics, most of thequantizers are designed to be uniform because it is easier to build.It means that τj+1 = τj + ∆ ∀j ∈ {2, . . . , 2b − 1} with a fixed quan-tization step ∆ . It is then assumed that the ωj are centered in thebins (except in the saturated bins where we generally assume thatω1 = τ2 −∆/2 and ω2b = τ2b +∆/2). It has been shown in [GN98]that for high resolution (i. e., high b) quantization and provided thata ∈ [τ2 −∆, τ2b +∆], the quantization error of such a uniform quan-tizer eq = a−Qb(a) follows a uniform distribution U(−∆2 , ∆2 ).

However, there exists other practical ways to fix the quantizationthresholds and levels. For example if the objective is to minimizethe output distortion knowing the probability distribution of the in-put, we can use the Lloyd-Max optimization (see [GN98] for details).Based on probability distributions we can also set the thresholds suchthat every bin Rj has the same probability pj to be chosen, i. e., maxi-mize the entropy (H =

∑j−pj log2 pj) per bit.

As a general rule for practical analog to digital converters, we cansay that the smaller the number of bits b, the more reduced is the

Page 47: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.3 gray, we need to quantize! 35

power consumption or the faster is the quantization but the bigger isthe output error.

In this work, the input value z of the quantizer is assumed to followa Gaussian distribution N(0,σz). Thanks to the central limit theorem,this assumption is almost always (at least approximately) verified incompressed sensing. Le us introduce, by a small example, the threetypes of quantizers mentioned above, namely

• uniform quantizers which are the most common form of quan-tizers in electronic ADC;

• Lloyd-Max quantizers which are constructed with intent to min-imize the distortion of the quantized signal with respect to theoriginal;

• maximum entropy quantizers which are built in order to dis-tribute the samples with equal probability in the quantizationbins.

We try to show what are the differences between those quantizers.Figure 12 shows the relation between input and output of these threekinds of quantizers. Notice the typical “stairs” shape of these rela-tions which is induced by the fact that the output can only take asmall finite number of possible values: the levels ωj. The number ofbits allowed per quantized value is b = 3, which means that thereare 2b = 8 possible bins Rj, so 8 levels ωj (the values taken by theoutput) and 9 thresholds τj (with −τ1 = τ9 = ∞. On Figure 12a, theregular uniform quantizer. It has been scaled so that the probabilityof saturation (of a Gaussian input z) is small (4%). The resulting binprobabilities are also indicated above each positive bin (and it is obvi-ously symmetric for negative values). These probabilities give us theentropy of the quantizer (for this particular input): H(Qb(z)) = 2.59.We see that, since the distribution between the bins is not equiprob-able, the entropy is lower than b = 3. It means that, on a statisticalpoint of view, the output of that quantizer gives on average 2.59 bitsof information per quantized value but uses 3 physical bits to de-scribe them. It also means that we could use entropy coding on theoutput bit stream to reduce the number of physical bits12.

By opposition, the second quantizer is built in order to distributewith same probability the quantized values among all bins. Sincethere are 8 bins, the probability to be quantized in each of them is

12 This would imply to add a DSP module after quantization and increase the electronicconstraints (power, area, speed. . . ).

Page 48: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

36 state of the art

approximately 13%. The levels ωj are set to the posterior expectationof the input value, i. e.,

ωj = E(z|Qb(z) ∈ Rj) =

∫τj+1τj

z fZ(z)dz∫τj+1τj

fZ(z)dz, (27)

with fZ(z), the normal PDF. We note that since all the bin probabili-ties are equal, the entropy is maximum, i. e., H(Qb(z)) = b = 3.

The purpose of the third quantizer is to minimize the distortionof the output signal. This first figure does not show that optimal-ity. However, we note that though it is not maximum, the entropyH(Qb(z)) = 2.82 is quite high (higher than the uniform quantizer).

The next figure (Figure 13), on the other hand, shows that the Lloyd-Max quantizer is better than the others in terms of NSNR, which is away of measuring the fidelity of the output signal to the input signal(i. e., in a way, an inverse of the distortion). 13 The figure is obtainedwith a simple Monte-Carlo simulation with 100 test signals z ∈ RN

withN = 1024. They are quantized with b = 1 to 5 bits and the NSNRbetween z and Qb(z) is averaged over the 100 trials.

2.3.2 Quantized compressed sensing

Quantization is fundamentally non linear and necessarily “adds” noiseto the measurements. Indeed, one way to model the error induced byquantization (noted Qb) is to consider it as an additive noise, i. e.,

y = Qb(Φx) =Φx+nq. (QCS)

This new framework is called quantized compressed sensing (QCS).However, there is a big difference with the previously defined noisycompressed sensing (NCS): the statistical distribution of nq can nolonger be considered as a Gaussian, particularly when quantizationis rough.

The fact that each measurement is encoded with a certain numberof bits allows us to compare several quantization scenarios. Whenwe allow a small number of bits per measurement, the informationbrought by each of them is small as well. But as we emphasized pre-viously, a rough quantization is a lot cheaper than acquiring precisehigh resolution measurements. This is why we can afford to acquiremore measurements when quantization is less precise. Now, in fact,

13 The description and advantages (over the SNR used in the previous section) of theNSNR are explained later in the text. Let us just mention that the higher the NSNR,the better is the quantized signal (just like the SNR).

Page 49: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.3 gray, we need to quantize! 37

H(Qb(z)) = 2.59

Qb(z)

z

.07

.16

.25

pclip = .02ω8

ω7

ω4

ω3

ω2

ω1

τ8τ3τ2

−3σz

−3σz

3σz

3σz

(a) Uniform with pclip = .2

H(Qb(z)) = 3.00

Qb(z)

z

.13

.13.13

pclip = .13ω8

ω7

ω4

ω3

ω2

ω1

τ8τ3τ2

−3σz

−3σz

3σz

3σz

(b) Maximum entropy

H(Qb(z)) = 2.82

Qb(z)

z

.11

.16

.19

pclip = .04ω8

ω7

ω4

ω3

ω2

ω1

τ8τ3τ2

−3σz

−3σz

3σz

3σz

(c) Lloyd-Max

Figure 12: Input-output relation of the quantizers adapted to a normal distributionN(0, 1). For the example, the number of bits per quantized quantity isb = 3. The theoretical entropy of the outputs H(Qb(z)) is noted in thebottom-right corners. These probabilities noted above the positive binscorrespond to the probabilities of z being quantized in the correspondingbins. The probability pclip of being in the bin 2b is half the probability ofsaturation, i. e., the probability of being in either the bin 1 or the bin 2b.The Gaussian distribution is drawn in dashed gray with the partitioningthresholds.

it makes sense to acquire more (coarsely quantized) samples than thenumber of samples prescribed by Nyquist, i. e., if b is small, it mightbe advantageous to set M > N. Of course, because of the non lin-ear quantization step, we are still not able to reconstruct by simplysolving a linear system. In order to compare quantization scenarios,

Page 50: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

38 state of the art

uniform

maximumentropy

Lloyd-Max

NS

NR

dB(z

,Qb(z))

b1 2 3 4

5

50

10

15

20

25

30

Figure 13: NSNR between z and Qb(z) in function of b. It is the result of a simpleMonte-Carlo experiment (average over 100 trials) on z ∈ RN with N =

1024 and each entry generated as N(0, 1).

we can compare the achieved reconstruction quality for a fixed totalnumber of bits (i. e., B = bM). We can even go further and estimatethe bit rate in terms of information theory, i. e.,Mmultiplied byH, theentropy of the measurements as viewed by the sensor by consideringtheir probability distribution in each bin. This bit rate is the optimalrate that could be achieved using entropy coding on the bit streamof the quantized measurement. These considerations are taken intoaccount in Chapter 3 where a greedy QCS reconstruction algorithmis presented.

A lot of work has been made recently in the framework of QCS. AnICASSP tutorial on that topic was recently given in Vancouver [BJ13].Advances have been made particularly in high resolution QCS and inin ultra-low resolution, i. e., the so called 1-bit compressed sensing.

First, in high resolution QCS, we can cite [DM11] where nq hasbeen considered as an additive Gaussian noise (i. e., using BPDN)and variants such as [LBDB11] where they handle saturation, [ZBC10]where they consider pre-quantization noise, [JHF09, JHF11, JHF12] inwhich they used `p−norm constraint with p > 2, [SG09] where theyoptimize the quantization procedure and finally [GLP+

13] where theyconsider another type of quantization (Σ∆) not discussed here.

Second, a lot of references about 1-bit compressed sensing can befound on [Las12] for example the introductory paper [BB08] but alsoreconstruction methods in [JLBB13, PV11, PV13]. In the following sub-section, we discuss about this particular case of QCS and present theBIHT reconstruction method from [JLBB13].

Page 51: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.3 gray, we need to quantize! 39

2.3.3 Binary compressed sensing

As indicated by its name, 1-bit CS uses only b = 1 bit to quantizeeach measurement which is equivalent to keep only their sign14, i. e.,

y = Q1(Φx) = ω sign (Φx), (BCS)

where −ω1 = ω2 = ω and τ2 = 0. We see that, unfortunately, in thiscontext of binary compressed sensing (BCS), we loose every informa-tion about the amplitude of x. Often, we simply consider ω = 1 sinceit does not bring information. In practical signals however, as long asthe structure is kept (the vector “direction” in RN), the amplitude in-formation has generally not a great importance (e. g., in a picture thisonly changes the luminosity, in sounds, the volume, etc.). One com-mon assumption is to suppose that x lies on the RN−1 unit sphere,i. e., ‖x‖ = 1.

The set of equations defined by BCS can be interpreted as a set ofM linear constraints yiφTi x > 0 ∀i ∈ [M], adding to the fact thatx ∈ ΣK. These constraints can be viewed each as an hyper-plane cut-ting the RN space in 2 half spaces and indicating in which half spacelies the objective x. A drawing of this situation in R3 is shown onFigure 14.

The paper [JLBB13] provides guarantees for the reconstructionsin 1-bit CS based on an interesting property on the mapping de-fined by u 7→ A(u) = sign (Φu) and linked with the RIP: the bi-nary ε−stable embedding (or BεSE). The definition of this embed-ding uses two different methods to measure distances in the signalspace and in the quantized measurement space, namely an angulardistance dS(x, s) = 1

π arccos(xTs) (since we are not interested in theamplitude) and the normalized hamming distance dH(y, z), i. e., thenumber of bits that are different when comparing two binary vectorsy and z (divided by the total number M of bits). The guarantees ba-sically state that when an algorithm is able to find a sparse signalx? ∈ ΣK such that the Hamming distance between y and A(x?) issmall, then the angular distance between x and x? is small. This state-ment can be easily interpreted on Figure 14: if we are on the correctside of every plane (or of the major part of them) we are certain to benot too far from x. The guarantee holds (with angular error bounded

14 To be consistent with the definition of Qb, it is assumed that τ2 = 0 belongs to R1so the sign function is defined here as

sign (a) =

{1 if a > 0

−1 if a 6 0(28)

Page 52: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

40 state of the art

x

diag (y)Φx > 0

y1φT1x > 0

y2φT2x > 0

y3φT3x > 0

y4φT4x > 0

y5φT5x > 0

ΣK

Figure 14: Intuition for 1-bit Compressed Sensing. The gray sphere is the unit spherein R3 on which x (represented by a red cross) is assumed to lie. Thered lines represent the constraint planes sectioning the sphere. The greenbands indicate, for each constraint, the half space in which x lies, i. e., theinformation given by the measurements. Finally, the dashed blue linesrepresent the set of all K−sparse signals on the unit sphere (with K = 2).We see that the combination of 5 random constraints (i. e., only 5 bits !)with the prior information that K = 2 restricts the possibilities to a verysmall set of points (highlighted in light blue) which are in the neighbor-hood of x.

by dH± ε) as long as A is a BεSE. This is the case with high probabil-ity in particular when Φ is generated randomly with rows followingan isotropic PDF in RN (e. g., Gaussian matrices like in the previoussection) and the number of measurements exceeds an inferior boundgiven in [JLBB13]. Other bounds on dS are given when the signalis compressible or when the sensing is affected by pre-quantizationnoise.

2.3.4 BIHT and theoretical guarantees

A greedy algorithm based on IHT was developed and studied in[JLBB13] in order to solve the BCS reconstruction problem. Like IHT,the so-called Binary IHT (BIHT) is based on the idea of minimizingan penalty or “energy” term under the constraint that x ∈ ΣK. Thispenalty term enforces the consistency of the recovered candidate withthe measurements, i. e., it is 0 when all the sign constraints are re-spected. When the candidate lies on the wrong side of the hyper-plane i, i. e., when yiφTi x < 0 the scalar error |φTi x| is added to thepenalty. A schematic representation of this one-sided penalty term forone measurement yi is shown on Figure 15.

Page 53: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

2.3 gray, we need to quantize! 41

yi

JB(zi,yi)

zi

Figure 15: Schematic representation of the one-sided penalty term JB(zi,yi) =

|(yizi)−| for one measurement yi, with zi = (Φu)i = φTi u.

The compact expression for this 1-bit consistency enforcing penaltyenergy term is given by the Lemma 5 of [JLBB13] as

EB(u) = ‖[y� (Φu)]−‖1, (29)

where � is the Hadamard product. The minimization problem thatBIHT tries to solve is then

x?BIHT ≈ argminu

EB(u) s.t. ‖u‖0 6 K, ‖u‖2 = 1. (30)

This energy is convex but not differentiable (e. g., when any xl forl ∈ [N] is 0) but one can find a subgradient15 in order to apply agradient descent combined with hard thresholding like in IHT. Thissubgradient is given in [JLBB13] by

∇EB(u) =1

2ΦT (sign (Φu) −y) . (32)

The algorithm is then simply given by the following iteration

a(n+1) := x(n) + µ2Φ

T(y− sign (Φx(n))

)x(n+1) := HK

(a(n)

).

(BIHT)

We can note that the result of the algorithm is totally independentfrom the value of µ which can therefore be set to 1. Indeed, the onlyeffect it has on the final result, is to multiply its amplitude by µ. Theproof of that statement is very simple. Assume that x(0) = 0. We find

a(1) := µ2Φ

T (y+ 1)

x(1) := HK(a(1)

).

(33)

15 A subgradient of the convex function f(x) is an element of the subdifferential set∂f(x) ⊂ RN defined by

∂f(x) ={u∣∣∣ f(v) > f(x) +uT (v− x) ∀ v ∈ dom f

}(31)

Informally, it is the set of all the gradients of the tangent planes lying entirely underthe graph of the function f. Where f is 1-dimensional, it is the set of all the slopes ofthe tangent lines lying under the graph.

Page 54: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

42 state of the art

We can define

a′(1) := 1µa

(1) = 12Φ

T (y+ 1)

x′(1) := 1µx

(1) = HK(a′(1)), (34)

i. e., what we would obtain in the first iteration if µ ′ = 1. The seconditeration is given by

a(2) := µ x′(1) + µ12ΦT(y− sign (Φx(1))

):= µ

(x′(1) + 1

2ΦT(y− sign (Φx′(1))

)):= µa′(2)

x(2) := HK(a(2)

)= µHK

(a′(2)) = µ x′(2),

(35)

where x′(2) is precisely what we would obtain at iteration 2 withµ ′ = 1. The key is that sign (Φx(1)) = sign (Φx′(1)). Which leads tothe conclusion by generalizing to iteration n+ 1, i. e.,

a(n+1) := µ(x′(n) + 1

2ΦT(y− sign (Φx′(n))

))x(n+1) := µHK

(a′(n+1)) = µ x′(n+1). (36)

The only effect of µ is to multiply x?BIHT (as any other iteration n),i. e., neither does it influence the convergence speed, nor the stabilityas it was the case in IHT. In addition, since we only seek for unitnorm x?BIHT, this µ has strictly no influence. Note also that the factor12 which originally came from the subgradient of EB is useless as well.

In summary BIHT tries to enforce consistency and when it suc-ceeds (exactly or for most of the constraints), the reconstruction erroris guaranteed to be small. Since it is a particular case (b = 1) of QIHTwhich is the main topic of the next chapter, we present Monte Carlosimulations further in the thesis.

Page 55: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

Part II

A S M A L L S T E P I N Q U A N T I Z E D C O M P R E S S E DS E N S I N G

This part is the central part of the work. We present herethe main contribution about dequantization in the field ofcompressed sensing reconstruction.

Page 56: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 57: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3T H E B R I D G I N G A L G O R I T H M

The worst thing about finally putting togethera puzzle is finding there are missing pieces.

— Dexter – DexterSeason 1 Episode 2: “Crocodile"

3.1 motivations

As we saw in Chapter 2, quantized compressed sensing (QCS) is prob-ably more suited to real cases of compressed sensing hardware. Whenthe quantization can be done with sufficient precision, we can useclassic approximations to consider the quantized signal as a pure oneaffected by noise with known probability distribution (often approxi-mated by a mere Gaussian noise). In other words, in high resolutioncompressed sensing, we can use the usual methods (possibly slightlyadapted, see references in Chapter 2). Sometimes it can be advanta-geous (e. g., for hardware convenience or optimization) to quantizeeach measurement with only one bit. This framework called BCS hasbeen studied over the past few years. In particular, general theoreti-cal bounds have been developed in [JLBB13] and some high perfor-mance algorithms exist to reconstruct 1-bit compressively sensed sig-nals (e. g., BIHT).

What has not been studied a lot yet are the intermediate (low) res-olutions. What happens when we quantize the measurements withb = 2, 3,. . . bits? Can we provide a general framework which is validin every quantization scenario? Such a framework can be interestingbecause in some cases we want to be able to tune the quantizationscenario with respect to the allowed bit budget. Yang et al. proposeda Bayesian approach in [YXZ12]. In this chapter, we present the socalled quantized iterative hard thresholding (QIHT) which is ratherbased on the well known algorithm IHT. As we will see, it was de-veloped using the ideas of BIHT and another variant of IHT not dis-cussed here which is a method for signal declipping developed by S.Kitic et al. in [KJM+

13]. The interesting particularity of this algorithmis that it reduces to BIHT for 1-bit quantization and it converges to

45

Page 58: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

46 the bridging algorithm

IHT at high resolutions.

3.2 derivations and interpretations

3.2.1 Inconsistency energy and QIHT algorithm

Remember that given a measurement vector y, IHT tries to minimizea certain energy with a simple gradient descent followed by the hardthresholding operator. The energy is the quadratic fidelity term

E(u) :=1

2‖y−Φu‖22, (37)

The gradient w.r.t. u of this energy is given by

∇E(u) =ΦT (Φu−y). (38)

In the context of quantized measurement, we want to build an al-gorithm (QIHT) that will take the quantization process into accountin order to reconstruct x from y = Qb(Φx).

We want to find an associated incoherence penalty energy Eb(u)

that will appropriately use the information given by the quantizer.Once defined, the algorithm will, as IHT, compute a gradient descenton that energy followed by hard thresholding at each iteration in or-der to approximately solve the problem

x?QIHT ≈ argminu

Eb(u) s.t. ‖u‖0 6 K. (39)

Such an energy can be defined as

Eb(u) := Jb(z =Φu,y) =M∑

i=1

Ji(zi,yi) (40)

with φi the ith row of Φ and

J(ν, λ) =

k∑j=h+1

wj(τj − ν) if h < k

h∑j=k+1

wj(ν− τj) if h > k

0 if h = k

, (41)

with

• k ∈ {1, ..., 2b} the index of the bin of λ, i. e., λ ∈ Rk.

• h ∈ {1, ..., 2b} the index of the bin of ν, i. e., ν ∈ Rh.

Page 59: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.2 derivations and interpretations 47

• wj = ωj −ωj−1 ∀j ∈ {2, ..., 2b}

This expression can be compactly written as

J(ν, λ) =2b∑

j=2

wj∣∣(sign (λ− τj) (ν− τj)

)−

∣∣. (42)

In this last expression, the non-zero terms are determined by the τjfor which sign (λ − τj) 6= sign (ν − τj). This is the case only whenτj lies between λ and ν. In words, this energy is the sum of the dis-tances (weighted by wj) between ν and any τj that lies between ν andλ. The more thresholds are overstepped by ν, the more the penaltyincreases quickly. Besides, when no threshold separates λ and ν, theconsistency is achieved, i. e., Qb(λ) = Qb(ν) and the energy is 0. Wecan represent the sum visually as on Figure 16.

λ

J(ν, λ)

ν

+

+

τj τj+1 τj+2

Figure 16: Schematic representation of the sum of the terms (blue, green and orange)in J(ν, λ) (total in dashed red).

We also see that since sign (λ− τj) = sign (Qb(λ) − τj) for all j ∈{1, . . . , 2b+ 1}, we have J(ν, λ) = J(ν,Qb(λ)) . We can see on Figure 17

that J(ν, λ) is convex and piecewise linear but as it is proved later,tends toward the quadratic energy function 1

2(ν− λ)2 when b → ∞.

Both seem already almost identical on Figure 17 with b = 3.

J(ν

,λ)

ν

τ2 τ3 τ4 τ5 τ6 τ7 τ8

λ ω50

1

2

3

Figure 17: One term of the inconsistency energy of QIHT. J(ν, λ) is drawn in bluewhile the quadratic energy function 1

2

(ν−Qb(λ)

)2 is superimposed indashed red for comparison.

Page 60: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

48 the bridging algorithm

Let us now derive the (sub)gradient of Eb(u) =∑i J(zi(u),yi) w.r.t.

u (with zi(u) = φTi u):

∇uJ(zi,yi) =

k∑j=h+1

wjφi if h < k

h∑j=k+1

wjφi if h > k

0 if h = k .

(43)

If we develop, we find

∇uJ(zi,yi) =

(ωk −ωh)φi if h < k⇔ ωh < ωk

(ωh −ωk)φi if h > k⇔ ωh > ωk

0 if h = k⇔ ωh = ωk ,

(44)

which allows us to merge the 3 cases in

∇uJ(zi,yi) = (ωh −ωk)φi

=(Qb(zi) − yi

)φi

(45)

and finally gives

∇Eb(u) = ΦT (Qb(z) −y)

= ΦT (Qb(Φx) −y) .(46)

This leads us to the following QIHT algorithm: we initialize withx(0) = 0. Then we iterate as

a(n+1) := x(n) + µΦT (y−Qb(Φx(n)))

x(n+1) := Hs(a(n)) ,

(QIHT)

until the stopping criterion ‖x(n)−x(n+1)‖‖x(n)‖ < tol is met or the maximum

number of iterations is reached. The last x(n+1) is considered to bethe best reconstruction found x?QIHT. The value of the parameter µis chosen like in Eq. 26 in Section 2.2 for IHT. In fact, it was theresult of an extensive experimental research for the best value of thisparameter. The results of these experiments are presented in the nextsection. The advantage of the expression found is that it fits the IHTcondition µ < 1/(1 + δ2K) mentioned in [Blu12] by assuming thatthe isometry constant δ2K behaves like

√cK/M which is a common

assumption in CS. When M = O(K logN/K), Φ/√M is RIP an the

definition tends to µ = c ′/M with c ′ a constant near 1.

3.2.2 Equivalence with BIHT when b = 1

In this subsection, we show how and why BIHT is a particular caseof QIHT. Let us first define the 1-bit quantizer Q1 by

τ1 = −∞τ2 = 0

τ3 = ∞(47)

Page 61: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.2 derivations and interpretations 49

and

ω1 = −12ω2 = 1

2

(48)

we can directly see that the algorithms are equivalent since Q1(a) =12sign (a) = 1

2A(a) which implies the equality of subgradients,∇E1 =∇EB. Moreover, we can show that the minimized energy is exactly thesame. We have w2 = 1 and

h+ 1 = k = 2 if k > h

h = k+ 1 = 2 if k < h,(49)

so,

J(zi,yi) =

w2(τ2 − zi) if h < k

w2(zi − τ2) if h > k

0 if h = k

=

−zi if Q1(zi) < 0 and yi > 0

zi if Q1(zi) > 0 and yi < 0

0 if Q1(zi)yi > 0

=

{|zi| if zi yi < 0

0 if zi yi > 0,

(50)

hence,

E1(u) =∑

{i |yiφTi u<0}

|φTi u|, (51)

which is by definition equal to ‖[y� (Φu)]−‖1 = EB(u).

3.2.3 Convergence towards IHT when b→∞

If we have a number of bits as large as we want, we can define aquantizer such that any real number can be approached as close aswe want with its quantized value. We can define a limit quantizersuch that ∀a ∈ R:

Q∞(a) := limb→∞

Qb(a) = a. (52)

This property is met for every quantizer that respect the followingproperties as b→∞

τ2 → −∞τ2b → ∞∆τ → 0,

(53)

Page 62: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

50 the bridging algorithm

with ∆τ = maxj∈{1,...,2b}(τj+1−τj) In particular, we can show that it isthe case for the Lloyd-Max optimal quantizer for a standard Gaussiandistribution. In fact, the third rule ∆τ → 0 is sufficient as long as thequantizer never saturates (i. e., as long as a ∈]τ2, τ2b ]). For such aquantizer, with an infinite number of bits, we have y = Q∞(Φx) =

Φx and Q∞(Φu) = Φu. The QIHT algorithm is then exactly thesame as IHT since

∇E∞(u) =ΦT (Q∞(Φu) −y) =ΦT (Φu−y) = ∇E(u). (54)

It is also interesting to show that E∞(u) tends towards E(u) = 12‖Φu−

y‖22 which is precisely the energy that IHT tries to minimize. This isequivalent to show that when b→∞, then J(ν, λ)→ 1

2(ν− λ)2.

When ∆τ→ 0, the space between two consecutive ωj also tends to0. We can replace the sum over j by an integral over ω and the wjby infinitesimal elements dω. Moreover, on each interval, we can saythat τj → ω. Finally, the integration boundaries are ω = ν to ω = λ

if λ > ν and vice versa. We have then

J(ν,ω) =

∫λν (ω− ν)dω if ν < λ∫νλ (ν−ω)dω if ν > λ

0 if ν = λ.

(55)

The three cases can now be merged. We calculate the integral andfind

J(ν, λ) =[ω2

2 − νω]λν= λ2

2 − νλ− ν2

2 + ν2

= 12

(λ2 + ν2 − 2νλ

)= 1

2(ν− λ)2

(56)

and then finally,

E∞(u) =∑

i

J(φTi u,yi) =1

2‖Φu−y‖22 = E(u). (57)

3.3 results and discussions

In this section we apply the matlab benchmark developed in Sec-tion 2.2 to the QIHT algorithm in the QCS framework. Again, weset K = 16, N = 1024 and generate the non zero entries of the x0signals 1 randomly from a normal distribution after having selectedtheir K−length supports uniformly at random in [N].

1 As in Chapter 2, we omit the subscripts corresponding to the different samples andthe different configurations.

Page 63: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.3 results and discussions 51

3.3.1 Setting the µ parameter for QIHT

In this experiment, we additionally projected the samples x on the`2−unit sphere in order to fix their amplitude to 1, i. e., x = x0/‖x0‖.

In this first experiment, we wanted to explore the influence of µ onthe quality of the reconstruction in order to choose the best possiblevalue. The quality is observed for 5 different quantization scenariosb ∈ {1, 2, 3, 4, 5} as a function of the total number of bits B = bM.

The quantization of the measurements is done using an optimalLloyd-Max quantizer. As it is described in Section 2.3 and in [GN98],this quantizer is designed to minimize the distortion of the measure-ments. In order to be tuned, the design requires to know the varianceof the unquantized measurements zi = φTi x. We can show that

σ2z = V(Zi) =∑

k∈suppX

V(Φi,k)V(Xk) = K1

K= 1 (58)

where Zi is the element i of the random vector defined by Z = ΦX

with Φi,k ∼ N(0, 1) the entry (i,k) of Φ and X a K−sparse randomvector uniformly distributed on the `2−unit sphere of RN. The prob-ability density function of the non-zero entries of X is computed inAppendix B. In particular, we show that the variance of these non-zero elements is exactly 1

K .

So for each number b of bits per measurement, we set the totalnumber of bits B ∈ {64, 128, ..., 1280}, then the number of measure-ments M =

⌊Bb

⌋. Finally, we set µ = µp0 with µ0 = 1

MN and p varyinglinearly from 0 to 1. Note that the case b = 1 is exactly equivalentto BIHT while increasing b makes the algorithm approaching IHT. Anew sensing matrix Φ is generated for each (b,B,µ) case. It is gen-erated as in Chapter 2 with each entry following a Gaussian distribu-tion N(0, 1). We repeat each case 20 times (with 20 different samplesx sensed with the same matrix Φ). The tolerance for the stopping cri-terion is tol = 10−4 and we stop anyway after 1000 iterations.

In order to be fair and allow comparison with the binary case(which, as we saw in the section about BIHT, suppresses any notionof amplitude), we project the estimation on the unit `2−ball beforemeasuring the SNR. In a way, we measure the angle in RN betweenthe signals. This way of measuring the quality is called normalizedSNR (NSNR) and is computed as NSNRdB(x, x?) = 10 log10 ‖ x‖x‖ −x?

‖x?‖‖−2. We can remark that this metric allow us to skip the pre-normalization of x (and this is what we do in the next experiments)while still allowing comparison with 1-bit QIHT (i. e., BIHT).

Page 64: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

52 the bridging algorithm

On Figure 18, you can see the color plots of the average NSNR indB (the scale is indicated by the colored bar in the bottom right).

Figure 18: QIHT with optimal Lloyd-Max quantizer. Average NSNRdB(x, x?) as afunction of B and µ for b ∈ {1, 2, 3, 4, 5}. The black line labelled µ? cor-

responds to the setting µ?(M) = 1M

(1− min(

√cKM ,η)

)with c = 3 and

η = 0.7.

Page 65: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.3 results and discussions 53

We can make several comments about what we can observe onthese figures. We notice first that BIHT does not depend on the valueof µ. As explained in Section 2.3, this is due to the fact that the onlyinformation used from the result of iteration n is the sign of Φx(n)

and every notion of scale has gone.

Another comment is that for low bit budget (B approximately be-tween 0 and 400) and as b increases, the algorithm seems to meetsome difficulties to converge to a good reconstruction whatever thevalue of µ. Though no guarantee has been found yet, we can reason-ably imagine that this is linked to the RIP (or another embeddingsuch as the BεSE defined for BIHT) of the sensing matrix.

We can see that the heuristic found for µ, i. e.,

µ? =1

M

(1− min(

√cK

M,η)

), (59)

in this case with c = 3 and η = 0.7, seems to be near optimal in thesense that it is the value of µ for which the SNR is (almost and at leastin that simulation) maximum for any fixed value of B.

3.3.2 Comparison of the performances of QIHT versus IHT and BPDN

In this subsection, we present the behaviour of IHT and BPDN inthe context of quantized measurements (see QCS). The purpose isto compare their performances (measured in term of NSNR) withthe performances of QIHT. Again, we explore the combinations ofB ∈ {64, 128, ..., 1280} and b ∈ {1, 2, 3, 4, 5} but this time, the averages(on the performance metrics) are made on 100 trials per B and anew matrix Φ is generated for each trial. The x signals were notnormalized as it wasn’t necessary with the NSNR metric (see above).Their non-zero elements simply follow N(0, 1) . The variance of zi =φTi x is then

σ2z = V(Zi) =∑

k∈suppX

V(Φi,k)V(Xk) = K (60)

However, in this experiment, we use a numerical estimation of thatvariance to set the quantizer Qb. For each parameter setting (B,b),we simply generate the 100 vectors z = Φx and then compute anunbiased2 estimation of the variance of the 100×M entries with the

2 The estimator is given by 1N′−1

∑N′

j=1

(zj − z̄

)2 with N ′ = 100M, zj denotes thejth entry of the long vector resulting from the concatenation of all the z. z̄ is theiraverage value, i. e., z̄ = 1

N′∑N′

j=1 zj

Page 66: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

54 the bridging algorithm

function var from Matlab. This would allow us to easily change thesensing matrix without worrying about re-computing the theoreticalvariance of the measurements.

The setup of IHT is essentially the same as QIHT with the pa-rameter previously found, µ = µ?(M). The fundamental differencewith QIHT is that the IHT algorithm itself does not use the fact thatthe measurements are quantized. The stopping criterion is also set totol = 10−4.

BPDN does not require to know K the sparsity of x but instead, wemust provide an estimation of the error ε. Since we want to comparethe algorithms in their optimal condition, we use an oracle estimateof ε:

ε? = ‖z−y‖2 (61)

In real world problems, of course, we wouldn’t have access to thisoptimal value since we wouldn’t know z. Again, this algorithm doesnot use the quantizer itself.

Several plots are produced. On the three first ones on Figure 19,you can compare the performances of QIHT, IHT and BPDN in termsof NSNR. Let us first comment the plot of the results of QIHT onFigure 19a. If we observe the average NSNR of the reconstruction ob-tained after at most 1000 iterations, we see that for low bit budgets(below approximately 300 bits) BIHT outperforms every other resolu-tions and the higher the number b of bits per measurement, the lowerthe NSNR. To the contrary, after a critical threshold (between 300 and400 bits), the trends is reversed and the higher the resolution b, thehigher the NSNR. An exception can be observed for b = 2 whichshows comparable performances with b = 1. This could be due to thefact that the Lloyd-Max quantizer provides minimum distortion onlywhen it quantizes x in a classical ADC (see Section 2.3). Here it quan-tizes the measurements y and is probably not optimal with respectto the QIHT reconstruction error minimization. In the case b = 1, thechoice of Lloyd-Max or any other centered quantizer (τ2 = 0) give thesame result and is therefore necessarily the best possible. Finally, weobserve that when b increases, the transition between a "bad" and a"good" reconstruction is more and more abrupt as it was the case forIHT and BP in the noiseless CS case (see Chapter 2).

QIHT clearly outperforms both IHT (Figure 19b) and BPDN (Fig-ure 19c) especially for low total number of bits B and for low numberof bits per measurement b. When the resolution of the quantizer in-creases, the gain of QIHT on IHT decreases. This fact is coherent withthe limit case analysis of the previous section (when b→∞). We can

Page 67: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.3 results and discussions 55

also notice that BPDN and IHT asymptotically follow a known rulein source coding theory: the “6dB per bit” gain. When we talk aboutsimple quantization of the Nyquist samples, it means that adding abit to the quantizer (i. e., multiply its precision by 2) results in increas-ing the SNR by approximately 6dB. QIHT, however, follows that ruleonly when the resolution is sufficiently high (we hardly see this be-havior when b = 4→ 5).

In order to validate the idea that QIHT is effectively minimizingthe inconsistency energy given by Eq. 40 and 42, we can observe thevalue of the energy Eb(x?QIHT) after reconstruction by QIHT and com-pare with Eb(x?IHT) and Eb(x?BPDN). We must however remark thatthe energy Eb(u) is a sum of M terms J((Φu)i,yi), correspondingto the M measurements. Instead of Eb(u) it seems, therefore, naturalto plot the average value of J((Φu)i,yi), i. e., simply Eb(u)/M. Thisreconstruction quality metric can be seen as a QCS counterpart of themean squared error (MSE) between the original measurements y andthe re-synthesized ones Φu. Indeed, the MSE is equal to 2E(u)/M(the constant factor 2 is not important since it does not change theshape of the curves), where E(u) = 1

2‖Φu− y‖22 is the quadratic en-ergy minimized by IHT. The plots are shown on Figure 20.

We can see that the mean inconsistency is 2 to 4 orders of magni-tude lower when using QIHT. It shows that QIHT is indeed minimiz-ing the problem 39. It also suggests that we could use that energy asa stopping criterion when it goes below a wanted threshold. To ver-ify that, a deeper analysis of the numerical properties of convergence,which is beyond the scope of this thesis, must be done. In furtherworks, we could also use the value of this energy to provide a boundon the reconstruction error (similarly to what is done for 1-bit andhigh-resolution CS).

It can be interesting to compare the convergence of QIHT and IHTin terms of number of iterations to reach the fixed tolerance. We seeon Figure 21, that QIHT takes on average more iterations than IHTand when the number of measurements is too low, it “fails to con-verge” i. e., reaches 500 iterations very often. We see that this phe-nomenon seems to depend onM rather than on B. Indeed, the curvesthat correspond to higher resolution b start decreasing only afterreaching a higher B (than the lower b curves). In other words, theyall start decreasing after reaching a constant number M of measure-ments. This could be linked with the properties of the embeddingformed by the sensing matrix-quantizer combination (like the RIP forCS or the BεSE for BCS).

Page 68: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

56 the bridging algorithm

B = bMN

SN

Rd

B

b = 5

b = 4

b = 3

b = 1 & 2

256 512 768 1024 1280−10

0

10

20

30

40

(a) QIHT

B = bM

NS

NR

dB

b = 5

b = 4

b = 3

b = 2b = 1

256 512 768 1024 1280−10

0

10

20

30

40

(b) IHT

B = bM

NS

NR

dB

b = 5

b = 4

b = 3

b = 2

b = 1

256 512 768 1024 1280−10

0

10

20

30

40

(c) BPDN

Figure 19: NSNRdB obtained with a Monte-Carlo benchmark to compare QIHT withIHT and BPDN. The measurements are obtained with an optimal (Lloyd-Max) quantizer with b bits per measurement. The number of quantized

measurements provided to the algorithms is M =⌊Bb

⌋with a total al-

lowed number of bits B ∈ {64, 128, ..., 1280}.

Page 69: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.3 results and discussions 57

B = bM

Eb(x

⋆)/M

b = 5

b = 4

b = 3

b = 2b = 1

0

0 200 400 600 800 1000 1200 1400

−6

−4

−2

10

10

10

10

(a) QIHTB = bM

b = 5

b = 4

b = 3

b = 2

b = 1

0

0 200 400 600 800 1000 1200 1400

−6

−4

−2

10

10

10

10

(b) IHT

B = bM

b = 5

b = 4

b = 3

b = 2

b = 1

0

0 200 400 600 800 1000 1200 1400

−6

−4

−2

10

10

10

10

(c) BPDN

Figure 20: Mean inconsistency energy Eb(u)/M in function of the total number ofbits B for the three algorithms.

B = bM

Nu

mb

er

of

itera

tio

ns

(max

500)

00

200

200

400

400 600 800 1000 1200 1400

100

300

500

(a) QIHT

b = 5

b = 4

b = 3

b = 2

b = 1

B = bM

00

200

200

400

400 600 800 1000 1200 1400

100

300

500

(b) IHT

Figure 21: Average number of iterations in the Monte-Carlo benchmark to compareQIHT with IHT before achieving the stopping criterion. Both algorithmswere stopped after 500 iterations or when ‖x(n) − x(n+1)‖/‖x(n+1)‖reached the tolerance of 10−4.

In terms of running time on a common laptop (Intel R©CoreTM2 Duo

CPU P7350 @ 2.00GHz 2.00GHz), QIHT was on average slower thanIHT and BPDN. In this regard, we can visually correlate the timingcurves on Figure 22a with the number of iterations on Figure 21a :the slow performances of QIHT are directly linked with the numberof iterations. Note however that since M is proportional to B (for con-stant b), the complexity per iteration (in O(MN)) and therefore thetime increases with B. Similarly for fixed B, M is inversely propor-tional to b. This explains why the b = 1 curve indicates the slowerreconstruction time. Furthermore, we see that BPDN is faster thatQIHT. This unexpected result is explained by the small dimensionsof the problem. BPDN is, indeed, much slower for larger dimensions(O(N3)). Unsurprisingly however, BPDN was still slower than IHTwhich outperformed both others in processing time.

Page 70: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

58 the bridging algorithm

B = bM

Tim

e(s/

sam

p)

200 400 600 800 1000 1200 1400

−2

−1

0

0

10

10

10

(a) QIHTB = bM

200 400 600 800 1000 1200 1400

−2

−1

0

0

10

10

10

(b) IHT

b = 5

b = 4

b = 3

b = 2

b = 1

B = bM200 400 600 800 1000 1200 1400

−2

−1

0

0

10

10

10

(c) BPDN

Figure 22: Average running time per sample for each scenario.

We emphasize the fact that using QIHT may be advantageous interms of quality but it may require more time to achieve this betterreconstruction. However, as we can verify on Figure 23, the computa-tion time per iteration is not much bigger than IHT. Indeed, the onlydifference is the quantization step which requires O(bM)� O(MN)

operations if it is implemented with binary search3. The number ofoperations required by quantization is then low with respect to thematrix multiplication that comes just after.

B = bM

aver

age

tim

ep

erit

er.

(sec

)

0 200 400 600 800 1000 1200 1400

−4

−3

−210

10

10

(a) QIHTB = bM

0 200 400 600 800 1000 1200 1400

−4

−3

−210

10

10

(b) IHT

Figure 23: Average time per iteration in the Monte-Carlo benchmark to compareQIHT with IHT.

3 If, however, it is implemented by simply comparing every measurement with ev-ery threshold, the added complexity becomes prohibitive: when b increases by 1,the number of bins doubles so a badly implemented quantization would requireO(2bM) operations. This number can become much larger than O(MN) at highresolutions.

Page 71: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.3 results and discussions 59

3.3.3 Entropy and classical compression

In the previous experiment, we have considered the total number ofbits in the output of the quantizer as the reference for our comparison.However, in some acquisition scenarios, we may have the opportunityto code the binary flow with some entropy coding. This is why we canalso consider the bit budget (in terms of information theory) as the ref-erence, i. e., the total number of bits if we are able to use an optimalcode. In other word, the bit budget is the entropy of one measure-ment H(Yi) (for the random variable Yi) multiplied by the numberof measurements M. We have seen in Chapter 2 that the entropy iscomputed as H(Yi) =

∑2bj=1−pj log2(pj) with pj = P(Yi ∈ Rj) which

can be estimated by p̂j =Mj

100M with Mj the number of measure-ments (among the 100M generated) that are quantized in the bin j.Figure 24 presents an example of measurements distribution withM = 256. They are quantized with a 3 bits (8 bins) Lloyd-Max quan-tizer adapted to the estimated distribution of the non quantized mea-surements z; a Gaussian N(0,σz) with standard deviation σz = 3.96.As you see and as predicted in Section 2.3, the Lloyd-Max quantizerdoes not distribute uniformly the measurements among the bins. Itmeans that the entropy is not maximized and we could still compressusing entropy coding. Indeed, the estimated entropy for this exampleis H(Yi) = 2.81 < b = 3. The bit budget per measurement vector yfor this example is then M.H(Yi) = 718.6 < B = 768.

ω1 τ2 ω2 τ3 ω3 τ4 ω4 τ5 ω5 τ6 ω6 τ7 ω7 τ8 ω8

.04.04

.10.10

.16.16

.20.20

Figure 24: Histogram of the 100M unquantized measurements z with M = 256, fit-ted normal distribution N(0,σz) with standard deviation σz = 3.96 andLloyd-Max quantizer thresholds τj for j ∈ {2, . . . , 2b − 1} and levels ωjfor j ∈ {1, . . . , 2b} with b = 3. The total number of bits for this situa-tion is B = bM = 768. The proportions of the measurements are indi-cated above above each bin. The estimated entropy of one measurementis H(Yi) = 2.81 < b = 3.

On Figure 25 the entropy of Yi (still quantized with the Lloyd-Maxquantizer) is plotted as a function of B for several values of b. Un-surprisingly, the entropy is constant, which simply means that thedistribution of Zi is always (approximately) N(0,σz) (with σz onlydependent on K, see above) even when M changes. As remarked inSection 2.3, the entropy of Yi when using this quantizer is always (ex-

Page 72: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

60 the bridging algorithm

cept for the 1-bit quantizer which is simply a threshold at 0) slightlybelow the maximum,i. e., b.

b = 5

b = 4

b = 3

b = 2

B = bM

H(Y

i)

b = 1

256 512 768 1024 12800

1

2

3

4

5

Figure 25: Entropy of a quantized measurement (with Lloyd-Max optimized quan-tized) as a function of B = bM for several values of b. In particular, wesee that H(Yi) does not depend on B.

The next figure (Figure 26) shows the QIHT curves of the MonteCarlo described above with two additional references. First, in thetop left corner, the dashed black line labeled “optimal compression”corresponds to the classical ADC followed by compression where weare able to select the best K−terms of the signal and only keep theirquantized values. Note that the dashed line therefore correspondsso to say to M = K (with the M “measurements” being in fact theK non zero terms of x) and the quantizer resolution b varying asBK in function of the allowed total number of bits B. It is the bestthat we could possibly do with the Lloyd-Max quantizer. We see thatthe NSNR of this optimal quantization-compression is growing veryquickly (much faster than QIHT). This strong slope is due to the verysmall value of K that requires small B to obtain high precision (e. g., toobtain a 32−bit precision, it requires only B = bK = 512 bits) 4. Lesssparse signals (or in general less compressible signals) would resultin less abrupt slope for optimal compression but also most probablyin lesser QIHT reconstruction quality (if any reconstruction at higherK is possible). We emphasize the fact that such compression must bedone after the quantization of the whole signal (soN samples must be

4 We neglect the fact that we must also know the position of the non zero coefficients,i. e., the support of x, if we want to be able to reconstruct it. If all the positionsare equiprobable, we need log2(N) = 10 additional bits per measurement in orderto keep this information. If these additional bits were taken into account, the linewould simply be shifted by K log2(N) = 160 to the right. If the probabilities are notequally distributed (as it is the case in image or sound compression), it means thatH(supp x) < log2(N) and we can use entropy coding for these indexes to reducetheir average “physical bits cost”. This is what is basically done for example in JPEGand MP3 compression standards.

Page 73: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.3 results and discussions 61

quantized). Adding the fact that high definition quantization is slowand major consumer of power, we see that everything comes with aprice.

optimalcompression

simple ADC

QIHT →{

between optimal compressionand direct signal quantization

B = bM

NS

NR

dB

1024 2048 3072 4096 5120−10

0

10

20

30

40

50

60

Figure 26: Comparison of the performances of QCS reconstructed by QIHT withoptimal quantization-compression quality (simple ADC followed by com-pression, dashed black line) and with quantized uncompressed sensing(simple ADC without compression, long horizontal lines). Remark thatthe latter follows approximately the classical “+6dB per bits” rule.

The second set of lines drawn on that figure, i. e., the 5 long flatlines, indicate the NSNR levels of an uncompressed quantized sens-ing (a simple ADC). To obtain these levels, we proceed as follows. Weassume that x is in fact the (inaccessible) sparse transform of somesignal x̃ in some orthonormal basis. The acquisition chip is assumedto be unable to do any kind of compression, i. e., the only thing it cando is acquire the N Nyquist samples of x̃ and quantize them. Here,we chose the DCT as orthonormal basis because it is simple, real, andspreads the energy of x among all the x̃ coefficients. The lines sim-ply indicate the levels of NSNRdB(Qb(x̃), x̃) for b ∈ {1, . . . , 5}. Themarkers indicate the total number of bits required to code these un-compressed quantized signals. We see that, as we would expect, theperformances of uncompressed ADC are obviously far below quan-tized compressed sensing when it is reconstructed using QIHT.

Page 74: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

62 the bridging algorithm

3.4 rmpi qcs device – model and numerical validation

In this section, we present a more concrete implementation of quan-tized compressed sensing and reconstruction with QIHT. It is a com-pressed sensing device for mono dimensionnal signals. The modeland numerical results, both introduced hereafter, are the fruit of in-teresting interactions with Loïc Neu, another master student whosemaster’s thesis was about the mixed analog-digital design and behav-ioral simulation (with Verilog AMS) of such a sensor. These interac-tions allowed to test the QIHT algorithm on a real case.

3.4.1 Device presentation

The imagined device is called the random modulator pre-integration(RMPI) which is composed from several parallel branches of ran-dom demodulators (RD). The latter was already studied by someresearches, e. g., in [KLW+

06, LKD+07, TLD+

09, PAL13]. It could beused to sample any one-dimensional bandlimited signal x(t). In thecase studied by Loïc Neu, the device is designed to be able to acquire,at a lower power, simple audio signals. For simplicity of this study,the audio signals are, in fact, supposed to have a pretty poor fre-quency content, i. e., be sparse in the Fourier domain (and even morespecifically, as it will be seen later, they are supposed to contain onlysome non-phase shifted pure tones). This assumption meets the dis-cussions in Chapter 2 on the fact that a signal can be reconstructedwith some accuracy provided that it is sufficiently compressible butthat the quality is easier to control if the signal is exactly K−sparse ina given basis. The audio signals that would actually be acquired withsuch a device would in fact be compressible in the Fourier domainor in a dictionary of wavelets (or a combination of both). This is whywe can say that this study is a first idealized approach of the scenario.

The electronic of such a device causes many non-idealities whichare briefly discussed in the last sub-section. For the simulations, wehave considered only a very limited number of these non-idealities.The results obtained with simulated signals can therefore differ sig-nificantly from reality.

Figure 27 shows a block diagram of the studied RPMI device. Inthis section, the indexes n ∈ [N] and l ∈ [B] in square brackets (e. g.,in x[n] and y[l]) indicate temporal indexes of discrete sequences (withdifferent sampling frequencies) while the subscripts m ∈ [MB] of theoutputs ym[l] indicate the index of the branch.

1. An analog signal x(t), whose bandwith is limited to 1/2T , iscaptured and sent in MB different RD (Random Demodulator)branches.

Page 75: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.4 rmpi qcs device – model and numerical validation 63

2. In each branch, it is mixed (multiplied) with a pseudo-randomsquare wave sm(t), whose edges fall on multiples of T , in orderto extend the spectrum of xm(t) = sm(t)x(t) across the wholeband (see [TLD+

09]).

3. Then we integrate it over a period NB × T . This is the “accumu-late” phase.

4. Then it is sampled and the integrator is reset to 0. This step,called “dump” can possibly last one or more cycles T . Notethat this “integrate and dump” approach has approximately theeffect of a low-pass anti-aliasing filter on xm(t). MB measure-ments zm[l] are produced (every T × (NB +Ndump)).

5. Finally, the sampled measurements are digitized using the quan-tizer Qb to produce the MB sequences of quantized outputym[l].

The MB measurements are concatenated into a large vector until Bsnapshots are taken and a time N × T = B × (NB +Ndump) × T iselapsed. A measurement vector thus contains M = B×MB measures.

×MB

Qb

x(t)

Qb

Qb

s1(t)

s2(t)

sMB(t)

x1(t)

x2(t)

xMB(t)

NBT

NBT

NBT

NBT

NBT

NBT

y1[l]

y2[l]

yMB[l]zMB

[l]

z2[l]

z1[l]

Figure 27: Block diagram of the simulated sensing device with continuous input

Page 76: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

64 the bridging algorithm

3.4.2 Model and idealization

3.4.2.1 Signal model

We will consider the discrete equivalent of the signal x(t) over a du-ration N.T , i. e., in the mathematical model of the circuit, it is as-sumed sampled, after low-pass anti-aliasing filter, at the frequency1/T (above the Nyquist frequency). We therefore manipulate a dis-crete signal of N samples x[n] = 1

T

∫(n+1)TnT x(t)dt (with n ∈ [N]).

This signal is assumed K−sparse in the DCT basis, i. e., as we haveseen in Chapter 2 if x = Ψα, or with the explicit formula for the DCTtransform

αk =

N∑

n=1

x[n] cos[π

N

(n−

1

2

)k

]k ∈ [N]. (62)

then, there is only K of the DCT coefficients αk that are non zero5.

As before in the Matlab benchmark, the numerically tested signalsare generated by selecting the K indexes that constitute the supportof α uniformly at random and by assigning to each entry αk fromthat support a random value following a normal distribution N(0, 1).The inverse DCT Ψ is then applied to α in order to obtain x.

3.4.2.2 Block diagram and matrix representation

We can simplify the diagram of Figure 27 by assuming that the in-put signal is the discrete equivalent x[n]. The pseudo-random squarewave sm(t) can therefore be replaced by a sequence sm[n] of pseudo-random ±1. Similarly, the integration is reduced to a sum of the lastNB samples. Figure 28 shows such a simplified block diagram. It issimply called the discrete equivalent of the previous diagram.

This second manner of drawing the diagram allows us to modelthe comportment of the circuit by means of a sparse bloc-diagonalmatrix. The non zero entries of this sensing matrix Φ correspond tothe sm[n] = ±1 sequences.

The fixed parameters are the number MB of branches, the sam-pling frequency (above Nyquist) 1/T , the period NB of the integrateand dump filter and the number of cycles Ndump required to resetthe integrator. By setting the number of snapshots B, we set the du-ration (counted in number of cycles) of a complete sampling N =

(NB + Ndump) × B and the length of a measurement vector M =

MB × B. Note that B is, by construction, the number of blocks (of

5 The indexes of the DCT coefficients are noted as subscripts because they are nottemporal.

Page 77: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.4 rmpi qcs device – model and numerical validation 65

×MB

Qb

Qb

Qb

x1[n]

x2[n]

xMB[n]

NB

NB

NB

NB

NB

NB

y1[l]

y2[l]

yMB[l]zMB

[l]

z2[l]

z1[l]

s1[n] = ±1

x[n] s2[n] = ±1

sMB[n] = ±1

=

1T

∫ (n+1)T

nT

x(t)dt

Figure 28: Block diagram of the discrete equivalent of the simulated sensing device

size MB ×NB) of the sensing matrix Φ. Finally, we obtain numeri-cally the measurements vector y by y = Qb(Φx) = Qb(ΦΨα) whereQb is the quantizer applied element-wise.

Figure 29 shows how to represent the sensing process in matrixform. Figure 30 represents one example of realization of the pseudo-random matrix Φ with NB = MB = 32, B = 128 and Ndump = 1

which gives M = 4096 and N = 4224.

Page 78: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

66 the bridging algorithm

Qb(

N = B×NB︷ ︸︸ ︷

MB

NB︷ ︸︸ ︷

±1 · · · ±1

.... . .

...

±1 · · · ±1

dump︷︸︸︷0

...

0

±1 · · · ±1

.... . .

...

±1 · · · ±1

0

...

0

. . .

±1 · · · ±1...

. . ....

±1 · · · ±1

0

...

0

x[1]

x[2]

...

...

...

...

...

...

x[N]

) =

y1[1]

y2[1]...

yMB[1]

y1[2]...

yMB[2]

...

y1[B]...

yMB[B]

Figure 29: Matrix representation Qb(Φx) = y of the sensing scenario

1500

1500

500

500

1000

1000

2000

2000

2500

2500

3000

3000

3500

3500

4000

4000

Figure 30: An example of one sensing matrix realization in colored view (red = +1,green = 0, blue = −1). The rectangle box shows a zoomed view of afew blocs of pseudo-random ±1. Here NB = MB = 32, B = 128 andNdump = 1 which gives M = 4096 and N = 4224.

Page 79: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.4 rmpi qcs device – model and numerical validation 67

3.4.3 Results and discussions

3.4.3.1 First experiment: which quantizer to choose?

The first benchmark test of that model is designed in order to com-pare the quality performances for different quantizer resolution levels.The quality is mainly evaluated on the NSNR in function of the totalnumber of bits allowed at the output of the device.

We use basically the same “Monte-Carlo” methodology as in theprevious section. We randomly generate 100 K−sparse signals x. Theonly difference is that they are sparse in the DCT, i. e., we randomlygenerate K values αk following a normal distribution N(0, 1) and se-lect the support of α uniformly at random among [N] with N = 1024

and K = 16. Then we compute x by applying the inverse DCT trans-form to α, i. e., as in Chapter 2, x = Ψα.

The device is assumed to contain MB = 32 branches. We deter-mine the number of blocks (or snapshots) B ∈ {1, 2, . . . , 512}. Notethat is equivalent to choosing the sampler frequency. These blocksdivide the N samples in a total of B×MB = M ∈ {32, 64, ..., 16384}measurements. We assume, for simplicity of the experiment, that thedumping step is instantaneous (i. e., Ndump = 0). The quantizer is,again, a Lloyd-Max optimized quantizer. As in the previous section,we estimate numerically the variance of the measurements ym[l]. Thenumber of allowed bits per measurement at the quantizer output ischosen in b ∈ {1, 2, 3, 4, 5}. For every couple (M,b), we generate thesensing matrix Φ as it is represented on Figure 29.

The QIHT algorithm will try to recover the sparse vector α, givenΦ and the sparsity transform Ψ. The algorithm is identical but wemust use Θ = ΦΨ instead of just Φ. Mathematically, it is equiva-lent to apply the fast transform Ψ to α(n) before applying the sparsematrix Φ but obviously, this way of computing the iteration will bemuch faster. Conversely, when we have to apply ΘT , we can insteadapply the sparse matrix ΦT and only then apply the adjoint of Ψ, i. e.,the fast DCT transform ΨT . Furthermore, in the end, we have to trans-form the recovered α?

QIHT, to find x?QIHT = Ψα?QIHT.

More explicitly, the algorithm reads

α(n+1) = HK[α(n) + µΨTΦT

(y−Qb(ΦΨα

(n))], (63)

with α(0) = 0 and µ = µ?(M), i. e., the heuristic defined in the previ-ous section. The stopping criterion is ‖α(n+1) −α(n)‖‖α(n+1)‖−1 <10−4 or when the number of iterations exceeds n = 500.

Page 80: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

68 the bridging algorithm

M

NS

NR

dB

b = 5

b = 4

b = 3

b = 2

b = 1

0

1 2 3 4 5−10

10

1010101010

20

30

40

50

60

70

(a) In function of the number of measurements

0

B = bM

NS

NR

dB

b = 5

b = 4

b = 3

b = 2

b = 1

1 2 3 4 5−10

10

1010101010

20

30

40

50

60

70

(b) In function of the total number of bits

Figure 31: Average NSNR of QIHT for 5 different quantizers

The average NSNRdB is then computed. The result is shown on Fig-ure 31. It is also interesting to observe the number of convergencefailures 6 on Figure 32 The number of failures is big when B is small.Except for the special case b = 1, the algorithm converges much betterwhen B exceeds 103 which corresponds approximately to the abrupttransition observed on Figure 31b. We also see that whenM is big, theNSNR becomes smaller and the number of failures increases slowly.We emphasize that since MB and N are constant, a big M means abig B and therefore a small NB. In other words, when M becomesbig, the number NB of accumulated samples in each branch for eachsnapshot is becoming ridiculously small. It means that the output ofthe MB branches will contain a lot of redundancy (which is what wewant to avoid, see Chapter 2). Indeed, when NB is small, the num-ber of binary combinations of ±1 is more and more reduced. In par-ticular when this number, which is in fact equal to 2NB , becomessmaller than MB = 32, there will inevitably be some measurementsthat are exactly identical and we loose the advantage of adding mea-surements. In fact, the extreme case of NB = 1 is basically equivalentto a simple ADC with 32 times the same information (the sign of themeasurement ym does not bring any additional information since itis randomly determined by sm). It is not necessary to add that thisconfiguration would be extremely wasteful and totally useless.

As in the previous section, we can compare the performances ofthe device with the simple uncompressed sensing provided by a clas-sical ADC. On Figure 33, the performances are actually shown interms of NSNR as a function of the bit budget H = M×H(Ym[l]).As discussed previously, this bit budget is the total number of bits

6 Remind that a reconstruction is qualified failure when the maximum number ofiterations is reached (in this case 500).

Page 81: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.4 rmpi qcs device – model and numerical validation 69

b = 5b = 4b = 3b = 2b = 1

B = bM

Nu

mb

er

of

fail

ure

s(m

ax

100)

1 2 3 4 51010101010

0

20

40

60

80

100

Figure 32: Number of failures (n=500) of QIHT in function of the total number ofmeasurements (logarithmic scale) for 5 different quantizers

that would be required if we used entropy coding on the output bitstream. As a matter of fact, it is, for a Lloyd-Max quantizer, inferiorto B. Anyway, the comparison with uncompressed ADC is unques-tionable: the compressed sensing scenario is (fortunately) way betterthan a simple coarse ADC.

H = 4842

H = 3853

H = 2890

H = 1951

H = 1024

H = MH(Ym[l])

NS

NR

dB

b = 5

b = 4

b = 3

b = 2

b = 1(b = 5)

(b = 4)

(b = 3)

(b = 2)

(b = 1)

1 2 3 4 5−10

0

10

1010101010

20

30

40

50

60

70

Figure 33: Comparison of the performances of the device (light color curves) witha classical ADC (dark straight lines). The NSNR is shown in function ofthe (information theory) bit budget H = N×H(Ym[l]),i. e., the minimumnumber of bits that would be required if we used entropy coding on theoutput bit stream. The bit budget required for a classical ADC to encodeNsamples (without DCT compression but with entropy coding) is indicatedby big markers on the lines.

Page 82: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

70 the bridging algorithm

3.4.3.2 Second experiment: influence of BER on BIHT reconstruction

This second experiment will study one particular setting of the de-vice. This setting was given by Loïc Neu who was working on thebehavioral simulation of the mixed analog-digital electronic devicewith Verilog AMS. The parameters are the following:

• NB = 32 number of accumulated digital equivalent samples.

• Ndump = 1 number of cycles necessary to reset the accumulator.

• MB = 32 number of RMPI branches

• B = 128 number of measurements taken per branch.

• K = 16 sparsity of x in the DCT (i. e., number of non zero fre-quencies).

This gives N = 4224, M = 4096 (i. e., exactly the example pre-sented on Figure 30). We select a simple binary quantizer, b = 1,i. e., Q1(.) = sign(.). This is the simplest possible quantizer: it can beimplemented to be cheap, extremely fast, compact and low power. Itcan also easily be replicated MB times and placed at the end of eachbranch.

Another side advantage of such a binary scheme is that a noise canbe modeled as a binary error probability, i. e., a probability that thequantizer makes the wrong decision. In order to study the influenceof the binary error rate (BER) p on the quality of reconstruction, wematerialize this probability by a random variable ϑ such that

ϑ =

{1 with probability 1− p

−1 with probability p.(64)

Mathematically, we only have to multiply the noiseless measure-ments by this random variable which gives

y = ϑQ1(Φx). (65)

In the Matlab benchmark, we test several logarithmically spacedBER p from 10−6 (i. e., one bit error over a million measurements) to10−1 (i. e., une one error over ten measurements). We use the samemethodology than in the first experiment, i. e., 100 samples per pvalue. The stopping criterion and µ are identical and the NSNR isalso computed the same way. The result is displayed on Figure 34.We see that this sensing-reconstruction scheme is not robust at all toerrors. Indeed, the NSNR falls dramatically when the BER exceeds 1per 1000. This is explained easily when we look back in Chapter 2 to

Page 83: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.4 rmpi qcs device – model and numerical validation 71

the Figure 14. If one of the constraint planes indicates the wrong halfspace, the solution of the problem may simply no be found. In or-der to obtain a good reconstruction quality, we must therefore chooseextremely reliable components or explore other reconstruction possi-bilities.

BIHT

BIHT−ℓ2

NS

NR

dB

BER

−6 −5 −4 −3 −2 −10

5

10

101010101010

15

20

25

30

35

40

45

Figure 34: SNR of the reconstruction with BIHT (blue) and BIHT-`2 (dashed orange)in function of the input bit error rate (BER).

Another algorithm has also been tested in order to compare therobustness against binary faults. This algorithm, called BITH-`2 is de-scribed in [JV10], i. e., the same paper than BIHT. It is a variant ofBIHT that tries to approximately minimize the one-sided `2−norm(instead of the one-sided `1−norm minimized by BIHT, see Chap-ter 2). The algorithm is defined by the following iteration:

α(n+1) = HK[α(n) − µΨ∗Φ∗Dy

∗(DyΦΨα(n)))−

], (BIHT-`2)

with Dy a matrix whose diagonal is filled with the entries of y. Werefer the reader to [JV10] for further details. For this algorithm, wedefine µ = 1/MB and α(0) = Ψ∗Φ∗y‖Ψ∗Φ∗y‖−1. Apart from that,the stopping criterion and the setup of the experiment stay identical7.As we can see on Figure 34, BIHT-`2 gives only a slightly better resultthan BIHT when the BER exceeds 1%. For low BER on the other hand,BIHT-`2 performances are quite poor.

7 NB: We don’t use Hamming distance proposed in the article

Page 84: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

72 the bridging algorithm

3.4.3.3 Proposition of modification of the device against bit errors

We propose the following modification in order to make the devicemore robust against noise. This modification implies to add 1 or 2

comparators or to use a “3 bins” (log2(3) bits) quantizer. The ap-proach follows the principles described hereafter. If a noisy measure-ment before its quantization is near 0, there is a significant probabil-ity that the value has the wrong polarity (the wrong sign) because ofthe noise. Conversely, when a measurement is, because of the noise,wrongly quantized, there is a big probability that the original noisymeasurement was near the threshold,i. e., near 0. This is why we pro-pose to define a minimum absolute value threshold τdiscard for theunquantized measurements zi. If |zi| > τdiscard then yi = sign (zi),else, we simply throw zi away because its sign is not trustworthy.The value of τdiscard must be chosen in regard of the noise power (orvariance) and according to a chosen BER. As one could infer fromthe NSNR versus B (Figure 31b, blue curve with ‘+’ markers) andNSNR versus BER (Figure 34, blue curve with ‘+’ markers) graphs,a reconstruction algorithm like BIHT will probably work better withless measurements that are 100% correct than with more measure-ments in total but among which 90% are correct and 10% are wrong.

A simulation was done in order to test the robustness of BIHTagainst the elimination of the measurements of smallest absolute value.The same framework as in the experiment about the BER was used,i. e., NB = 32 ; Ndump = 1; MB = 32 ; B = 128 ; K = 16. We proposeto define the probability of discarding a measurement pdiscard. Aftera numerical evaluation of the standard deviation σz of the randomvariable Zi corresponding to an unquantized measurement, we setthe threshold τdiscard so that

P(|Zi| < τdiscard) = pdiscard. (66)

This is easily calculated since we assume that Zi follows N(0,σz).The probability pdiscard is varied logarithmically from 10−6 to 10−1

and then more finely from 10−1 to almost 1. The Figure 35 show theresults. We can see that the results are promising. Indeed, when theproportion of discarded samples is not too high, the NSNR remainsrelatively good, e. g., when pdiscard = 0.01, we have NSNRdB ≈ 30dB.Even when most of the measurements are discarded (pdiscard > 0.1,the fact that we keep the most “significant” ones (in terms of abso-lute value) seems to slow down the falling of the NSNR which staysabove 15dB almost until pdiscard > 0.3. However, we cannot concludeabout the real efficiency of the modified device if we only considerthese graphs. Indeed, the BER experiment conducted in the previ-ous subsection does not consider the amplitude of the measurementswrongly quantized. We corrupt indistinctly “big’ and “small” mea-surements (in terms of absolute value). If the noise is approximately

Page 85: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.4 rmpi qcs device – model and numerical validation 73

0

discard small samples

reference (keep all)

NS

NR

dB

pdiscard

−6 −5 −4 −3 −2 −1

5

10

101010101010

15

20

25

30

35

40

45

(a)

pdiscard

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

5

10

15

20

25

30

35

40

45

(b)

Figure 35: NSNR obtained by BIHT when discarding samples below a thresholdτdiscard fixed such that the probability of discarding is pdiscard. Left graphis logarithmically plotted for low probability (from 10−6 to 0.1), rightgraph is a finer plot for high discarding probability (from 0.1 to 0.9 i. e.,when discarding almost all the samples).

Gaussian, as explained above, a small measurement has more chanceto “pass to the other side” of the threshold than a big one. The devicemodification is even based on that principle.

This is why we conducted another experiment. This time we modelthe errors as a noise whose power is determined by the probabilityof bit flips (i. e., the average BER). For a random measurement Z ∼

N(0,σz) (with an associated PDF fZ(z)), a noise N ∼ N(0,σn) (withan associated PDF fN(n)) and based on the BER, we have,

BER = P((Z+N) > 0 |Z < 0) = P((Z+N) < 0 |Z > 0), (67)

i. e., the probability that a noisy measurement is positive (resp. neg-ative) while the original measurement was negative (resp. positive).The value of σz is still numerically evaluated (as in the other exper-iments). We have to determine the standard deviation of the noisecorresponding to a given BER. Actually if we define Z+ the variableZ conditioned to Z > 0, we have,

fZ+(z) =

{2fZ(z) if z > 0

0 else. (68)

So we can define Y = Z+ +N and its PDF given by

fY(y) =

∫∞

0

2fZ(z)fN(y− z)dz (69)

and finally,

P(Y < 0) =

∫0

−∞fY(y)dy. (70)

Page 86: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

74 the bridging algorithm

We have now, a way to compute the BER, given σz and σn by nu-merical integration of the PDF. We simply use the Matlab functionintegrate. However, what we want is to inverse the relation and findσn, given the BER and σz. Again, we use a built-in function of Matlabto do this: the non linear equation solver fsolve. The computation isrelatively slow (a few seconds) but gives a correct result.

Now that the noise power is fixed such as to imply binary errorswith a known probability, we can fix the threshold τdiscard based onthat probability. Remark that we will also discard a lot of measure-ments that were correctly quantized. If we want to have a chance todiscard all the errors, we have to discard a lot more than the BER.This is why we will set the probability of discarding measurementshigher than the BER, e. g., in this case we set pdiscard = 3BER. Figure 36

displays the result of this experiment with a BER varying from 10−4

to 10−1.

discard small samples

noisy case (keep all)

noiseless reference case (keep all)

NS

NR

dB

BER

−4 −3 −2 −10

5

10

10101010

15

20

25

30

35

40

45

Figure 36: In dashed black is the reference NSNR of the reconstructions not affectedby the noise (≈ 41dB). In blue, with ‘+’ markers is the curve affected bythe errors. Finally, in red with ‘o’ markers, is the curve where the mea-surements have been affected by the noise and the noisy measurementssmaller than τdiscard have been discarded.

Surprisingly, discarding small samples with pdiscard = 3BER seemsto be inefficient. When the number of binary errors is around 1%, dis-carding small samples degrades the NSNR. This unexpected resultcan be interpreted like this. According to the results of the simulation(not all are shown here), among the discarded measurements, only28% (on average) are effectively binary errors. This means on average72% were good measurements, discarded by “mistake”. Moreover, wediscard on average 87% of the errors, which means that we let 13%of the wrong measurements. In other word, we divide approximatelythe apparent BER by 10 at the expense of deleting a lot of useful mea-surements. In this case, the gain in BER is obviously lower than the

Page 87: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.4 rmpi qcs device – model and numerical validation 75

loss in number of measurements.

We also tried other values for pdiscard (1, 2, 4, 5 and 6 times the BER)with less samples (10 instead of 100) but they all gave similar or worseresults. For pdiscard =BER, the “discard” curve was 1 or 2 dB above the“noisy” curve when BER= 0.1 (i. e., around 15dB instead of 14). Butanyway, since the number of samples was low, this difference couldnot be considered statistically significant.

However, another observation can be made on this graph whenwe compare it to Figure 34. The NSNR degrades much slower withthe BER when the errors are truly generated with prequantizationwhite noise than when the errors are generated uniformly at randomamong the measurements. This can be explained by the fact that smallamplitude measurements are more easily corrupted than big ones. Ifyou recall the interpretation of 1-bit CS illustrated on Figure 14, asmall amplitude measurement means that the objective is near thehyper-plan defined by the constraint. It also means that if the recon-structed signal is on the wrong side of that hyper-plan but near thefrontier, the error will probably be small. So to summarize, when pre-quantization noise is Gaussian (or more generally if the noise PDFis concentrated around 0), we have a big proportion of the bit errorsthat give only a small reconstruction deviation and the resulting re-construction NSNR is higher than with uniformly random bit flip.

Another possible attempt to improve reconstruction instead of dis-carding the small samples, would be to consider the modified deviceas a 3-bins quantizer: two extreme bins for the measurements that are“certain” (i. e., above τdiscard in absolute value) and one central bin for“uncertain” measurements. As noted previously, the number of binsis 3 so it would be a log2(3)−bits quantizer. A reconstruction withQIHT could improve (probably a little bit) the situation since we keepan information that was thrown away in the previous case. However,doing this adds the issue of the measurements that are considered bigbecause of the noise but were in fact below the threshold (and viceversa). With the same noise level, it would probably be less frequentthan the “flips” around 0 considered previously but it would happenanyway. The best would be to find a way to consider the “uncertain”measurements as such, i. e., use the information that the measurementwas “probably small” without creating an inviolable constraint (likethe half-space constraints in 1-bit CS). We should find a way to givemore weight to measurements of bigger amplitudes,i. e., less proba-ble,i. e., more informative. These tracks are left as open questions forfurther research.

Page 88: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

76 the bridging algorithm

3.4.4 Not modeled non idealities

There are several non idealities that have not been included in themodel and could dramatically influence the performances of the de-vice. Among other, some of the following are studied by Loïc Neu:• The comparator (1−bit quantizer) can be affected by a natu-

ral random offset. This very common problem can be due totiny variations in the manufacturing process of the circuit. Thismeans that the threshold of the quantizer can be located at anunknown value slightly different from 0. We can imagine twosolutions to tackle that problem. First, we could use the mod-ification mentioned just above. If the minimum absolute valuethreshold is set above the absolute value of the random thresh-old, the effect of the offset is simply bypassed: ambiguous val-ues are thrown away or considered as uncertain ("somewhere"in the central bin). The second possibility would be to calibratethe device in order to know exactly where the threshold is. Wecould therefore use the exact model of the quantizer in QIHTand reconstruct the signal without trouble.

• The mixing imprecision. The pseudo random square wave sm(t)

may not be ideally mixed to the input signal. Indeed, there areseveral issues that can affect the process. First, transistors have anon linear comportment and necessarily imply distortions. Sec-ond, any part of the electronics is affected by parasitic capac-itance. For example, the capacitance of the transmission linesbetween the clock generator and the inverter clock gate implyamong other causes some delay in the switching of the squarewave depending on the physical location of the inverter. Thisdelay is called clock skew. Another random delay called clock jit-ter can be added to this. These delays are particularly importantwhen we are dealing with high frequencies. Another parasiticcapacitance, this time at the inputs of the inverter, causes theswitching to be non instantaneous. There will necessarily be acertain slew rate when the waveform goes from −1 to 1 or con-versely. The time that one transition takes is called apperture. Allthese effect will have an impact on the mixed waveform xm(t)

and may eventually result in higher binary error rate.

• The low pass filter. The “accumulate and dump” strategy usedin the device is also susceptible to be affected by electronic nonidealities. We could use the knowledge of some of them dobuild a linear model of the equivalent low-pass filter. The modelcould therefore be included in the sensing matrix Φ.

• The “input” noise. Electronics necessarily produce several typesof noise (white noise,peper noise, shot noise,. . . ) depending onthe temperature, on the quality of the semiconductors and soon. The addition of all these internal noises and other external

Page 89: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

3.4 rmpi qcs device – model and numerical validation 77

causes such as electromagnetic radiations etc. is often modeledas an “equivalent” additive white Gaussian noise at the inputor, as in the NCS framework, just before the quantization. Ifwe want a complete study of the performances of the device,it is necessary to consider the influence of this noise on thereconstruction (e. g., like in Chapter 2 with BPDN and IHT).

Once modeled, some of them could be included in the reconstruc-tion priors and decrease the number of “faulty” bits. The same phi-losophy is actually used for designing BPDN and QIHT for reachingrobust reconstructions with respect to noise and quantization respec-tively: when the algorithm takes into account the non ideal parts ofthe sensing in its model of consistency (`2 deviation ε for BPDN,quantized consistency Eb instead of continuous quadratic penalty forQIHT), it has more chance to stick with reality and recover the rightsparse signal. In a mathematical point of view, we enlarge the searchspace by relaxing the constraints to what we think is certain (boundednoise, quantization consistency) and by a “leap of faith”, trust thesparsity ball ΣK (or the minimum `1 ball in the case of BPDN) tointersect this larger search space at (or close to) the right spot. Thisis also mandatory for guaranteeing that the original (pure) signal re-mains a feasible point of the reconstruction constraints. More precise(possibly stochastic) model of the sensing process would allow to fo-cus the optimization to more realistic priors and extend the searchspace in the right direction.

Page 90: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 91: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

Part III

E V E RY T H I N G C O M E S T O A N E N D

This part presents the discussions and conclusions aboutthis work and the future work that can be done.

Page 92: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 93: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

4F U RT H E R W O R K A N D D I S C U S S I O N S

To raise new questions, new possibilities,to regard old problems from a new angle,

requires creative imaginationand marks real advance in science.

— Albert Einstein

You can’t make a half sandwich.If it’s not half of a whole sandwich,

it’s just a small sandwich.

— Dr. Sheldon Cooper – The Big Bang TheorySeason 3, Episode 7: “The Guitarist Amplification”

In this chapter, we talk about what is beyond the scope of this thesisand could or will be studied in future works.

4.1 robustness against noise

As we saw in Chapter 3, BIHT meets some trouble when it comesto deal with pre-quantization noise. An improvement has been pro-posed to eliminate the unreliable measurements in but it turned outto be inefficient as it eliminates too many good measurements and toofew corrupted ones. We have also proposed another way to possiblyovercome that issue while keeping the same hardware as proposed,i. e., a single bit quantizer and an absolute value threshold τdiscard todetermine if the quantized measurement is big enough to be trusted.If the measurement is below the threshold, we keep it but we markit as uncertain. We have then 3 possible values for a quantized mea-surement:

• strongly positive;

• strongly negative;

• uncertain.

This means that it can be viewed as a log2(3) bits quantizer that canbe regularly used in QIHT. However, as it was remarked, the prob-lem of “wrong bin” assignment remains when the noise is strong. To

81

Page 94: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

82 further work and discussions

avoid that, we propose another trick to adapt QIHT by consideringeach entry as quantized with a different quantizer. If the measurementis uncertain, we consider the 3 bins quantizer described above to tryto enforce the corresponding re-synthesized measurement to be un-der the threshold. If the measurement is above τdiscard, the entry ofthe measurement vector is treated as if it was quantized with an ordi-nary binary quantizer with a threshold at 0. This way, we allow moreflexibility to the big measurements. Anyway none of both improve-ments of the method have been implemented or tested. It would beinteresting to obtain numerical validations of these hypotheses.

In the same topic, if we want to obtain realistic guarantees aboutquantized compressed sensing in general (not only 1−bit CS), it ap-pears necessary to study the influence of additive pre-quantizationnoise, i. e.,

y = Qb(Φx+n). (71)

The work in [ZBC10] deals with Gaussian noise on the measure-ments before quantization using an `1-penalized maximum likeli-hood reconstruction method but very few other works have emergedon the topic. This is why, it would be useful to consider in details theimplications of noise corruption on QCS. In particular, it would beinteresting to test numerically the reaction of QIHT (with b > 1) tocorrupted quantized measurements. Some of the ideas just presentedfor BIHT could then be adapted to higher resolutions.

4.2 quantized embeddings and formal proofs

After the bridge between the coarse 1-bit and the idealistic high res-olution greedy reconstructions provided by BIHT and IHT, it wouldbe a natural extension to study the possibility of the development ofa property bridging the RIP and the BεSE. Indeed, these two proper-ties provide a simple way to guarantee a certain reconstruction qual-ity when the problem of finding an exactly or approximately consis-tent sparse signal is solved. As we saw in Chapter 3, QIHT tries toenforce consistency between the quantized measurements and the re-constructed signal. We even noted that it seems to succeed very wellin doing so (at least in the tested benchmark framework). It seemstherefore justified to study the possibility of finding theoretical guar-anties about the QCS when consistency is achieved. Remind that theRIP, when applied to the difference of two K−sparse vectors (if weskip the details, see Definition 5) writes

(1− δ2K)‖u− x‖22 6 ‖Φu−Φx‖22 6 (1+ δ2K)‖u− x‖22 . (72)

Page 95: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

4.3 minimizing inconsistency – exact optimization 83

In parallel, the operator A(.) = sign (Φ .) is a BεSE, as defined in[JV10], if (for a certain ε and also skipping the details)

dS(u, x) − ε 6 dH(A(u),A(x)) 6 dS(u, x) + ε , (73)

with the angular distance dS(u, x) := 1π arccos 〈u, x〉 and dH, the

mean Hamming distance. The objective, in order to bridge thosevery useful properties would be to find an expression that bounds“some distance” between two quantized measurements with eithera quadratic `2-norm (like in the RIP) or an angular distance (like inthe BεSE). The bound would have to hold for a particular combina-tion of quantizer and sensing matrix. In other words, the propertywould be specific for a pair (Qb(.) ; Φ) and hold for any pair (u; x) ofK−sparse signals. An idea would be to use the piece-wise linear en-ergy Jb(z,y) defined during the development of QIHT (see Chap. 3)to give a “quantized stable embedding” hypothetical property like

(1−β)dX(u, x) −α 6 Jb(Φu,Φx) 6 (1+β)dX(u, x) +α , (74)

for some constants β and/or α possibly depending on K and for a cer-tain distance dX(u, x) (in analogy with dS(u, x) and ‖u− x‖22). Notethat Jb(Φu,Φx) depends (obviously) onΦ but also on the quantizerQb(.) (see its definition by Eq. 40).

There is obviously still a lot of hard work to do in that direction.Thanks to such a property, we may be able to find theoretical boundson the reconstruction quality in function of the inconsistency energy.In particular, this, combined to a proof of convergence of QIHT to-wards a consistent solution would provide more credit to the greedyscheme developed in this thesis. More generally, every possible QCSreconstruction scheme, greedy or from exact optimization program,would be theoretically consolidated with such a property.

4.3 minimizing inconsistency – exact optimization

This section talks about the possibility to use the discoveries pre-sented in Chapter 3 to develop a novel QCS reconstruction strategythat is based on an exact minimization problem. The goal is to find aBPDN-like optimization program that would minimize the `1−normof the reconstructed signal while keeping control on the inconsistencyenergy developed in Chapter 3.

4.3.1 Motivations

The main goal is to solve QCS reconstruction by finding an exact solu-tion to a well posed optimization problem rather than solving approx-imately a hard combinatorial problem like QIHT does (Eq. 39). The

Page 96: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

84 further work and discussions

advantage of well posed convex minimization programs is that thereexists a multitude of algorithms that are guaranteed to converge to-ward the global minimum. Nonlinear convex optimization is, indeed,a large, advanced and active field of interests in (applied) mathemat-ics because of the huge variety of applications that it covers. Doing sowould also allow to use hypothetical future results about the “quan-tized stable embedding” property imagined in the previous section.Indeed, an exact minimization of the inconsistency would then pro-vide a reconstruction error that could be more or less easily bounded.

4.3.2 Proximal operators

One of the most recent concerns in convex optimization applied toinverse problems (like CS) and in particular in signal processing isthe idea to use proximal (or proximity) operators to define efficientparallelizable algorithms. In [CP09], P. Combettes and J.-C. Pesquetshow how proximal splitting methods unify and extend well-known op-timization algorithms and discuss applications in signal processing.An excellent introductory monograph, [PB13] by N. Parikh and S.Boyd presents the proximal operators with their properties, interpre-tations and the ways to compute them. They also give an overviewof the main proximal algorithms and how (and why) they can be par-allelized. In [CP10], A. Chambolle and T. Pock present a first-orderprimal-dual algorithm that is well suited to solve regularized prob-lems like the LASSO form of BPDN. Some of the algorithms pre-sented in the previous references were used in the UCL master’s the-sis of Thibault Lienart [Lie12] to solve inpainting problems which areclosely related to CS reconstruction.

In a nutshell, a proximal operator can be viewed as the generaliza-tion of the projection operator on a convex set or as an approximationof gradient descent. It is briefly defined as follows (see, e. g., [PB13]for details).

Definition 7. Let f : RN → R∪ {+∞} be a closed proper convex function.The proximal operator of the scaled function λf is the function

proxλf(v) = argminx

(f(x) +

1

2λ‖x− v‖22

)(Prox)

We easily see that if iC is the indicator function of a convex set C,i. e.,

iC(x) =

{0 x ∈ C

+∞ x /∈ C,(75)

then

proxiC(v) = argminx∈C

(‖x− v‖2) = ΠC(v), (76)

Page 97: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

4.3 minimizing inconsistency – exact optimization 85

the Euclidean projection on C. Under some assumptions, we also havethat for a differentiable function f

proxλf(x) ≈ x− λ∇f(x). (77)

Let us emphasize that the computation of the proximal operator ofthe `1−norm give the well known LASSO problem, i. e., the shrinkage(or soft thresholding) operator:

proxλ‖.‖1(v) = argminx

(λ‖x‖1 +

1

2‖x− v‖22

)= Sλ(v). (78)

Among the useful properties of the proximal operator, let us give thetwo following.

First, when a matrixΦ is a tight frame, which means thatΦΦT = Id(but not necessarily ΦTΦ = Id which would mean that the matrix isalso orthogonal1), the proximal operator of a function f(x) = ϕ(Φx)

is expressed in function of proxλϕ(Φv) as

proxλf(v) = v+ΦT (proxλϕ(Φv) −Φv). (79)

Tight frame matrices are often encountered as sensing matrices. Amongthe examples given in Chapter 2, we can give the class of matricesΦ = SU called random basis ensemble. U is any orthogonal basisand S is a (random) selection operator which is precisely a tight framesuch that the combination of both is a tight frame.

The second property is called the Moreau decomposition. If wedefine

f∗(u) = supx

(uTx− f(x)

), (80)

the convex conjugate of a function f(x), then we can decompose anypoint v by

v = proxf(v) + proxf∗(v). (81)

In particular, this means that proxf∗(v) = v− proxf(v).

4.3.3 Quantized inverse problem

The (noiseless) quantized compressed sensing can be re-formulatedas a convex optimization problem similar to basis pursuits. Let usrecall the objective of the reconstruction. We want to find the sparsestsignal that is consistent with the measurements, given the sensingmatrix and the quantizer. In other words, we seek for the minimum

1 It would also work if ΦΦT = c Id with some constant c but we omit this detail tokeep it simple.

Page 98: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

86 further work and discussions

sparsity signal u such that the inconsistency energy Jb(Φu,y) is zero:

argminu

‖u‖0 s.t. Jb(Φu,y) = 0. (82)

Note that this hard combinatorial problem is intimately linked to theproblem that QIHT was trying to approximately solve

argminu

Jb(Φu,y) s.t. ‖u‖0 6 K. (83)

However, if we want to efficiently (in polynomial time) solve the prob-lem, we have to relax it to make it convex. Like in BP, we replace the`0-“norm” by the `1−norm which is convex:

argminu

‖u‖1 s.t. Jb(Φu,y) = 0. (84)

Like in BPDN, we can also relax the constraint and allow some in-consistency, e. g., if the measurements are possibly affected by pre-quantization noise:

argminu

‖u‖1 s.t. Jb(Φu,y) 6 εq, (85)

with a fixed εq > 0 that depends on the probability of corruption ofthe measurements. For a certain λq depending on εq, the problem isequivalent to the “LASSO” formulation

x? = argminu

‖u‖1 + λqJb(Φu,y), (QCS-LASSO)

for which, as we will see, it should be possible to implement a solverbased on proximal operators (also called a proximal algorithm). How-ever, the correspondence between λq and εq cannot be established apriori without solving the problem. If λq = 0, the inconsistency is notconstrained and the solution will trivially be x? = 0. However, if λqis too big, the inconsistency is the only thing that will be minimizedand the solution found may not be sparse at all. Recall that the in-verse problem defined by QCS is largely under-determined and thatthere are a lot of possible consistent solutions among which we seekthe sparsest. A classical result in convex optimization is that the solu-tions of (85) and (QCS-LASSO) follow the “same path” when varyingλq or εq. To find the good value of λq, we can therefore start from asmall value and increase it until Jb(Φx?,y), the inconsistency of x?,is below or equal to a chosen εq (which can be 0).

The particular case when b = 1 leaves one issue: the ambiguityabout the amplitude of the reconstructed signal. Indeed, if we do notimpose anything about that amplitude, the `1−part of the objective

Page 99: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

4.3 minimizing inconsistency – exact optimization 87

will make the solution tend to 0 while the consistency is achieved2.This implies that the only solution given by this binary version of theproposed optimization is x? = 0. To overcome that, we may add aconstraint that will force the amplitude to be strictly bigger than 0,for example:

x? = argminu

‖u‖1 + λqJ1(Φu,y) s.t. yTΦu = ηq. (86)

Note that the advantage of this constraint is that it defines a convexset so the problem remains convex. The value of ηq has no importanceas long as it is strictly above 0 (it will only determine the amplitudeof the solution which can anyway be normalized afterwards). This iswhy we can reformulate the problem as

x? = argminu

‖u‖1 + λqJ1(Φu,y) + µqyTΦu, (1bit-LASSO)

for a certain multiplier µq > 0 that depends on ηq, i. e., it has noimportance as long as it is strictly positive.

4.3.4 Hints about a possible algorithm

In [CP10], the authors provide an efficient algorithm, given a linearoperator Φ : RN → RM (i. e., a matrix Φ ∈ RM×N), a functiong(u) : RN → R and a function f(z) : RM → R, to solve the genericproblem

minuf(Φu) + g(u), (87)

if we are able to compute the proximal operators of g(u) and of f∗(z),the convex conjugate of f(z).

We can directly apply this method to the proposed optimizationprograms QCS-LASSO for b > 1 and 1bit-LASSO for b = 1.

If b > 1, we naturally use f(z) = λqJb(z,y) and g(u) = ‖u‖1. We

already know that proxλg(v) = Sλ(v). Thanks to Moreau de-composition, the prox of f∗ can be obtained by computing theprox of f. This prox is not computed here, we leave it for futureworks, but the regularity of the function and its similarities withthe `1 and `2 norms suggests that a closed form solution couldexist. If it is not the case, there exists a multitude of ways tocompute it numerically (see for example [PB13]).

If b = 1, we choose f(z) = λqJ1(z,y) and g(u) = ‖u‖1 − µqyTΦu.

Note the similarities between

f(z) = J1(z,y) = 2‖(y� z)−‖1 = ‖z‖1 −yTz (88)

2 Note that although Q1(0) is a vector exclusively filled with ω1 (a negative value)(see the definitions of the quantizer in Chap. 2) the inconsistency of 0 with anypossible binary quantized y is exactly 0, i. e., J1(0,y) = 0 ∀y.

Page 100: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

88 further work and discussions

and

g(u) = ‖u‖1 − µqyTΦu. (89)

The prox of both functions can be easily computed, but againthe resolution is beyond the scope of this thesis and left forfuture works.

Note that if the sensing matrix is a tight frame, then we can, insteadof the Chambolle-Pock algorithm, use the Douglas-Rashford splittingdescribed in [CP09] to solve the problem.

4.4 more applications

Now that the bases of compressed sensing have been well establishedtheoretically and that quantization has been introduced in the picture,new research interests about concrete specific applications are emerg-ing (see, e. g., [Car13]). The example presented in the last section ofChapter 3 shows that electronic designers begin to study where andhow quantized compressed sensing could be efficiently used. There-fore, it appears more and more necessary to be able to model anyphysical sensing scheme in order to perform evaluation of reconstruc-tion effectiveness. In particular, it seems really important to developtools to compare the performances of such or such device configura-tion, taking into account different sources of noise and modeling asprecisely as possible the non idealities. We have seen that it is notan easy task to evaluate the RIP or other similar properties especiallywhen they correspond to a particular “real world” sensing scenario.We must rely on precise mathematical models mainly based on proba-bilistic analysis of the properties of a random sensing scheme. It is noteasy either to take all the parameters of a sensing device into accountwhen evaluating the best quantizer to use (1 bit, coarse multi-bit, highresolution...) or the best sensing parameters (time and frequency con-stants, pseudo-randomness, sizes and compression rate,...). In addi-tion, if we want to evaluate the performances of the sensing device,we must carefully choose, among an increasing number of reconstruc-tion methods, the one that is the best suited to the application. A re-construction algorithm can behave very differently depending on thesensing context. Finally, when the sparsity or compressibility basis ofthe target signal is not obvious, we have to discover the best way torepresent the signal for example by using adapted (or learned) over-complete dictionaries. In summary, there is no doubt about the factthat there is a lot of work to do when we want to build an efficient(and competitive) quantized compressed sensing device.

Page 101: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

5C O N C L U S I O N S

And kids, that’s how I met your mother. . .Psych! It was just some chick.

— Ted Mosby – How I Met Your MotherSeason 6 Episode 24: “Challenge Accepted"

5.1 objective and main results

The main objective of these researches was to find an efficient greedyalgorithm for dequantizing compressed sensing. The work implied a de-tailed preliminary study of the state of the art and numerical valida-tions of the discoveries first in a general (Gaussian) framework andthen in a particular concrete application.

Such an algorithm, the quantized iterative hard thresholding (QIHT)has been developed. It is based on the minimization of a quadratic-likepiece-wise linear inconsistency energy under the constraint of K−sparsity.It was shown to be equivalent to BIHT [JV10] when a binary quan-tizer is used and to tend toward IHT [BD09] when the number of bitstends to infinity. Except for those given in previous works for the twoaforementioned extreme cases, no theoretical guarantees have beenprovided, but Chapter 4 suggests some research directions in thatpurpose. The algorithm was numerically tested in noiseless Gaussiansensing and relatively coarse quantization with a very sparse, lowdimensional problem (N = 1024 and K = 16) and compared to theperformances of IHT and BPDN. In that framework, QIHT clearly out-performed IHT and BPDN in terms of reconstruction quality but wasslower to converge than IHT. The comparison with classical ADC bitbudget requirement without compression leaves no doubt about thefact that in the ideal noiseless case, Gaussian QCS coupled with QIHTreconstruction is much more efficient: We can use (cheap, fast and/orlow power) coarse quantizers and acquire much less samples for abetter reconstruction quality. QIHT was also tested in a more realis-tic sensing scenario. A project of electronic (audio) low power sensorhas been modeled by a pair matrix-quantizer and QIHT reconstruc-tion was tested in the Matlab Monte-Carlo benchmark framework.

89

Page 102: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

90 conclusions

The results were, again, really good when compared to a classicalADC (before compression). Some tests were finally conducted in the1-bit case (which has been chosen by the electronic designers) to testrobustness against noise. BIHT in this context appeared to be sensi-tive to uniformly random bit errors even for quite small bit error rate (1error over 1000). A tentative of amelioration gave rather unexpectedresults. First, we found out that simply discarding small amplitudemeasurements is not beneficial to the reconstruction. Then, and fi-nally, we observed that Gaussian pre-quantization noise leads, in fact,to better reconstruction performances than the same number of errorsproduced by uniformly random bit flips.

5.2 summary

The first chapter was an introduction to the goals of the thesis. First,the general context was planted: The thesis is about compressed sens-ing, a rather new discipline in the large field of signal processing. Themain innovation of compressed sensing is that it allows us to sampleat sub-Nyquist rate. The issues addressed were: How is quantizationincluded and handled in the frame of compressed sensing and howcan we efficiently reconstruct compressively sensed and quantizedsignals? It was emphasized that besides sampling, this quantizationstep is mandatory in the process of digitalizing “real world” signals.After this contextualization, the structure of the thesis and the no-tations and conventions used throughout the document were brieflypresented.

The second chapter established the bases of this work by introduc-ing the state of the art in compressed sensing, reconstruction andquantization. The first section talked about the concept of K−sparsesignal in a sparsity basis or best K−terms approximation in a com-pression basis and how compressed sensing is fundamentally basedon the hypothesis that the signal is compressible. The principle is to“capture the information” given by the few (K) non-zero entries ofthe sparse representation α of the original signal x ∈ RN, and dis-tribute it equally among all the entries of the shorter measurementsvector y ∈ RM such that K < M � N. The strength of compressedsensing is to be able to do that in an analog way without having toanalyse x and without knowing which are the non zero coefficients.We saw that it is possible when the sensing scheme is as incoherentas possible with the compression basis and that randomness plays avery important role in that incoherence. The second section helped tounderstand how we can use the sufficient information brought by yto recover α and then x. We presented the ideal exact sparse (hardcombinatorial) optimization problem (ES) and its guaranteed perfectreconstruction thanks to the RIP. Then we introduced the `1−norm

Page 103: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

5.3 difficulties and solutions 91

relaxation, basis pursuit (BP), a regular convex optimization problemthat is possible to solve with O(N3) complexity. The basis pursuitdenoise program (BPDN) was also introduced so as to handle thenoise on the measurements when it is limited in amplitude. BPDN-reconstructed signals are guaranteed to be near the perfect solutionassuming that the RIP holds for the sensing scheme. The iterativehard thresholding (IHT), a greedy reconstruction method (and there-fore suboptimal yet with good theoretical guarantees in NCS) wasthen presented. It has the advantage, over BPDN, to be faster (O(MN)

in worst case scenario). Both algorithm were numerically comparedin a Matlab Monte-Carlo experiment which clearly exposed IHT asthe best. The third section presented quantization and its effects oncompressed sensing. A brief description of the “1-bit per measure-ment” scenario was given and the BIHT algorithm was also descibed.

Third chapter presented the new concept of inconsistency energyto minimize and applied it to create a new greedy algorithm basedon IHT and named QIHT for quantized iterative hard thresholding.First section presented the motivations of developing a new greedyalgorithm specifically for QCS. Second section gave the derivationsthat lead us to the QIHT method and showed how it bridges BIHTand IHT. The third section presented the results of numerical Monte-Carlo experiments with QIHT. First, to find a good parameter µ, thento compare it with BPDN and IHT, finally to compare with the clas-sical ADC with and without compression. The last section explainedhow the QIHT was used to reconstruct the signals “acquired” witha model of the RMPI, a real compressed sensing device. After a pre-sentation of the device and its idealization, Monte-Carlo experimentswere presented, first to compare the performances of the device witha classical ADC at different quantizer resolutions, then to evaluatethe robustness of the selected 1-bit case against bit errors. A tentativeof amelioration was implemented and gave unexpected results.

Finally, fourth chapter gave some insight about what could possi-bly be done in future works, based on the results of this thesis. Inparticular, a proposition of exact minimization problem (similar tothe LASSO formulation of BPDN but for quantized compressed sens-ing) and an algorithm to solve it were suggested.

5.3 difficulties and solutions

Many difficulties were encountered during this thesis. The first stepwas a preliminary study of the state of the art. This means that inorder to understand and before trying to go one step further, it wasnecessary to read and digest many carefully selected extracts from an

Page 104: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

92 conclusions

abundant and sophisticated literature.

The step by step discovery of the Jb energy required creativity, in-tuition and a lot of fruitful meetings and discussions.

Numerical simulations and the establishment of the Matlab bench-mark framework were not straightforward either. The final versionof some of the testing scripts is quite far from the initial idea and itrequired a lot of thinking and intuition (along with my supervisors)to choose the good parameters to test and a good methodology to doit efficiently. The amount of degrees of freedom and the length of asingle simulation did not always allow to proceed by trial and errors.For example the choice of the scaling parameter µ was not an easytask. We were finally able to find a seemingly good heuristic thanksto a tentative to relate with known results.

After a simulation, it was also challenging to find the good inter-pretation of the results. In particular, it was necessary to be able tobounce back in any situation: If the result is good and correspondsto the prediction, then we can explore further. If the result is bad,unexpected or really surprising, then we should find out why. Theidentification of what is important and what is not was sometimesnot as easy as one would think.

Finally, this thesis was a great opportunity to develop communica-tion skills. How to explain simply and efficiently your results? Howto adapt your explanation to a particular audience? It was very inter-esting and instructive to try to answer these questions not only formeetings and presentations with my supervisors or with other pro-fessors (David Bol) and students (Loïc Neu) and even with foreignvisitors (Pawel Jerzy Pankiewicz), but also in this final written docu-ment and the upcoming final presentations.

5.4 further tracks

As it was mentioned in Chapter 4, there is still a lot of hard workto do. We still have to further explore the robustness of QIHT (andmore generally the QCS framework) against pre-quantization noise.It would also be a big step to find theoretical guarantees about thereconstruction quality in QCS based on a property such as the RIP.Other QCS dequantization techniques using the inconsistency energycan also be developed. In particular, the use of proximal operators tosolve the problem proposed in Section 4.3 deserves special attention.There are also many possible application that need to be studied inorder to determine how and in what measure quantized compressedsensing can be useful or simply if it is a good idea to use QCS in a

Page 105: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

5.4 further tracks 93

given context. There is a whole new world of perspective that stillneeds to be explored. Fortunately, thanks to the “Science 2.0 effect”(mentioned in [JV10]), communication between scientists around theworld has never been faster and more efficient. There are a lot ofinternational scientists from the signal processing community work-ing on these topics and there is no doubt about the fact that mostof the questions addressed in this thesis will find their answers soonenough.

Page 106: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 107: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

Part IV

A P P E N D I X

Page 108: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 109: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

AE X P L I C I T B O U N D S F O R R E C O N S T R U C T I O N

This appendix give the explicit values of the constants in the theoret-ical bounds of Chapter 2.In Theorem 2, the bound is

‖x?BPDN − x‖2 6 C0‖x− xK‖1√

K+C1ε,

with constants C0 and C1 given in [Can08] by

α =2√1+ δ2K1− δ2K

; ρ =

√2δ2K

1− δ2K;

C0 =2(1+ ρ)

1− ρ; C1 =

1− ρ.

Figures 37 and 38 show the values of C0 and C1 when δ2K goesfrom 0 to

√2 − 1 ≈ 0.41 (the bound is not guaranteed above that

threshold).

C0

C0

δ2Kδ2K

000 0.050.05 0.10.1 0.150.15 0.20.2 0.250.25 0.30.3 0.350.35 0.4

2

4

6

8

10

12

14

16

50

100

150

200

250

Figure 37: Constant C0 that controls the contribution of the best K−terms approxi-mation error.

97

Page 110: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

98 explicit bounds for reconstruction

C1

C1

δ2Kδ2K

00

00 0.050.05 0.10.1 0.150.15 0.20.2 0.250.25 0.30.3 0.350.35 0.4

5

10

15

20

25

30

35

100

200

300

400

500

Figure 38: Constant C1 that controls the contribution of measurement noise.

Page 111: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

BD I S T R I B U T I O N O N T H E U N I T H Y P E R - S P H E R E

This appendix provides a way to establish the probability densityfunction of the elements of a random vector uniformly distributed onthe RN unit sphere. This distribution can also be seen as the distribu-tion of the elements of a normalized random vector generated from a0−mean i.i.d. isotropic (e. g., standard normal) distribution.

Let us define the random vector X := (X1,X2, ...,XN) with indepen-dently and identically distributed elements which follow a standardnormal distribution i. e.,∀i ∈ {1, ...,N} and ∀j ∈ {1, ...,N}, j 6= i, we have Xi ⊥⊥ Xj and Xi ∼

N(0, 1) which can also be noted X ∼ NN(0, 1). The probability densityfunction (pdf) of such a variable is given by:

fX(x) =1

(2π)N/2e−

12 (x

21+...+x2N)

It is isotropic in the sense that it is invariant with rotation,i. e., ifΨ ∈ RN×N is a rotation in RN,

fX(Ψx) = fX(x)

and in particular, there is a constant c > 0 such that for any g ∈ RN,

E(< g,X >2) = c‖g‖2.

E. g., if g is the kth canonical vector ek, we have E(X2k) = c ∀k ∈ [N].

If we normalize the random vector X, i. e., divide it by its `2−norm,the new variable

X̃ =X

‖X‖ =X√

X21 + ... +X2N

is then uniformly distributed over the unit RN sphere. Note that thedistribution is still isotropic.

First of all, it is easy to compute the variance of an entry of such arandom vector. Indeed,

‖X̃‖2 = E(‖X̃‖2) =∑

k∈[N]

E(X̃2k) = NV(X̃k).

99

Page 112: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

100 distribution on the unit hyper-sphere

The first equality holds trivially because the variable ‖X̃‖ = 1 is fixed.It is obvious that the X̃k are not independent from each other. Indeedif, e. g., you fix N− 1 values, the Nth value is by definition automati-cally fixed, in other words P

(X̃N|X̃1, ..., X̃N−1

)6= P(X̃N). However, the

expectation is always linear. This fact justifies the second equality. Asit is said in the beginning, the distribution is isotropic which impliesthat E(X̃2j ) = E(X̃2k) ∀k, j ∈ [N]. Moreover, the expectation of X̃k is 0(since E(Xk) = 0). This means that E(X̃2k) = V(X̃k) and justifies thethird equality. Finally,

V(X̃k) =‖X̃‖2N

=1

N.

We could want to find the pdf of one element of X̃ (which, is the samefor any element).

Let’s first define a new random vector in R2 as

X := (X1,Ξ),

with Ξ :=√X22 + ... +X2N. It is easy to see that

X̃1 =X1√X21 + Ξ

2

Since X1 is independent from Xi for i 6= 1, we can say that X1 isindependent from Ξ. Consequently, the pdf of X will be given by

fX(x) = fX1(x1)× fΞ(ξ).

We know that X1 ∼ N(0, 1):

fX1(x1) =1√2πe−

12x21 .

From probability theory we also know that Ξ ∼ χ(N− 1) the “chi"distribution with N− 1 degrees of freedom:

fΞ(ξ) =

{CξN−2e−

12ξ2

if ξ > 0

0 else,

with constant C =23−N2

Γ(N−12

) .

Now let g : R→ R be a measurable and bounded function. We canfind

E[g(X̃1)

]=

∫∞

0

∫∞

−∞g

x1√x21 + ξ

2

fX1(x1)fΞ(ξ)dx1 dξ.

Page 113: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

distribution on the unit hyper-sphere 101

Let’s apply the “polar coordinates" change of variables:

x1 = r cosφ

ξ = r sinφ,

with φ ∈ [0,π] and r ∈ [0,∞]. The Jacobian is r. The integral is now:

E[g(X̃1)

]=

∫π

0

∫∞

0

g (cosφ) fX1(r cosφ)fΞ(r sinφ) r drdφ

=

∫π

0

∫∞

0

g (cosφ) 1√2πe−

12r2 cos2φ C(r sinφ)N−2e−

12r2 sin2φ r drdφ

= C√2π

∫π

0

g (cosφ) (sinφ)N−2 dφ∫∞

0

rN−1e−12r2

dr

= C ′∫π

0

g (cosφ) (sinφ)N−2 dφ,

with C ′ some normalization constant found by integration of the

second integral and given byΓ(N/2)√π Γ(N−1

2 ). We apply the following

change of variable:

cosφ = t

sinφ =√1− t2

∂φ∂t = −1√

1−t2,

to finally find:

E[g(X̃1)

]= C ′

∫1

−1g(t) (1− t2)

N−32 dt.

This final expression gives us the wanted probability density func-tion:

fX̃1(t) =

Γ(N/2)√π Γ(N−1

2 )(1− t2)

N−32 if − 1 < t < 1

0 else.

As we could have expected, this distribution is centred around 0.Note that the (well known) case N = 3 gives the uniform distributionover [−1, 1].

We can now compute the variance of such a random variable:

V(X̃1) = E(X̃12) = C ′

∫1

−1t2 (1− t2)

N−32 dt.

Assuming that N is an integer bigger than 1, one can verify (e. g.,using Matlab symbolic toolbox) that this expression is exactly equal

Page 114: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

102 distribution on the unit hyper-sphere

to V(X̃1) =1N such as predicted in the beginning.

A final remark is that when X̃ is K−sparse and its non zero elementsare uniformly distributed on the (intersection of ΣK with the) `2−unitsphere, then the whole reasoning holds with N replaced by K. Inparticular, V(X̃k) =

1K ∀k ∈ supp (X̃k).

Page 115: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

B I B L I O G R A P H Y

[BB08] P. T. Boufounos and R. G. Baraniuk. 1-bit compressivesensing. In Proc. Conf. Inform. Science and Systems (CISS),Princeton, NJ, March 19-21 2008.

[BD08] T. Blumensath and M. E. Davies. Iterative thresholdingfor sparse approximations. Journal of Fourier Analysis andApplications, 14(5-6):629–654, 2008.

[BD09] T. Blumensath and M. E. Davies. Iterative hard thresh-olding for compressed sensing. Applied and ComputationalHarmonic Analysis, 27(3):265 – 274, 2009.

[BD10] T. Blumensath and M. Davies. Normalized iterative hardthresholding: Guaranteed stability and performance. Se-lected Topics in Signal Processing, IEEE Journal of, 4(2):298–309, 2010.

[BDDW07] R. Baraniuk, M. Davenport, R. Devore, and M. Wakin. Asimple proof of the restricted isometry property for ran-dom matrices. Constr. Approx, 2008, 2007.

[BJ13] P. Boufounos and L. Jacques. Tutorial 9: Modern quanti-zation strategies for compressive sensing and acquisitionsystems. ICASSP 2013 Vancouver Convention & Exhibi-tion Centre, May 2013. IEEE Signal Processing Society.

[Blu12] T. Blumensath. Accelerated iterative hard thresholding.Signal Processing, 92(3):752 – 756, 2012.

[Can08] E. J. Candès. The restricted isometry property and its im-plications for compressed sensing. Comptes Rendus Math-ematique, 346(9 - 10):589 – 592, 2008.

[Can11] E. Candès. Math matters, video interview in vancou-ver on the occasion of iciam 2011, the 7th congress onindustrial and applied mathematics. Accessible on theInternet Page http://www.france-vancouver.info/math-matters, july 2011. Last consulted in may 2013.

[Car13] I. Carron. Compressive sensing hardware, a non-exhaustive list of technologies implementing compressivesensing. Internet Page http://sites.google.com/site/ ig-orcarron2/compressedsensinghardware, may 2013. Lastconsulted in may 2013.

103

Page 116: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

104 bibliography

[CDS98] S. S. Chen, D. L. Donoho, and M. A. Saunders. Atomicdecomposition by basis pursuit. SIAM Journal on ScientificComputing, 20:33–61, 1998.

[Cha07] R. Chartrand. Exact reconstruction of sparse signals vianonconvex minimization. Signal Processing Letters, IEEE,14(10):707–710, 2007.

[CP09] P. L. Combettes and J.-C. Pesquet. Proximal SplittingMethods in Signal Processing. ArXiv e-prints, December2009.

[CP10] A. Chambolle and T. Pock. A first-order primal-dual algo-rithm for convex problems with applications to imaging.Journal of Mathematical Imaging and Vision, 40(1):120–145,2010.

[CR07] E. Candès and J. Romberg. Sparsity and incoherencein compressive sampling. Inverse Problems, 23(3):969–985,June 2007.

[CRT06a] E. Candes, J. Romberg, and T. Tao. Robust uncertaintyprinciples: exact signal reconstruction from highly incom-plete frequency information. Information Theory, IEEETransactions on, 52(2):489–509, 2006.

[CRT06b] E. J. Candès, J. K. Romberg, and T. Tao. Stable signalrecovery from incomplete and inaccurate measurements.Comm. Pure Appl. Math., 59(8):1207–1223, August 2006.

[CT05] E. Candès and T. Tao. Decoding by linear programming.Information Theory, IEEE Transactions on, 51(12):4203–4215,2005.

[CW08] E. J. Candès and M. B. Wakin. An Introduction ToCompressive Sampling. Signal Processing Magazine, IEEE,25(2):21–30, March 2008.

[CWB08] E. J. Candes, M. B. Wakin, and S. P. Boyd. Enhancingsparsity by reweighted `1 minimization. Journal of FourierAnalysis and Applications, 14(5-6):877–905, 2008.

[DDDM04] I. Daubechies, M. Defrise, and C. De Mol. An iterativethresholding algorithm for linear inverse problems witha sparsity constraint. Comm. Pure Appl. Math., 57(11):1413–1457, 2004.

[DDT+08] M. Duarte, M. Davenport, D. Takhar, J. Laska, T. Sun,

K. Kelly, and R. Baraniuk. Single-pixel imaging viacompressive sampling. Signal Processing Magazine, IEEE,25(2):83–91, 2008.

Page 117: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

bibliography 105

[DM09] W. Dai and O. Milenkovic. Subspace pursuit for com-pressive sensing signal reconstruction. Information Theory,IEEE Transactions on, 55(5):2230–2249, 2009.

[DM11] W. Dai and O. Milenkovic. Information theoretical andalgorithmic approaches to quantized compressive sens-ing. Communications, IEEE Transactions on, 59(7):1857–1866, 2011.

[Don06] D. Donoho. Compressed sensing. Information Theory,IEEE Transactions on, 52(4):1289–1306, 2006.

[GFR08] V. Goyal, A. Fletcher, and S. Rangan. Compressive sam-pling and lossy compression. Signal Processing Magazine,IEEE, 25(2):48–56, 2008.

[GLP+13] C. S. Güntürk, M. Lammers, A. Powell, R. Saab, and

Ö. Yılmaz. Sobolev duals for random frames and σδ

quantization of compressed sensing measurements. Foun-dations of Computational Mathematics, 13(1):1–36, 2013.

[GN98] R. Gray and D. Neuhoff. Quantization. Information Theory,IEEE Transactions on, 44(6):2325–2383, 1998.

[Jac13] L. Jacques. Le petit chercheur illustré. Internet Pagehttp://yetaspblog.wordpress.com/, 2008-2013. Last con-sulted in may 2013.

[JDD13] L. Jacques, K. Degraux, and C. De Vleeschouwer. Quan-tized Iterative Hard Thresholding: Bridging 1-bit andHigh-Resolution Quantized Compressed Sensing. ArXive-prints, May 2013.

[JHF09] L. Jacques, D. Hammond, and M. J. Fadili. Dequantizingcompressed sensing with non-gaussian constraints. InImage Processing (ICIP), 2009 16th IEEE International Con-ference on, pages 1465–1468, 2009.

[JHF11] L. Jacques, D. Hammond, and M. J. Fadili. Dequantiz-ing compressed sensing: When oversampling and non-gaussian constraints combine. Information Theory, IEEETransactions on, 57(1):559–571, 2011.

[JHF12] L. Jacques, D. K. Hammond, and M. J. Fadili. Stabilizingnonuniformly quantized compressed sensing with scalarcompanders. CoRR, abs/1206.6003, 2012.

[JLBB13] L. Jacques, J. Laska, P. Boufounos, and R. Baraniuk. Ro-bust 1-bit compressive sensing via binary stable embed-dings of sparse vectors. Information Theory, IEEE Transac-tions on, 59(4):2082–2102, 2013.

Page 118: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

106 bibliography

[JV10] L. Jacques and P. Vandergheynst. Compressed sens-ing:“when sparsity meets sampling”. Preprint Chapter[Optical and Digital Image Processing–Fundamentals and Ap-plications], Wiley, New, York, 2010.

[JVB+09] L. Jacques, P. Vandergheynst, A. Bibet, V. Majidzadeh,

A. Schmid, and Y. Leblebici. CMOS Compressed Imagingby Random Convolution. In IEEE International Conferenceon Acoustics, Speech, and Signal Processing (ICASSP), Taipei,Taiwan, 2009.

[KDXH09] M. A. Khajehnejad, A. G. Dimakis, W. Xu, and B. Hassibi.Sparse recovery of positive signals with minimal expan-sion. CoRR, abs/0902.4045, 2009.

[KJM+13] S. Kitic, L. Jacques, N. Madhu, M. P. Hopwood, A. Spriet,

and C. De Vleeschouwer. Consistent iterative hard thresh-olding for signal declipping. CoRR, abs/1303.1023, 2013.

[KLW+06] S. Kirolos, J. Laska, M. Wakin, M. Duarte, D. Baron,

T. Ragheb, Y. Massoud, and R. Baraniuk. Analog-to-information conversion via random demodulation. In De-sign, Applications, Integration and Software, 2006 IEEE Dal-las/CAS Workshop on, pages 71–74, 2006.

[KR03] N. G. Kingsbury and T. Reeves. Iterative image codingwith overcomplete complex wavelet transforms. In VCIP,pages 1253–1264, 2003.

[Las12] J. Laska. 1bitcompressivesensing on dsp.rice.edu. Inter-net Page http://dsp.rice.edu/1bitCS/, July 2012. Lastconsulted in may 2013.

[LBDB11] J. N. Laska, P. T. Boufounos, M. A. Davenport, and R. G.Baraniuk. Democracy in action: Quantization, saturation,and compressive sensing. Applied and Computational Har-monic Analysis, 31(3):429–443, 2011.

[Lie12] T. Lienart. Inpainting on graphs as interpolation meth-ods. Master’s thesis, Université catholique de Louvain,Louvain-la-Neuve, September 2012.

[LKD+07] J. Laska, S. Kirolos, M. Duarte, T. Ragheb, R. Baraniuk,

and Y. Massoud. Theory and implementation of ananalog-to-information converter using random demodu-lation. In Circuits and Systems, 2007. ISCAS 2007. IEEEInternational Symposium on, pages 1959–1962, 2007.

[Mar96] G. R. R. Martin. A game of thrones, volume bk. 1 of Asong of ice and fire. Bantam Spectra, a division of RandomHouse, Inc., 1996.

Page 119: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

bibliography 107

[MZ93] S. Mallat and Z. Zhang. Matching pursuits with time-frequency dictionaries. Signal Processing, IEEE Transac-tions on, 41(12):3397–3415, 1993.

[NT09] D. Needell and J. A. Tropp. Cosamp: Iterative signal re-covery from incomplete and inaccurate samples. Appliedand Computational Harmonic Analysis, 26(3):301–321, 2009.

[Nyq28] H. Nyquist. Certain topics in telegraph transmission the-ory. Proceedings of the IEEE, 90(2):280–305, 2002, originalpaper published in 1928.

[PAL13] P. J. Pankiewicz, T. Arildsen, and T. Larsen. Model-basedcalibration of filter imperfections in the random demodu-lator for compressive sensing. CoRR, abs/1303.6135, 2013.

[PB13] N. Parikh and S. Boyd. Foundations and Trends in Optimiza-tion, volume 1, chapter Proximal Algorithms. 2013. Nearfinal draft. Updated April 30, 2013.

[PRK93] Y. C. Pati, R. Rezaiifar, and P. Krishnaprasad. Orthogo-nal matching pursuit: Recursive function approximationwith applications to wavelet decomposition. In Signals,Systems and Computers, 1993. 1993 Conference Record of TheTwenty-Seventh Asilomar Conference on, pages 40–44. IEEE,1993.

[PV11] Y. Plan and R. Vershynin. One-bit compressed sensing bylinear programming. CoRR, abs/1109.4299, 2011.

[PV13] Y. Plan and R. Vershynin. Robust 1-bit compressed sens-ing and sparse logistic regression: A convex program-ming approach. Information Theory, IEEE Transactions on,59(1):482–494, 2013.

[ROF92] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total varia-tion based noise removal algorithms. Physica D: NonlinearPhenomena, 60(1-4):259–268, November 1992.

[Rom08] J. Romberg. Imaging via compressive sampling. SignalProcessing Magazine, IEEE, 25(2):14–20, 2008.

[SG09] J. Z. Sun and V. K. Goyal. Optimal quantization of ran-dom measurements in compressed sensing. In Informa-tion Theory, 2009. ISIT 2009. IEEE International Symposiumon, pages 6–10. IEEE, 2009.

[Sha49] C. Shannon. Communication in the presence of noise.Proceedings of the IEEE, 86(2):447–457, 1998, original paperpublished in 1949.

Page 120: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

108 bibliography

[Tib96] R. Tibshirani. Regression shrinkage and selection viathe lasso. Journal of the Royal Statistical Society. Series B(Methodological), pages 267–288, 1996.

[TLD+09] J. A. Tropp, J. N. Laska, M. F. Duarte, J. K. Romberg, and

R. G. Baraniuk. Beyond nyquist: Efficient sampling ofsparse bandlimited signals. CoRR, abs/0902.0026, 2009.

[TP12] A. M. Tillmann and M. E. Pfetsch. The Computa-tional Complexity of the Restricted Isometry Property,the Nullspace Property, and Related Concepts in Com-pressed Sensing. ArXiv e-prints, May 2012.

[vdBF07] E. van den Berg and M. P. Friedlander. SPGL1: Asolver for large-scale sparse reconstruction, June 2007.http://www.cs.ubc.ca/labs/scl/spgl1.

[VDBF08a] E. Van Den Berg and M. P. Friedlander. Probing thepareto frontier for basis pursuit solutions. SIAM Journalon Scientific Computing, 31(2):890–912, 2008.

[vdBF08b] E. van den Berg and M. P. Friedlander. Probing the paretofrontier for basis pursuit solutions. SIAM Journal on Sci-entific Computing, 31(2):890–912, 2008.

[YXZ12] Z. Yang, L. Xie, and C. Zhang. Unified frameworkand algorithm for quantized compressed sensing. CoRR,abs/1203.4870, 2012.

[ZBC10] A. Zymnis, S. Boyd, and E. Candes. Compressed sensingwith quantized measurements. Signal Processing Letters,IEEE, 17(2):149–152, 2010.

[Zha08] Y. Zhang. On theory of compressive sensing via`1−minimization: Simple derivations and extensions.TR08-11, CAAM, Rice University, 27, 2008.

Page 121: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

A C R O N Y M S

ADC Analog to Digital Conversion/Converter

BIHT Binary Iterative Hard Thresholding

BP Basis Pursuit

BPDN Basis Pursuit DeNoise

CS Compressed Sensing

DCT Discrete Cosine Transform

DFT Discrete Fourier Transform

DSP Digital Signal Processing

DWT Discrete Wavelet Transforms

ES Exact Sparse recovery problem

FFT Fast Fourier Transform

FWT Fast Wavelet Transform

IHT Iterative Hard Thresholding

i.i.d. independent and identically distributed

JPEG Joint Photographic Experts Group (compression standards forimages)

LASSO Least Absolute Shrinkage and Selection Operator

MPEG Moving Picture Experts Group (compression standards forvideo and sound)

MP3 MPEG Layer-3 (compression standard for sound)

MSE Mean Squared Error

PDF Probability Density Function

PSNR Peak Signal to Noise Ratio

QIHT Quantized Iterative Hard Thresholding

RIP Restricted Isometry Property

s.t. Subject To or Such That

109

Page 122: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes
Page 123: Bridging 1-bit & High Resolution Quantized Compressed Sensing · In a nutshell, the compressed sensing (CS) [CRT06b,Don06] theory ... and MPEG-4 for video. This thesis subscribes

colophon

This document was typeset using the typographical look-and-feelclassicthesis developed by André Miede.

Final Version as of March 30, 2016 (classicthesis version 4.0).