3052 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, …mf140/home/Papers/Encoder.pdf · 3052 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 6, JUNE 2012 A New Encoder for

3052 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 6, JUNE 2012

A New Encoder for Continuous-Time GaussianSignals With Fixed Rate and Reconstruction Delay

Damián Marelli, Kaushik Mahata, and Minyue Fu, Fellow, IEEE

Abstract—In this paper, we propose a method for encoding con-tinuous-time Gaussian signals subject to a usual data rate con-straint and, more importantly, a reconstruction delay constraint.We first apply aKarhunen-Loève decomposition to reparameterizethe continuous-time signal as a discrete sequence of vectors. Wethen study the optimal recursive quantization of this sequence ofvectors. Since the optimal scheme turns out to have a very cumber-some design, we consider a simplified method, for which a numer-ical example suggests that the incurred performance loss is negli-gible. In this simplified method, we first build a state space modelfor the vector sequence and then use Bayessian tracking to sequen-tially encode each vector. The tracking task is performed using par-ticle filtering. Numerical experiments show that the proposed ap-proach offers visible advantages over other available approaches,especially when the reconstruction delay is small.

Index Terms—Bayesian methods, continuous-time signals, par-ticle filters, predictive coding, quantization, state-space methods,transform coding.

I. INTRODUCTION

M OST digital data transmitted over communicationchannels originate from continuous-time signals with

the purpose of having the original continuous-time signalsreconstructed. While many applications do not have a strictrequirement on reconstruction delays, many other applicationsare of real-time nature. With the rapid growth in high-speedcommunication networks, real-time applications such as smartelectricity grids [1], [2], and networked control systems [3], [4]are made possible. In both cases, fast changing continuous-timesignals need to be digitally encoded and transmitted over acommunication network, and the reconstruction of the signalneeds to be made with minimal time delay to facilitate fastdecision making. Such applications impose a new challenge onthe encoding technology.In principle, the encoding problem considered in this paper

could be roughly stated as follows: Given a maximum recon-struction delay (in seconds) and a fixed (average) bit rate

Manuscript received July 07, 2011; revised October 16, 2011 and January 18,2012; accepted February 19, 2012. Date of publication March 06, 2012; date ofcurrent version May 11, 2012. The associate editor coordinating the review ofthis manuscript and approving it for publication was Prof. Olgica Milenkovic.D. Marelli and K. Mahata are with the School of Electrical Engineering and

Computer Science, University of Newcastle, Callaghan, NSW 2308, Australia(e-mail: [email protected]; [email protected]).M. Fu is with the School of Electrical Engineering and Computer Science,

University of Newcastle, Callaghan, NSW 2308, Australia. He is also with theDepartment of Control Science and Engineering, Zhejiang University, China(e-mail: [email protected]).Digital Object Identifier 10.1109/TSP.2012.2190064

(number of bits per seconds), we need to encode a con-tinuous-time signal in a given class (to be specified later)so that, at any time , a reconstructed version of isavailable for all , and the reconstruction error isminimized in some sense (to be specified later). However, thisproblem is too general. In particular, notice that, since only theaverage bit rate is specified, this problem allows using variablerate codes, which are not suitable for our intended real-timenetwork applications. Hence, we consider a particular case ofthis problem instead. More precisely, we further assume thatbits are transmitted at every ( being an integer).

From this problem statement, it follows that we only need totransmit data once every seconds1. However, we differentiatethe data transmission rate from the sampling rate. More pre-cisely, a rate much higher than can be chosen to samplethe continuous-time signal, provided that we can encode thesampled signal within the given bit rate . With the currentadvances in digital electronics, very fast sampling devices areeasily implementable. Hence, we can realistically assume that,at any time , the encoder knows the whole continuous-timesignal up to time . On the other hand, the constraint on the datatransmission rate is often unavoidable, especially for large com-munication networks or wireless links. With the above thinking,our encoding problem can be restated as follows: Given that thetransmitter knows the whole continuous-time signal up totime , and that the receiver knows the bits transmitted up totime , which bits of digital information (i.e., update)need to be transmitted, so that the receiver can reconstruct thecontinuous-time signal up to time with minimal distortion?Traditionally, continuous-time signals are encoded by first

sampling the signal and then quantizing the samples. Guidedby the Nyquist-Shannon sampling theorem, the sampling fre-quency is typically chosen to be higher than, but close to, theminimum sampling frequency which is twice of the signal band-width [5]. This sample-and-quantize approach is popular be-cause of its simplicity and is adequate if the purpose is to in-form on the sampled signal. However, when the purpose is toreconstruct the original continuous-time signal, reconstructiontime delay is inevitable with this approach. A natural way toreduce the time delay is to sample faster. But this results in ahigher transmission rate (number of transmissions per second)and a higher data rate (number of bits per second)2. Anotherdrawback of the sample-and-quantize approach is that, in many

1We assume that transmission delay and computational delay are negligible.But this assumption can be relaxed without adversely affecting our approach.2The possibility of using entropy coding to reduce the high data rate resulting

from a high sampling rate was studied in [6]. However, this introduces extrareconstruction delay.

1053-587X/$31.00 © 2012 IEEE

MARELLI et al.: NEW ENCODER FOR CONTINUOUS-TIME GAUSSIAN SIGNALS 3053

cases it may not be realistic to assume that the continuous-timesignal is band-limited.The drawbacks of the sample-and-quantize approach are

avoided by using transform coding [7]. This technique usesa linear transformation to obtain a vector of real coefficientswhich represents the signal over each time-interval of length. Hence, the sequence of such vectors can be considered asan alternative representation of the continuous-time signal.The essential difference between this linear transformationand the sampling operation used in the sample-and-quantizeapproach, is that each coefficient vector provides a completerepresentation of each signal segment, whereas samples frominfinite past and future are needed to represent each segment.Once the coefficient vector is computed, it needs to be encoded.In classical transform coding, this is typically done using scalarquantization on each vector entry. However, it is also possibleto use vector quantization techniques (e.g., generalized Lloyd’salgorithm or linear predictive vector quantization) [7].In this paper we propose a coding method to address the

delay-constrained problem described above, under the assump-tion that the signal to be encoded is Gaussian. The proposedmethod uses transform coding, more precisely the Karhunen-Loève (KL) decomposition [8], to obtain a sequence of vectors,as described above.We then study the optimal recursive strategyfor quantizing these vectors. Unfortunately, it turns out that theoptimal quantizer has a very cumbersome design. However, asimplified version of it leads to an accessible design. In this sim-plified version, each vector is encoded using its joint probabilitydensity function (pdf) conditioned on the previously transmittedquantized data. This is done recursively, every seconds, andthis process involves updating the joint pdf and the quantizationdictionary (code book). In our approach, these are done with theaid of a state-space model of the KL-decomposed sequence anda particle filter. To support our choice of this simplified scheme,we present a numerical example suggesting that the distortionincrease resulting from this simplification is indeed negligible.In addition, we present simulation results showing that the pro-posed approach leads to a smaller reconstruction error, whencompared with other available methods, especially for small re-construction delays. While the complexity of our approach issignificantly higher than that of other methods, it is affordablewhen the reconstruction delays is small. Hence, the proposedmethod is a valid alternative for coding under small reconstruc-tion delays, provided that the extra computational complexity isaffordable.We have explained above our strategy for quantizing the se-

quence of vectors resulting from a KL decomposition. A tech-nique related to this strategy is known as sequential quantiza-tion, in the context of vector quantization [9]. In this technique,the (scalar) components of a vector are sequentially quantizedusing their pdf conditioned on the previously quantized sam-ples. In contrast, in our problem we need to quantize an infi-nite sequence of vectors, rather than a finite sequence of scalars.Hence, we cannot express the conditional pdfs analytically orusing a training sequence, but we need to resort to numericalmethods (particle filtering) for pdf tracking.We note that delay-constrained encoding has been an active

research topic in coding theory for a long time. First, it is well

known that, if no limitation is imposed on the reconstructiondelay, the theoretical minimum reconstruction distortion, fora given bit rate, is given by the distortion-rate function ofthe continuous-time signal to be encoded [8]. When timedelay is constrained, the problem becomes more difficult. Forthe case of zero-delay reconstruction, it was shown in [10]that, for a discrete-time signal whose samples are statisticallyindependent, the theoretical minimum distortion is achievedusing scalar Lloyd-Max quantization. This means that optimalquantization is achieved by considering the knowledge of eachsample independently, rather than jointly with its previoussamples. For Markov sources, it was shown in [11] that if thesource is th-order Markov, the minimum distortion is achievedby forming each coded symbol using the last source symbols,and the current state of the receiver (which is built usingthe past coded symbols). For the case of first-order Markovsources, this result was extended in [12] for the scenario wherecode symbols are transmitted through a noisy channel withnoiseless feedback. The same problem, but without feedback,was studied in [13], concluding that the minimum distortioncan be achieved considering the current source symbol, andthe probability distribution (according to the encoder) of thedecoder’s state. A number of works study optimal codingstructures where the performance measure is not simply givenby the distortion. In this line, the authors of [14] studied theoptimal coding of Markov sources, in the sense of minimizinga weighted sum of the distortion and the conditional entropy ofthe coded sequence. Also, optimal variable-rate coding, wherethe cost function is a weighted sum of distortion and rate, isstudied in [15].In [16] and [17], a variant of the zero-delay coding scheme

called causal coding is studied. In this scheme, the reproductionvalue of each output depends on the present and past outputs.However, there is no constraint on the reconstruction delay, andtherefore, it permits the placement of an entropy encoder afterthe quantizer. Also, theoretical minimum bounds for zero- andlimited-delay coding were studied in [18]–[20], [17], for the in-dividual sequence setting, where the signal to be coded is notassumed to be a random process but a deterministic boundedfunction. It is unfortunate that no simple expression is availablefor the distortion-rate function of a general stationary randomprocess with correlated samples under zero or limited recon-struction delay.The rest of the paper is organized as follows. In Section II

we describe the limited-delay coding problem. In Section IIIwe study some properties of optimal limited-delay coding, andwe propose a sub-optimal coding strategy which is suitable forpractical implementation. In Section IV we derive a numericalalgorithm for implementing the proposed coding strategy. InSection V we present numerical experiments comparing theperformance of the proposed coding algorithm with that ofother available methods, and we give concluding remarks inSection VI.

II. PROBLEM DESCRIPTION

Let be a continuous-time stationary Gaussianrandom process, with zero mean and known autocorrelation

. The problem to be addressed is how


Fig. 1. Comparison between UDC (top) and LDC (bottom) dictionaries.

to code the signal , assuming that bits are transmitted atevery , for , so that the distortion

(1)

is minimized.In order to give some insight into the problem, we intro-

duce the concepts of unlimited-delay coding (UDC) and lim-ited-delay coding (LDC). Suppose that we want to code thesignal , on the interval , using bits (i.e.,using bits per second). Using UDC, the coding is doneby determining a dictionary of codeword signals

, which are chosen to minimize the distortion

(2)

when coding is done by choosing the codeword of the dic-tionary which is closest to , i.e.,

For a given rate , the distortion is minimized when tendsto infinity, in which case it is given by the distortion-rate func-tion of [8]. In this approach, the decoding side is onlyable to recover with a delay of . If instead, the decoderneeds to recover with a maximum delay of , we use LDC.More precisely, we define the codewords

, so that the decoder can recover on the interval, after , for each . To

illustrate this idea, we compare in Fig. 1 an UDC dictionary anda LDC dictionary for and .

III. LIMITED-DELAY CODING

In this section, we assume that the autocorrelation of, the maximum delay , and the number of bits per time

interval of length are given. We then propose a suboptimalLDC strategy for , in the sense of minimizing the distortion

(1). For convenience in the presentation, we constrain the do-main of to the interval , for a given , andwe minimize (2) instead of (1). However, the resulting codingstrategy, which is stated in Section III-C, is independent of .Therefore, it is readily applicable in a practical scenario, wherethe domain of is .

A. Order Reduction Using the Karhunen-Loève Decomposition

Our coding problem can be restated as that of coding the se-quence of segments , for .As explained in Section I, in practice the continuous-time signal

is sampled using a very dense grid of points. Hence, thecoding of the sequence of segments mentioned above turns intoa very high dimensional problem. To avoid this, we resort totransform coding for reducing the dimension of the problem, asexplained in [7, Sec. 12.6] in the context of vector quantization.More precisely, we use the KL decomposition [8].Using the KL decomposition, we can expand , on each

interval , as follows:

(3)

where the functions are orthonormal(i.e., ), and the coefficients

are uncorrelated random variables with, for some . Doing the same

expansion with the reconstructed version of we obtain

(4)

Let denote the distortion on the segment(i.e., ).

Since the functions are orthonormal, it followsfrom Parseval’s identity [21] that:

(5)

Let and suppose that we use a vector quantizer that actsonly on the first coefficients ,and ignores the remaining coefficients. Then, we have

where . Hence, the distortionis given by the sum of the quantization errorand the truncation error . Also, as shown inAppendix B, decreases monotonically with an increase in. Hence, if is chosen large enough to guarantee that the

truncation error is negligible in comparison with the quantiza-tion error, then, from a practical perspective, the quantization of

is equivalent to that of the -dimensional vector process. We will address this problem in Section III-B below.

Remark 1: The dimension of the vector is chosensuch that the truncation error is negligible in


comparison with the distortion . Also, needs to be keptsmall such that the coding complexity is not unnecessarily in-creased. Hence, finding the optimal value of requires an it-erative procedure.Remark 2: In the presentation above we consider and

to be continuous-time functions. This impliesthat the basis functions , as well as the coefficients

, are difficult to compute in practice. However, recallthat we assume that in practice, the knowledge of is givenby its samples obtained at a much higher sampling rate than thetransmission rate. Under this assumption, the functionsare obtained as the eigenvectors of the covariance matrix of thevector of samples of , in the interval ,and the coefficients as inner products between these basisfunctions and the same vector of samples.

B. Recursive Quantization of the Vector Process

1) Characterization of Quantization Cells for RecursiveLDC: A coding strategy, either UDC or LDC, of the (vector)samples induces a partition ofthe space into cells, each corresponding to a code-word (i.e., a vector in ) of some quantization dictionary.In this section, we study the structure of these cells for a recur-sive LDC scheme [7, Sec. 14.1], (i.e., when is computedfrom the vector , and the previously quantized values

). The difference between the partitions in-duced by UDC and recursive LDC is illustrated in Fig. 2, usinga two-dimensional example, where and ,i.e., and are both scalars. Fig. 2(top) shows the

quantization cells and codewords for the UDC de-sign. They are those of an optimal joint quantizer for the vector

. Notice that using this design, both samplesand need to be known in order to choose the appropriatecodeword from the quantization dictionary. Onthe other hand, Fig. 2(bottom) shows a recursive LDC design.Notice that in this case, the arrangement of quantization cellsand dictionary permits choosing the appropriate codeword(from a marginal dictionary of codewords) as soon as

becomes available. Furthermore, this choice determinesthe quantization cells and codewords for .Extending the idea above to general values of and ,

it follows that the partitions induced by recursive LDC codinghave the following structure:• For interval : A partition of is done into cells

, each corresponding to a quantized(vector) value of the first (vector) sample .

• For interval : For each , a partitionof is done into cells , eachcorresponding to a quantized (vector) value of thesecond (vector) sample , given that .

• The procedure continues so that, at time interval , foreach , a partition of is doneinto cells , each correspondingto a quantized (vector) value of the -th (vector)sample , given that

.

Fig. 2. Quantization cells and dictionary for a Gaussian random vectorusing UDC (top) and recursive LDC (bottom).

2) Design of the Quantization Cells and Codewords: In orderto design the optimal recursive LDC, we need to design, foreach , and each combination of the indexes

, the quantization cells andthe codewords of . Let

and. Then, as shown in Appendix A, we have that

(6)

(7)

where

(8)

(9)


In view of (6), the distortion in the time interval (i.e., in thecontinuous-time interval ) depends on the con-ditional probability of the sample given the past quantiza-tion values . It is clear from (7) that, if the quantizationcells for the interval are given, theoptimal quantization codeword is the centroid of eachcell under the conditional probability 3. Hence,only the quantization cells need to bedesigned.Form (7)–(9), we see that the distortion at time interval de-

pends not only on the quantization cellsfor that interval (via the integration regions), but also on thechoice of the cells at previous intervals (via and

). A natural question then is whether, at each, we can design the cells simplyconsidering the pdf of conditioned to thepast quantized values (we call this a greedy design), or instead,we need to consider the joint pdf of allsamples, to jointly design all cells

(we call this a joint design). Themain advantage of the greedy design is that, if it is optimal, wecan devise a simplemethod for quantizing recursively. Un-fortunately, it turns out that a greedy design is not optimal4. Thisis shown by Example 1 below.Example 1: Consider the LDC design depicted in

Fig. 2(right), where and , i.e.,, are scalar discrete-time samples. Let

(i.e., have a Gaussian distribution with zero mean and unitvariance), and consider the particular case where .We consider the greedy and joint designs described above. Forthe greedy design, the quantization cell boundaries for arecomputed using Lloyd’s algorithm [7]. Then, for each value of

, the boundaries for are computingusing the same algorithm, considering the pdf of

conditioned on the previously coded value of .The quantization boundaries so obtained are shown in Fig. 3(the figure only shows the boundaries on the positive axis,since those on the negative axis are symmetric). The figurealso shows the distortions obtained for and . Noticethat this design minimizes the distortion of the first sample,but not necessarily the total distortion. This is the goal of thejoint design, in which the quantization cell boundaries of bothsamples are jointly optimized to minimize the total distortion.The optimization is carried out using a quasi-Newton (BFGS)procedure. In order to evaluate whether the optimizationprocedure gets stuck into a local minimum, we carry out tenquasi-Newton parallel searches, which are initialized usingboundaries with randomly chosen locations. It turns out that allquasi-Newton searches yield the same result. The boundariesand distortions resulting from the joint design are shown inFig. 4. We see that while the distortion of the first sample ishigher than the one obtained with the greedy design, the totaldistortion of the joint design is smaller.

3This follows from the same argument used in the centroid optimality con-dition for vector quantization [7], by replacing the pdf of by theconditional pdf .4The theoretical possibility of this being the case was pointed out in [7, p.

524], in the context of recursive vector quantization.

Fig. 3. Quantization cell boundaries and distortion for the greedy encoder.

Fig. 4. Quantization cell boundaries and distortion for the joint encoder.

Example 1 indicates that the greedy method is not optimal.Hence, the quantization cells for all time intervalsneed to be jointly designed using . This canbe a very cumbersome task. Luckily, numerical studies demon-strate that, when the samples have a jointly Gaussian distribu-tion, the advantage obtained by the joint method is very smallin comparison with the greedy method. This is illustrated in thefollowing simple example.Example 2: Consider the coding of samples taken from

a discrete-time scalar random process (i.e., ) , gen-erated by filtering discrete-time Gaussian white noise using thefilter . We compare the distortion ob-tained using the greedy design, with the one resulting fromthe joint design, as described in Example 1. Fig. 5 shows thedistortion per sample (i.e., and ) resulting fromboth methods, for different values of the pole of the filter .(Notice that the value of determines the level of correlation be-tween consecutive samples. Notice also that, when samples arenot correlated both, the greedy and joint design, yield the sameresult). The figure also shows the difference persample, which measures the improvement offered by the jointdesign. We see that this improvement is indeed negligible forall values of . Fig. 6 shown the same comparison for differentnumber of quantization levels. (Notice that we do not constraintthe plot to only those values which are powers of two). Also inthis case we see that the improvement given by the joint designis negligible.


Fig. 5. Distortion comparison between greedy and joint encoders, for differentvalues of .

Fig. 6. Distortion comparison between greedy and joint encoders, for differentquantization levels.

In view of Example 2, and since we are only concerned aboutcoding of Gaussian signals, we adopt the greedy approach to de-sign the quantization cells for eachtime interval . Notice that in this case, designing the cells

is equivalent to designing the code-words . This equivalence follows fromthe nearest neighbor optimality condition for vector quantiza-tion [7], which states that the quantization cell of each

is formed by those vectors which are closer tothan to any other codeword. With this in mind, we state

the proposed LDC strategy in Section III-C.

C. Resulting LDC Procedure

Following the analysis above, a suboptimal recursive LDCstrategy is obtained by carrying out, at the -th time interval,the following four steps:

(S1) Use a KL decomposition to obtain the vector of co-efficients , representing the signal in the interval

;(S2) Compute the conditional pdf usingthe past quantized values ;(S3) Use to compute the quantizationcodewords ;(S4) Quantize by choosing as the codewordwhich is the closest to .

The main burden in the strategy above lies in Steps (S2) and(S3). These are to be studied next.Remark 3: The strategy described above redesigns the code-

words at each time interval . A nat-ural question is whether the decoder is able to reproduce thesecodewords for reconstructing . Notice that the codewordsdepend on conditional probabilities given the past quantizedvalues . Since is available at the decoder, it isable to reproduce the codewords, making the reconstruction of

possible.Remark 4: Asmentioned in Section I, our strategy for coding

the sequence of vectors [i.e., steps (S2) to (S4)], is equiva-lent to a sequential quantization scheme [9], where instead of thescalar components of a finite dimensional vector, an infinite se-quence of vectors is quantized. This prevents us from expressingthe conditional pdf either analytically, or usinga training sequence. Instead, we use a Bayesian tracking proce-dure, which we describe in the next section.

IV. PROPOSED LDC ALGORITHM

In this section, we describe the numerical implementation ofthe steps (S2) and (S3) mentioned above.

A. State Space Realization for pdf Tracking

Step (S2) requires the computation of the conditional pdfat each , given the past quantized values. We

introduce below a recursive scheme for doing so, using a statespace realization of the vector process .Recall (3) and let . Then, it is

easy to show that the autocorrelation of is given by

(10)

where denotes convolution and the superscript denotesthe transpose, time-reversal operation (i.e., ).Now, since is Gaussian, using some spectral realizationmethod [22] we can build a state space model

(11)

(12)

(13)

so that the autocorrelation of equals (10). In (11)–(13),denotes the state vector at time interval

denotes the quantizerat , and is a sequence of independent, -dimensionalrandom vectors with distribution . Also, to satisfy thestationarity condition, we assume that the initial state hasdistribution , where is the solution of

.Using the state space model (11)–(13), we can recursively

compute the conditional pdf using a Bayesiantracking procedure [23], [24]. More precisely, we first compute

using the following recursive formulas:

(14)


(15)

where and

(16)

Then, is computed from asfollows:

(17)

B. Implementation Using Particle Filtering

The implementation of the recursive formulas (14)–(17) re-quired for step (S2), and the design of the optimal codewords in(S3) are both numerically complex. We derive below a numeri-cally tractable algorithm for approximately implementing thesetasks using particle filtering [24].For a given , let

and . Fix , and supposethat the conditional distribution is known. Wedescribe below an iterative algorithm which uses the (approxi-mate) knowledge of to build an approximationof . Each iteration is for formed by a number ofsteps which are detailed below. The initialization of the itera-tions is addressed subsequently.Computing (17): The idea is to approximate the distribu-

tion by a sum of impulses (particles) lo-cated at some points

, i.e.

(18)

The particle locations are obtained fromrandom samples of the distribution . Now, from(18), (17) becomes

(19)

Computing (S3): In view of (19), the codewords, for the time interval , can be obtained by

running the k-means algorithm [7] on the samples .

Computing (15): We have

(20)

Now, using (16) and the approximation (18), we have thatcould in principle be approximated by con-

straining the sum in (18) to only those impulses such that, i.e.

(21)

where equals the number of suchimpulses. However, notice that doing so reduces the number ofparticles from to . This would cause that the iterations wouldhave eventually no particle left. To prevent this, we obtain anew set of samples , by drawing themrandomly from the discrete distribution in (21). These sampleswould obviously have repetitions if . Doing so we obtain

(22)

Computing (14): The last step consists in using the points, to obtain points

for building an approximation ofas in (14). We have that

(23)

Hence, for each , we can obtain a pointsby simply adding to the point an extra component. This component is obtained by a random sampling of the

distribution .Initialization: For , we have that

. Hence, we approximate it by the following sum of im-pulses

where the impulse locations are obtained asrandom samples of the distribution of .In the explanation above we have used particles to approxi-

mate joint conditional pdfs. Each particle is represented by


a vector in . Hence, there seems to be a dimension increasewith . However, this is not the case. Notice that, due to theMarkov property of the state space model (11)–(13), we onlyneed to keep track of the last component of each

, avoiding a memory growth of the algorithm. Withthis consideration, the resulting algorithm is summarized here.

Algorithm 1

Start by drawing from the distributionof the initial state . Then, for each ,

1) Compute the codewordsusing the k-means algorithm and the points .

2) Choose , where(i.e., choose the

codeword closest to ).3) Keep only the set of pointssatisfying

i.e., such that is closer to than to anyother codeword.

4) Obtain by doing randomchoices (with possible repetitions) from the set ,

, defined in Step 3.5) For each , draw from thedistribution .

Remark 5: Fig. 1(right) shows a dictionary of continuoussignals for LDC. Notice that recursively designing the code-words , for each time interval ,using the k-means algorithm, does not guarantee that thesignals of the resulting dictionary are continuous on theboundaries of the intervals . Toguarantee this, the codewords need to be chosen under theconstraints ,for all , where denotes thevalue of the reconstructed signal segment

, at the boundary . Similarconstraints can be imposed to guarantee the continuity of thederivatives of at the interval boundaries.Remark 6: As mentioned in Remark 3, the decoder needs

to reproduce the codewords , at timeinterval , in order to reconstruct . In Algorithm 1, thesecodewords are derived from the set of particles

, which are randomly generated. In practice, a pseudo-random generator needs to be used so that the same particles canbe generated at both encoder and decoder ends.

C. Complexity Analysis

In this section we study the numerical complexity of Algo-rithm 1, proposed in Section IV-B. We summarize below theoperations requiring floating point multiplications. These oper-ations need to be carried out at both the encoder and the de-coder, unless explicitly stated, and at each time interval (i.e.,once every seconds).

1) The KL transform at the encoder and the KL inverse trans-form at the decoder require each multiplications,where denotes the (super-Nyquist) frequency used forsampling before processing.

2) The k-means algorithm with points of dimensions andclusters, requires multiplications, wheredenotes the average number of iterations required for

convergence.3) Coding the vector requires multiplications (thisis only done at the encoder).

4) The random particle choices carried out in step 4 ofAlgorithm 1 requires generating uniform random vari-ables, and the generation of requires generatingGaussian variables.

5) Computing andrequires multiplications.

From the tasks above, the complexity of tasks 2 and 5 aredominant. Hence, at each time interval, the number of multipli-cation at both the encoder and the decoder is approximatelygiven by

(24)

V. NUMERICAL EXPERIMENTS

In order to evaluate the performance of the proposed LDC al-gorithm, we compare it with a number of standard quantizationtechniques. For the comparison we use a random process gen-erated by filtering Gaussian white noise using a fifth-order But-terworth filter with cutoff frequency Hz. The powerspectral density (PSD) of the resulting signal is shown in Fig. 7(left). While in theory, the signal is of continuous-time, weimplement it as a discrete-time signal with sampling frequency

Hz, which is much higher than the Nyquist rate of. We generate samples spanning 1000 seconds. We describe

below the quantization methods used in the comparison. Withinthe description of each method, we include the complexity asso-ciated with the decoding task. We consider this complexity in-stead of that of the encoding task, because the former is higherthan or equal to the latter, in all cases.Sampling and scalar quantization : This

method samples the continuous-time signal using a super-Nyquist rate of Hz. Then, each sample is quantized usinga (non-uniform) optimal scalar quantizer. This quantizer is de-signed using Lloyd’s algorithm ([7], Section 6.4). The recon-struction of the continuous-time signal from the quantizedsamples is done using a filter derived from a non-causal infi-nite impulse response (IIR) filter . Hence, to achieve a pre-scribed reconstruction delay , we truncate the non-causal com-ponent of the impulse response of so that , for

. In order to reduce the error introduced by such trunca-tion, we choose to be a raised cosine filter with cutoff fre-quency 0.5 Hz and roll-off factor . The frequency andimpulse responses of this filter are shown in Fig. 7. Finally, toreduce computations, we also truncate the causal component of

so that , for sec. The decoding complexityof this method is determined by the reconstruction process, and


Fig. 7. (Top) PSD of and frequency response of the reconstruction filter.(Bottom) Impulse response of the reconstruction filter.

is given by multiplications per sample (i.e., persecond).Sampling and linear predictive quantization

: This method is similar to the SMP+SQ method de-scribed above, with the only difference in the quantizationstage. This consists of a linear predictive quantizer, with alinear predictor of 10th order. The design of the quantizer isdone using the method described in [25], which is summarizedin Appendix C. To do so we use a training signal withsamples. The decoding complexity of this method is due tothe reconstruction task and the prediction, i.e.,multiplications per sample (i.e., per second).KL decomposition and scalar quantization :

Thismethod uses a KL decomposition to obtain a sequenceof vector coefficients of , as explained in Section III-A. Thenumber of components of the vectors is chosen as thesmallest number of components yielding a truncation error atleast 50 dB smaller than the total power. Then, each component

of is quantized using the scalarscheme described in SMP+SQ. To do so, the bits available toquantize the vector need to be allocated over its com-ponents. We do so using the greedy bit allocation algorithm de-scribed in ([7], Section 8.4). In this method, the bits are se-quentially allocated by assigning, at each step, one additional bitto the component having the highest reconstruction error. Thedecoding complexity of this method is due to the KL transform,and is given by multiplications every seconds.KL decomposition and linear predictive quantization

: This method is similar to the KL+SQ method,with the only difference in how the components

Fig. 8. Distortion comparison for different reconstruction delays , and fixedbit rate .

are quantized. This is done using the linearpredictive method described in . The decodingcomplexity is given by the KL transform and the prediction,i.e., multiplications every seconds.KL decomposition and vector quantization :

This method uses a KL decomposition as described for themethod. Then, each vector is jointly quantized

using an optimal vector quantizer. This quantizer is designedusing the generalized Lloyd’s algorithm [7, Sec. 11.3] and atraining signal of samples. The decoding complexity ofthis method is given by the KL transform, i.e., multipli-cations every seconds.KL decomposition and linear predictive vector quantiza-

tion : This method is similar to the KL+VQmethod, with the only difference in how the vectors arequantized. This is done using a linear predictive vector quan-tizer. The quantizer is designed using the method in [25], andsummarized in Appendix C. Following [25], we use a first-ordervector linear predictor in KL-LPVQ, since higher orders leadto instability during the design procedure. The decoding com-plexity of this method is given by the KL transform and linearprediction, i.e., multiplications every seconds.Proposed method: The proposed method uses the KL trans-

form described for the method. For obtaining thestate-space model (11)–(12) we use the algorithm described inAppendix D. Also, for the pdf tracking and codeword design al-gorithm described in Section IV.B, we use particles.In the first simulation, we compare the distortion of the pro-

posed method to those of the methods listed above. To do so,we use a fixed rate of bits per second, and wevary the reconstruction delay from 0.125 to 1 s (so that thequantization of is done using 1 to 8 bits). The result of thiscomparison is shown in Fig. 8. For comparison purposes, wepoint out that the rate distortion function of the continuous-timesignal , evaluated at , equals dB. We see thatthe distortion of the and the proposed methodsare noticeably smaller than those of the other methods. Also,the proposed method outperforms the method forlow reconstruction delays. The reason for this is discussed in thenext paragraph.As shown in Section III-B2, a nearly optimal quantiza-

tion strategy is achieved by designing the vector quantizerfor each vector sample using the conditional pdf


Fig. 9. Comparison between the and the proposed methods, fordifferent reconstruction delays and bit rates .

Fig. 10. Comparison between the and the proposed methods, fordifferent bit rates and minimum delay .

of that sample, given the previous quantiza-tion values. This requires to be redesigned at each , as donein the proposed method. Instead of doing so, the KL-LPVQmethod uses , where

is the predicted value of , i.e., at each it usesthe same quantizer (which we design using the method de-scribed in Appendix C) whose center is shifted by . Now,if there is no quantization (or equivalently, as the number ofquantization bits tends to infinity), becomesa Gaussian distribution with mean and a covariancematrix which is independent of . Hence, the KL+LPVQmethod becomes optimal as increases; and therefore, theproposed coding method is advantageous when is small.When the coding bit rate is fixed, this (i.e., a small value of) corresponds to a small delay .To see this point in more detail, in the second simulation we

repeat the experiment, only involving the KL+LPVQ methodand the proposed method, and considering different bit rates. The result is shown in Fig. 9, showing the advantage of the

proposed method for low reconstruction delays.Finally, in Fig. 10 we compare the distortions of the

KL+LPVQ and the proposed methods, for different bit rates ,and for each rate, we use the minimum delay, i.e., , sothat is always quantized with one bit (i.e., ). We seethat the advantage of the proposed method becomes more clearat high rates, where the distortion becomes smaller.Table I shows the details for computing the complexity of

the proposed method, measured in number of multiplicationsper interval of seconds. These details correspond to a rate of

bits per second, and reconstruction delays ranging

TABLE IDETAILS OF THE COMPUTATION OF THE COMPLEXITY MULT./ OF THE

PROPOSED METHOD, FOR

TABLE IICOMPLEXITY MULT./ OF ALL OTHER METHODS, FOR

from 0.125 to 0.5 s. Table II shows the complexity of all othermethods used in the comparison. We see that the complexityof the proposed method is significantly higher than those of theother methods. Hence, in view of the performance comparisonpresented above, we conclude that the proposed method is avalid option for encoding under low reconstruction delay (i.e.,when transmitting one or two bits at a time), if the extra com-plexity can be afforded.

VI. CONCLUSION

We have proposed a fixed-rate encoding method for contin-uous-time Gaussian signals to reduce the reconstruction distor-tion for given constraints on the reconstruction delay and datarate. The proposed approach uses a Karhunen-Loève decompo-sition to obtain a sequence of coefficient vectors, whose innova-tion is vector-quantized. This is recursively done by consideringthe conditional pdf of the current coefficient vector, given thepast quantized data which is already available at the decoder.While the proposed method achieves a suboptimal reconstruc-tion distortion, numerical experiments show that the differencewith the optimal reconstruction is negligible in the Gaussiancase, while offering advantages over other available approaches,especially when the reconstruction delay is small.

APPENDIX APROOF OF (6)

Since the quantized value does not depend on the futurevalues , we have


Now, since whenever , itfollows that:

and since

it follows that:

Finally, (6) follows from:

APPENDIX BMONOTONIC DECREASE OF WITH

For any , define,

and let denote the optimalquantizer for of rate . Let also denotethe distortion obtained by using the quantizer on

, and ignores the remainingcoefficients of the expansion (5).Let, , and define the (not necessarily optimal) quan-

tizer for , obtained fromthe quantizer as follows:

for all , where denoted the th compo-nent of . That is, quantizes the first components of

according to the optimal quantizer for ,

and makes zero the remaining components. Since is theoptimal quantizer for of rate , it follows that

APPENDIX CDESIGN OF A LPQ USING THE METHOD IN [25]

For simplicity, we describe the design of a scalar LPQ. Thesame procedure can be straightforwardly applied for designinga vector LPQ.Let be a scalar discrete-time signal,

be an th-order predictor ( is the forwardshift operation, i.e., ), and be a scalarquantizer. In the LPQ scheme, the reconstructed version of

is obtained as follows:

Now, given the statistics (i.e., the autocorrelation) of the inputsignal, and the predictor (which is straightforwardly designedusing linear least-squares [7, Sec. 4.3], the problem is how to de-sign . To do so we use a realization as a training signal,and we use the following iterative procedure. At iteration ,let denote the reconstructed version of obtainedfrom the previous iteration. Then, the quantizer is com-puted using the k-means algorithm on the residuals

After doing so, the reconstruction is computed by

The iterations stop when the reconstruction error stops de-creasing. Also, the iterations are initialized by choosing

where the quantizer is computed using the k-means algo-rithm on the residuals .

APPENDIX DSPECTRAL REALIZATION METHOD USED FOR COMPUTING

(11)–(12)

In this Appendix we described the method we use to computethe state space model (11)–(12), from the auto-correlation (10).


Let be such that the tail , is negligible,and let

(25)

Now, for a given , we can findand such that

This is done by solving the linear least-squares problem

where denotes the matrix Frobenius norm. For doing thisapproximation, the value of is chosen using Akaike’s crite-rion [26, p. 442].The next step is to build a state-space model

[similar to the one in (11)–(12)] such that its transfer functionapproximates .

This is done by choosing [27, p. 101]

. . .. . .

......

The state-space model is typically not minimal.Hence, we use balanced truncation ([27], p. 197) to obtaina model of reduced order (i.e., with a matrixof smaller dimension than ), whose transfer matrix

is approximately equal to .Then, we have

(26)

Now, from (25) and (26), it follows that the spectrumof is given by:

Using the spectral factorization technique in [28, Sec. 8.5], itfollows that:

(27)

where and , with

and the matrix being the solution of the Ricatti equation

Equation (27) implies that a spectral realization of isgiven by the following state-space model

(28)

(29)

However, the model above includes an output noise componentgiven by the matrix . To obtain the desired model (11)–(12)from (28)–(29), we use the transformation in [26, p. 178]. Doingso, we obtain

REFERENCES

[1] H. Farhangi, “The path of the smart grid,” IEEE Power Energy Mag.,vol. 8, no. 1, pp. 18–28, Jan.–Feb. 2010.

[2] K. Moslehi, “Intelligent infrastructure for coordinated control of aself-healing power grid,” in Proc. IEEE Electr. Power Syst. Conf.(EPEC’08), 2008.

[3] G. Nair, F. Fagnani, S. Zampieri, and R. Evans, “Feedback controlunder data rate constraints: An overview,” Proc. IEEE, vol. 95, no. 1,pp. 108–137, 2007.

[4] M. Fu and C. de Souza, “State estimation for linear discrete-time sys-tems using quantized measurements,” Automatica, vol. 45, no. 12, pp.2937–2945, 2009.

[5] M. Unser, “Sampling-50 years after Shannon,” Proc. IEEE, vol. 88, no.4, pp. 569–587, 2000.

[6] R. Zamir and M. Feder, “Rate-distortion performance in coding ban-dlimited sources by sampling and dithered quantization,” IEEE Trans.Inf. Theory, vol. 41, no. 1, pp. 141–154, 1995.

[7] A. Gersho and R. M. Gray, Vector Quantization and Signal Compres-sion, 1st ed. New York: Springer, 1991.

[8] R. G. Gallager, Information Theory and Reliable Communication.New York: Wiley, 1968.

[9] R. Balasubramanian, C. Bouman, and J. Allebach, “Sequential scalarquantization of vectors: An analysis,” IEEE Trans. Image Process., vol.4, no. 9, pp. 1282–1295, Sept. 1995.

[10] N. Gaarder and D. Slepian, “On optimal finite-state digital transmissionsystems,” IEEE Trans. Inf. Theory, vol. 28, no. 2, pp. 167–186, 1982.

[11] H. Witsenhausen, “On the structure of real-time source coders,” BellSyst. Tech. J., vol. 58, no. 6, pp. 1437–1451, 1979.

[12] J. Walrand and P. Varaiya, “Optimal causal coding-decoding prob-lems,” IEEE Trans. Inf. Theory, vol. 29, no. 6, pp. 814–820, 1983.

[13] D. Teneketzis, “On the structure of optimal real-time encoders and de-coders in noisy communication,” IEEE Trans. Inf. Theory, vol. 52, no.9, pp. 4017–4035, 2006.

[14] V. Borkar, S. Mitter, and S. Tatikonda, “Optimal sequential vectorquantization of Markov sources,” SIAM J. Contr. Optimiz., vol. 40, no.1, pp. 135–148, 2001.

[15] Y. Kaspi and N. Merhav, Structure Theorems for Real-Time Vari-ablerate Coding With and Without Side Information Arxiv preprintarXiv:1108.2881, 2011.

[16] D. Neuhoff and R. Gilbert, “Causal source codes,” IEEE Trans. Inf.Theory, vol. 28, no. 5, pp. 701–713, 1982.

[17] T. Linder and R. Zamir, “Causal coding of stationary sources and indi-vidual sequences with high resolution,” IEEE Trans. Inf. Theory, vol.52, no. 2, pp. 662–680, 2006.

[18] T. Linder and G. Lugosi, “A zero-delay sequential scheme for lossycoding of individual sequences,” IEEE Trans. Inf. Theory, vol. 47, no.6, pp. 2533–2538, 2001.

[19] T. Weissman and N. Merhav, “On limited-delay lossy coding and fil-tering of individual sequences,” IEEE Trans. Inf. Theory, vol. 48, no.3, pp. 721–733, 2002.

[20] A. Gyorgy, T. Linder, and G. Lugosi, “Efficient adaptive algorithmsand minimax bounds for zero-delay lossy source coding,” IEEE Trans.Signal Process., vol. 52, no. 8, pp. 2337–2347, 2004.

[21] J. Conway, A Course in Functional Analysis. New York: Springer,1990.


[22] A. C. Antoulas, Approximation of Large-Scale Dynamical Systems(Advances in Design and Control). Philadelphia, PA: SIAM, Jul.2005.

[23] M. West and J. Harrison, Bayesian Forecasting and Dynamic Models,ser. Springer Series in Statistics, 2nd ed. New York: Springer, Mar.1999.

[24] M. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial onparticle filters for online nonlinear/non-Gaussian Bayesian tracking,”IEEE Trans. Signal Process., vol. 50, no. 2, pp. 174–188, 2002.

[25] H. Khalil, K. Rose, and S. Regunathan, “The asymptotic closed-loopapproach to predictive vector quantizer designwith application in videocoding,” IEEE Trans. Image Process., vol. 10, no. 1, pp. 15–23, Jan.2001.

[26] T. Soderstrom and P. Stoica, System Identification, ser. Prentice-HallInternational Series in Systems and Control Engineering. EnglewoodCliffs, NJ: Prentice-Hall, Mar. 1994.

[27] C.-T. Chen, Linear System Theory and Design (Oxford Series inElectrical and Computer Engineering), 3rd ed. Oxford: OxfordUniv. Press, Sep. 1998.

[28] T. Kailath, A. H. Sayed, and B. Hassibi, Linear Estimation, 1st ed.Englewood Cliffs, NJ: Prentice-Hall, Apr. 2000.

Damián Marelli received the Bachelor’s degreein electronics engineering from the UniversidadNacional de Rosario, Argentina, in 1995, the Ph.D.degree in electrical engineering and a Bachelor(Honors) degree in mathematics from the Universityof Newcastle, Australia, in 2003.From 2004 to 2005, he held a postdoctoral po-

sition at the Laboratoire d’Analyse Topologie etProbabilités, CNRS/Université de Provence, France.Since 2006, he is Research Academic at the Schoolof Electrical Engineering and Computer Science,

University of Newcastle. In 2007, he received a Marie Curie PostdoctoralFellowship, hosted at the Faculty of Mathematics, University of Vienna,Austria, and in 2010, he received a Lise Meitner Senior Fellowship, hostedat the Acoustics Research Institute of the Austrian Academy of Sciences. Hismain research interests include signal processing and communications.

Kaushik Mahata received the Ph.D. degree in elec-trical engineering from Uppsala University, Sweden,in 2003.Since then he has been with the University of New-

castle, Australia. His research interest is in signal pro-cessing and its applications.

Minyue Fu (F’83) received the Bachelor’s degree inelectrical engineering from the University of Scienceand Technology of China, Hefei, China, in 1982, andthe M.S. and Ph.D. degrees in electrical engineeringfrom the University of Wisconsin-Madison, in 1983and 1987, respectively.From 1983 to 1987, he held a Teaching Assistant-

ship and a Research Assistantship at the Universityof Wisconsin-Madison. He worked as a ComputerEngineering Consultant at Nicolet Instruments,Inc., Madison, during 1987. From 1987 to 1989, he

served as an Assistant Professor in the Department of Electrical and ComputerEngineering, Wayne State University, Detroit, MI. He joined the Departmentof Electrical and Computer Engineering, University of Newcastle, Australia, in1989, where he is currently Chair Professor in Electrical Engineering and Headof School of Electrical Engineering and Computer Science. In addition, he wasa Visiting Associate Professor at the University of Iowa during 1995–1996,and a Senior Fellow/Visiting Professor at Nanyang Technological University,Singapore, 2002. He holds a Qian-ren Professorship at Zhejiang University,China. His main research interests include control systems, signal processing,and communications.Dr. Fu has been an Associate Editor for the IEEE TRANSACTIONS ON

AUTOMATIC CONTROL, Automatica, and the Journal of Optimization andEngineering.

Documents

3052 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, …mf140/home/Papers/Encoder.pdf · 3052 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 6, JUNE 2012 A New Encoder for