Innovations and singular value decomposition for blind sequence detection in wireless channels

Signal Processing 83 (2003) 1945–1959

www.elsevier.com/locate/sigpro

Innovations and singular value decomposition for blindsequence detection in wireless channels

Sujit Sen∗, Subbarayan Pasupathyc/o Graduate O�ce, Department of Electrical and Computer Engineering, University of Toronto, 10 King’s College Road, Toronto,

ON, Canada M5S 3G4

Received 28 January 2002; received in revised form 4 August 2002

Abstract

A blind sequence detection algorithm based on the innovations approach is proposed and its performance in a slow and-at Rayleigh fading environment is evaluated. For a blind sequence detection algorithm, the main goal is to estimate thesource correlation from the observed received samples. The Viterbi algorithm is then used to detect the input symbols. Theproposed detector is a blind algorithm since it does not require any channel parameter information in order to detect the inputsymbols. A comparison between the innovations and the singular value decomposition (SVD) [14] based blind sequencedetection algorithm is also presented. Our study considers only linearly modulated signals which are useful for diversityreceivers. Simulation results show that the innovations blind sequence (BSD) detector has about a 3 dB gain over the SVDBSD in the fading model studied in this paper. Various performance measures will be presented to explain the behaviour ofthese BSDs in both the additive white Gaussian noise and fading environments.? 2003 Elsevier B.V. All rights reserved.

Keywords: Blind sequence detection; Innovations; Singular value decomposition; Viterbi algorithm; Wireless; Multipath fading channels

1. Introduction

In order to have high-speed reliable communica-tion on a wireless channel, channel identi<cation andequalization are required to combat intersymbol in-terference (ISI). Normally, channel identi<cation andequalization are done either by sending long trainingsequences or by designing the equalizer based uponprior knowledge of the channel. Unfortunately, in ra-dio communication channels little is known about the

∗ Corresponding author. Tel.: +1-416-946-8809;fax: +1-416-978-4425.

E-mail addresses: [email protected] (S. Sen),[email protected] (S. Pasupathy).

channel a priori and many standard adaptive detectorsused in radio environments waste some of their trans-mission time on a training sequence. Recently, blindsequence detection and innovations based estimationtechniques have received a lot of attention in recov-ering a transmitted data sequence corrupted by an un-known channel.Blind equalization is where one estimates the chan-

nel without knowing what the transmission channelcharacteristics are. Numerous methods (e.g. [4,9,13])have been proposed for using higher-order statistics(HOS) for blind equalization techniques. Methodsthat employ HOS can identify correctly non-minimumphase channels. However, the convergence time ofthese HOS algorithms make them unsuitable for

0165-1684/03/$ - see front matter ? 2003 Elsevier B.V. All rights reserved.doi:10.1016/S0165-1684(03)00113-0

mailto:[email protected]

mailto:[email protected]

1946 S. Sen, S. Pasupathy / Signal Processing 83 (2003) 1945–1959

mobile channels. Second-order cyclostationary blindsignal processing methods with a lower convergencerate have been proposed (e.g. [15–17]) but thesemethods cannot properly identify a channel when thechannel has certain special zeroes [14].Adaptive detectors are widely used in blind se-

quence detection for channel estimation, sequence de-tection and data recovery. Ghosh and Weber [3] haveproposed a receiver that does joint data and channelestimation in a recursive manner. The data estimationpart is accomplished by using maximum likelihood se-quence estimation (MLSE) (Viterbi decoding) whilethe channel estimation is done by the least-squaresestimation technique. In addition, Ghosh and We-ber employ a delayed decision feedback estimation(DDFSE) algorithm [2] to reduce the number of statesin the Viterbi trellis in their maximum-likelihoodblind equalization method. Seshadri [12] has devel-oped a joint data and channel estimation algorithmusing a fast blind trellis search technique. Eventhough the Viterbi based blind technique developedby Seshadri oJers an optimal solution, the pricepaid is high computational complexity. Raheli et al.[10] discuss per-survivor processing (PSP) algorithmwhich can be used to approximate MLSE algorithmswhenever unknown quantities exist that prevent theexact use of the standard Viterbi algorithm. The mainprinciple behind PSP is that the data-aided estimationof unknown parameters can be embedded into theframework of the Viterbi algorithm. Unfortunately,the use of PSP results in a longer computation time.PSP can operate without any training sequences butonly in the case where there are enough data for PSPto converge to the channel estimates [1]. Kim andCox [5] have developed a blind sequence estimatorthat does not require any training sequences. Theirestimator is based on PSP-MLSE where the initialchannel estimates needed for the conventional MLSEare obtained blindly. This occurs at the instant whena burst arrives. This blind initial channel estimateaids PSP to converge quickly. An application for thisestimator can be used in short burst TDMA formats.Tong [14] has developed a blind sequence estima-

tion technique that is very suitable for use in mobilechannels. His technique is primarily based upon sin-gular value decomposition (SVD). In blind sequencedetection, the training period of the transmission overan ISI channel can be reduced or eliminated and the

convergence rate occurs within 100 symbols. Manymobile channels require channel identi<cation within100 symbols [14]. Similarly, the innovations processhas found numerous applications in the area of wire-less communications. Yu and Pasupathy [19] haveapplied the innovations approach to Rayleigh fadingchannels, and have developed a general and practicalMLSE receiver which demodulates a received signalrecursively in a non-coherent fashion. So far in thecurrent literature, nobody has studied the applicationof an innovations approach to blind sequence detec-tion (especially in the case of fading channels) or com-pared it to the SVD based blind sequence detectionalgorithm proposed in [14].In this paper we hope to achieve two objectives.

First we will present an alternative method to blindsequence detection. One of the main motivations ofusing an innovations approach for blind sequence de-tection is that it has many inherent advantages overSVD. For example, if we have a N×N autocorrelationmatrix, Rx, the innovations process which is based onCholesky decomposition results in a lower triangularmatrix with N (N + 1)=2 elements while SVD has N 2

elements [18]. Therefore, innovations has about halfthe number of coeMcients when compared to SVD.Furthermore, innovations is a causal system. Hence,the innovations approach is a simple, causal and com-putationally eMcient system when compared to SVD.It will be shown later on that Cholesky decompositionof Rx is required to construct an innovations basedblind sequence detector (BSD). This is similar to thedetector developed in [14] where the SVD of Rx wasdone to develop an SVD based blind sequence detec-tion algorithm.An innovations based BSD will be presented and

implemented for Rayleigh fading channels. Similar toTong’s approach [14], we focus on blind sequencedetection using an innovations approach instead ofchannel identi<cation. The main idea is to estimatethe source (deterministic) correlation from observa-tion (without knowing the channel). The Viterbi al-gorithm is then used to reconstruct the input symbols.The estimates of the signal correlation function requireonly channel orthogonalization. Secondly, a compari-son will be made between the SVD BSD and the pro-posed innovations based BSD. A study of these twoBSDs operating in a slow and -at fading channel andadditive white Gaussian noise (AWGN) channel will

S. Sen, S. Pasupathy / Signal Processing 83 (2003) 1945–1959 1947

be done and presented. Furthermore, our work consid-ers linearly modulated signals which operate well fordiversity receivers.This paper is organized as follows, Section 2 de-

scribes the system model and assumptions. In Sec-tion 3, an innovations based blind sequence detectionalgorithm is developed and presented. The computersimulations are displayed in Section 4 and as well asin depth discussion of the results are also presented.Finally, conclusions are stated in Section 5.

2. System model and assumptions

In this section we will outline the system modeland parameters used to develop our innovations basedBSD. This model is similar to the one used in [14].A multi-path Rayleigh fading channel is often used torepresent the channel characteristic between a trans-mitter and the ith mobile receiver in a wireless en-vironment. The multi-path Rayleigh fading channelenvelope impulse response can be written as

ci(t) =K∑k=1

ik(t − �ik); (1)

where K is the number of multi-paths, ik(t) arezero-mean Gaussian processes and �ik are the asso-ciated path delays. The subscript “i” denotes the ithreceiver in the system. The received baseband signalat the ith receiver can be written as

xi(t) =∑n

snp(t − nTs) ∗ ci(t) + ni(t) (2)

=∑n

snhi(t − nTs) + ni(t): (3)

Here sn represents the symbol sequence where onlyone symbol is transmitted for every time interval Ts.The “combined” channel hi(t) includes the channel,pulse shaping (p(t)) and receiver <lters. AWGN isrepresented by ni(t). The signal x(t) is bandlimitedand sampled at t=l�. As such, the discrete time modelcan be expressed as

xi(l) =∑n

snhi(l− nN ) + ni(l); (4)

where N = Ts \ � is an integer.

In this model, we use oversampling and not sym-bol rate sampling. Symbol rate sampling is used fora system that utilizes match <lters. In order to use amatch <lter, one must have knowledge of the channel.However, in blind sequence detection the channel isunknown and hence symbol rate sampling cannot beused. Generally speaking, oversampling is used whenthe channel is unknown in order to get suMcient statis-tics.Assume h(l) last d symbol intervals or in other

words h(l) = 0 for l¡ 0 and l¿dN . Thus, the re-ceived signal can be written in the following vectorform:

xi(t) = [xi(tN ); : : : ; xi((t + 1)N − 1)]t ; (5)

ni(t) = [ni(tN ); : : : ; ni((t + 1)N − 1)]t ; (6)

hi(t) = [hi(tN ); : : : ; hi((t + 1)N − 1)]t ; (7)

Hi = [hi(0) : : : ; hi(d− 1)]t ; (8)

s(t) = [st ; st−1; : : : ; st−d+1]t : (9)

Therefore, we have

xi(t) =His(t) + n(t): (10)

If we have L receivers, we can put all the infor-mation from the L receivers into a compact form asshown below

x(t) =

x1(t)

...

xL(t)

∈CNL×1; (11)

n(t) =

n1(t)

...

nL(t)

∈CNL×1; (12)

H(t) =

H1(t)

...

HL(t)

∈CNL×d: (13)


In vector form this reduces to

x(t) =Hs(t) + n(t): (14)

The main idea behind blind sequence detection isto detect the information symbols sn without knowingwhat the channel parameter H is.The following assumptions will be made in our

model [14]:

(A1) The information symbol sequence sn is zeromean and E(sis∗j ) = (i − j).

(A2) Noise power nj() is zero mean for all j andE(ni(t1)n∗j (t2)) = �2(i − j)(t1 − t2).

(A3) The noise process is uncorrelated with {sn}.(A4) Channel identi<cation matrix H is an N by d

matrix with full column rank.

3. An innovations based BSD

There appears to be a connection between the workdone by Tong [14] and by Yu and Pasupathy [19].It can be shown that a relationship exists betweeninnovations and SVD [11]. Since Tong has used anSVD approach to BSD, then it is theoretically pos-sible to derive an innovations based BSD. The inno-vations approach is simple, causal and numericallyeMcient [18]. In this section, a new approach towardsblind sequence detection is proposed based on inno-vations. This proposed approach does not require anychannel identi<cation and follows the same principlesbehind the SVD BSD used in [14]. The fundamentalidea behind this methodology is to estimate the source(deterministic) correlation without knowing the chan-nel characteristics, and then to use the Viterbi Algo-rithm to obtain the source symbols. We will also showthat, in order to estimate the signal correlation func-tion, one only has to orthogonalize the channel. Theinnovations approach is used as a motivation for thisorthogonalization.

3.1. Innovations and channel orthogonalization

The innovations process plays an important role inour BSD. The main idea in our approach is to orthog-onalize the channels using only the observation data.

Let

Rx(k) = E(x(t)x∗(t − k)): (15)

From (15) and assumptions (A1)–(A4) we get

Rx(0) =HH∗ + �2I: (16)

By the innovations approach or Cholesky factorization[8] Rx(0) can be written as

Rx(0) = P−1D(P∗)−1 (17)

where P is a lower triangle matrix composed of allorders of prediction coeMcients of x(t) and D is a di-agonal matrix where the diagonal entries are the vari-ances of the corresponding prediction errors. If thereis no noise in (16) (�2 = 0) we then have

HH∗ = P−1D(P∗)−1; (18)

(D−1=2PH)(D−1=2PH)∗ = I: (19)

Therefore, the transform =D−1=2P orthogonalizesthe channel H. In otherwords, there is an orthogonalmatrix V such that H = V. The matrix is calledthe whitening <lter and its inverse L = −1 is calledthe innovations <lter of x(t). In the absence of noise,the (deterministic) correlation of the -transformedoutput

y(t) = x(t) = Vs(t) (20)

is identical to that of the source [14]

y∗(t)y(t − k) = s∗(t)s(t − k): (21)

Without having any knowledge of the channel, the(deterministic) correlation of s(t) can be found fromthe received samples.If we assume that HH∗ is a square positive

semi-de<nite matrix then in the absence of noiseCholesky factorization will work. Unfortunately,HH∗ is normally not positive semi-de<nite. There-fore the matrix HH∗ cannot be whitened by theinversion of the Cholesky factors since the requiredinverses do not exist. However, by adding some noiseto the system (�2 �= 0) Cholesky factorization canbe done because Rx(0) will be positive de<nite [11].


Fortunately, noise exist in almost all practical wirelesssystems (�2 �= 0) and therefore an innovations basedBSD can be developed.

3.2. Implementation of the Viterbi Algorithm

The key point of the blind sequence detection al-gorithm is that the inner product (21) is preserved.The Viterbi Algorithm can be used for a noisy chan-nel to calculate the estimated sequence which resultsin the minimum correlation (deterministic) diJerencebetween the �-transformed received sample y(t) andthe estimated sequences. If we let

r(k)y (t) = y∗(t)y(t − k); (22)

r(k)s (t) = s∗(t)s(t − k) (23)

=d−1∑l=0

s∗t−lst−l−k ; (24)

then one obtains

r(k)y (t) = r(k)s (t) + w(k)(t) (25)

where w(k)(t) represents the interfering noise compo-nents of the observation

w(k)(t) = s∗(t)H�∗�n(t − k)

+n∗(t)�∗�Hs(t − k)

+n∗(t)�∗�n(t − k):

We can use the de<nition found in [14] to state theoptimum sequence detection as

min∑t

|r(k)y (t) − r(k)s (t)|2: (26)

This detection problem can be solved by applyingthe Viterbi Algorithm to a Kd+k−1 state trellis for aK-QAM constellation. The delay parameter k shouldbe chosen in such a manner so that the number ofstates in the Viterbi Algorithm is minimized. Increas-ing k results in a more complex system and as shownin [11], the added complexities of a higher k are notvery bene<cial to blind sequence detection algorithms.Thus we set k =1 as being suMcient to determine theinformation sequence {sn}.

Once the algorithm has decided on a d value, thecorresponding trellis for the Viterbi Algorithm is easyto determine. For example, suppose d=2 and a BPSKconstellation signaling scheme is used, then we knowthat

r(1)y = stst−1 + st−1st−2 + w(1)(t): (27)

In Fig. 1, the state transition diagram is shown wherethe state !n is given as !n = (sn−1; sn). The corre-sponding path for the transition from !n to !n+1 islabeled as r(1)s (n+1). In Fig. 2, the source sequence isestimated by feeding in the r(1)y (t) estimates of r(1)s (t)to the Viterbi Algorithm. The Viterbi Algorithm thendetermines the most likely sequence by using the op-timal sequence detection criteria in (26).

3.3. Source correlation estimators

The innovations process only allows us to <nd thesource correlation of s(t) from the correlation of y(t)in the absence of noise. In many communication sys-tems, noise is a common phenomena and must be dealtwith. Unfortunately, the whitening <lter � is not op-timum in the presence of noise.The main goal of the innovations based BSD is to

obtain an optimum estimation of the source correlation

rs(t) = s∗(t)s(t − 1): (28)

The Viterbi Algorithm can then be used with the esti-mated rs(t) to recover the input symbols. We can usethe innovations process as a motivation to estimaters(t) from the correlation function of the transformedobservations. De<ne

y(t) = �ox(t); (29)

ry(t) = y∗(t)y(t − 1): (30)

The goal is to <nd an optimum <lter �o that minimizes[14]

J (�) = E(|r(k)y (t) − r(k)s (t)|2): (31)

The optimization problem for the SVD BSD wassolved in [14]. Using similar techniques presented in[14] we can attempt to derive an optimal solution forthe innovations BSD. It can be shown that (see the


(s(n-1),s(n))

(-1,-1)

(-1,1)

(1,-1)

(-1,-1)

(s(n),s(n+1))

2

0 0

-2

0 -2

0

2

Fig. 1. Example of a state transition diagram when d = 2.

(1,1)

(1,1)

(1,1)

(1,1)

0.023 -2.34 -2.13 -0.12ry

Fig. 2. The optimal path determined by the Viterbi Algorithm.

appendix for a derivation of this cost function)

J = tr(AQAQ) + �4 tr(A2) (32)

+ 2�2 tr(A2Q) − 2tr(AQ) + d; (33)

where A and Q are Hermitian matrices.

A = �∗�; (34)

Q=HH∗: (35)

In order to <nd �o, we have to <nd the optimum Ao

that minimizes J (�) in (31) by letting

@J (�)@A

= 0: (36)

By doing this we get [14]:

QAQ+ �4A + 2�2AQ−Q= 0: (37)

From Ao the optimum <lter �o can be found. Solvingfor A in this matrix equation does not seem straight-forward and we were unable to <nd a closed form so-lution.However, we can look at a one-dimensional sys-

tem (d = 1) in order to get an intuitive feel as towhat a sub optimal source correlation estimator maybe for a higher-dimensional innovations based BSD.We will now solve the matrix equation (37) for theone-dimensional case. In the one-dimensional case,(37) can be written as

q2a+ �4a+ 2�2aq= q: (38)


Recall that the Cholesky factorization of Rx(0) is

Rx(0) = hh∗ + �2 (39)

= p−1d(p∗)−1: (40)

Since we previously de<ned q= hh∗ we have

q= p−1d(p∗)−1 − �2 (41)

= p−1(d− �2pp∗)(p∗)−1 (42)

= p−1c(p∗)−1; (43)

where c = d − �2pp∗. Using this substitution andsolving for a in (38) we have

a=q

q2 + 2�2q+ �4

=q

(q+ �2)2

=p−1c(p∗)−1

(p−1c(p∗)−1 + �2)2

=p−1(p−1c(p∗)−1 + �2)−2c(p∗)−1

=p−1(p−1d(p∗)−1)−2c(p∗)−1

=p−1c1=2(p−1d(p∗)−1)−1

×(p−1d(p∗)−1)−1c1=2(p∗)−1:

Thus, for �o we have

�o = (p−1d(p∗)−1)−1c1=2(p∗)−1 (44)

= (d− �2pp∗)1=2d−1p: (45)

Thus, in (45) we have an exact optimal <lter for aone-dimensional system. Notice that as �2 → 0 the�o <lter approaches the whitening <lter as expected.Perhaps a sub-optimal <lter may arise if a minimum

variance estimate of s(t) is done. A minimum varianceestimate s(t) of s(t) will be done in order to estimaters(t)= s∗(t)s(t−1) by using rs(t)= s∗(t)s(t−1). Ourtask now is to estimate s(t) from s(t). The minimumvariance linear estimate of s(t) is [7]:

s(t) = �mvx(t) (46)

= E(s(t)x∗(t))R−1x (0)x(t) (47)

= (D− �2PP∗)1=2D−1Px(t): (48)

Rx(0)

estimateds_k

XΓ0

Innovations

<,>Correlator

AlgorithmViterbi

Delay

d,P,D,σ

y

Fig. 3. An innovations based blind sequence detection algorithm.

Thus, the linear transform�mv that gives the minimumvariance estimate is

�mv = (D− �2PP∗)1=2D−1P (49)

which is the same matrix that provides the same mini-mum variance estimate of the source correlation rs (re-fer to (45)) for a one-dimensional system. Intuitively,since (49) and (45) agree, it seems that this <lter maybe well suited for an innovations based BSD.

3.4. An innovations based BSD: outline andimplementation

A schematic of the innovations approach to blindsequence detection is shown in Fig. 3. The following isa brief outline of the innovations based blind sequencedetection algorithm.

1. Form the data vectors xi(t) by sampling the re-ceived signal at each receiver.

xi(t) = [xi(tN ); : : : ; xi((t + 1)N − 1)]t : (50)

2. Form the total data vector x(t) by

x(t) = [xt1(t); : : : ; xtL(t)]

t : (51)

3. Estimate Rx(0) by

Rx(0) = 1=QQ−1∑i=0

x(t − i)x∗(t − i): (52)

4. Compute the Cholesky factorization of Rx(0)

Rx(0) = P−1D(P∗)−1: (53)


5. Estimate the signal dimension d (see the discus-sion below on estimating d).

6. Form the optimal transform matrix � (see Eq.(49)).

7. Form the reduced optimal �o by selecting the <rstd rows of �.

8. Apply the linear transform to the data vector

y(t) = �ox(t): (54)

9. Compute the correlation

ry(t) = y∗(t)y(t − 1): (55)

10. Estimate the source symbol sequence using theViterbi Algorithm by minimizing

min∑t

|ry(t) − rs(t)|2; (56)

rs(t) =d∑i=0

s∗t−ist−i−1: (57)

Simulation results have shown that estimating thedimension of the signal subspace d is very important[11]. Using a lower dimension approximation is oftensatisfactory [14]. The smaller d results in a smallertrellis for the Viterbi Algorithm. In the current litera-ture, there are many detection schemes which can beused as a starting point to estimate d [14,6]. In [11]the eJects of varying d were investigated and it wasdetermined that an optimal d existed for both algo-rithms. Simply increasing d did not always result inan improved performance. In the next section, we willbrie-y discuss this point for the slow and -at fadingmodel used in our simulation runs. For both the inno-vations and SVD approach to blind sequence detectionwe have assumed that �2 is known. This was done tomake a fair comparison between the two estimators.

4. Simulation results: an analysis of BSD

In this section we present a new fading model diJer-ent than the one used in [14]. Only two ray multi-pathchannels were used where we assumed to have onlyone receiver. We will keep everything else in our sim-ulation setup the same (i.e. sampling rate, SNR de<-nition, etc.) as in [14] in order to do a fair comparisonbetween the two receivers. The sampling rate at eachreceiver was four times the symbol rate (T = 4). The

composite channel (h(t)) used in our model can bedescribed as

h(t) = 11p(t) + 12p(t − �); (58)

where p() was a raised cosine pulse with 90% rolloJ. The time delays � were delayed from �= 0:1T to� = 1:0T . The time delays were purposely varied inorder to see how the BSDs behaved in diJerent “delayspread” environments. This particular model was cho-sen because we wish to see how these BSDs work inan environment where we can easily vary channel pa-rameters and see their direct eJect on the performanceof blind sequence detection algorithms. By keepingthe model relatively simple yet realistic, we can gainmore insight on how blind sequence detection algo-rithms operate. The gain ij at the ith receiver withrespect to the jth multipath is

ij = [1:2094 + 0:8260j − 0:5123 − 1:3070j]:(59)

The gain (ij) was generated from a zero mean unitvariance Gaussian distribution. AWGN was added atthe receiver end in our digital communication sys-tem model. We used the following de<nition for thesignal-to-noise ratio (SNR) in our simulations [14]:

SNR = 10 log10E(‖Hs(t)‖2)E(‖n(t)‖2) : (60)

A binary phase shift keying (BPSK) signal constel-lation was used as the input source to our commu-nication system model. The covariance matrix Rx(0)was estimated every time 100 bits of data was re-ceived. The transformation matrix �o was calculatedfrom the Cholesky factorization of the estimated co-variance matrix Ro(0). The received data x(t) wastransformed by �o and its correlation was computed.The correlation of the transformed samples were thenfed to the Viterbi Algorithm in order to reconstructthe input symbols. The initial state of the trellis wasknown and this was done by sending a few -ag bitsat the beginning of transmission.The innovations and SVD based BSDs were <rst

compared and analyzed in a simple AWGN channel.From Fig. 4, the SVD based BSD has a better BERthan the innovations based BSD. There is roughly a2:5 dB improvement in performance when the SVDBSD is used. Perhaps by analyzing the minimummean


0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 510-6

10-5

10-4

10-3

10-2

10-1

SNR (dB)

Est

imat

ed B

ER

SVD BSD vs. Innovations BSD for AWGN

SVD BSDInnovations BSD

Fig. 4. Performance of the SVD and Innovations based BSD inAWGN.

squared distance measure of the shortest error path inthe trellis (d2min) we can explain why the SVDBSD hasa better performance. For both BSDs the theoreticalminimum mean squared distance is the same (d2min =8). As such the shortest error path in the trellis isidentical for both estimators (see Fig. 10). Simulationresults also agree with the theoretical d2min for bothBSDs. Thus, in this case, the d2min measure cannotexplain why the SVD BSD has a better performance.Fortunately, we can introduce a new measure

known as the average correlation estimation error(CER) to explain why the SVD BSD has a betterperformance in AWGN. This measure is de<ned as

CER =∑N

i=1 |ryi − rsi |2N

: (61)

Here, ryi is a vector of the estimated correlation de-termined by the BSD, rsi is a vector of the actual cor-relation and N is the number of times the correlationvectors were estimated. The same data (signal andnoise) was used for both estimators in order to

0 1 2 3 4 5 610

15

20

25

30

35

40

45

50Average Correlation Error for BSDs in AWGN

Ave

rage

Est

imat

ed C

orre

latio

n E

rror

SNR (dB)


Fig. 5. Average CER for SVD and Innovations based BSD inAWGN.

determine and compare their respective CER. FromFig. 5, the SVD BSD has an overall lower CER. Asthe SNR increases, the CER for both estimators appearto merge to some asymptotic value. Since the SVDBSD has a lower CER, this estimator provides betterestimates of the correlation of the source and as suchhas a better performance in terms of BER.In Fig. 6 the optimal results in terms of BER for

the fading model using the SVD and innovations BSDhave been displayed. Notice that for both BSDs, asthe delay spread � increases so does the SNR for a<xed BER. Therefore, it appears that both blind se-quence detection algorithms are sensitive to changes inthe delay spread. The SVD BSD is especially sensitivewhen � goes from 0:4T to 0:5T . This can be explainedby using the CER measure introduced earlier (61). InFig. 7 the CER is plotted for two delay spreads forthe SVD BSD. As � goes from 0:4T to 0:5T the esti-mation error increases and results in a BER increasesince at a larger delay spread the SVD BSD poorlyestimates the source correlation. It was also observed


0 5 10 15 20 25 3010-4

10-3

10-2

10-1

100

τ=0.1T

τ=0.1T

τ=0.2T

τ=0.2Tτ=0.4T

τ=0.4T τ=0.5T

τ=0.5T

τ=1.0T

τ=1.0T

SNR (dB)

Est

imat

ed B

ER

Comparing SVD and Innovations BSD with Varying Time Delay

Innovations BSDSVD BSD

Fig. 6. Varying the delay spread � for the innovations and SVDblind sequence detectors.

that increasing d did not always result in an improvedBER for both BSDs. In fact, the simulations showedthat the optimal d for the innovations BSD was dif-ferent than that for the SVD BSD.In Fig. 6, by using the Innovations approach, there

is a signi<cant gain of about 3 dB when compared tothe SVD BSD. In fact, for � = 0:5T , the InnovationsBSD has a tremendous gain over the SVD BSD interms of BER performance. By looking at the CERmeasure more insight is gained as to why the inno-vations BSD has a better performance. However, inorder to make an accurate comparison when using theCER measure, the same d should be used for bothBSDs. Unfortunately, the optimal d for the innova-tions BSD is diJerent than the SVD BSD. In Fig. 8,the CER is plotted for both estimators where d=1 forthe innovations BSD and d=2 for the SVD BSD. Thedelay spread for this <gure is � = 0:5T . Initially, theinnovations BSD has a lower estimation error. How-ever, when the SNR goes above 16 dB the SVD BSDhas a lower estimation error yet in terms of the BER

2 4 6 8 10 12 14 16 18 200

20

40

60

80

100

120

140

SNR (dB)

Ave

rage

Cor

rela

tion

Est

imat

ion

Err

or

Average Correlation Estimation Error for SVD BSD

τ=0.4Tτ=0.5T

Fig. 7. Average CER for SVD BSD with varying time delay.

the SVD BSD still has a much higher BER than theinnovations BSD.As mentioned earlier, we are using diJerent signal

dimensions (d) for these estimators and as such a faircomparison cannot be made since both estimators areusing a diJerent trellis for their Viterbi Algorithm.However, we can still give some explanation as towhy the innovations BSD has a better performance.In Figs. 9 and 10, the trellises for d=1 and 2 with theminimum error event path is shown. If an error eventdoes happen when d= 2, it results in more bit errorsthan in the case for d=1. Notice that when d=1, only1 bit is in error for the minimum error path while ford = 2 two bits are in error. Since the SVD BSD hasa higher d parameter, when an error event does occurmore bit errors will result. If we take into accountboth the number of bit errors and the correspondingbit pattern lengths for each d and carefully examinethe resulting bit error rate for an error event, we noticethat the BER changes for each dmin (i.e. BER for d=1is 1/3 and BER for d= 2 is 2/5, see Figs. 9 and 10).


0 5 10 15 20 25 3070

80

90

100

110

120

130

140

150

SNR (dB)

Est

imat

ed C

orre

latio

n E

rror

Comparison of Correlation Estimation Error For Innovations and SVD Based BSD


Fig. 8. CER for innovations BSD (d= 1) and SVD BSD (d= 2)with � = 0:5T .

-1

1

State1 1

-1 -1

minimum error path

input = -1-1-1

minimum error path = -11-1

Fig. 9. Trellis with d = 1 with a minimum error event path.

The BER is higher for d = 2. Referring to Fig. 8, athigh SNR the CER for both BSDs are very similar.Both detectors seem to be having the same amount ofcorrelation errors. Since the SVD BSD has a d = 2

input sequence =-1-1-1-1-1

minimum error path =-1 1 1-1-1

State

(-1,-1)

(-1,1)

(1,-1)

(1,1)

minimum error path

Fig. 10. Trellis with d = 2 with a minimum error event path.

where as the innovations BSD has a d=1, we expectmore bit errors for the SVD BSD when the CER issimilar for both BSDs. Therefore, if we look at thecombined eJect of bit errors, bit error rate for an errorevent corresponding to a particular dmin and CER, wecan explain why the SVD BSD performance is worsethan the innovations BSD. We can conclude that onemust be careful when looking at estimation error as anabsolute performance measure. A low estimation errordoes not necessarily result in a lower BER especiallyin the case when diJerent trellises are used for theViterbi Algorithm.Using the model presented in [14], we will now

investigate the eJects of diversity on the performanceof BSDs. We use linearly modulated signals for oursimulation study. These signals are useful for diver-sity receivers. Our work studied linearly modulatedsignals which work well for diversity receivers. Sim-ulations have shown that increasing the number of“diversity paths” for the innovations BSD does notprovide a signi<cant improvement in performance[11]. The innovations approach to blind sequencedetection is essentially robust to diversity techniques.However, the SVD BSD performs very well when di-versity is used. From Fig. 11, we can clearly see thatas the diversity is increased the performance of theSVD BSD improves signi<cantly. This result can beexplained by using the average CER measure. From


0 5 10 15 20 25 3010

-4

10-3

10-2

10-1

100

SNR (dB)

Est

imat

ed B

ER

Diversity Reception for SVD Based Blind Sequence Detector

Diversity = 3

Diversity = 2

Diversity = 1

Fig. 11. Diversity reception for SVD BSD.

Fig. 12, the estimated correlation error decreases asthe diversity is increased. For all three cases, as theSNR increases the CER approaches some asymptoticlimit. This explains why the BER approaches somelimit as the SNR is increased. In this case, the CER isa valid performance measure because the same trellisis used for all types of diversity reception.

5. Conclusions

An innovations approach to blind sequence detec-tion was studied and compared to the SVD based blindsequence detection. Each of these techniques has itsadvantages and disadvantages. The obvious bene<ts ofusing the innovations approach are that it is a causal,simpler, less expensive and computationally more ef-<cient system than SVD. An innovations based blindsequence detector (BSD) for Rayleigh fading chan-nels was proposed and its performance was studiedusing computer simulations. Simulations were also

0 2 4 6 8 10 12 14 16 18 200

50

100

150

SNR (dB)

Cor

rela

tion

Est

imat

ion

Err

or

Correlation Estimation Error for SVD BSD with Diversity Reception

Diversity=1

Diversity=2

Diversity=3

Fig. 12. CERs for SVD BSD with varying diversity.

carried out to compare the error performance of boththe SVD and innovations based blind sequence detec-tion algorithms. The AWGN channel was investigatedand it was found that the SVD BSD performed slightlybetter than the Innovations BSD. However, when thetwo estimators were compared in a slow and -at fad-ing channel, the innovations BSD outperformed theSVD BSD. It was also determined that ISI can inter-fere with the performance of both BSDs. When di-versity reception was employed, the SVD BSD hada better performance gain in terms of BER. By look-ing at the CER measure and trellis diagrams for bothBSDs more insight was gained as to explain why acertain estimator performed better than the other for aparticular environment. The results indicate that onemust be careful when using the CER criterion to ex-plain the performance of BSDs. The CER measure attimes seems to give inconsistent estimates about theBER. Further studies are needed to gain more insighton the CER and to make it more useful in analyzingBSDs. This area is still open for more research.


Appendix A.

We will now mathematically derive J (�). It turnsout that the derivations for J (�) is very similar to thederivations done in [14]. Let us de<ne the followingterms as:

x(t) =Hs(t) + n(t); (A.1)

y(t) = x(t); (A.2)

rs = s∗(t)s(t − 1); (A.3)

ry = y∗(t)y(t − 1); (A.4)

A = ∗ ; (A.5)

Q=HH∗; (A.6)

J = E(|ry − rs|2): (A.7)

We can express

J = E(|ry − rs|2) (A.8)

= E(|ry|2) − E(ryr∗s ) − E(rsr∗y) + E(|rs|2): (A.9)

In order to evaluate E(|ry|2) we observe thatry = y(t)∗y(t − 1) (A.10)

= x∗(t)Ax(t − 1) (A.11)

= r(i)y + r(ii)y + r(iii)y + r(iv)y ; (A.12)

where

r(i)y = s∗(t)H∗AHs(t − 1); (A.13)

r(ii)y = n∗(t)An(t − 1); (A.14)

r(iii)y = s∗(t)H∗AHn(t − 1); (A.15)

r(iv)y = n∗(t)AHs(t − 1): (A.16)

In addition to the previous assumptions ((A1)–(A4)),assume {sk} is i.i.d. and the noise ni(·) is indepen-dent of nj(·) and the source {sk} [14]. Taking these

assumptions into consideration we can show that

E(|ry|2) = E(|r(i)y |2) + E(|r(ii)y |2)

+E(|r(iii)y |2) + E(|r(iv)y |2): (A.17)

We can write E(|r(i)y |2) asE(|r(i)y |2) = E(s∗(t)H∗AHs(t − 1)s∗(t − 1)

×H∗AHs(t)): (A.18)

Identity 1. If X is a 1 × N matrix, Y is a N × Nmatrix and Z is a N × 1 matrix then

XYZ = tr(YZX ): (A.19)

Using this identity in (A.19) we can express (A.18)as

E(|r(i)y |2) = tr(H∗AHE(s(t − 1)s∗(t − 1)

×H∗AHs(t)s∗(t))): (A.20)

De<ne B=H∗AH and let Bij be the (i; j)th componentof B. We can rewrite (A.20) as

E(|r(i)y |2) = tr(B∗E(s(t − 1)s∗

×(t − 1)Bs(t)s∗(t))): (A.21)

It can be shown that the (i; j)th component of

s(t − 1)s∗(t − 1)Bs(t)s∗(t)

=st−ist−j+1

∑k;l

Bk; lst−l+1s∗t−k : (A.22)

Thus the (i; j)th component of E(s(t − 1)s∗(t −1)Bs(t)s∗(t)) is given by

E

st−ist−j+1

∑k;l

Bk; lst−l+1s∗t−k

= Bij: (A.23)

Since sk is an i.i.d. sequence the above equation holds.Therefore, we can write

E(|r(i)y |2) = tr(B∗E(s(t − 1))

×s∗(t − 1)Bs(t)s∗(t)) (A.24)

= tr(B∗B) (A.25)


= tr(H∗A∗HH∗AH) (A.26)

= tr(A∗HH∗AHH∗) (A.27)

= tr(A∗QAQ∗): (A.28)

We will now determine E(|r(ii)y |2):E(|r(ii)y |2) = E(n∗(t)An(t − 1)

×n∗(t − 1)A∗n(t)) (A.29)

= tr(AE(n(t − 1)n∗(t − 1)A∗n(t)

×n∗(t))) (A.30)

= �4 tr(AA∗): (A.31)

The remaining third and fourth terms are evaluated ina similar fashion:

E(|r(iii)y |2) = E(s∗(t)H∗An(t − 1)

×n∗(t − 1)A∗Hs(t)) (A.32)

= tr(H∗AE(n(t − 1)

×n∗(t − 1)A∗Hs(t)s∗(t))) (A.33)

= tr(H∗AE(n(t − 1)

×n∗(t − 1))A∗HE(s(t)s∗(t))) (A.34)

= �2 tr(H∗AA∗H) (A.35)

= �2 tr(AA∗Q); (A.36)

E(|r(iv)y |2) = E(n∗(t)A∗Hs(t − 1)

×s∗(t − 1)H∗An(t)) (A.37)

= tr(A∗HE(s(t − 1)s∗(t − 1))

×H∗AE(n(t)n∗(t))) (A.38)

= �2 tr(A∗HH∗(A)) (A.39)

= �2 tr(AA∗Q): (A.40)

Therefore, we can say that

E(|ry|2) = tr(AQAQ) + �4 tr(A2)

+2�2 tr(A2Q): (A.41)

Using the previously mentioned assumptions wecan evaluate E(rsr∗y) and E(ryr

∗s ). We can express

E(rsr∗y) = E(s∗(t)s(t − 1)s∗(t − 1)H∗AHs(t)(A.42)

+ s∗(t)s(t − 1)s∗(t − 1)H∗An(t)s∗(t)

× s(t − 1)n∗(t − 1)AHs(t) (A.43)

+ s∗(t)s(t − 1)n∗(t − 1)An(t)) (A.44)

= E(s∗(t)s(t − 1)s∗(t − 1)H∗AHs(t)) (A.45)

= tr(E(s(t − 1)s∗(t − 1)

×H∗AHs(t)s∗(t))) (A.46)

= tr(H∗AH) (A.47)

= tr(AQ) (A.48)

and

E(ryr∗s ) = E(s∗(t)H∗AHs(t − 1)s∗(t − 1)s(t)(A.49)

+ s∗(t)H∗An(t − 1)s∗(t − 1)s(t) (A.50)

× n∗(t)AHs(t − 1)s∗(t − 1)s(t)

+ n∗(t)An(t − 1)s∗(t − 1)s(t)) (A.51)

= E(s∗(t)H∗AHs(t − 1)s∗(t − 1)s(t)) (A.52)

= tr(E(H∗AHs(t − 1)

×s∗(t − 1)s(t)s∗(t))) (A.53)

= tr(H∗AH) (A.54)

= tr(AQ): (A.55)


For evaluating E(|rs|2) we <nd that

E(|rs|2) = E(rsr∗s ) (A.56)

= E(s∗(t)s(t − 1)s∗(t − 1)s(t)) (A.57)

= E

d∑i; j=1

s∗t−i+1st−is∗t−jst−j+1

(A.58)

=d∑

i; j=1

E(s∗t−i+1st−is∗t−jst−j+1) (A.59)

=d∑i=1

E(|st−i+1|2|st−i|2) (A.60)

=d∑i=1

E(|st−i+1|2)E(|st−i|2) (A.61)

=d∑i=1

1 (A.62)

= d: (A.63)

Therefore, we can <nally show that

J = tr(AQAQ) + �4tr(A2) + 2�2 tr(A2Q)

−2 tr(AQ) + d; (A.64)

where A and Q are Hermitian matrices.

References

[1] K.M. Chugg, Acquisition performance of blind sequencedetectors using per-survivor-processing, Proceedings of theVTC97, Phoenix, AZ, USA, May 1997, pp. 539–543.

[2] A. Duel-Hallen, C. Heegard, Delayed decision-feedbacksequence estimation, IEEE Trans. Comm. 37 (1989) 428–436.

[3] M. Ghosh, C.L. Weber, Maximum-likelihood blindequalization, Opt. Eng. 31 (June 1992) 1224–1228.

[4] D.N. Godard, Self-recovering equalization and carriertracking in two-dimensional data communication systems,IEEE Trans. Comm. 28 (November 1980) 1867–1875.

[5] B.J. Kim, D.C. Cox, Blind sequence estimation for shortburst communications over time varying wireless channels,Proceedings of the ICUPC 97, San Diego, CA, USA, 1997,pp. 713–717.

[6] K. Konstantinides, K. Yao, Statistical analysis of eJectivesingular values in matrix rank determination, IEEE Trans.Acoust. Speech Signal Process. 36 (May 1998) 757–763.

[7] D.G. Luenberger, Optimization by Vector Space Methods,Wiley, New York, NY, 1969.

[8] A. Papoulis, Probability, Random Variables and StochasticProcesses, McGraw-Hill, New York, NY, 1991, pp. 401–405(Chapter 12).

[9] B. Porat, Blind equalization of digital communicationchannels using higher order moments, IEEE Trans. SignalProcess. 39 (1991) 522–526.

[10] R. Raheli, et al., Per-survivor processing: a general approachto MLSE in uncertain environments, IEEE Trans. Comm. 43(February 1995) 534–564.

[11] S. Sen, Innovations and singular value decomposition forblind sequence detection in wireless channels, M.S. Thesis,University of Toronto, Toronto, 1999.

[12] N. Seshadri, Joint data and channel estimation using fastblind trellis search techniques, Proceedings of the Globecom,Tucson, AZ, USA, 1991, pp. 1659–1663.

[13] D. Shalvi, E. Weinstein, New criteria for blind deconvolutionof non minimum phase systems (channels), IEEE Trans.Inform. Theory 36 (March 1990) 312–320.

[14] L. Tong, Blind sequence estimation, IEEE Trans. Comm. 43(December 1995) 2986–2994.

[15] L. Tong, et al., Fast blind equalization of multipath channelsvia antenna arrays, Proceedings of the ICASSP, 1993.

[16] L. Tong, et al., Blind identi<cation and equalization based onsecond order statistics: a time domain approach, IEEE Trans.Inform. Theory 40 (March 1994).

[17] L. Tong, et al., Blind identi<cation and equalization ofmultipath channels: a frequency domain approach, IEEETrans. Inform. Theory 41 (January 1995) 329–334.

[18] X. Yu, Innovations based maximum likelihood sequenceestimation for Rayleigh channels, Ph. D. Thesis, Universityof Toronto, Toronto, 1995.

[19] X. Yu, S. Pasupathy, Innovations based MLSE for Rayleighfading channels, IEEE Trans. Comm. 43 (February 1995)1534–1544.

Documents

Innovations and singular value decomposition for blind sequence detection in wireless channels