Semiblind space–time decoding in wireless …jmiguez/papers/P7_2002_Semiblind Space...The wireless channel is a hostile medium where communication signals are severely distorted

Signal Processing 82 (2002) 1–18www.elsevier.com/locate/sigpro

Semiblind space–time decoding in wireless communications:a maximum likelihood approach

Joaqu!"n M!"guez∗;1, Luis CastedoDepartamento de Electr!onica e Sistemas, Universidade da Coruna, Spain

Received 29 May 2000; received in revised form 12 September 2001; accepted 12 September 2001

Abstract

It has been recently shown that deploying multiple transmitting and receiving antennae can substantially improve thecapacity of multipath wireless channels if the rich time-scattering is properly exploited. Space–time coding (STC) is anovel proposal that combines channel coding techniques suitable for multiple transmitting elements with signal processingalgorithms that exploit the spatial and temporal diversity at the receiver. In this paper, we focus on the signal processing per-spective and propose a novel space–time semiblind decoding scheme that performs maximum likelihood (ML) based channelestimation and data detection. The multiple input multiple output (MIMO) time-scattering channel is estimated using a blockiterative expectation-maximization (EM) algorithm that fully exploits the statistical features of the transmitted signal togetherwith the knowledge of a small number of transmitted symbols, hence the term semiblind. Data detection is e#ciently carriedout using the Viterbi algorithm. In order to reduce the computational load of the receiver, a modi$cation of the EM algorithmwith a potentially lower complexity is also suggested. Computer simulations show that the proposed semiblind decoder clearlyoutperforms conventional receivers that estimate the channel parameters exclusively from the a priori known transmittedsymbols. ? 2002 Elsevier Science B.V. All rights reserved.

Keywords: Maximum likelihood; Space–time processing; Space–time decoding; Channel estimation; Sequence detection; Multiuserdetection; Interference suppression; EM algorithm; Semiblind receivers

1. Introduction

The development of third generation mobile com-munication systems has brought wireless digital trans-mission to the focus of research in the last few years.

∗ Corresponding author. Department of Electrical and Com-puter Engineering, State University of New York at Stony Brook(SUNY), Light Engineering Building, Stony Brook, NY 11794,USA. Tel.: +34-981-167150; fax: +34-981-167160.1 Facultade de Inform!atica, Campus de Elvina s=n, 15071 A

Coruna, Spain.E-mail address: [email protected] (J. M!"guez), luis@des.

$.udc.es (L. Castedo).

The wireless channel is a hostile medium wherecommunication signals are severely distorted due tomultipath propagation, additive white Gaussian noise(AWGN) and inter-symbol interference (ISI) causedby time scattering. Therefore, it is hard to achievehigh data rates and it is expected that future mobilecommunication systems will incorporate sophisticatedcoding and signal processing capabilities in order toaccomodate a wide range of new wideband servicesincluding multimedia transmission [5].The key to overcome the limitations imposed by

the wireless channel and achieve high transmissionrates is to introduce some form of diversity. We willuse this term, hereinafter, to indicate the availability

0165-1684/02/$ - see front matter ? 2002 Elsevier Science B.V. All rights reserved.PII: S0165 -1684(01)00147 -5

2 J. M!"guez, L. Castedo / Signal Processing 82 (2002) 1–18

Nomenclature

fxe(·) Probability density function(p.d.f.) of the random vector xe

fxe;H(·) P.d.f. of the random vector xethat depends on the parametermatrix H

fxe|x(n);H(·) P.d.f. of the random vector xe,that depends on the parametermatrix H, conditioned to thevalue of the random vector x(n)

Exe(n)(·) Statistical expectation with re-spect to the random vector xe(n)

Exe(n);H(·) Statistical expectation with re-spect to the random vector xe(n)that depends on the parametermatrix H

Exe(n)|x(n);H(·) Statistical expectation with re-spect to the random vector xe(n),that depends on the parametermatrix H, conditioned to thevalue of the random vector x(n)

L(·) Expected log-likelihood functionL(·) Estimate of the expected log-likelihood

functionH (l) Sequence of N × L matrices that describe

the MIMO scattering channelH Nm× L MIMO channel matrixH MIMO channel matrix estimateH Overall MIMO channel matrix, with di-

mensions N (K−M +m−1)×L(K−M)s(n) nth transmitted symbol vector, with di-

mensions N × 1s(n) nth received symbol vector, with dimen-

sions Nm× 1s Space–time codeword, with dimensions

N (K −M)× 1sp Padded space–time codeword, with di-

mensions N (K −M + m− 1)× 1x(n) L× 1 observation vectorx Observation frame, with dimensions

L(K −M)× 1

of di%erent replicas of a signal of interest that canbe combined to enhance performance in some sense[35,51]. Two interesting examples are:• Temporal diversity: multipath propagation in wire-less channels causes several replicas of the trans-mitted signal to arrive at the receiver with di%erentdelays. Channel coding together with interleavingalso provides temporal redundancy that may be un-derstood as a form of diversity [24,35].

• Spatial diversity: spatially separated antennae canbe used at the receiver to obtain di%erent replicas ofthe communication signal. Spatial transmit diver-sity has also recently been considered as a meansto increase capacity [25,52].

In particular, it has been recently demonstrated thatdeploying multiple antennae both at the transmitterand the receiver leads to a signi$cant increase inthe system spectral e#ciency.Quantitatively, it was$rst shown by G.J. Foschini that the system capac-ity grows linearly with the number of transmittingantennae 2 [7,8]. Further combination with temporal

2 As long as this is less than or equal to the number of receivingantennae.

diversity has led to a novel proposal termed space–time coding (STC) that brings together channelcoding techniques suitable for multiple transmit-ting antennae with signal processing at the receiver[7,8,10,15,26,29,32–35,50].Fig. 1 shows the basic building blocks of a wire-

less communication systemwith space–time (ST) cod-ing capabilities [10]. The bit stream to be transmit-ted, b(l), is fed into a temporal coding stage. In thesimplest case, this processing block converts the bitstream into a sequence of independent symbols. In amore general setup, a channel code is used to producea temporally correlated symbol sequence with desir-able properties that enable error detection and correc-tion at the receiver. This sequence, s(l), is the input ofa serial-to-parallel converter that produces N symbolsubstreams, 3 s1(n); : : : ; sN (n). Parallel synchronoustransmission of the N substreams is carried out usinga bank of N conventional radio-frequency modulators

3 In some cases, such as the V-BLAST scheme proposed in[50], the di%erent data substreams are encoded independently andit is easier to think of the space–time encoder as a serial-to-parallelconverter followed by a bank of channel encoders.

J. M!"guez, L. Castedo / Signal Processing 82 (2002) 1–18 3

1s (n)

s (n)N

x (n)L

x (n)1

b(l)^

RX

RX

TX

TX

b(l) s(l)Channel Encoder

S/P Converter

Space-TimeDecoder

Space-Time EncoderRF

RFDownconverters

Upconverters

Mul

tipat

h Pr

opag

atio

n

Fig. 1. Block diagram of a space–time coded wireless communi-cation system.

and transmitting antennae. Due to the combination ofchannel coding and demultiplexing, the resulting com-munication signal may present both temporal and spa-tial correlation. Multipath propagation occurs betweeneach transmitting and receiving element, resulting ina multiple input multiple output (MIMO) channel. Abank of L¿N coherent demodulators is employed atthe receiver to obtain observations, x1(n); : : : ; xL(n),that are processed by the ST decoder in order to com-pute estimates of the source bits, b(l).The design of ST codes for &at fading MIMO

channels has been studied in [10,32–35]. Althoughthe time-scattering e%ect of typical wireless multipathchannels is viewed by these authors as an impairment,they show that ISI does not reduce the performancegain provided by adequately designed ST codes. In-deed, the analytical results in [29] indicate that furtherimprovements in the data rate can be achieved whentime-scattering is properly exploited. An optimal STCstructure that allows to achieve the capacity bound isalso proposed in [29]. Unfortunately, its implemen-tation requires to have full knowledge of the MIMOtime-scattering channel, which is not available inpractice.In this paper, we focus on the signal processing

aspect of STC systems. From this point of view, anumber of techniques previously developed for chan-nel estimation and data detection in MIMO systemsare also relevant 4 for STC. When a su#ciently large

4 Either channel state information or soft data estimates mustbe obtained before decoding.

number of transmitted data are known a priori bythe receiver (i.e., training sequences are employed),conventional signal processing [11,36] can be appliedfor supervised estimation of both the channel and thedata. Transmission of training sequences, however, isnot always possible or desirable and, therefore, sev-eral blind methods have been investigated during thelast decade. Blind techniques rely on the statisticaland structural knowledge of the communication signalin order to estimate the transmitted data without usingtraining sequences and without any a priori knowl-edge of the channel [3,18,37]. Many blind algorithmsfor blind source separation (BSS) [3,4,17,26,41],blind channel identi$cation [9,37,40,42,43] and jointchannel=data estimation [30,31,53] can be either di-rectly applied or straightforwardly extended for theiruse in STC systems. It is also widely recognized [33]that ST decoding resembles the multi-user detection(MUD) problem [44] encountered in code divisionmultiple access (CDMA) systems. In CDMA, di-versity is introduced by spreading the spectrum ofthe communication signals using distinct signaturewaveforms. Many blind MUD algorithms have beenproposed [1,2,21,22,38,39,44–48] but most of themcannot be directly used for ST decoding because theyrely on the knowledge of the users’ spreading codes.In the STC context, this requirement is equivalentto having some a priori structural knowledge of theMIMO channel, which is not usually available.Nevertheless, blind techniques also pose a number

of problems, which may include misconvergence dueto inadequate choice of initial conditions [30,31,53],the need of very large data records to attain ade-quate results [3,4,41] and poor performance in thelow signal-to-noise ratio (SNR) region [46]. Semi-blind techniques try to alleviate these drawbacks bycombining statistical=structural criteria and the trans-mission of short 5 training sequences. Most decisiondirected algorithms for joint channel=data estima-tion can be classi$ed as semiblind when a trainingsequence is used for computing initial estimates[30,31,53]. Several blind subspace approaches, suchas the sub-channel matching method in [16] and theprojection approach in [23], also lend themselvesto a semiblind implementation [14]. Heuristic

5 A training sequence is short if it is not enough for adequatesupervised-only channel estimation.


combinations of a supervised criterion, typically leastsquares (LS), and a blind one, such as constant modu-lus (CM), have also been claimed to exhibit enhancedperformance [12,13].In the present paper, we propose a novel approach

to ST decoding that consists of the estimation of theMIMO time-scattering channel coe#cients and the de-tection of the transmitted data in the maximum like-lihood (ML) sense. Channel estimation is carried outusing the expectation-maximization (EM) algorithm[19] whereas joint symbol detection is e#ciently per-formed with a dynamic programming algorithm (e.g.,the Viterbi algorithm [28]) that makes use of sideinformation obtained during the channel estimationstage. Notice that this approach is optimum when atrellis channel code and=or a modulation format withmemory are used at the ST encoder. Althoughwe showthat the MIMO channel can be estimated in a totallyblind way (i.e., without any knowledge of the trans-mitted symbols), we focus on a semiblind scheme thatassumes that a small part of the transmitted symbolsare known a priori at the receiver. Since the proposedsemiblind ST decoder exploits the statistical knowl-edge of the transmitted data to a great extent, a verysmall number of a priori known symbols are enough toattain practically optimum performance. In fact, com-puter simulations show that the novel semiblind re-ceiver is clearly more e#cient than conventional de-coders because much shorter training sequences areenough to attain the same bit error rate (BER).The main drawback of the proposed semiblind de-

coding scheme is the computational complexity of theresulting algorithms. It can be easily shown that thecomplexity of ML-based channel estimation grows ex-ponentially with the number of transmitting antennaeand the channel memory size. To avoid this limita-tion, we also introduce a modi$cation of the channelestimation EM algorithm that uses suboptimum detec-tion schemes in order to reduce the computational loadof channel estimation. As a result, a decision drivensemiblind receiver is obtained.The remaining of this paper is organized as fol-

lows. Section 2 presents the signal model under con-sideration. In Section 3 we introduce the optimizationproblem that leads to the estimation of the MIMOtime-scattering channel in the ML sense, together withiterative algorithms that solve it. The convergence ofthe iterative EM algorithm for channel estimation is

discussed in Section 4. The optimum detection proce-dure is described in Section 5. In Section 6, computersimulation results are presented that illustrate the per-formance of the proposed ST decoding structures. Fi-nally, Section 7 is devoted to the conclusions.

2. Signal model

Let us consider the STC system depicted inFig. 1. The source bit stream, b(l), is encoded intoN complex symbol substreams, si(n); i = 1; : : : ; N .This process consists, in general, of a channel codingoperation that introduces temporal correlation and aserial to parallel converter that allocates the codedsymbols into N di%erent transmitters, thus creatingthe spatial dependence structure. In its simplest form(no channel coding), the ST encoder maps the sourcebits into a sequence of independent symbols and thendemultiplexes this sequence into N branches. Dur-ing the nth symbol period, the symbol substreams,s1(n); : : : ; sN (n), are fed into the bank of transmitters.N di%erent baseband communication signals (1 persymbol substream) are generated, upconverted by aradio frequency carrier and synchronously transmittedthrough the wireless multipath link. At the receivingend, the signals are coherently demodulated back tobaseband at each receiving element to yield an L× 1observation vector, x(n) = [x1(n); : : : ; xL(n)]T. Theobservation drawn from the ith receiving element,xi(n), can be written in terms of the contributionsfrom the N transmitted symbol substreams as

xi(n) =N∑

j=1

m−1∑

l=0

h∗ij(l)sj(n− l) + gi(n); (1)

where gi(n) is a zero-mean complex Gaussian noisecomponent with variance !2g and h∗ij(l); l=0; : : : ; m−1 are the discrete-time channel coe#cients resultingfrom the multipath wireless propagation from the jthtransmitting antenna to the ith receiving antenna (su-perindex ∗ denotes complex conjugation). The quan-tity m will be referred to as the channel memory sizeand depends on the time-scattering introduced by thechannel. Using (1), the nth observation vector can bewritten as

x(n) =m−1∑

l=0

HH(l)s(n− l) + g(n) (2)


where superindex H denotes Hermitian transposi-tion, g(n) = [g1(n); : : : ; gL(n)]T is an additive whiteGaussian noise (AWGN) vector, s(n − l) = [s1(n −l); : : : ; sN (n− l)]T is the (n− l)th transmitted symbolvector and

H (l) =

⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎣

h11(l) h21(l) · · · hL1(l)

h12(l) h22(l) · · · hL2(l)

......

. . ....

h1N (l) h2N (l) · · · hLN (l)

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎦

(3)

is the N × L channel matrix corresponding to the(n− l)th symbol vector during the nth symbol period.The ST channel can $nally be expressed as

x(n) =HHs(n) + g(n); (4)

where

H =

⎡

⎢

⎢

⎢

⎢

⎣

H (m− 1)

H (m− 2)...

H (0)

⎤

⎥

⎥

⎥

⎥

⎦

and s(n) =

⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎣

s(n− m+ 1)

s(n− m+ 2)...

s(n)

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎦

(5)

are the Nm × L MIMO channel matrix and the nthreceived Nm× 1 symbol vector, respectively.A channel use consists of the transmission of K

complex symbols in each substream, including a train-ing sequence of M symbols. The information bear-ing observation frame, or simply observation frame,is built with the observations that involve unknownsymbols and can be expressed as

x =HHsp + g; (6)

where g=[gT(M) · · · gT(K−1)]T is the (K−M)L×1AWGN vector,

x =

⎡

⎢

⎢

⎢

⎢

⎢

⎣

x(M)

x(M + 1)...

x(K − 1)

⎤

⎥

⎥

⎥

⎥

⎥

⎦

(7)

is the (K −M)L× 1 observation frame,

H =

⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎣

H (m− 1) 0 · · · 0

H (m− 2) H (m− 1) · · · 0

... H (m− 2). . .

...

H (0)...

... H (m− 1)

0 H (0). . . H (m− 2)

......

. . ....

0 0 · · · H (0)

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎦

(8)

is the N (K −M +m− 1)× (K −M)L overall MIMOchannel matrix and

sp =

⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎣

s(M − m+ 1)...

s(M − 1)

s(M)

...

s(K − 1)

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎦

: (9)

Vector sp can be decomposed as sp = [sT(M − m +

1) · · · sT(M −1) sT]T where s=[sT(M) · · · sT(K−1)]T is the symbol frame or ST codeword that containsthe symbols to be detected. Thus, sp will be termedthe padded ST codeword.Finally, note that for the ST decoder to be able to

recover the N symbol substreams and decode theminto an output bit streamwithout ambiguity, the overallchannel matrixHmust have full row rank [29]. This isa realistic assumption in urban scenarios where severemultipath propagation and time-scattering e%ects aretypically observed [20,50].

3. Semiblind channel estimation

We propose to estimate the MIMO channel matrixH in the maximum likelihood (ML) sense by solvingthe optimization problem

H = argmaxH

{Ex log(fx;H(x))}; (10)


where fx;H(x) is the probability density function(p.d.f.) of a single observation vector x, that dependson matrix H, log(·) is the natural logarithm, Ex de-notes statistical expectation with respect to (w.r.t.)the random vector x and H is the resulting MIMOchannel matrix estimate. Assuming that the AWGN isstatistically independent of the transmitted symbols,the p.d.f. fx;H(·) reduces to (See Appendix A)

fx;H(x) =1

"L!LgEse−(1=!

2g)∥x−H

Hs∥2 ; (11)

where Es denotes statistical expectation w.r.t. the re-ceived symbol vector, s, and ∥v∥2 = vHv. Substituting(11) into (10) and neglecting constant terms yieldsthe equivalent optimization problem

H = argmaxH

{L(H) = Ex log(Ese−(1=!2g)∥x−H

Hs∥2 )};

(12)

where the cost functionL(H) will be termed the ex-pected log-likelihood of the MIMO channel. The ex-pectation Es can be analytically obtained because indigital communications the symbols are usually mod-elled as discrete random variables with known $nitealphabet p.d.f. As a consequence, given an arbitraryfunction #(s), the expectation w.r.t. s reduces to asimple summation

Es#(s) =∑

j

fs(sj)#(sj); (13)

where fs(·) is the joint p.d.f. of the received symbolsand sj is the jth possible realization of the receivedsymbol vector. When the received symbols are statis-tically independent and identically distributed (i.i.d.)and the modulation format hasM levels, there areMNm

di%erent realizations of s and fs(sj) = M−Nm ∀j. Ingeneral, however, fs(·) may take more complicatedforms because it depends on the spatio-temporal cor-relation induced by the ST encoder. The expectationEx, on the other hand, cannot be analytically derivedand must be estimated from the actually received ob-servation vectors. Thus, the channel estimate H is ob-tained in practice as

H= argmaxH

{

L(H)

=K−1∑

n=0

log(Es(n)e−(1=!2g)∥x(n)−H

Hs(n)∥2 )

}

; (14)

where L(H) is an estimate of the expected log-likelihood function. Notice that when the MIMOchannel is not time-scattering and the transmittedsymbols are i.i.d., L(·) is also the true likelihood ofthe MIMO channel matrix. Problem (14) leads to atotally blind estimation method because we have notexploited the availability of the training sequence,S = [s(0) · · · s(M − 1)], yet. When this informa-tion is taken into account, we arrive at the semiblindestimator

H= argmaxH

{

L(H)

=K−1∑

n=0

log(Es(n)|Se−(1=!2g)∥x(n)−H

Hs(n)∥2 )

}

(15)

where notation s(n)|S denotes conditioning of the nthreceived symbol vector to the known symbols. Noticethat, for M = 0, problem (15) reduces to the blindapproach (14) whereas, for M =K , (15) is simply theLS estimator.

3.1. The EM algorithm

Problem (15) does not have a closed form solu-tion and, therefore, some iterative optimization pro-cedure is required to compute H. The EM algorithm[19] is a suitable iterative method for ML estimationwhen direct maximization of the likelihood is not fea-sible. Together with the available observations, theEM paradigm postulates the existence of some unob-served data that, if known, would simplify the param-eter estimation problem. Let us de$ne the completedata set as the union of the observed and unobserveddata sets. The algorithm consists of a two-step itera-tion:• use the observed data and the current parameterestimate to compute su#cient statistics of thecomplete data (E step) and

• re-estimate the parameter using the computed com-plete data su#cient statistics (M step).

In our problem, the observed data are the observationvectors x(0); : : : ; x(K − 1) and the complete data arethe extended observations xe(n) = [sT(n) xT(n)]T,n = 0; : : : ; K − 1. Using this notation, the log-likelihood of a single observation vector, x(n), can be


written as [19]

log(fx;H(x(n))) = U (H; H; n)− V (H; H; n); (16)

where

U (H; H; n) = Exe(n)|x(n);H log(fxe;H(xe(n))); (17)

V (H; H; n) = Exe(n)|x(n);H log(fxe|x(n);H(xe(n))):(18)

Using (16) the estimate of the expected log-likelihoodcan be rewritten as

L(H) =K−1∑

n=0

U (H; H; n)− V (H; H; n)

−K log(

1"L!Lg

)

=U(H; H)−V(H; H)− K log(

1"L!Lg

)

(19)

where U(H; H)=∑K−1

n=0 U (H; H; n) andV(H; H)=∑K−1

n=0 V (H; H; n). Applying Jensen’s inequality andthe concavity of the logarithmic function it can beshown that V (H; H; n)6V (H; H; n) ∀H [19] and,subsequently,

V(H; H)6V(H; H): (20)

The above relationship allows to prove that the se-quence of channel estimates

Hi+1 = argmaxH

{U(H; Hi)} (21)

is monotonically non decreasing in the expectedlog-likelihood. Function U(H; H) constitutes a set ofcomplete data su#cient statistics and Eq. (21) com-prises the two steps of the EM algorithm into a singlerecursive rule.The complete data p.d.f., fxe;H(·), can be written as

(see Appendix A)

fxe;H(xe(n)) = fs(s(n))1

"L!Lge−(1=!

2g)∥x(n)−H

Hs(n)∥2 :

(22)

Substituting (22) into U(H; H) and conditioning thestatistical expectation to the training sequence S yields

the equivalent recursive rule

Hi+1=argminH

{

K−1∑

n=0

Es(n)|x(n); S ;Hi∥x(n)−HHs(n)∥2

}

(23)

where we have neglected constant terms and takeninto account that the only random part in xe(n) |x(n)is the received symbol vector s(n). At this point, wehave cast problem (15), which does not have a closedform solution, into a sequence of quadratic problemsthat can be analytically solved as

Hi+1 =

(

K−1∑

n=0

Es(n)|x(n); S;His(n)sH(n)

)−1

×(

K−1∑

n=0

Es(n)|x(n); S;His(n)xH(n)

)

: (24)

To compute the conditioned expectations in theprevious formula the p.d.f. fs|x(n)(·) is required.This function can be expressed in terms of fx;H(·),fx|s(n);H(·) and fs(·) using the Bayes theorem. No-tice that fx|s(n)(·) can be easily obtained from (11)by simply dropping the expectation Es. As a result,we arrive at the following relationship:

Es(n)|x(n);Hi#(s(n))

=Es(n)e−(1=!

2g)∥x(n)−H

Hi s(n)∥

2#(s(n))

Es(n)e−(1=!2g)∥x(n)−H

Hi s(n)∥2

: (25)

Finally, let us observe that the above expression dras-tically simpli$es for n=0; : : : ; M−1 due to the knowl-edge of the training sequence S

M−1∑

n=0

Es(n)|x(n); S;Hi#(s(n))

=M−1∑

n=0

#(s(n)); n= 0; : : : ; M − 1 (26)

thus reducing the number of operations required tocompute Hi+1.


3.2. The decision-directed EM algorithm

The EM algorithm computational complexity growsexponentially with the number of transmitting anten-nae, N , and the channel memory size, m, due to theneed of computing statistical expectations w.r.t. thereceived symbol vector s(n). An important simpli$ca-tion, however, can be achieved if we realize that theconditioned expectation Es(n)|x(n);His(n) is actually thenonlinear mean squared (NMS) [36] estimate of s(n)obtained from the corresponding observation vector,x(n), and the current channel estimate, Hi. Therefore,the channel estimation algorithm will still work if wesubstitute Es(n)|x(n);His(n) and Es(n)|x(n);His(n)s

H(n) bysi(n) and si(n)sHi (n), respectively, where si(n) is aconditional estimate of s(n) computed using Hi andthe available observations. In this way, we arrive at thedecision-directed expectation-maximization (DDEM)algorithm

si(n) = Detector{x(0); : : : ; x(K − 1); Hi}

Hi+1 =

(

M−1∑

n=0

s(n)sH(n) +K−1∑

n=M

si(n)sHi (n)

)−1

×(

M−1∑

n=0

s(n)xH(n) +K−1∑

n=M

si(n)xH(n)

)

:

(27)

This algorithm employs the last update of the channelestimate to detect the transmitted symbols and, after-wards, uses the detected symbols to improve the chan-nel estimate. The initial channel estimate, H0, can beeasily obtained from the training sequence S.As any other decision directed procedure, the

DDEM algorithm is sensitive to the choice of theinitial conditions and, therefore, we can expect thatthe longer training sequences are, the better per-formance will be attained. On the other hand, ifthe computation of si(n) is carried out in an inex-pensive way (e.g., linear detection) the complexityof the DDEM algorithm reduces drastically w.r.t.the original EM algorithm. In addition, the algo-rithm combines channel estimation and data detec-tion into a single stage because symbol estimates,si(n), are iteratively computed as the algorithmproceeds. Thus, the method may lend itself to the

implementation of low-complexity semiblind STdecoders.In Section 6, we will consider linear minimummean

square error (LMMSE) $ltering, followed by projec-tion onto the symbol alphabet, for the detection part ofthe DDEM algorithm. Assuming that the transmittedsymbols are zero-mean, the LMMSE detector com-putes soft estimates as

sLMMSE = argmins

{Es;x∥x− Hsp∥2}

= [(HHRpH + !2gIL(K−M))

−1HHRsp]Hx

(28)

where H is an estimate of the overall MIMOchannel matrix built from the channel estimate H,Rp = Essps

Hp is the padded ST codeword autocorre-

lation matrix, Rsp = EsspsH is the cross correlation

matrix between the padded and non-padded ST code-words and IL(K−M)) denotes the L(K−M)×L(K−M)identity matrix. When the transmitted symbols areuncorrelated and they have equal power, the abovecorrelation matrices reduce to

Rp =

[

!2s1N (m−1) 0N (m−1)×N (K−M)

0N (K−M)×N (m−1) !2s IN (K−M)

]

and

Rsp =

[

0N (m−1)×N (K−M)

!2s IN (K−M)

]

: (29)

Notation 0r×c denotes the zero matrix of the indi-cated dimensions, 1r is the r × r square matrix withall elements equal to 1 and !2s = Es|si(n)|2 ∀i; n is thetransmitted symbol power.Detector (28) is an extension of the well-known

Wiener $lter [13] to the ST signal model under consid-eration. The main di#culty in the derivation of (28)arises from the (possible) existence of statistical de-pendence among the symbols in s, which may renderthe exact computation of Rp and Rsp somehow in-volved.

3.3. Deterministic ML methods

In order to derive the EM channel estimation algo-rithm (24), we have taken a stochastic ML approachwhere the transmitted symbols are dealt with as ran-dom variables. In this way, channel estimation and


data detection are decoupled. It is also possible totake a deterministic ML approach in order to jointlyestimate the channel coe#cients and the transmittedcodeword. Solving this joint problem via the spacealternating generalized EM (SAGE) algorithm [6,53]leads to the decision directed procedure termed itera-tive least squares with enumeration (ILSE) [31]. TheILSE algorithm proceeds in two alternating steps:(1) ML estimation of the transmitted symbols, via

enumeration, using the current channel estimateand

(2) LS channel reestimation using the symbolestimates.

For reasons of computational complexity, the enu-meration procedure in the data detection step can besubstituted either by a tree search method [49] (whentransmission is synchronous and the MIMO channelis non-dispersive) or by the Viterbi algorithm (whentransmission is asynchronous or the channel intro-duces ISI). The latter choice leads to the extension forMIMO systems of the generalized ML sequence de-tection and channel estimation (GMLSDE) iterativescheme introduced in [53] for single input and singleoutput (SISO) channels.A reduced complexity alternative to the ILSE al-

gorithm is also presented in [31]. The iterative leastsquares with projection (ILSP) consists of three steps:(1) LS (soft) estimation of the data using the current

channel estimate,(2) projection of the estimates onto the $nite alphabet

of the symbols and(3) LS reestimation of the channel matrix using the

current (hard) symbol estimates.Although derived in a di%erent way, the DDEM algo-rithm proposed in this paper can be considered as anextension of the ILSP algorithm in which: (a) ISI isspeci$cally taken into account and (b) the data detec-tion step is not constrained to any particular method,so it can be matched to the system as illustrated inSection 6. Note that the DDEM method reduces to theILSP algorithm when considering no ISI (m= 1) andthe LS criterion, followed by projection, is used forthe detection step in (27).

4. Convergence of the EM algorithm

There is an inherent ambiguity in the totally blindestimation of the MIMO channel matrix that may

cause an arbitrary permutation of the rows in H w.r.t.the true channel matrix, H. Indeed, the received ob-servation vectors can be written either as in Eq. (4)or, alternatively, as

x(n) = (PH)H(P−Hs(n)) + g(n); (30)

where P is an arbitrary, and unknown, Nm×Nm per-mutation matrix. 6 As a consequence,H cannot be dis-tinguished from PH using the observations x(n) aloneand the estimate H may correspond to PH instead ofH. In this case, detection of the transmitted symbolsbecomes considerably involved. In addition, it is wellknown that, if the likelihood function presents unde-sired local maxima, the EM algorithm may su%er fromlocal misconvergence problems [19] that make blindestimation very dependent on the choice of the initialcondition, H0.Although a rigorous proof would involve the ana-

lytical maximization of L(H), which is not possible,the computer simulation results presented in Section 6show that the semiblind estimator incorporating shorttraining sequences is able to practically avoid the per-mutation and phase-rotation ambiguity and is also veryrobust to local misconvergence. This is also illustratedin this section by means of a graphical analysis of theexpected log-likelihood function for a two-coe#cientSISO channel (N = 1; L = 1; m = 2) and a numeri-cal study of a simple MIMO channel (N = 2; L = 2;m= 1).

4.1. Expected log-likelihood surface for atwo-coe#cient SISO channel

Let us consider a simple system with one transmit-ting antenna (N = 1), one receiving antenna (L= 1),BPSK modulation and small ISI (m = 2). The re-sulting channel matrix, H, reduces in this case to atwo-coe#cient vector, h = [h1 h2]. In order to allowgraphical representation, it is necessary to assume thatboth h1 and h2 are real-valued. In particular, let us seth1 = 0:1 and h2 = 0:4. Data are transmitted in framesof K = 150 binary symbols, including a training se-quence of length M at the beginning of the data burst.The average signal-to-noise ratio is set to 10 dB.

6 P presents exactly one non-zero element in each row andcolumn. This non-zero element has unit amplitude and an unknownphase rotation, i.e., it is of the form ej$, where $ is unknown.


Fig. 2. Level curves of the likelihood function L([0:1 0:4]) when: (a) K = M = 150, (b) K = M = 15, (c) K = 150 and M = 0, (d)K = 150; M = 15.

Fig. 2(a) plots the likelihood function L(h) whenK =M = 150. In this case, the ML criterion reducesto LS estimation and we observe a purely quadraticsurface with a single maximum (labeled with ⊙) thatcorresponds to a very accurate estimate of the truechannel (labeled with ?).Fig. 2(b) depicts the quadratic surface obtained

when the training sequence length is M =15 and con-ventional LS estimation of the channel coe#cients iscarried out (i.e., only the K = M = 15 observationsfrom the training sequence are used). It can be seenhow the maximum of the surface, that corresponds tothe channel estimate, is shifted from the true channellocation.Fig. 2(c) represents the likelihood function for the

totally blind case, when M = 0. The resulting surfaceis a considerably more complicated one, with one lo-cal minimum (labeled with ×), eight local maximaand four saddle points. The local maxima correspondto all possible permutations and sign changes of h andthey are due to the ambiguity inherent to blind chan-nel estimation, as explained at the beginning of thissection. The EM algorithm may also get trapped at asaddle point, but it can be shown that a small randomperturbation of the channel estimate would cause thesequence of iterations to diverge from that point [19].

Finally, Fig. 2(d) depicts the likelihood functionfor the semiblind case, i.e., K = 150 and M = 15. Itcan be seen that the local maxima corresponding topermutations and=or sign changes of the true channelare suppressed. The resulting surface presents a sin-gle maximum that corresponds to a very accurate esti-mate of the true channel. Moreover, the properties ofthe EM algorithm guarantee the convergence to thismaximum [19]. We must remark that the suppressionof local maxima is not ensured for a particular valueofM , since it depends on the other system parametersand the channel itself, but it is always achieved as Mincreases.

4.2. Expected log-likelihood surface for afour-coe#cient MIMO channel

As a second example, we tackle the numerical studyof the expected log-likelihood in a MIMO channelwith two transmitting and receiving antennae (N =L=2), BPSK modulation and no ISI (m=1). Hence,the channel is represented by a 2 × 2 matrix thatwe set as Ht =

[

0:900:40

−0:30−0:80

]

. As before, the framesize is K = 150, including M known symbols per


Table 1Maximum of the expected log-likelihood function in the [N=L=2; m=1] MIMO channel with K=150 and M=150 (ideal LS estimation)

True channel Channel estimate Log-likelihood Gradient

Ht =

[

0:90 −0:300:40 −0:80

]

H =

[

0:91 −0:320:40 −0:83

]

1K L(H) =−1:81 1

K∇HL(H) =

[

0:02 0:010:05 0:02

]

Table 2Maximum of the expected log-likelihood function in the [N = L= 2; m= 1] MIMO channel with K = 15 and M = 15 (conventional LSestimation)

True channel Channel estimate Log-likelihood Gradient

Ht =[

0:90 −0:300:40 −0:80

]

H =[

0:82 −0:340:45 −0:89

]

1K L(H) =−1:81 1

K∇HL(H) =[

−0:01 0:030:03 −0:03

]

transmitting antenna, and the average signal-to-noiseratio is 10 dB. Due to the increased dimensionality ofthe system, it is not possible to graphically representL(H) as in the previous subsection. Instead, we showthe numerical values (with a precision of ±0:01) ofthe points of interest in the four-dimensional space,together with the corresponding magnitude and slopeof the expected log-likelihood function. In summary,the results are analogous to the SISO case.Table 1 shows the single maximum of L(H) when

K =M = 150. As in the SISO case, the ML criterionreduces to the LS method and a very accurate estimateof Ht is obtained. The performance loss that occursin practice when a LS estimate of the channel is com-puted using only the available training sequence is il-lustrated by Table 2. The maximum of L(H) whenonly the initial M = 15 observation vectors are pro-cessed by the channel estimator is indicated.In the totally blind case (K = 150 and M = 0)

there are eight local maxima corresponding to thedi%erent permutations and sign changes of Ht . Theyare shown, together with the associated value of theexpected log-likelihood and its gradient, in Table 3.Notice, however, the improved accuracy w.r.t. the LSestimator with M = 15. This is because the ML esti-mator makes use of the whole frame of available data.Finally, we study the semiblind case, i.e., K = 150

and M = 15. Table 4 shows the gradient ∇HL eval-uated at each one of the eight former local maxima.It is clearly observed that the slope of the expected

log-likelihood function is non negligible except forthe case PT = I2 (i.e., no permutation), where thegradient practically vanishes. This is analogous to theSISO example, and shows that the transmission of theshort training sequence is enough to remove any localmaxima associated to permutations or sign inversionsof the coe#cients in Ho. Again, we must remark that,although the elimination of the non-desired localmaxima does not depend on the choice of M alone,it is always achieved as this parameter is increased.

5. Maximum likelihood ST decoding

The problem of ST decoding can be formulatedas the detection of the transmitted ST codeword.Assuming that some sort of ST trellis code is em-ployed, optimum decoding is achieved by ML de-tection of the codeword, s [34]. The decoder canbe implemented with a dynamic programming algo-rithm (e.g., the Viterbi algorithm [28]) that searchesthe path with the least cost in a trellis structurethat accounts for both the channel and the encodermemory.Given aMIMO channel estimate H, theML decoder

is

sML = argmaxs {log(fx|s;H(x))}

= argmins

{$ML(s) = ∥x− HHsp∥2}; (31)


Table 3Maximum of the expected log-likelihood function in the [N =L=2; m=1] MIMO channel with K=150 and M =0 (blind ML estimation)

Channel permutation Channel estimate Log-likelihood Gradient

PT =[

1 00 1

]

H =[

0:91 −0:320:40 −0:83

]

1K L(H) =−4:56 1

K∇HL(

H)

=[

0:02 0:020:02 0:01

]

PT =[

0 11 0

]

H =[

0:40 −0:830:91 −0:32

]

1K L(H) =−4:56 1

K∇HL(

H)

=[

0:02 0:020:01 0:02

]

PT =[

1 00 −1

]

H =[

0:91 −0:32−0:40 0:83

]

1K L(H) =−4:56 1

K∇HL(

H)

=[

0:02 −0:020:02 −0:01

]

PT =[

0 1−1 0

]

H =[

−0:40 0:830:91 −0:32

]

1K L(H) =−4:56 1

K∇HL(

H)

=[

−0:02 0:02−0:01 0:02

]

PT =[

−1 00 1

]

H =[

−0:91 0:320:40 −0:83

]

1K L(H) =−4:56 1

K∇HL(

H)

=[

−0:02 0:02−0:02 0:01

]

PT =[

0 −11 0

]

H =[

0:40 −0:83−0:91 0:32

]

1K L(H) =−4:56 1

K∇HL(

H)

=[

0:02 −0:020:01 −0:02

]

PT =[

−1 00 −1

]

H =[

−0:91 0:32−0:40 0:83

]

1K L(H) =−4:56 1

K∇HL(

H)

=[

−0:02 −0:02−0:02 −0:01

]

PT =[

0 −1−1 0

]

H =[

−0:40 0:83−0:91 0:32

]

1K L(H) =−4:56 1

K∇HL(

H)

=[

−0:02 −0:02−0:01 −0:02

]

where

fx|s;H(x) =1

"L(K−M)!L(K−M)ge−(1=!

2g)∥x−H

Hsp∥2(32)

is the conditional p.d.f. of the received data frame, x,that depends on the overall channel matrix, H. Thequadratic cost function $ML(s) results from neglectingthe terms and factors in log(fx|s;H(x)) that do notdepend on s. It admits an additive decomposition

$ML(s) = ∥x− HHsp∥2 =K−1∑

n=M

∥x(n)− HHs(n)∥2

(33)

that yields the branchmetrics for the Viterbi algorithm,%n = ∥x(n)− HHs(n)∥2¿ 0.Note that, when H has been calculated via the EM

algorithm (24), the computational load required tosolve (31) is alleviated because the branch metrics%n have already been computed in the last iterationof (24). Nevertheless, the complexity of the Viterbi

algorithm grows exponentially with the number oftransmitting antennae, N , and the channel memorysize, m.

6. Computer simulations

In this section we present computer simulation re-sults that illustrate the performance of the proposedsemiblind decoding schemes. Let us consider a STCsystem with N =3 and L=3 transmitting and receiv-ing antennae, respectively. The symbol stream, s(l), isfed into the serial-to-parallel converter that allocatesthe symbols into the N radio-frequency modulators ina round-robin fashion, i.e.,

s1(0) = s(0) s1(1) = s(4) s1(2) = s(7) · · ·

s2(0) = s(1) s2(1) = s(5) s2(2) = s(8) · · ·

s3(0) = s(2) s3(1) = s(6) s3(2) = s(9) · · ·

Frames of K = 150 symbols are transmitted througheach antenna. This frame includes a training sequence


Table 4Maximum of the expected log-likelihood function in the [N = L = 2; m = 1] MIMO channel with K = 150 and M = 15 (semiblind MLestimation)

Channel permutation Channel estimate Expected log-likelihood Gradient

PT =[

1 00 1

]

H =[

0:91 −0:320:40 −0:83

]

1K L

(

H)

=−4:28 1K∇HL

(

H)

=[

0:02 0:020:02 0:01

]

PT =[

0 11 0

]

H =[

0:40 −0:830:91 −0:32

]

1K L(H) =−5:07 1

K∇HL(

H)

=[

0:34 −0:300:46 −0:42

]

PT =[

1 00 −1

]

H =[

0:91 −0:32−0:40 0:83

]

1K L(H) =−7:85 1

K∇HL(

H)

=[

0:18 0:81−0:31 −1:76

]

PT =[

0 1−1 0

]

H =[

−0:40 0:830:91 −0:32

]

1K L(H) =−7:67 1

K∇HL(

H)

=[

1:14 −0:14−1:31 −0:76

]

PT =[

−1 00 1

]

H =[

−0:91 0:320:40 −0:83

]

1K L(H) =−7:77 1

K∇HL(

H)

=[

1:67 0:38−0:67 −0:11

]

PT =[

0 −11 0

]

H =[

0:40 −0:83−0:91 0:32

]

1K L(H) =−7:95 1

K∇HL(

H)

=[

0:71 1:340:33 −1:11

]

PT =[

−1 00 −1

]

H =[

−0:91 0:32−0:40 0:83

]

1K L(H) =−12:35 1

K∇HL(

H)

=[

1:83 1:18−1:00 −1:89

]

PT =[

0 −1−1 0

]

H =[

−0:40 0:83−0:91 0:32

]

1K L(H) =−11:57 1

K∇HL(

H)

=[

1:51 1:50−1:44 −1:45

]

composed of M = 15 symbols. The communicationsignals propagate through a multipath wireless linkthat introduces ISI between consecutive symbols (i.e.,the channel memory size ism=2). The signal-to-noiseratio (SNR) at the receiver is de$ned as

SNR =!2sTrace(HHH)

L!2g; (34)

where !2s =E|s(l)|2 is the symbol power and Trace[ · ]denotes the matrix trace operator.The performance of the STC system described

above is illustrated when the minimum shift keying(MSK) modulation format is employed [28]. TheMSK encoder can be interpreted as a trellis stage,that converts the source bit stream into a sequenceof correlated symbols, followed by a radio-frequencymodulator and, therefore, it $ts the system struc-ture depicted in Fig. 1. Symbol correlation has tobe incorporated into the EM channel estimationalgorithm when computing the statistical expecta-tions w.r.t. the received symbol vectors, s(n). In the

DDEM algorithm, projection of the soft symbol es-timates onto the $nite symbol alphabet is carried outas

#= arg min#∈{"=2;−"=2}

{|si(n)− ej#si(n− 1)|2} (35)

si(n) = ej#si(n− 1)where si(n) is the soft estimate for the nth sym-bol transmitted from the ith antenna and si(n) isthe corresponding hard estimate. Finally, the STdecoder has to take into account both the mod-ulation and the channel memory to simultane-ously detect the symbols in the N transmissionstreams. It is implemented using the Viterbi algo-rithm to search for the most likely ST codewordin a trellis structure with 2Nm states and 2N in-coming and outgoing transitions associated to eachstate.The results presented below have been averaged

over 60 independent random realizations of theMIMOchannel matrix, H(1); : : : ;H(60). To obtain these


0 2 4 6 8 10 12

1e−1

1e−2

1e−3

1e−4

1e−5

1

SNR (dB)

BER

Viterbi LS(15)+Viterbi DDEM(15,150)+ViterbiEM(15,150)+Viterbi

Fig. 3. BER in a system with N = 3 transmitting antennae, L= 3receiving antennae, channel memory size m=2, training sequencelength M=15 (per antenna) and frame size K=150 (per antenna).

matrices, we have considered a Rayleigh fadingmodel, i.e., the channel coe#cients, hij(l), are com-plex i.i.d. Gaussian random variables with zero-meanand variance !2h = 1.

6.1. Average performance for a 3× 3 system withMSK modulation

Fig. 3 shows the bit error rate (BER) attained withthe considered STC system using several decodingstrategies. The optimum decoder (labeled Viterbi)consists of a Viterbi detector with perfect knowl-edge of the MIMO channel matrix. The other curvescorrespond to:• The conventional decoder, labeled LS(15)+Viterbi.It consists of a LS channel estimator, that uses justthe training sequence (with length M = 15), fol-lowed by the Viterbi detector.

• The proposed EM-based decoder, labeled EM(15; 150) + Viterbi, that combines semiblind chan-nel estimation using the EM algorithm (24) andViterbi detection. Notation EM (15; 150) indicatesM = 15, the training sequence length for one an-tenna, and K = 150, the frame length for oneantenna.

• The DDEM-based decoder, labeled DDEM(15; 150) + Viterbi. Channel estimation is carriedout using the semiblind DDEM algorithm (27),that involves soft and hard symbol estimation us-ing (28) and (35), respectively. Once the channelis estimated, the ST codeword is detected using theViterbi algorithm.

0 2 4 6 8 10 12

1e−1

1e−2

1e−3

1e−4

1e−5

1

SNR (dB)

BER

Viterbi LS(45)+Viterbi DDEM(15,150)+Viterbi

Fig. 4. Comparison of the semiblind receiver based on DDEMchannel estimation with training sequence length M = 15 and theconventional (supervised) receiver with M = 45.

It is apparent that both semiblind approaches outper-form the conventional decoder and practically matchthe optimum BER.The proposed semiblind method has an inherent

advantage over conventional approaches because itmakes use of the whole block of observations in orderto estimate the channel parameters. The main goal ofthe training sequence is to guarantee the proper con-vergence of the EM channel estimation algorithm. Forthis reason, it can remain relatively short and the trans-mission overhead is minimized. This is illustrated inFig. 4. The BER corresponding to the semiblind de-coders, withM=15, is compared with the performanceachieved by the conventional decoder when the train-ing sequence length is M = 45. It can be clearly seenthat the conventional receiver requires a training se-quence three times longer (thus falling into in a veryhigh transmission overhead) in order to attain a BERclose to the one achieved by the semiblind one.We have also extended the iterative GMLSDE

scheme proposed in [53] to the MIMO case in orderto compare it with the proposed DDEM space–timedecoder. In Fig. 5, it is seen that both receiversperform nearly the same but the complexity of theDDEM approach is much lower because the Viterbialgorithm is only run once, after channel estimation,whereas the GMLSDE decoder requires one run ofthe Viterbi algorithm each time it iterates.Finally, we consider the convergence speed of the

EM-based channel estimation algorithms. Fig. 6 illus-trates the convergence speed of the EM and DDEM


0 2 4 6 8 10 12

1e−1

1e−2

1e−3

1e−4

1e−5

1

SNR (dB)

BER

Viterbi GMLSDE(15,150) DDEM(15,150)+Viterbi

Fig. 5. Comparison of the DDEM based receiver and the GMLSDEreceiver.

0 5 10 15

−1160

−1180

−1200

−1220

−1240

−1260

−1280

−1300

iteration no.

log−

likel

ihoo

d

EM(15,150) DDEM(15,150)

Fig. 6. Convergence speed, in terms of the expected log-likelihood,of the proposed EM and DDEM algorithms.

algorithms in terms of the averaged likelihood,

L=160

60∑

i=1

L(H(i)); (36)

for the STC system under consideration, with SNR =10 dB. It is apparent that both the EM and the DDEMconverge in a very small number of iterations. Al-though the estimates from the DDEM algorithm donot attain such a high likelihood as those from the EMone, this di%erence does not necessarily translate intoa higher channel estimation error (in the mean-squaredsense). In fact, it can be seen from the previous sim-ulations that both algorithms attain nearly the sameBER for low and moderate SNR values.

7. Conclusions

We have introduced a novel space–time semiblinddecoding scheme that performs maximum likeli-hood (ML) based channel estimation and data de-tection. The multiple input multiple output (MIMO)time-scattering channel is estimated using a blockiterative expectation-maximization (EM) algorithmthat fully exploits the statistical features of the trans-mitted signal together with the knowledge of a smallnumber of transmitted symbols, hence the term semi-blind. Data detection is e#ciently carried out usingthe Viterbi algorithm, that uses some side informa-tion provided by the EM algorithm in order to reducethe number of operations. The main limitation of thesemiblind decoder is its computational complexity.We have addressed this problem introducing a heuris-tic modi$cation of the EM algorithm that allowslower complexity implementations. Finally, computersimulation results have been presented showing thatthe proposed semiblind decoders clearly outperformconventional receivers that estimate the channel pa-rameters exclusively from the known transmitted data.

Acknowledgements

This work has been supported by FEDER funds(grant number 1FD97-0082) and Xunta de Galicia(PGIDT00PXI10504PR).

Appendix A. Probability density function ofan observation vector

The p.d.f. of a single observation vector, x (wedrop the discrete-time index for simplicity), can beexpressed in terms of the p.d.f. of the received symbolvector, fs(·), and the complex AWGN vector, fg(·),which are both assumed to be known. Let us de$ne the(Nm+L)×1 vectors se=[sT gT]T and xe=[sT xT]T.It is well known [27] that the p.d.f.’s of se and xe arerelated by

fxe;H(xe) =fse(T

−1H (xe))

|JT| ; (37)

where fxe;H(·) is the p.d.f. of xe, that depends onH, fse(·) is the p.d.f. of se, TH is the invertible


transformation

TH(se),

[

s

HHs + g

]

= xe; (38)

T−1H (xe) =

[

s

x−HHs

]

= se;

JT is the Jacobian ofTH(·) and | · | denotes absolutevalue. It is straightforward to show that JT reduces tounity, since

JT = det

⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎣

@s1@s1

· · · @s1@sNm

@s1@g1

· · · @s1@gL

.... . .

......

. . ....

@sNm@s1

· · · @sNm@sNm

@sNm@g1

· · · @sNm@gL

@x1@s1

· · · @x1@sNm

@x1@g1

· · · @x1@gL

.... . .

......

. . ....

@xL@s1

· · · @xL@sNm

@xL@g1

· · · @xL@gL

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎦

= det

⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎢

⎣

INm 0Nm×L⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎣

@x1@s1

· · · @x1@sNm

.... . .

...

@xL@s1

· · · @xL@sNm

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎦

IL

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎥

⎦

= 1; (39)

where sj is the jth element in s. If we assume that theAWGN is statistically independent of the symbols, thep.d.f. fse(·) can be factorized as

fse(T−1H (xe)) = fs(s)fg(x−H

Hs): (40)

Substituting (39) and (40) into (37), the desired ex-pression for fx;H(·) is obtained as the marginal p.d.f.

fx;H(x) =∫

sfs(s)fg(x−HHs) ds

= Esfg(x−HHs)

=1

"L!LgEse−(1=!

2g)∥x−H

Hs∥2 : (41)

Expression (41) could have also been obtained, in asomehow simpler way, resorting to the total probabil-ity theorem. We have chosen the more general deriva-tion in this appendix, however, because it allows touse the same notation already employed to obtain theEM algorithm.

References

[1] S.E. Bensley, B. Aazhang, Subspace-based channel esti-mation for code division multiple access communicationsystems, IEEE Trans. Comm. 44 (8) (August 1996) 1009–1020.

[2] M.F. Bugallo, J. M!"guez, L. Castedo, A maximum likelihoodapproach to blind multiuser interference cancellation, IEEETrans. Signal Process. 49 (6) (June 2001) 1228–1239.

[3] J.F. Cardoso, Blind signal separation: statistical principles,Proc. IEEE 86 (10) (October 1998) 2009–2025.

[4] L. Castedo, C.J. Escudero, A. Dapena, A blind signalseparation method for multiuser communications, IEEE Trans.Signal Process. 45 (5) (May 1997) 1343–1348.

[5] P. Chaudhury, W. Mohr, S. Onoe, The 3GPP Proposal forIMT-2000, IEEE Comm. Mag. 37 (12) (December 1999)72–81.

[6] J.A. Fessler, A.O. Hero, Space-alternating generalizedexpectation-maximization algorithm, IEEE Trans. SignalProcess. 42 (10) (October 1994) 2664–2677.

[7] G.J. Foschini, Layered space–time arquitechture for wirelesscommunications in a fading environment when usingmulti-element anntenas, Bell Labs Technical J. 1 (2) (1996)41–59.

[8] G.J. Foschini, G.D. Golden, R.A. Valenzuela, P.W.Wolniansky, Simpli$ed processing for high spectral e#ciencywireless communication employing multi-element arrays,IEEE J. Selected Areas Comm. 17 (11) (November 1999)1841–1852.

[9] A. Gorokhov, P. Loubaton, Blind identi$cation of MIMO-FIRsystems: A generalized linear prediction approach, SignalProcessing 73 (1–2) (January 1999) 105–124.

[10] A.R. Hammons Jr., H. El-Gamal, On the theory of space–time codes for PSK modulation, IEEE Trans. Inform. Theory46 (2) (March 2000) 524–542.

[11] S. Haykin, Adaptive Filter Theory, Information and SystemSciences Series, 3rd Edition, Prentice-Hall, Englewood Cli%s,NJ, 1996.

[12] A.M. Kuzminskiy, Finite amount of data e%ects inspatio-temporal $ltering for equalization and interferencerejection in short burst wireless communications, SignalProcessing 80 (10) (October 2000) 1987–1997.

[13] A.M. Kuzminskiy, D. Hatzinakos, Semi-blind spatio-temporal processing with temporal scanning for short burstSDMA systems, Signal Processing 80 (10) (October 2000)2063–2073.


[14] G. Li, Z. Ding, Semi-blind channel identi$cation forindividual data bursts in GSM wireless systems, SignalProcessing 80 (10) (October 2000) 2017–2031.

[15] X. Lin, R.S. Blum, Improved space–time codes using serialconcatenation, IEEE Comm. Lett. 4 (7) (July 2000) 221–223.

[16] H. Liu, G. Xu, L. Tong, A deterministic approach to blindidenti$cation of multi-channel FIR systems, Proceedings ofthe 1994 IEEE ICASSP, Adelaide, May 1994, pp. IV:581–584

[17] O. Macchi, E. Moreau, Adaptive unsupervised separation ofdiscrete sources, Signal Processing 73 (1–2) (January 1999)49–66.

[18] U. Madhow, Blind adaptive interference suppression fordirect-sequence CDMA, Proc. IEEE 86 (10) (October 1998)2049–2069.

[19] G.J. McLachlan, T. Krishnan, The EM Algorithm andExtensions, Wiley Series in Probability and Statistics, Wiley,New York, 1997.

[20] A. Mehrotra, GSM System Engineering, MobileCommunications Series, Artech House Publishers, Norwood,USA, 1997.

[21] J. M!"guez, L. Castedo, A constant modulus blind adaptivereceiver for multiuser interference suppression, SignalProcessing 71 (1) (November 1998) 15–27.

[22] J. M!"guez, L. Castedo, A linearly constrained constantmodulus approach to blind adaptive multiuser interferencesuppression, IEEE Comm. Lett. 2 (8) (August 1998) 217–219.

[23] E. Moulines, Subspace methods for the blind identi$cationof multichannel FIR $lters, Proceedings of the 1994 IEEEICASSP, Adelaide, May 1994, pp. IV:573–576.

[24] A.F. Naguib, N. Seshadri, A.R. Calderbank, Increasing datarate over wireless channels, IEEE Signal Process. Mag. 17(3) (May 2000) 76–92.

[25] A. Narula, M.D. Trott, G.W. Wornell, Performance limits ofcoded diversity methods for transmitter antenna arrays, IEEETrans. Inform. Theory 45 (7) (November 1999) 2418–2433.

[26] C.B. Papadias, New unsupervised processing techniquesfor wideband multiple transmitter=multiple receiver systems,Proceedings of the 34th Asilomar Conference, November2000.

[27] A. Papoulis, Probability, Random Variables and StochasticProcesses, McGraw-Hill Series in Electrical Engineering,New York, 1984.

[28] J.G. Proakis, Digital Communications, McGraw-Hill,Singapore, 1995.

[29] G.G. Raleigh, J.M. Cio#, Spatio-temporal coding for wirelesscommunication, IEEE Trans. Comm. 46 (3) (March 1998)357–366.

[30] N. Seshadri, Joint data and channel estimation using blindtrellis search techniques, IEEE Trans. Comm. 42 (2=3=4)(February=April 1994) 1000–1011.

[31] S. Talwar, M. Viberg, A. Paulraj, Blind separation ofsynchronous co-channel digital signals using an antenna array— part i: algorithms, IEEE Trans. Signal Process. 44 (5)(May 1996) 1184–1197.

[32] V. Tarokh, H. Jafarkhani, A.R. Calderbank, Space–time blockcodes from orthogonal designs, IEEE Trans. Inform. Theory45 (5) (July 1999) 1456–1467.

[33] V. Tarokh, A. Naguib, N. Seshadri, A.R. Calderbank,Combined array processing and space–time coding, IEEETrans. Inform. Theory 45 (4) (May 1999) 1121–1128.

[34] V. Tarokh, A. Naguib, N. Seshadri, A.R. Calderbank, Space–time codes for high data rate wireless communications:performance criteria in the presence of channel estimationerrors, mobility and multiple paths, IEEE Trans. Comm. 47(2) (February 1999) 199–207.

[35] V. Tarokh, N. Seshadri, A.R. Calderbank, Space–time codesfor high data rate wireless communications: performanceanalysis and code construction, IEEE Trans. Inform. Theory44 (March 1998) 744–765.

[36] C.W. Therrien, Discrete Random Signals and StatisticalSignal Processing, Prentice-Hall, Englewood Cli%s, NJ, USA,1992.

[37] L. Tong, S. Perreau, Multichannel blind identi$cation: fromsubspace to maximum likelihood methods, IEEE Trans.Signal Process. 45 (5) (May 1997) 1343–1348.

[38] M. Torlak, G. Xu, Blind multiuser channel estimation inasynchronous CDMA systems, IEEE Trans. Signal Process.45 (January 1997) 137–147.

[39] M.K. Tsatsanis, Inverse $ltering criteria for CDMA systems,IEEE Trans. Signal Process. 45 (1) (January 1997) 102–112.

[40] J.K. Tugnait, Blind spatio-temporal equalization and impulseresponse estimation for MIMO channels using a Godard costfunction, IEEE Trans. Signal Process. 45 (1) (January 1997)268–271.

[41] J.K. Tugnait, Adaptive blind separation of convolutivemixtures of independent linear signals, Signal Processing 73(1–2) (January 1999) 139–152.

[42] J.K. Tugnait, B. Huang, Multistep linear predictors-basedblind identi$cation and equalization of multiple-inputmultiple-output channels, IEEE Trans. Signal Process. 48 (1)(January 2000) 26–38.

[43] J.K. Tugnait, B. Huang, On a whitening approach topartial channel estimation and blind equalization of FIR=IIRmultiple-input multiple-output channels, IEEE Trans. SignalProcess. 48 (3) (March 2000) 832–845.

[44] S. Verd!u, Multiuser Detection, Cambridge University Press,Cambridge, UK, 1998.

[45] X. Wang, A. Host-Madsen, Group-blind multiuser detectionfor uplink CDMA, IEEE J. Selected Areas Comm. 17 (11)(November 1999) 1971–1984.

[46] X. Wang, H.V. Poor, Blind equalization and multiuserdetection in dispersive CDMA channels, IEEE Trans. Comm.46 (1) (January 1998) 91–103.

[47] X. Wang, H.V. Poor, Blind multiuser detection: a subspaceapproach, IEEE Trans. Inform. Theory 44 (2) (March 1998)677–690.

[48] X. Wang, H.V. Poor, Space–time multiuser detection inmultipath CDMA channels, IEEE Trans. Signal Process. 47(9) (September 1999) 2356–2374.

[49] G. Wei, L. Rasmussen, R. Wyrwas, Near optimumtree-search detection schemes for bit-synchronous multiuser


CDMA systems over Gaussian and two-path Rayleigh-Fadingchannels, IEEE Trans. Comm. 45 (6) (June 1997) 691–700.

[50] P.W. Wolniansky, G.J. Foschini, G.D. Golden, R.A.Valenzuela, V-BLAST: an architecture for realizing veryhigh data rates over the rich-scattering wireless channel,Proceedings of ISSSE’98, Pisa, Italy, 1998.

[51] G. Wornell, Linear diversity techniques for fading channels,in: Wireless Communications: Signal Processing Perspectives,Prentice-Hall, Englewood Cli%s, NJ, 1998, pp. 1–63.

[52] G. Wornell, M.D. Trott, E#cient signal processing techniquesfor exploiting transmit antenna diversity on fading channels,IEEE Trans. Signal Process. 45 (1) (January 1997) 191–205.

[53] H. Zamiri-Jafarian, S. Pasupathy, Adaptive MLSDE using theEM algorithm, IEEE Trans. Comm. 47 (8) (August 1999)1181–1193.

Documents

Semiblind space–time decoding in wireless …jmiguez/papers/P7_2002_Semiblind Space...The wireless channel is a hostile medium where communication signals are severely distorted