5
ISIT 2009, Seoul, Korea, June 28 - July 3, 2009 Low-Complexity Near- ML Decoding of Large Non-Orthogonal STBCs using Reactive Tabu Search N. Srinidhi, SaifK. Mohammed, A. Chockalingam, and B. Sundar Rajan Department ofECE, Indian Institute of Science, Bangalore 560012, INDIA Abstract-Non-orthogonal space-time block codes (STBC) with large dimensions are attractive because they can simultaneously achieve both high spectral efficiencies (same spectral efficiency as in V-BLAST for a given number of transmit antennas) as well as full transmit diversity. Decoding of non-orthogonal STBCs with large dimensions has been a challenge. In this paper, we present a reactive tabu search (RTS) based algorithm for decod- ing non-orthogonal STBCs from cyclic division algebras (CDA) having large dimensions. Under i.i.d fading and perfect channel state information at the receiver (CSIR), our simulation results show that RTS based decoding of 12 x 12 STBC from CDA and 4-QAM with 288 real dimensions achieves i) 10- 3 uncoded BER at an SNR of just 0.5 dB away from SISO AWGN performance, and ii) a coded BER performance close to within about 5 dB of the theoretical MIMO capacity, using rate-3/4 turbo code at a spectral efficiency of 18 bps/Hz. RTS is shown to achieve near SISO AWGN performance with less number of dimensions than with LAS algorithm (which we reported recently) at some extra complexity than LAS. We also report good BER performance of RTS when i.i.d fading and perfect CSIR assumptions are relaxed by considering a spatially correlated MIMO channel model, and by using a training based iterative RTS decoding/channel esti- mation scheme. I. INTRODUCTION MIMO systems that employ non-orthogonal space-time block codes (STBC) from cyclic division algebras (CDA) for ar- bitrary number of transmit antennas, Ni, are attractive be- cause they can simultaneously provide bothfull-rate (i.e., N, complex symbols per channel use, which is same as in V- BLAST) as well as full transmit diversity [1],[2]. The 2 x 2 Golden code is a well known non-orthogonal STBC from CDA for 2 transmit antennas [3]. High spectral efficiencies of the order of tens of bps/Hz can be achieved using large non-orthogonal STBCs. For e.g., a 16 x 16 STBC from CDA has 256 complex symbols in it with 512 real dimensions; with 16-QAM and rate-3/4 turbo code, this system offers a high spectral efficiency of 48 bps/Hz. Decoding of non-orthogonal STBCs with such large dimensions, however, has been a chal- lenge. Sphere decoder and its low-complexity variants are prohibitively complex for decoding such STBCs with hun- dreds of dimensions. Recently, we proposed a low-complexity near-ML achieving algorithm to decode large non-orthogonal STBCs from CDA; this algorithm, which is based on bit- flipping approach, is termed as likelihood ascent search (LAS) algorithm [4]-[6]. In this paper, we present a reactive tabu search (RTS) based approach to near-ML decoding of non- orthogonal STBCs with large dimensions. Key attractive features of the proposed RTS based decod- ing are its low-complexity and near-ML performance in sys- tems with large dimensions (e.g., hundreds of dimensions). While creating hundreds of dimensions in space alone (e.g., V-BLAST) requires hundreds of antennas, use of non-orthogonal STBCs from CDA can create hundreds of dimensions with just tens of antennas (space) and tens of channel uses (time). Given that 802.11 smart WiFi products with 12 transmit an- tennas! at 2.5 GHz are now commercially available [7] (which establishes that issues related to placement of many anten- nas and RF/IF chains can be solved in large aperture com- munication terminals like set-top boxes/laptops), large non- orthogonal STBCs (e.g., 16 x 16 STBC from CDA) in com- bination with large dimension near-ML decoding using RTS can enable communications at increased spectral efficiencies of the order of tens of bps/Hz (note that current standards achieve only < 10 bps/Hz using only up to 4 tx antennas). Tabu search (TS), a heuristic originally designed to obtain ap- proximate solutions to combinatorial optimization problems [8]-[10], is increasingly applied in communication problems [11]-[13]. For e.g., in [11], design of constellation label maps to maximize asymptotic coding gain is formulated as a quadra- tic assignment problem (QAP), which is solved using RTS [10]. RTS approach is shown to be effective in terms ofBER performance and efficient in terms of computational com- plexity in CDMA multiuser detection [12]. In [13], a fixed TS based detection in V-BLAST is presented. In this paper, we establish that RTS based decoding of non-orthogonal STBCs can achieve excellent BER performance (near-ML and near- capacity performance) in large dimensions at practically af- fordable low-complexities. We also present a stopping-criteri- on for the RTS algorithm. RTS for large dimension non- orthogonal STBC decoding has not been reported so far. Our results in this paper can be summarized as follows: Under i.i.d fading and perfect channel state information at the receiver (CSIR), our simulation results show that RTS based decoding of12 x 12 STBC from CDA and 4- QAM (288 real dimensions) achieves i) 10- 3 uncoded BER at an SNR of just 0.5 dB away from SISO AWGN performance, and ii) a coded BER performance close to within about 5 dB of the theoretical capacity using rate- 3/4 turbo code at a spectral efficiency of 18 bps/Hz. Compared to the LAS algorithm we reported recently in [4]-[6], RTS achieves near-SISO AWGN performance with less number of dimensions than with LAS; this is achieved at some extra complexity compared to LAS. We report good BER performance when i.i.d fading and perfect CSIR assumptions are relaxed by adopting a spa- tially correlated MIMO channel model, and a training based iterative RTS decoding/channel estimation scheme. II. NON-ORTHOGONAL STBC MIMO SYSTEM MODEL Consider a STBC MIMO system with multiple transmit and receive antennas. An (n, p, k) STBC is represented by a ma- 112 antennas in these products are now used only for beamfonning. Single-beam multi-antenna approaches can offer range increase and inter- ference avoidance, but not spectral efficiency increase. 978-1-4244-4313-0/09/$25.00 ©2009 IEEE 1993 Authorized licensed use limited to: INDIAN INSTITUTE OF SCIENCE. Downloaded on January 19, 2010 at 05:43 from IEEE Xplore. Restrictions apply.

Low-ComplexityNear-ML DecodingofLarge Non-OrthogonalSTBCs ...achockal/pdf_files/isit09_2.pdf · ISIT 2009, Seoul, Korea, June 28 - July 3, 2009 Low-ComplexityNear-ML DecodingofLarge

  • Upload
    trantu

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

ISIT 2009, Seoul, Korea, June 28 - July 3, 2009

Low-Complexity Near-ML Decoding ofLarge Non-Orthogonal STBCsusing Reactive Tabu Search

N. Srinidhi, SaifK. Mohammed, A. Chockalingam, and B. Sundar RajanDepartment ofECE, Indian Institute of Science, Bangalore 560012, INDIA

Abstract-Non-orthogonal space-time block codes (STBC) withlarge dimensions are attractive because they can sim ultaneouslyachieve both high spectral efficiencies (same spectral efficiencyas in V-BLAST for a given number of transmit antennas) as wellas full transmit diversity. Decoding of non-orthogonal STBCswith large dimensions has been a challenge. In this paper, wepresent a reactive tabu search (RTS) based algorithm for decod­ing non-orthogonal STBCs from cyclic division algebras (CDA)having large dimensions. Under i.i.d fading and perfect channelstate information at the receiver (CSIR), our sim ulation resultsshow that RTS based decoding of 12 x 12 STBC from CDA and4-QAM with 288 real dimensions achieves i) 10-3 uncoded BERat an SNR of just 0.5 dB away from SISO AWGN performance,and ii) a coded BER performance close to within about 5 dB ofthe theoretical MIMO capacity, using rate-3/4 turbo code at aspectral efficiency of 18 bps/Hz. RTS is shown to achieve nearSISO AWGN performance with less number of dimensions thanwith LAS algorithm (which we reported recently) at some extracomplexity than LAS. We also report good BER performance ofRTS when i.i.d fading and perfect CSIR assum ptions are relaxedby considering a spatially correlated MIMO channel model, andby using a training based iterative RTS decoding/channel esti­mation scheme.

I. INTRODUCTION

MIMO systems that employ non-orthogonal space-time blockcodes (STBC) from cyclic division algebras (CDA) for ar­bitrary number of transmit antennas, Ni, are attractive be­cause they can simultaneously provide bothfull-rate (i.e., N,complex symbols per channel use, which is same as in V­BLAST) as well as full transmit diversity [1],[2]. The 2 x 2Golden code is a well known non-orthogonal STBC fromCDA for 2 transmit antennas [3]. High spectral efficienciesof the order of tens of bps/Hz can be achieved using largenon-orthogonal STBCs. For e.g., a 16 x 16 STBC from CDAhas 256 complex symbols in it with 512 real dimensions; with16-QAM and rate-3/4 turbo code, this system offers a highspectral efficiency of48 bps/Hz. Decoding ofnon-orthogonalSTBCs with such large dimensions, however, has been a chal­lenge. Sphere decoder and its low-complexity variants areprohibitively complex for decoding such STBCs with hun­dreds ofdimensions. Recently, we proposed a low-complexitynear-ML achieving algorithm to decode large non-orthogonalSTBCs from CDA; this algorithm, which is based on bit­flipping approach, is termed as likelihood ascent search (LAS)algorithm [4]-[6]. In this paper, we present a reactive tabusearch (RTS) based approach to near-ML decoding of non­orthogonal STBCs with large dimensions.

Key attractive features of the proposed RTS based decod­ing are its low-complexity and near-ML performance in sys­tems with large dimensions (e.g., hundreds of dimensions).While creating hundreds of dimensions in space alone (e.g.,V-BLAST) requires hundreds ofantennas, use ofnon-orthogonalSTBCs from CDA can create hundreds of dimensions withjust tens of antennas (space) and tens of channel uses (time).

Given that 802.11 smart WiFi products with 12 transmit an­tennas! at 2.5 GHz are now commercially available [7] (whichestablishes that issues related to placement of many anten­nas and RF/IF chains can be solved in large aperture com­munication terminals like set-top boxes/laptops), large non­orthogonal STBCs (e.g., 16 x 16 STBC from CDA) in com­bination with large dimension near-ML decoding using RTScan enable communications at increased spectral efficienciesof the order of tens of bps/Hz (note that current standardsachieve only < 10 bps/Hz using only up to 4 tx antennas).

Tabu search (TS), a heuristic originally designed to obtain ap­proximate solutions to combinatorial optimization problems[8]-[ 10], is increasingly applied in communication problems[11]-[13]. For e.g., in [11], design of constellation label mapsto maximize asymptotic coding gain is formulated as a quadra­tic assignment problem (QAP), which is solved using RTS[10]. RTS approach is shown to be effective in terms ofBERperformance and efficient in terms of computational com­plexity in CDMA multiuser detection [12]. In [13], a fixed TSbased detection in V-BLAST is presented. In this paper, weestablish that RTS based decoding of non-orthogonal STBCscan achieve excellent BER performance (near-ML and near­capacity performance) in large dimensions at practically af­fordable low-complexities. We also present a stopping-criteri­on for the RTS algorithm. RTS for large dimension non­orthogonal STBC decoding has not been reported so far. Ourresults in this paper can be summarized as follows:

• Under i.i.d fading and perfect channel state informationat the receiver (CSIR), our simulation results show thatRTS based decoding of12 x 12 STBC from CDA and 4­QAM (288 real dimensions) achieves i) 10-3 uncodedBER at an SNR ofjust 0.5 dB away from SISO AWGNperformance, and ii) a coded BER performance close towithin about 5 dB of the theoretical capacity using rate­3/4 turbo code at a spectral efficiency of 18 bps/Hz.

• Compared to the LAS algorithm we reported recently in[4]-[6], RTS achieves near-SISO AWGN performancewith less number of dimensions than with LAS; this isachieved at some extra complexity compared to LAS.

• We report good BER performance when i.i.d fading andperfect CSIR assumptions are relaxed by adopting a spa­tially correlated MIMO channel model, and a trainingbased iterative RTS decoding/channel estimation scheme.

II. NON-ORTHOGONAL STBC MIMO SYSTEM MODEL

Consider a STBC MIMO system with multiple transmit andreceive antennas. An (n, p, k) STBC is represented by a ma-

112 antennas in these products are now used only for beamfonning.Single-beam multi-antenna approaches can offer range increase and inter­ference avoidance, but not spectral efficiency increase.

978-1-4244-4313-0/09/$25.00 ©2009 IEEE 1993

Authorized licensed use limited to: INDIAN INSTITUTE OF SCIENCE. Downloaded on January 19, 2010 at 05:43 from IEEE Xplore. Restrictions apply.

(4)

(3)

(9.a)

(11)

L:~Ol d n-3,i W~L ti

L~==-Ol d n-2,i w~ ti

L~==-Ol do,i ti

L~==-Ol dl, i t i

L~==-Ol d2,i ti

ISIT 2009, Seouk Korea, June 28 - July 3, 2009A. High-rate Non-orthogonal ,.)TBCsfrom CDA

We focus on the decoding of square (i.e., n == p == Nt), full­rate (i.e., k == pn == Nl), circulant (where the weight ma-

trices A~i) 's are permutation type), non-orthogonal STBCsfrom CDA [1], whose construction for arbitrary number oftransmit antennas n is given by the matrix in Eqn.(9.a) givenat the bottom of this column. In (9.a), W n = ej~Ti, j = yCI,and du,v, 0 :s; u, v :s; n - 1 are the n 2 data symbols from

a QAM alphabet. When 8 == eV5j and t == ej , the STBCin (9.a) achieves full transmit diversity (under ML decoding)as well as information-losslessness [1]. When 8 == t == 1,the code ceases to be of full-diversity (FD), but continues tobe information-lossless (ILL). High spectral efficiencies withlarge n can be achieved using this code construction. How­ever, since these STBCs are non-orthogonal, ML detectiongets increasingly impractical for large n. Consequently, a keychallenge in realizing the benefits of these large STBCs inpractice is that of achieving near- ML performance for largen at low decoding complexities. The RTS based decodingalgorithm we present in the following section essentially ad­dresses this challenge.

III. RTS ALGORITHM FOR LARGE NON-ORTHOGONALSTBC DECODING

In this section, we present the RTS algorithm, which is aniterative local search algorithm, for decoding non-orthogonalSTBCs. The goal is to get X, an estimate of x, given Y and H.

Neighborhood Definition: Let aq E A, q == 1,··· ,M. De­fine a set N (aq ) as a fixed subset of A \ aq , which we referto as the symbol neighborhood of aq . We choose the cardi­nality of this set to be the same for all aq , q == 1,··· ,M;i.e., we take IN(aq)1 == N, v« Note that the maximum andminimum values of N are M - 1 and 1, respectively. Fore.g., A = {-3, -1, 1, 3} for 4-PAM, and choosing N to be2, N(-3) = {-I, I}, N(-l) = {-3, I}, N(l) = {-1,3},N(3) = {I, -I} are possible symbol neighborhoods. Letui; (aq ) , v == 1, ... ,N denote the vth element in N (aq ) ; i.e.,we say ui; (aq ) is the vth symbol neighbor of aq .

Let x(m) == [x~m) x~m) ... x~;)] denote the data vector be­longing to the solution space, in the mth iteration, where

x~m) == aq , q E {I,··· ,M}. We refer to the vector

z(m)(u,v) == [zim)(u,v) z~m)(u,v) ... z~;)(u,v)], (10)

as the (u, v)th vector neighbor (or simply the (u, v )th neighbor)of x(m) u = 1 ... 2k v = 1 ... N if i) x(m) differs from

, '" '"z(m) (u, v) in the uth coordinate, and ii) the uth element of

z(m) (u, v) is the vth symbol neighbor of x~m). That is,

(m) _ { x~m) for i -# uzi (u, v) - (m)

wv(xu ) fori==u.

So we will have 2kN vectors which differ from a given vector

L~==-Ol d n-2,i ti

L~==-Ol dn-l,i ti

(9)

(5)

(6)

n., = n r + jnQ, H; = HI + jHQ.

Further, we define H, E IR.2N r P X 2k , Yr E IR.2N r P X 1, x., EA 2k x 1, and n; E IR.2N r P X 1 as

n, = (~Q -:~ ), v- = [yf y~]T,

x., == [xr x~]T, n; == [llr ll~]T.

Now, (3) can be written ass- rr,«, + llr· (7)

Henceforth, we work with the real-valued system in (7). Fornotational simplicity, we drop subscripts r in (7) and write

Y Hx + ll, (8)where H = H, E }R2NrPX2k, Y = Yr E }R2NrPX\ X = x- E

A 2 k X l, and n = n- E }R2NrPX1. We assume that the channelcoefficients are known at the receiver but not at the transmit­ter. The ML solution is given by

arg min THTH THXML A 2k x x - 2y X,

XEflwhose complexity is exponential in k.

trix X, E C n x p, where n andp denote the number of transmit

antennas and number of time slots, respectively, and k de­notes the number of complex data symbols sent in one STBCmatrix. The (i, j)th entry in X, represents the complex num­ber transmitted from the ith transmit antenna in the jth timeslot. The rate of an STBC is k. Let N; and N, == n denote

pthe number of receive and transmit antennas, respectively.Let He E ceNrXNt denote the channel gain matrix, wherethe (i, j)th entry in He is the complex channel gain from thejth transmit antenna to the ith receive antenna. We assumethat the channel gains remain constant over one STBC ma­trix duration. Assuming rich scattering, we model the entriesof He as eN(O, 1). The received space-time signal matrix,Y, E C N r X P

, can be written as

Y, == HeXe + N e, (1)

where NeE ceNr xp is the noise matrix at the receiver andits entries are modeled as i.i.d eN (0, (J"2 == N\E s

) , whereE; is the average energy of the transmitted symbols, and I isthe average received SNR per receive antenna [14], and the(i, j)th entry in Y, is the received signal at the ith receive an­tenna in the jth time-slot. Consider linear dispersion STBCs,where X, can be written in the form [14]

k

x, L x~i) A~i) , (2)

where x~i) is the ith complex J;ta symbol, and A~i) E ceNt xp

is its corresponding weight matrix. The received signal modelin (1) can be written in an equivalent V-BLAST form as

k'"'" (i)....... (i) ~Yc = L...Jxc (Hcac )+nc = Hcxc+nc,i=l

where Ye E ce N r P X 1 == vee (Y,'), He E ce N r P X Ntp == (Ip @

He), r, is p X P identity matrix, a~i) E ce N t p xl == vee (A~i) ),

n., E ce N r P X 1 == vee (N e), x., E ce k x 1 whose ith entry is

the data symbol x~i), and He E ceNrpxk whose ith column

is He a~i), i == 1,2,··· ,k. Each element of x., is an M­PAM/M-QAM symbol. M-PAM symbols take discrete val-

6.ues from A = {a q , q = 1,··· ,M}, where aq == (2q-1- M),and M -QAM is nothing but two PAMs in quadrature. Let Ye,He, x.; n., be decomposed into real and imaginary parts as:

Yc=YI+jYQ, Xc=XI+jXQ,

1994

Authorized licensed use limited to: INDIAN INSTITUTE OF SCIENCE. Downloaded on January 19, 2010 at 05:43 from IEEE Xplore. Restrictions apply.

~ c (eSm) (u,v))

where e~m) (u,v) is the uth element of e(m) (u,v), f~m) isuth element of r(m), and Ru,u is the (u, u)th element of R.¢(x(m)) on the RHS in (12) can be dropped since it will notaffect the cost minimization. Let

The move (UI' VI) is accepted if anyone ofthe following twoconditions is satisfied:

i) ¢(Z(m)(UI,VI)) < ¢(g(m))

ii) tabu_matrix((uI-1)M+q, VI) = Owhereq: xS7) = aq EA.

If move (UI' VI) is accepted, then make

x(m+l) == x(m) +e(m)(UI,VI). (14)

in the solution space in only one coordinate. These 2kN vec­tors form the neighborhood of the given vector. We note thatneighborhood definition based on bit-flipping [4] is a specialcase ofthe above neighborhood definition for N == 1, M == 2.

The algorithm is said to execute a move (u,v) if x(m+l) ==z(m) (u,v). The number of candidates to be considered fora move in the mth iteration is 2kN. Since the coordinatethat changes in a move can take M possible values for M­PAM, the total number of possible moves is 2kMN. Thetabu value of a move, which is a non-negative integer, meansthat the move cannot be considered for that many number ofsubsequent iterations, unless certain conditions are satisfied.

Tabu Matrix: A tabu.matrix of size 2kM x N is the matrixwhose entries denote the tabu values of moves. The (r, s )thentry of the tabu.matrix corresponds to the move (u, v) from

x(m) when u == l rMIJ + 1, v == s and x~m) == aq , whereq == mod(r -I,M) + 1.

RTS Algorithm: Let g(m) be the vector which has the leastML cost found till the mth iteration of the algorithm. Letlrep be the average length (in number of iterations) betweentwo successive occurrences of the same solution vector (rep­etitions), at the end of an iteration. Tabu period, P, a dy­namic non-negative integer parameter, is defined. If a moveis marked as tabu in an iteration, it will remain as tabu for Psubsequent iterations. The algorithm starts with an initial so­lution vector x(O) , which, for e.g., could be the MMSE or MFoutput vector. Set g(O) == x(O), lrep == 0, and P == Po. Allthe entries of the tabu.matrix are set to zero. The followingsteps 1) to 3) are performed in each iteration. Consider mthiteration in the algorithm, m 2:: 0.

6. 6. ()6.Step 1): Define Ymj == HTy, R == HTH, and f m ==Rx(m) - Ymj. Let e(m)(u,v) == z(m)(u,v) - x(m). TheML costs of the 2kN neighbors ofx(m), namely, z(m) (u,v),U == 1, ... ,2k, v == 1, ... ,N, are computed as

¢(z(m)(u, v)) = (x(m) + e(m)(u, V))T R (x(m) + e(m)(u, v))

-2(x(m) + e(m)(u, V))TYmf

¢(x(m)) + 2(e(m)(u, v))TRx(m)

+ (e(m)(u, v))TRe(m)(u, v) - 2(e(m)(u, V))TYmf

¢(x(m)) + 2 eSm)(u, v) f~m) + (eSm)(u, v)) 2 Ru,u, (12), .,

ISIT 2009, Seoul, Korea, June 28 - July 3, 2009If move (UI' VI) is not accepted (i.e., neither of conditions i)and ii) is satisfied), find (U2' V2) such that

(15)

(19)

(17)

P+l,

P+l,

_ arg min ( (m) )(U2, V2) - --I- --I- C eu (u, v) ,u, v : u I UI, V I VI

and g(m+l) == x(m+I); else,

tabu.matrix ((u' - l)M + q', v')

tabu.matrix ((u' - l)M + q", v")

tabu.matrix (r, s) == max{tabuJnatrix (r, s) - 1, O}, (18)

for r = 1, ... ,2kM, S = 1, ... ,N. r(m) is updated as

r(m+l) == r(m) + e(m)(u' v')Ru' , u',

where R u ' is the u'th column of R.

and g(m+l) == g(m) .

Step 3): Update the entries of the tabu.matrix as

Stopping criterion: The algorithm can be stopped based on afixed number of iterations. Though convergence can be slowat low SNRs (typ. hundreds of iterations), it can be fast (typ.tens of iterations) at moderate to high SNRs. So rather thanfixing a large number of iterations to stop the algorithm ir­respective of the SNR, we use an efficient stopping criterionwhich makes use of the knowledge of the best ML cost in agiven iteration, as follows.

Since the ML criterion is to minimize IIHx - Y11 2, the mini­

mum value of the objective function xTHTHx - 2xTHT Y,is always greater than _yT y. We stop the algorithm whenthe least ML cost achieved in an iteration is within certainrange of the global minimum, which is -v''»: We stop thealgorithm in the mth iteration, if the condition

I¢(g(m)) - (_yTy)1 < al (20)l-yTyl

and check for acceptance of the (U2' V2) move. If this alsocannot be accepted, repeat the procedure for (U3' V3), and soon. If all the 2kN moves are tabu, then all the tabu.matrix en­tries are decremented by the minimum value in the tabu.matrix;this goes on till one of the moves becomes permissible. Let(u', v') be the index of the neighbor with the minimum costfor which the move is permitted. The variables q', q", v"are implicitly defined by x~~) = a q , = wVII(X~~+l)), and

(m+l) h AXu' == aq" , were aq, ,aq" E fl.

Step 2: After a move is done, the new solution vector ischecked for repetition. For the channel model in (8), repe­tition can be checked by comparing the ML costs of the so­lutions in the previous iterations. If there is a repetition, thelength ofthe repetition from the previous occurrence is found,the average length, lrep, is updated, and the tabu period P ismodified as P == P + 1. If the number of iterations elapsedsince the last change of the value of P exceeds f3lrep, for afixed f3 > 0, make P == P - 1. The minimum value of P,however, will be 1. Note that this step, if executed, also qual­ifies as the one which changed P. After a move (u', v') isaccepted, if ¢(x(m+I)) < ¢(g(m)), make

tabu.matrix ((u' - l)M + q', v') == 0,

tabu.matrix ((u' - l)M + q", v") == 0, (16)

(13)_ arg min ((m) )(UI,VI) - C eu (u,v).

u,v

1995

Authorized licensed use limited to: INDIAN INSTITUTE OF SCIENCE. Downloaded on January 19, 2010 at 05:43 from IEEE Xplore. Restrictions apply.

ISIT 2009, Seoul, Korea, June 28 - July 3, 2009

122

1-- ~ '.~ ~~ ~ 1;lt.. B IT~I~tIiri~eN~W· · ILLSTBC· NI=N r· · 4~QAM ......"..... ~~......• .•• .•. •••••••••, . .. .. , . .. .. . .. .... ... .. ..~ ., '

2 MMSEinitial vector '" . . "" •..•.~"" :::~, . : : :~ 4, 4 :.:

,~~: : : : : : : : : : : . , s:~ :....... "''' ......

3 •...· 4x4 ILL STBC, LAS(6) min"it"r=?O .........'\."\.,~~ :SX8

........4x4 ILL STBC, RTS . '\.~STBC......

-s'SxSILL STBC, LAS(6).'\.4

"'SxSILL STBC, RTS13=1

·.·16x16 ILL STBC, LAS[6

-+- 12x12, ILL STBC, RTS5

solutions. Consequently, RTS incurs some extra com­plexity compared to LAS, without increase in the orderof complexity.

RTS performance in V-BLAST: A similar observation can bemade with uncoded BER ofRTS detection in V-BLAST in Fig.2 for N, = N; and 4-QAM. From Fig. 2, it is seen that LASrequires 128 dimensions (64 x 64 V-BLAST) to achieve per­formance within 1 dB of SISO AWGN performance at 10- 3

BER, whereas RTS is able to achieve even better closenesswith just 64 dimensions (32 x 32 V-BLAST). In summary,the ability to achieve near SISO AWGN performance at lessdimensions than LAS is an attractive feature of RTS.

10

10

10

10

4 6 S 10Average receivedSNR(dB)

Fig. I. Uncoded BER of RTS decoding of4 x 4,8 x 8 and 12 x 12 non­orthogonal STBCs from COA. N, = N«, ILL STBCs (8 = t = 1),4-QAM.RTS achieves near SISO AWGN performance for increasing N , = N; (i.e.,STBC size). RTS performs better than LAS.

B. Turbo coded BER performance ofRTS

Figure 3 shows the rate-3/4 turbo coded BER of RTS decod­ing of12 x 12 non-orthogonal ILL STBC with N, = N; and4-QAM (corresponding to a spectral efficiency of 18 bps/Hz),under perfect CSIR and i.i.d fading. The theoretical mini­mum SNR required to achieve 18 bps/Hz spectral efficiencyon a N, = N; = 12 MIMO channel with perfect CSIR and i.i.dfading is 4.27 dB (obtained through simulation of the ergodiccapacity formula [14]). From Fig. 3, it is seen that RTS de­coding is able to achieve vertical fall in coded BER close towithin about 5 dB from the theoretical minimum SNR, whichis good nearness to capacity performance. This nearness tocapacity can be further improved by 1 to 1.5 dB if soft deci­sion values, proposed in [5], are fed to the turbo decoder.

C. Iterative RTS Decoding/Channel EstimationNext, we relax the perfect CSIR assumption by consideringa training based iterative RTS decoding/channel estimationscheme. Transmission is carried out in frames, where oneN, x N, pilot matrix (for training purposes) followed by Nddata STBC matrices are sent in each frame. One frame length ,T, (taken to be the channel coherence time) is T = (Nd +1)Nt channel uses. The proposed scheme works as follows:i) obtain an MMSE estimate of the channel matrix during thepilot phase, ii) use the estimated channel matrix to decodethe data STBC matrices using RTS algorithm, and iii) iteratebetween channel estimation and RTS decoding for a certain

20ur simulation results show that the BER performance of FO-ILL andILL STBCs with RTS decoding arc almost thesame.

is met with at least min.iter iterations being completed tomake sure the search algorithm has 'settled.' The bound isgradually relaxed as the number of iterations increase and thealgorithm is terminated when

1¢(g<ml)_ (_yTy )1 (21)

I

T I < ma2·- y y

In (20) and (21), al and a2 are positive constants. In ad­dition, we terminate the algorithm whenever the number ofrepetitions of solutions exceeds max.rep, Also, the maximumnumber of iterations is set to max.iter. We have found that useof the following stopping criterion parameters results in lowcomplexity without compromising much on the performance(compared to a fixed number of iterations of300) for 4-QAM:min.iter = 20, max.iter = 300, max.rep = 75, al = 0.05,and 0<2 = 0.0005.

IV. SIMULATION RESULTSWe evaluated the uncodedlcoded BER performance of theRTS algorithm in decoding non-orthogonal STBCs with 8 =

t = 1 (i.e., ILL) and 8 = ev'5j, t = J (i.e., FD-ILL2) throughsimulations. The following RTS parameters are used in all thesimulations: MMSE initial vector, Po = 2, fJ = 1,0.1, aj

5%,a2 = 0.05%, max.repr Zb, max.iter = 300, min.iter = 20.

A. Uncoded BER performance ofRTS:

RTS versus LAS Performance: In Fig. 1, we plot the un­coded BER of the RTS algorithm as a function of averagereceived SNR per receive antenna, "(, in decoding 4 x 4 (32dimensions), 8 x 8 (128 dimensions) and 12 x 12 (288 dimen­sions) non-orthogonal ILL STBCs for4-QAM and N, = N«.Perfect CSIR and i.i.d fading are assumed. For the same set­tings, performance of the LAS algorithm in [4]-[6] are alsoplotted for comparison. MMSE initial vector is used in bothRTS and LAS. As a reference, we have plotted the BER per­formance on a SISO AWGN channel as well. From Fig. 1,the following interesting observations can be made:

• the BERofRTS algorithm improves and approaches SISOAWGN performance as N, = N; (i.e., STBC size) is in­creased; e.g., performance close to within 0.5 dB fromSISO AWGN performance is achieved at 10- 3 uncodedBER in decoding 12 x 12 STBCwith 288 real dimensions.

• RTS algorithm performs better than LAS algorithm (seeRTS and LAS BER plots for 4 x 4 and 8 x 8 STBCs).Further, while both RTS and LAS algorithms exhibitlarge system behavior (i.e ., BER improves as N; = N;is increased), RTS is able to achieve nearness to SISOAWGN performance at 10- 3 BER with less number ofdimensions than with LAS. This is evident by observingthat, while LAS requires 512 dimensions (16x16 STBC)to achieve 1 dB closeness to SISO AWGN performanceat 10- 3 BER, RTS is able to achieve even 0.5 dB close­ness with just 288 dimensions (12 x 12 STBC). RTS isable to achieve this better performance because, whilethe bit/symbol-flipping strategies are similar in both RTSand LAS, the inherent escape strategy in RTS allows itto move out of local minimas and move towards better

1996

Authorized licensed use limited to: INDIAN INSTITUTE OF SCIENCE. Downloaded on January 19, 2010 at 05:43 from IEEE Xplore. Restrictions apply.

ISIT 2009 , Seoul , Korea, June 28 -July 3,2009

-

10-'E:;~~~5f~~~~~fTi~:;;;~F~I < « «<N~~ "' 1 2x 1 2 FD-ILLSTBC, Nt=Nr=12, l.i.d. channel+ 12x12 FD-ILLSTBC, Nt=Nr=12, correlatedchannell •••• ••• ••• ••• ••••• ••• ••• ••• •• ••• ••• • T

-5 ...... 12x1 2 FD-ILLSTBC, Nt=12, Nr=14, correlated chI.

10 0 2 4 6 8 10 12Average received SNR (dB)

Fig. 4. Effect of spat ial correlation on the performance of RTS decod ingof 12 x 12 I'D-ILL STBC with N , = 12, N; = 12, 14, 4-QAM, rate-3/4turbo code, 18 bps/Hz. Ie = 5 Gl-lz, R = 500 m, S = 30 , D t = D r = 20m, fh = Or = 90° , N i.d; = N td t = 72 em. Spatial correlation degradesachieved diversity order compared to i.i.d. Increasing N; alleviates thisperformance loss.pared to LLdfading, there is a loss in diversity order in spatialcorrelation for N, = N; = 12; further, use of more receiveantennas (N; = 14, N, = 12) alleviates this loss in perfor­mance . Finally, we note that have carried out simulations ofRTS decoding for 16-QAM as well, where similar results re­ported here for 4-QAM are observed. The RTS decoding canbe used to decode perfect codes of large dimensions as well.

REFERENCES[I] B. A. Sethuraman, B. Sunda r Rajan , and V.Shashidhar, "Full-diversity

high-rate space-time block codes from divis ion algebras," IEEE Trans.Inform. Theory, vol. 49, no. 10, pp. 2596-2616, October 2003 .

[2] F. Oggier, J.-c. Belfiore, and E. Viterbo, Cyclic Division Algebras: ATooljor Space-Time Coding, Foundations and Trends in Commun. andInform. Theory, vol. 4, no. I, pp. 1-95, Now Publishers, 2007.

[3] J.-C. Belfiore , G. Rekaya, and E. Viterbo, "The gold en code: A 2 x2 full-rate space-time code with non-vanishing determinants," IEEETrans. Inform. Theory, vol. 51, no. 4, April 2005.

[4] K. Vishnu Vardhan , Saif K. Mohammed, A. Chockalingam, B. SundarRajan , "A low-complexity detecto r for large MIMO systems and multi­carr ier CDMA systems," IEEE JSAC Spl. Iss. on Multiuser Detection,j or Adv. Commun. Systems and Networks, pp. 473-485, April 2008 .

[5] Saif K. Mohammed, A. Chockal ingam , and B. Sundar Rajan , "A low­complexity near-ML performance achiev ing algorithm for large MIMOdetection," Proc. IEEE ISIT '2008, Toronto, July 2008.

[6] Saif K. Mohammed, A. Chockal ingam, and B. Sundar Rajan , "High­rate space-time coded large MIMO systems: Low-complexity detectionand performance," Proc. IEEE GLOHECOM '2008, December 2008 .

[7] http ://www.ruckuswireless.com/technology/beamflex.php[8] F. Glover, "Tabu Search - Part I," ORSA Journal ojComputing, vol. I,

no. 3, Summer 1989, pp. 190-206.[9] F. Glove r, "Tabu Search - Part II," ORSA Journal ojComputing, vol. 2,

no. I, Winter 1990, pp. 4-32.[10] R. Battiti and G. Tecchiolli, "T he reactive tabu search ," ORSA Journal

on Computing, no. 2, pp. 126-140 , 1994.[I I] Y. Huang and J. A. Ritcey, " Improved 16-QAM constellation labeling

for BI-STCM-ID with the Alamouti schem e," IEEE Commun. Letters,vol. 9, no. 2, pp. 157-159 , February 2005.

[12] P. H. Tan and L. K. Rasmussen, "Multiuser detection in COMA - Acomparison of relaxations, exact, and heurist ic search methods," IEEETrans. Wireless Commun., pp. 1802-1809, September 2004.

[13] H. Zhao, H. Long, and W. Wang, "Tabu search detection for MIMOsystems," Proc. IEEE PIMRC '2007, Athens, September 2007 .

[14] H. Jafarkhani, Space-Time Coding: Theory and Practice, CambridgeUniversity Press , 2005.

[15] D. Shiu, G. 1. Foschini, M. 1. Gans , and J. M. Khan , " Fading correlationand its effect on the capacity of multi- antenna systems," IEEE Trans.on Commun., vol. 48, pp. 502-513, March 2000.

[16] D. Gesbert, II. Bolcskei , O. A. Gore, and A. J. Paulraj , "OutdoorMIMO wireless channels: Models and performance prediction," IEEETrans. on Commun. , vol. 50, pp. 1926-1934, December 2002.

10 12

0:2=0.05%

..,

P [ = ?

o n

468Average received SNR (dB)

.............. Min SNR reqLJ iredtoachievecapacity of 18 bps/Hz

10-' ~;~:;~ ~=~~~~i: ~~~[5] 1 •••• ••• ••• ••••• •••••• ~t~~.,~ •• ••••••••••••••••••••••••••••••••••••••••SN-1I'64x64 V- BLAST, LAS[5]"'32x32 V- BLAST, RTS

5 - SISOAWGN10- 0 2

-

Fig. 2. Uncoded BER of RTS detection of V-BLAST with N t N rand 4-QAM. RTS achieves near SISO AWGN performance fo r increasingN; = N«. RTS performs beller than LAS.

-&- Perfect CSIR, 18 bps/Hz

f ········· c. ................ l"'Estimated CSIR, Nd=20, 17.14 bps/Hz

-e- Estimated CSIR, Nd=8, 16bps/Hz

10-''----- --"--e-c-----.J'-------"-------.J----'------'----'------'----'3 4 5 6 7 8 9 10 11 12

Averagereceived SNR (dB)

Fig. 3. Turbo coded BER ofRTS decoding of 12 x 12 non-orthogonal ILLSTBC with N, = N«, 4-QAM, rate-3/4 turbo code , and 18 bps/Hz. HERof RTS with estimated CSIR approaches close to that with perfect CSIR f orincreasing Nd (i.e., slow fading).

number of times . For 12 x 12 ILL STBC, in addition to per­fect CSIR performance, Fig. 3 also shows the performancewith CSIR estimated using the above iterative RTS decod­ing/channel estimation scheme for N d = 8 and N d = 20. 2iterations between RTS decoding and channel estimation areused. With Nd = 20 (which corresponds to large coherencetimes, i.e., slow fading) the BER and bps/Hz with estimatedCSIR get closer to those with perfect CSIR.

D. Effect ofMIMO Spatial Correlation

In Figs. 1 to 3, we assumed i.i.d fading. But spatial corre­lation at transmit/receive antennas and the structure of scat­tering and propagation environment can affect the rank struc­ture of the MIMO channel resulting in degraded performance[15],[ 16]. We relaxed the i.i.d. fading assumption by consid­ering the correlated MIMO channel model proposed by Ges­bert et al in [16], which takes into account carrier frequency(fc), spacing between antenna elements (d t , dr), distance be­tween tx and rx antennas (R) , and scattering environment. InFig. 4, we plot the uncoded BER of RTS decoding of 12 x 12FD-ILL STBC with perfect CSIR in i) i.i.d. fading, and ii)correlated MIMO fading model in [16]. It is seen that, com-

2 . 12x12 STBC, Nt=Nr=1 2'" 4-QAM, rate- 3/4Turbo code

~ 10-2f· · ltl teelr.aattli~V'ee RRToTSr· · · ddle:c~o::d:inllgg?/C:hh lll~:le::st11 .· .· ·· .· .·~ .c ~ . ~•••·\~'"~()~~•••• r r1gw

iii

1997

Authorized licensed use limited to: INDIAN INSTITUTE OF SCIENCE. Downloaded on January 19, 2010 at 05:43 from IEEE Xplore. Restrictions apply.