[IEEE Iterative Information Processing (ISTC) - Brest, France (2010.09.6-2010.09.10)] 2010 6th International Symposium on Turbo Codes & Iterative Information Processing - Interblock

2010 6th International Symposium on Turbo Codes & Iterative Information Processing

Interblock Memory for Turbo Trellis Coded Modulation

Yeong-Luh Ueng Chia-Jung Yeh Mao-Chao Lin Dept. of Electrical Engineering

National Tsing Hua University

Hsinchu, Taiwan, R.O.C.

Graduate Institute of Commun. Engineering

National Taiwan University

Dept. of Electrical Engineering

National Taiwan University

Taipei, Taiwan, R.O.C. Taipei, Taiwan, R.O.C.

Abstract-This paper presents a turbo treOis coded modulation (TTCM) scheme for which the encoding is implemented by seriaUy concatenating a demultiplexer, a multilevel delay processor, and a signal mapper to the encoder of a binary turbo code. Through the delay processor and the signal mapper, interblock memory is introduced between consecutive binary turbo codewords. The proposed TTCM can be decoded based on the demapper and the binary turbo decoder. We can design a TTCM with a rate of 2 bits per 8PSK symbol and a pinch-off limit of 3.08 dB which is only 0.18 dB away from the capacity (2.9 dB).

Index Terms: turbo codes, concatenated codes, binary codes, turbo coded modulation.

I. INTRODUCTION

Random-like binary codes such as conventional turbo codes

[1], low-density parity-check (LDPe) codes [2][3], repeat

accumulate (RA) codes [4][5] have pushed the error perfor

mance of channel coding close to the Shannon limit. Random

like coded modulation for near-capacity performance has also

been extensively studied. The related results include turbo

trellis coded modulation (ITCM) [6]-[11], multilevel coded

modulation employing random-like binary codes [12][13],

LDPC codes for PSK modulation [14], and many more which

are not listed here.

On the other hand, block coded modulation (BCM) [15]

with interblock memory (BCMIM) was proposed in [16]. It

was demonstrated in [17] that the BCMIM has the advantage

of rate gains without distance loss compared to the conven

tional BCM. In [18][19], a trellis coded modulation (TCM)

scheme with large free distance was proposed. In [20], a tail

biting design of the TCM presented in [19] was proposed.

The resultant coded modulation can be viewed as a bit

interleaved coded modulation (BICM) [26][27] with a regular

bit interleaver and was demonstrated to have better error

performance than the BICM when the code length is short

[20]. In [16][18][19][20], non-random codes are modified to

achieve larger free (or minimum) distances. The associated

distance spectra cannot be as thin as those of the turbo codes

even if optimum decoding is used. In addition, the suboptimal

decoding used in [18][19] increases the error coefficients

(equivalently, degrading the distance spectra). Hence, the error

performance at low signal-to-noise ratio (SNR) is poor.

35 978-1-4244-6746-4/10/$26.00 ©201O IEEE

For a random-like block code [1]-[5], the error performance

can be improved by increasing the block length, which implies

more correlation among code bits. There is a construction of

random-like code that is not in block form, called LDPe con

volutional code [28], for which increased correlation among

code bits can be obtained by increasing the constraint length.

Consequently, it is interesting to increase the correlation

among code bits by providing correlation among consecutive

codewords of random-like block codes. In [21], we considered

a binary coding scheme T which is implemented by serially

concatenating a demultiplexer, a multilevel delay processor,

and a binary signal mapper to the encoder of a binary turbo

code C. With the delay processor and the signal mapper, extra

checks are added between code bits of consecutive binary

turbo codewords. Hence, this coding is a kind of binary turbo

coding scheme with interblock memory.

For T, two suboptimum decoding methods were proposed

in [21]. The first is IOSC (iterative decoding within a single

codeword of C) and the second is IOAC (iterative decoding

between adjacent codewords of C). IDSC is a two-stage

decoding which employs a demapper in the first stage and

a turbo decoder of C in the second stage. Using IDSC to

decode T will result in significant error coefficients which will

degrade the error performance of T. The reason of degradation

is that in the check-node operation defined by the signal

mapper, we do not utilize the extrinsic information provided

by the following codeword of turbo code C that is correlated

with the current codeword of C. To alleviate this problem,

we can use IDAC which efficiently utilizes this extrinsic

information. Usually, using IDAC, we can achieve better error

performance with some additional complexity and decoding

delay compared to using IDSe. T can achieve a much lower

error floor than that of C for short-to-Iong code lengths using

either IOSC or IOAe.

In this paper, we investigate turbo trellis coded modula

tion using interblock memory by replacing the binary signal

mapper with a multi-level constellation. We derive the pinch

off limits of T using IOSC and IOAC respectively by the

EXIT (extrinsic information transfer) charts [33] under some

additional assumptions. Simulation is performed to evaluate

the performance of T. In all the investigated cases, there

always exists a ITCM T, which performs at least as good

as the ITCM presented in [6] regarding the waterfall region.

20 I 0 6th International Symposium on Turbo Codes & Iterative Information Processing

In some cases, the superiority is very significant. In particular, we can design a ITCM T with a rate of 2 bits per 8PSK symbol and a pinch-off limit of 3.08 dB which is slightly better than that of LDPC coded 8PSK presented in [14] and is only 0.18 dB away from the capacity (2.9 dB).

The remainder of this paper is organized as follows. Encoding, decoding including IDSC and IDAC, and distance property for the binary turbo coding with interblock memory are reviewed in Section II. Turbo trellis coded modulation with interblock memory is given in Section III. Conclusions are given in Section IV.

II. BINARY TURBO CODING WITH INTER-BLOCK

MEMORY

A. Encoding

Let C be a turbo code with codeword length m)... The schematic diagram of turbo coding with interblock memory, denoted as T, is shown in Fig. 1. The output of C is split into multiple streams, and then each stream undergoes a different delay. These streams are linearly combined to form T.

Denote the t-th code word of C as a(t) = [a (t,O) ,a (t, 1), . . . ,a ( t, m).. - 1)]. The turbo coded sequence rt = { . . . ,a(t),a(t + I), . . . } is sequentially processed by a demultiplexer and a delay processor to produce the associated output sequences 1f = {"" v(t), v(t+I), . . . } and ? = {"" s(t), s(t+ 1)"" }, respectively, where v(t) = [v(t, 0), v(t, 1), . . " v(t, ).. -1)], s(t) = [s(t, 0), s(t, 1), . ··, s(t, ).. -1)], and each of v(t, k )=[Vl (t, k), . . . , vm(t, k)] and s(t, k) = [SI (t, k), "' , sm(t, k)] is a binary m-tuple. The sequence ? is then processed by the signal mapper to produce the output sequence 7 = { . . . , z(t), z(t + 1), . . . }, where z(t) = [2(t, 0), 2(t, I), . ··, 2(t, ).. -I)] and 2(t, k) = [zl(t, k), "' , zm (t, k)] is an m-bit symbol. The resultant code T takes 7 as its code sequence.

The relation between the output and input of the demultiplexer is Vj(t, k) = a(t, ( j -1) .. + k), 1 :::; j :::; m, k =

0, 1", . , ).. -1. The delay processor acts as a convolutional interleaver with block delay. For an m-Ievel delay processor, the relationship between Vj(t, k) and Sj(t, k) is Sj(t, k) = vj(t - (m - j), k), I:::; j:::; m , O:::; k:::; ).. -1. The rates of T and C are the same. For T, the insertion of delay processor introduces interblock memory since z( t) and z( t+ 1) are correlated.

The signal mapper maps a binary m-tuple s(t, k) into a binary m-tuple 2(t, k) = w(s(t, k)) = s(t, k)M E no, where M is a binary m x m nonsingular matrix and no = {o, I}m. According to s = (SI' S2, ' " ,sm ) and 2 = w(s) E no, we can construct an m-Ievel partition chain no/nI/n2/ ... /nm , where Si, 1:::; i :::; m, is the label of the partitioning ni-I/ni, according to the set-partitioning principle in [22][23].

Let w(s) and w(s') denote two signal points which are in the same coset of nj-l. but in distinct cosets of nj labelled by Sj = 1 and sj = 0, respectively. Let �(w(s), w(s')) represent the bit-wise Hamming distance between w(s) and w(s'). We define �j to be the least one of all the possible nonzero � (w ( s) , w ( s')), and nj to be the average number of signal

36

points w(s') satisfying �(w(s), w(s')) = �j. We have �m � �m-l�" '��I' Example 1: Let m = 2, no = {0, I}2 and M = (� �). We have ZI = SI EB S2 and Z2 = S2. We

have �1 = min{�(w(s), w(s')) s E {(I, O), (I, I)};s' E {(O, O), (O, I)}} = 1. Both w(s = (1, 0)) and w(s = (1, 1)) are possible neighbors labelled with SI = 1 which are at a Hamming distance of 1 from w(s' = (0, 0)) that is labelled with s� = 0. Hence, we have nl = 2. Similarly, we have �2 = 2 and n2 = 1. 0

The encoding stucture of T using the parameters given in Example 1 is shown in Fig. 2.

B. Distance property

With the delay processor and the signal mapper, we can check the distance properties of the resultant code T in a manner similar to that used in [19]. Let 1f = {- . . , v(-I), v(O), ···} be a weight-d binary sequence that is the demultiplexed version of a turbo coded sequence rt from C. Let 7 and z6 be the binary ou�ut sequences associated with 1f and 0, respectively, where ° is the all zero sequence . Let �T(7, z6) be the bit-wise Hamming distance between the output sequences 7 and z6 corresponding sequences 1f and 0, respectively.

Assume v(t) = 0 for t < ° and v(O) !- 0, where 0 is the all zero codeword consisting of ).. binary m-tuples. Remember that si(m-j, k) = vi(m-j -(m-i), k) = vi(i-j, k) = ° for i < j. From the set-partitioning property, we have �(w(s(mj, k)), w(O)) � (sj(m -j, k) EB O)�j = Vj(O, k)�j . Thus

A-I m

�T(7,z6) � LL�(w(s(m -j, k)), w(O)) k=Oj=1 A-I m

> L L Vj(O, k)�j == DLB,T(1f, 0).(1) k=Oj=1

Example 2 : Consider a delay processor with m = 2 and ).. = 4. Let 1f= {"" (OO), v(O, O) =

(VI (0, 0), V2(0, 0)) = (01), v(O, 1) = (10), V(O, 2) =

(11), V(O, 3) = (01), (00), (00)" .. 1 be the input sequence of the delay processor WhICh is also a code sequence of C. Hence, the associated output sequence ? of the delay processor is given as follows.

7= [ ::: 8{0,0) 8{0, 1) 8{ 1,0)

� � � o 0 0 0 0 1 0 1 1 0

1 1 0 0 0 0

... ] ... With the signal mapper given in Example 1, we have

�T(7,z6) � vl(O, I)�1 + vI(0, 2)�1 + V2(0, 0)�2 + V2(0, 2)�2+V2(0, 3)�2 = 2�1 +3�2' Since �1 = 1, �2 = 2, we have �T(7, z6) � 8. Indeed, �T(7, z6) = 8. In contrast, the Hamming distance between 1f and a is only 5. 0

Let Nc(d) and NT(d) denote the multiplicities of weight-d code sequences of C and T, respectively. In C, each weightd binary sequence, 1f, contributes to only "one" neighbor in


counting Nc(d). In T, each binary sequence, 71, for which !:1Tct,:z6) = d contributes to "one" neighbor in counting NT(d), in case that optimum decoding is employed. Suppose that, for T, a suboptimum decoding of a(O) is designed based on the m-Ievel partition structure and we have no a priori

information about a(I), a(2), ' ". Then, each binary sequence, 71, for which !:1Tct,:z6) = d, may contribute to more than one neighbor in counting NT(d). We illustrate this point by the following example.

Example 3: Write 71 in Example 2 as {71 P' (00), (00)", . }, where 71 p = {"" (00), "', (00), 1)(0,0) = (01), 1)(0,1) =

(10),1)(0,2) = (11), 1)(0,3) = (01)}. We have !:1Tct, :z6) = 8. By feeding 712 = {t7 P' (00), (01), (00), (00), (00)·· . }, 713 = {71 p, (00), (00), (01), (00), (00) . . . }, or 714 = {71 p, (00), (01), (01), (00), (00)··· } to the delay processor and signal mapper specified in Example 2, we can obtain sequences 72, 73, and 74, respectively, at the output of the signal mapper. Note that 72, 73, and 74 may not be the code sequences of T since 712, 713, and 714 may not be the code sequences of C. In addition, 72, 73, and 74 are also at a distance of 8 from :z6. In case that in a suboptimum decoding, we have no a priori information about s2(1, k) and hence s2(1, k) can be either ° or 1 for k = 0, 1, 2, 3. Hence, 72, 73, and 74 are regarded as neighbors which are at a distance of 8 from :z6 even though 72, 73, and 74 may not be the code sequences of T. Hence, the code sequence, 71, contributes to 4 = 281(1,1)281 (1,2) = nfl (O,1)nf1(O,2), neighbors in counting NT(!:1T(7,:z6) = 8). 0

Now consider the general case. If Vj (0, k) = 1, then there are nj possible neighbors labelled with sj(m-j, k) = 1 which are at a Hamming distance of !:1j from w(O). Hence, the code sequence, 71, contributes to

n ( -4 -=to ) = II"!!- IIA-1[n_j8j(m-j,k) = II"!!- IIA-1[n_jvj(O,k) T V , 3=1 k=O 3 3=1 k=O 3 (2)

neighbors in counting NT (!:1T (7, :z6)). In case that we have the a priori information about a(I), a(2), . ", we are able to reduce nT(71, ct} C. Iterative decoding within a single codeword

Now we present a suboptimum decoding for T, called iterative decoding within a single codeword (IOSC). Here, we present the lOse with m = 2. Extension to lOse with m > 2 is straightforward.

Suppose that we have received ···, y(t -1), y(t), y(t + 1), . . " where y(t) is the received version of z(t) that is corrupted by additive Gaussian noise. Now consider the decoding of a(t). The decoder of lOse consists of an MAP (maximum

a posteriori probability) demapper and a turbo decoder of C. Suppose the extrinsic L-values (or log-likelihood ratios) Lv,e(a(t -1)) = Lv,p(a(t -1)) -Lv,c(a(t -1)) of bits in a(t -1) have been obtained from the decoding of a(t -1), where Lv,p(a(t -1)) and Lv,c(a(t -1)) are the associated a posteriori and channel values, respectively. In [35], the symbol rn is used as the notation for the addition defined by

37

Step 1 The demapper computes the a posteriori L-values LM,p(ih(t)) for bits in V2(t) based on y(t) and the a priori L-values LM,a(V1(t -1)) which can be obtained from Lv,e(a(t -1)). Specifically, for k = 0,1"",>'- 1, we have LM,p(V2(t, k)) = L(V2(t, k)IY1(t, k),Y2(t, k)) = (Lc . Y1(t, k)) rn LM,a(V1(t -1, k)) + Lc' Y2(t, k), where Lc = 4t.

Step 2 The demapper computes LM,p(V1(t)) for bits in V1(t) based on y(t + 1) and LM,a(V2(t + 1)) = O. Specifically, for k = 0,1"", >. - 1, we have LM,p(V1 (t, k)) = L(V1(t, k)IY1(t+ 1, k), Y2(t+ 1, k))

= (Lc ' Y1(t + 1, k)) rn (Lc ' Y2(t + 1, k)). Step 3 The turbo decoder of C uses LM,p(V(t)) as channel

values Lv,c(a(t)) for bits in a(t) to recover bits of a(t) and obtain values Lv,e(a(t)) which are stored for the calculation of LM,p(V2(t + 1)).

Throughout this paper, the Max-Log-MAP algorithm with correction factors [34][35][36] is employed and the number of iterations used for the decoding of C is NJ•

D. Iterative decoding between adjacent codewords

For T, IDSe results in large error coefficients as indicated in equation (2). The error coefficients can be reduced by message passing between two adjacent turbo code words, a(t) and a(t+ 1). This decoding is referred to as iterative decoding between adjacent codewords (IOAC). We now present the decoding of a(t) for IOAe with m = 2. Suppose that Lv,e(a(t -1)) has been obtained.

Step 1 With Lv,e(a(t)) = 0 (equivalently LM,a(V(t)) =

0), we use lOse to decode a(t + 1) and obtain Lv,e(a(t + 1)).

Step 2 Based on Lv,e(a(t + 1)) obtained in Step 1, Lv,e(a(t - 1)), and LM,a(V(t))=O, the demapper calculates LM,p(V(t)). The decoder of C then uses LM,p(V(t)) as Lv,c(a(t)) to decode a(t) and obtain Lv,e(a(t)).

Step 3 Based on Lv,e(a(t)) obtained in Step 2, we use lOse to re-decode a(t + 1) and update Lv,e(a(t + 1)).

Step 4 Based on updated Lv,e(a(t + 1)) obtained in Step 3, Lv,e(a(t -1)), and LM,a(V(t))=O, the demapper calculates LM,p(V(t)). The decoder of C uses updated LM,p(V(t)) as new channel values Lv,c(a(t)) to decode a(t) and obtain Lv,e(a(t)).

Step 5 After repeating Steps 3 and 4 for NJVAc -1 times, we can recover a(t) and obtain Lv,e(a(t)).

For each turbo decoding of C, the number of iterations used is NJ• It can be seen that the IDSe can be viewed as a scheduling for decoding a(t) (equivalently, v(t» over a window covering z(t) and z(t+l), while IOAe can be viewed as a scheduling for decoding a(t) (equivalently, v(t» over a window covering z(t), z(t + 1) and z(t + 2).


E. A specific construction

In the section, we use the rate-1I2 conventional binary turbo code [1] as the code C in constructing T. This C is composed of two recursive systematic convolutional codes, i.e. , RSCI and RSC2 respectively. The parameters of m, 00, and M given in Example 1 are used. The relation between input bits and output bits of the demultiplexer is given by Vi (t, k) = u(t, k), v2(t,k) = Pi(t,k) for even k, and v2(t,k) = P2(t,k) for odd k, where u(t, k), Pi(t, k) and P2(t, k) are message bit, parity bit of RSCl, and parity bit of RSC2, respectively. The interleaver size is K = >.. The constituent codes used in this paper are 4-state codes with generator matrix (1,5/7)8. Additive white Gaussian noise (AWGN) channels are assumed.

In [32], the effective minimum distance d2,min(C) of C which equals 2+amin is defined to be the minimum Hamming weight generated by weight-2 message words, where amin is the associated parity-check weight. Let Nc (d2,min (C)) denote the associated multiplicity.

For T, we can calculate effective minimum distances or effective free distances with the aid of (1). We have �i = 1 and �2 = 2 since the signal mapper in Example 1 is used. Now consider a sequence 7 of T associated to a sequence ct={ . . . , a(t), a(t+1),··· }, where a(t) is a weightd2,min(C) codeword of C . In particular, the message part of a(t), i.e. , u(t), has weight 2 and the parity part of a(t), i.e. , p(t), has weight amino In case that a(t') = 0 for t' > t, the Hamming weight of 7 of T associated to ct of C is 2�1 + amin�2 =2+ 2amin == d2,min(T) which is larger than d2,min(C) if amin > O. In case that a( t') -:F 0 for some t' > t, the Hamming weight of 7 will be increased beyond d2,min(T). This argument implies that NT(d2,min(T)), the multiplicity of weight d2,min(T) in T, is equal to Nc(d2,min(C)), the multiplicity of weight d2,min(C) in C.

From the analysis of effective minimum distance and the associated multiplicity, we expect that the asymptotic performance of T at the error floor region will be better than C in case that optimum decoding is used. To see whether this trend is still true in case that the suboptimum decoding methods presented in this paper are used, we estimate the low weight part of the weight spectra of T resultant from not only the weight-2 message sequences but also the weight-3 message sequences. We consider the case of an S-random interleaver of size K=1024 with S = 18. We use (1) and (2) to calculate the distance spectra of T. The effective minimum distances d2,min(T) and d2,min(C) are 18 and 10, respectively. The associated multiplicities, NT(d2,min(T)) and Nc(d2,min(C)) are 3 and 3, respectively. Let d3,min(T) and d3,min(C) denote the minimum weight of sequences in T and C resultant from the weight-3 message sequences, respectively. Then, d3,min(T) and d3,min(C) are 25, and 14, respectively. The associated multiplicities, NT(d3,min(T)) and Nc(d3,min(C)) are 1 and 1, respectively. Since IDAC takes the a priori information of a( t + 1) into account, we assume that, for the high SNR condition, the associated multiplicities are close to

38

those derived earlier using optimum decoding. As to IDSC, its multiplicities, NT(d2,min(T)) associated to d2,min(T) is 12, and the multiplicities, NT(d3,min(T)) associated to d3,min(T) is 4.

Using the equation for BER given in [31], the asymptotic performance calculated based on the NT(d2,min(T)) and NT(d3,min(T)) for IDSC and IDAC respectively and the BER obtained from simulation are shown in Fig. 3. We see that the asymptotic performance of T is dominated by sequences encoded from low-weight message like the case of conventional turbo codes C. The suboptimum decoding IDAC can achieve BER performance at the error floor region which is close to that predicted based on the optimum decoding. The BER performance of the suboptimum decoding IDSC is inferior to that of IDAC due to the increased multiplicities.

III. TURBO TRELLIS CODED MODULATION WITH

INTERBLOCK MEMORY

A. Encoding and Decoding

The scheme of turbo coding with interblock memory can also be used to construct coded modulation if the binary signal mapper M in Fig. 1 is replaced by the signal mapper for a modulation constellation. For a constellation with 2m points, we consider a q-Ievel, q ::; m, partition. Let 0 = io < ii < ... < iq-i < iq = m, where ij is an integer. Write s = [fit,··· ,Oq], where OJ = [Si

j_1+i,Si

j_1+2,··· ,Si

j].

With this, we can have a q-Ievel partition chain 0 =

00/Od02/ . . . /Oq, where OJ-i is partitioned into 2ir

ij-1

cosets of OJ and OJ represents one of the 2ij-

ij-1 cosets [22],

[23]. Since 0 is a signal constellation, the distance measure �(z, z') is the squared Euclidean distance between z and z'. Let w(s) and w(s') denote two signal points which are in the same coset of OJ-b but in distinct cosets of OJ labelled by OJ = a and OJ = (3, respectively. We define �j(a, (3) to be the least one of all the possible �(w(s),w(s')), and nj (a, (3) to be the average number of signal points w( s') satisfying �(w(s),w(s'))=�j(a,(3). Furthermore, let w(s�,

W(s')E OJ-i and Sij_1+k -:F S�

j_1+k

. We define 8;j-1+ ,

k = 1,··· , ij -ij-b to be the least one of all the possible �(w(s),w(s')), and 17;j-1+k to be the average number of signal points w(s') satisfying �(w(s),w(s'))=8;j-1+k. Note that if q = m, then 8; = �j(1, 0) = �j and r,; = nj(1, 0) =

nj for j = 1,··· , m. The necessity of introducing the mlevel parameters, i.e. , rd and 81 for all possible i and j, in addition to the q-Ievel parameters { �i ( ., . ), . . . , �q ( ., .)} and {ni(-'·),··· ,nq(.,.)} is that even though we use a q-Ievel partition structure in the encoding, the bit-by-bit decoding used in the turbo decoding views the coding as an m-Ievel structure.

Example 4: Let m = 3, q = 3 and {it ,i2, i3} =

{1, 2, 3}. Consider an 8PSK signal constellation 0 with Ungerboeck's labelling [22]. It can be checked that { �i (1,0) =

0.586'�2(1,0) = 2,�3(1,0) = 4}, {ni(1,O ) = 2,n2(1,0) =

2,n3(1,0) = 1}, {8} = 0.586,8� = 2,8J = 4} and {17t = 2,17� = 2,17J = 1}. 0


Example 5: Let m = 3, q = 2 and {il, i2} =

{I, 3}. Consider an 8PSK signal constellation n with

mixed labelling [24]. It can be checked that {�l (1, 0) =

0. 586'�2(00, 1O) = 2'�2(00, OI) = 2'�2(00, 1l) = 4}, {nl(I, O) = 2, n2(00, IO) = I, n2(00, OI) = I, n2(00, 1l) =

I}, {8f = 0. 586, 8� = 2, 8� = 2} and {1]f = 2, 1]� = I, 1]� =

I}. D With such a q-Ievel partition, we can use a gen

eral q-level delay processor instead of the m-Ievel

delay processor to construct coded modulation. Write

v(t, k) bl(t, k), ··· , /'q(t, k)], where /'j(t, k) [Vij_l +1 (t, k), Vij_l +2(t, k), . . . , Vij (t, k )]. The relationship

between /'j(t, k) and Bj(t, k) is Bj(t, k) = /,j(t - (q -j), k) for 1 5, j 5, q, k = 0, 1, 2, ··· , A-I, and m A = 1%.

With the general q-Ievel delay processor, the lower bound

on the pairwise distance measure �ct, z6) between se

quences -t and z6 in (1) is modified to be �LBC-:j, a) =

E3=lE�:��j(')'j(0, k), Oj), where OJ denotes the all zero

(ij -ij-l)-tuple.

We can use either IOSC or IDAC to decode T in the case

of coded modulation. Note that, for either IDSC or IDAC, the

extrinsic value for each labelling bit is calculated in bit level

rather than in symbol level in the MAP demapper since the bit

by-bit decoding is used in the turbo decoding of C. In IDSC,

the increased number of neighbors given in (2) is modified to

be n(--:j 0) = rrq rri,-ij-l rrA-l ['I'1�j-l +k]Vij_1 +k(O ,£) The , )=1 k=l £=0 ./) . increased error coefficient can also be reduced by IDAC.

E. Specific constructions

We now consider the design of ITCM with a rate of 2

bits per 8PSK symbol. Let C be a rate-2/3 conventional

binary turbo code with an interleaver size of K message

bits. Let a(t)= [u(t), p(t)] be the t-th codeword of C, where

p(t)=[Pl(t, O), P2(t, I), Pl(t, 4), P2(t, 5), . . . , Pl(t, K -2) P2(t, K -1)].

Construction A: We use the parameters of m, n, q, {il, i2} and the signal labelling as given in Example S. The relation

between inputs and outputs of the demultiplexer is described

by Vl(t, k) = u(t, 2k) and V2(t, k) = u(t, 2k+ 1). In addition,

V3(t, k) = Pl(t, 2k) for even k and V3(t, k) = P2(t, 2k-I) for

odd k. Let A = K /2. The relation between inputs and outputs

of the delay processor is described by Sl(t, k) = Vl(t -1, k), S2(t, k) = V2(t, k), and S3(t, k) = V3(t, k), k = 0, 1, ··· , A-1. The relation among -t, --:j and <t is given in Fig. 4. Using

this construction, ITCM T with a rate of 2 bits per 8PSK

symbol can be obtained. D We may consider a modified version of Construction A

which is denoted as Construction A' by employing Unger

boeck 's labelling instead of mixed labelling and a three-level

delay processor, i.e., m = q = 3 and A = K /2 which

yields {�l(I, O) = 0. 586'�2(I, 0) = 2, �3(I, 0) = 4} and {1]f = 2, 1]� = 2, 1]5 = I}. Although the asymptotic

performance is quite good, {1]f = 2, 1]� = 2, 1]5 = 1 } implies a dense distance spectrum and hence causes poor error

performance by using IOSC for the SNR of interested to us. In

addition, the decoding delay will be increased. Although we

39

can use IOAC to decode codes obtained by Construction A' for

reducing the increased error coefficients, we must respectively

apply IDSC to a(t), a(t + 1), and a(t + 2) to sufficiently

reduce the increased error coefficients and hence the decoding

complexity is significantly increased. Moreover, the decoding

delay is further increased.

Suppose that we switch the positions of u(t, 2k) and

Pl(t, 2k), and switch the positions of u(t, 2(k + 1)) and

P2(t, 2k + 1), where k is an even integer. Then, the message for which u(t, 2k) and u(t, 2(k+I)) are zero for all the even k will greatly reduce the code distance. A compromise given in

the following construction can achieve a smaller code distance

but a thinner distance spectrum compared to Construction A.

Construction B: This construction is the same as Construction

A except that we switch the positions of u(t, 2(k + 1)) and

P2(t, 2k + 1), where k is an even integer. The resultant code

T is still a ITCM with a rate of 2 bits per 8PSK symbol. The

relation among -t, --:j and <t is given in Fig. S. D C. Performance analysis

We now consider the error performance of T obtained by

using Construction A or B. In our simulation, 4-state and 16-

state constituent codes are considered. The generator matri

ces for 4-state and 16-state constituent codes are (I, 5/7}s and (1, 3 5/23)8, respectively. Additive white Gaussian noise

(AWGN) channels are assumed.

I) BER results for short-to-moderate interleavers: In this

subsection, we use S-random interleavers with S = 24, 28,

and 32 for K = 2048, 4096, and 8192, respectively. Simulation

results for 4-state T using either IOSC or IDAC are given

in Fig. 6, where K = 2048. Also included in Fig. 6 are the

simulation results for the 4-state ITCM constructed from [6]

with K = 2048, 4096, and 8192. We can see that the error

performance of T obtained by using either Construction A or

B is better than that of ITCM [6] at moderate-to-high SNR

not only for the same interleaver size but also for the same

decoding delay. We also observe that T can achieve much

better error performance with higher decoding complexity by

using IDAC compared to using IDSC. The inferior asymptotic

error performance of the ITCM given in [6] compared to

T may be due to that the minimum distance of the ITCM

in [6] is smaller than that of T. In addition, T obtained

by Construction A is suitable for low BER (or high SNR)

conditions while T obtained by Construction B is suitable for

moderate BER (or moderate SNR) conditions.

2) EXIT chart analysis for pinch-off limits: EXIT chart

is a powerful tool for analyzing the convergence behavior

of iterative decoding of turbo-like codes. To investigate the

pinch-off limit (or turbo cliff) of T using IDSC and IDAC,

the technique of EXIT chart is adopted. Let Ia[Ck, La(Ck)] denote the mutual information between bit Ck and its a pr ior i L-value (LLR) La(Ck). Similarly, Ie[Ck, Le(Ck)] de

notes the mutual information between bit Ck and its extrinsic

L-value Le(Ck). In addition, we use Ia(c) and Ie(c) to denote

the average mutual information E�':Ol Ia[Ck, La(Ck)l!N and

E�,:ol Ie [Ck, Le(Ck)l!N of bits in c == (Co, Cl, ··· , CN-l),


respectively. The EXIT charts of a(t) are calculated based

on the conditions that K = 2097152 and Ia(a(t -1)) = 1. For IDSC, we consider the infonnation exchange between

two constituent codes RSC 1 and RSC2 based on the as

sumption of Ia(a(t + 1)) = O. For T, the channel values

LD,c(a(t)) for code bits of RSCI and RSC2 are obtained

from the demapper instead of obtaining from channel directly.

The EXIT curves are shown in Fig. 7, where Construction B

using the 4-state component codes is considered. The pinch

off limits for the 4-state T using IDSC is 3.24 dB, while the

pinch-off limit for the ITCM presented in [6] is 3.55 dB.

For IDAC, we need to consider the infonnation exchange

between a(t) and a(t + 1) for T. In fact, we only consider

the infonnation exchange between bits in 'ih (t) and [V2 (t + 1), v3(t+l))], where VI(t) and [v2(t+l), v3(t+l))] are parts of

a(t) and a(t + 1), respectively. The EXIT curves of T using

Construction B are shown in Fig. 8, in which Ia (a( t -1)) = 1 and Ia(a(t + 2)) = 0 are assumed. We run simulation on

decoding a(t) with NJ = 60 iterations to obtain leI (VI (t)) for

various IaI(V2(t + 1), V3(t + 1)). Similarly, we run simulation

on decoding a(t + 1) with NJ = 60 iterations to obtain

Ie2(v2(t+l), v3(t+l)) for various Ia2 (VI (t)). From the EXIT

curves of T using IDAC, we can find that T using IDAC can

converge to a large value of mutual infonnation with a small

number of interblock iterations, i.e., NIDAC. The pinch-off

limits for the 4-state T using IDAC is 3.08 dB.

To verify the pinch-off limit (or turbo clift) for T obtained

by EXIT charts, we perfonn BER simulation for T with very

long interleavers (K = 262144) to see whether the proposed

ITCM can achieve the near-capacity perfonnance. The simu

lation results are given in Fig. 9, where the interleavers used

are S-random interleavers with S = 90. We see that, 4-state

T (Construction B) with K = 262144 can achieve a BER

of 4.45 x 10-8 at Eb/No = 3.12 dB which is slightly better

than the threshold (pinch-off SNR) of 3.15 dB for the irregular

LDPC-CM given in [14]. In addition, the pinch-off limit of the

4-state T using IDAC is 3.08 dB, which is only 0.18 dB from

the constraint capacity (2.9 dB).

In our EXIT chart analysis, Ia(a(t -1)) = 1 is assumed.

In the BER simulation, we do not use such assumption and

in addition we consider only finite NJ for IDSC and IDAC

and finite NIDAC for IDAC. Hence, the pinch-off limits for

T obtained from the EXIT chart analysis are better than

those obtained from the BER simulation. However, the EXIT

analysis does provide valuable infonnation for the design of

T. For example, the EXIT chart analysis for the 4-state T

shows that there is significant room for obtaining better BER

perfonnance by using interblock memory and we do obtain

better BER perfonnance through simulation.

IV. CONCLUSION

The perfonnance of turbo trellis coded modulation (ITCM)

with interblock memory, which can be decoded using either

IDSC or IDAC has been investigated. The designed ITCM can

achieve better error perfonnance in the error floor and waterfall

regions compared to the conventional ITCM presented in [6].

40

In particular, we can design a ITCM with a rate of 2 bits per

8PSK symbol and a pinch-off limit of 3.08 dB which is only

0.18 dB away from the capacity (2.9 dB).

REFERENCES

[1] C. Berrou, A. Glavieux, and P. Thitimajshima, "Near Shannon limit error-correcting coding and decoding: turbo-codes (1);' in Proc. IEEE ICC (Geneva, Switzerland, May 1993) pp. 1064-1070.

[2] D. J. C. MacKay, "Good error-correcting codes based on very sparse matrices;' IEEE Trans. Inform. Theory, vol. 45, no. 2, pp. 399-431, Mar. 1997.

[3] TJ. Richardson, M.A. Shokrollahi, and R.L. Urbanke, "Design of capacity-approaching irregular low-density parity-check codes," IEEE Trans. Inform. Theory, vol. 47, no. 2, pp. 619-637, Feb. 2001.

[4] D. Divsalar, H. Jin, and R. McEliece, "Coding theorems for 'turbolike' codes;' in Proc. 36th Allerton Conf. on Commun., Control, and Computing, pp. 201-210, Sept. 1998

[5] H. Jin, A. Kandekkar, and R. McEliece, "Irregular repeat-accumulate codes;'in Proc. 2nd Int. Sympo. on Turbo codes, pp. 1-8, Sept. 2000

[6] S. Le Goff, A. Glavieux and C. Berrou, "Turbo-codes and high spectral efficient modulation," in Proc. ICC'94, pp. 1064-1070, 1994.

[7] P. Robertson and T. Worz, "Bandwidth-efficient turbo trellis coded modulation using punctured component codes;' IEEE J. Select. Areas Commun., vol. 16, pp. 206-218, Feb. 1998.

[8] S. Benedetto, D. Divsalar, G. Montorsi and F. Pollara, "Parallel concatenated trellis coded modulation," in IEEE Conf. on Commun., pp. 974-978, 1996.

[9] T. M. Duman and M. Salehi, "Performance bounds for turbo-coded modulation systems;' IEEE Trans. Commun., vol. 47, no. 4 pp. 511-521, April 1999.

[10] C. Fragouli and R. D. Weael, "Turbo-encoder design for symbolinterleaved parallel concatenated trellis coded modulation," IEEE Trans. Commun., vol. 49, no. 3 pp. 425-435, March 2001.

[11] P. Li, B. Bai, and X. Wang, "Low-complexity concatenated two-state TCM schemes with near-capacity performance," IEEE Trans. Inform. Theory, vol. 49, no. 12, pp. 3225-3234, Dec. 2003.

[12] U. Wachsmann, R. F. H. Fischer, and J. B. Huber, "Multilevel codes: theoretical concepts and practical design rules," IEEE Trans. Inform., vol. 45, pp. 136IV-1391, July 1999.

[13] J. Hou, P. H. Siegel, L. B. Milstein, and H. D. Pfister, "Capacityapproaching bandwidth-efficient coded modulation schemes based on low-density parity-check codes;' IEEE Trans. Inform., vol. 49, no. 9, pp. 2141-2155, Sept. 2003.

[14] D. Sridham, and T. E. Fuja, "LDPC codes over rings for PSK modulation," IEEE Trans. Inform. Theory, vol. 51, no. 9, pp. 3209-3220, Sept. 2005.

[15] H. Imai and S. Hirakawa, "A new mulilevel coding method using error-correcting codes;' IEEE Trans. Inform. Theory, vol. IT-23, no. 3, pp. 371-376, May. 1977.

[16] M.C. Lin and S.C. Ma, "A coded modulation scheme with interblock memory," IEEE Trans. Commun., vol. 42, pp. 911-916, Feb./Mar.lApr. 1994.

[17] C.-N. Peng, H. Chen, J. T. Coffey, and R. G. C. Williams, "Rate gains in block-coded modulation systems with interblock memory," IEEE Trans. Inform. Theory, vol. 46, No. 3, pp. 851-868, May 2000.

[18] G. Hellstem, "Coded modulation with feedback decoding trellis codes;' in Proc. ICC '93 (Geneva, Switzerland, May 1993), pp. 1071-1075.

[19] J.Y. Wang and M.C. Lin, "On constructing trellis codes with large free distances and low decoding complexities," IEEE Trans. on Commun., vol. 45, No. 9, pp. 1017-1020, Sept. 1997.

[20] Y. L. Ueng, C. J. Yeh and M. C. Lin, "On trellis codes with delay processor and signal mapper," IEEE Trans. Commun., vol. 50, pp. 1906-1917, Dec. 2002.

[21] C. J. Yeh, Y. L. Ueng, M. C. Lin, and M. C. Lu, "Interblock memory for turbo coding," IEEE Trans. Commun., vol. 58, no. 2, Feb. 2010.

[22] G. Ungerboeck, "Channel coding with multileveVphase signals;' IEEE Trans. Inform. Theory, vol. IT-28, no. 1, pp. 55-67, Jan. 1982.

[23] G. Pottie and D. Taylor, "Multilevel codes based on partitioning," IEEE Trans. Inform. Theory, vol. 35, pp. 87-98, Jan. 1989.

[24] X. Li and A. Ritcey, "Bit-Interleaved coded modulation with iterative decoding;' IEEE Commun. Letters, vol. 1, no. 6, pp. 77-79, Nov. 1997.

[25] J. L. Ramsey, "Realization of optimum interleavers," IEEE Trans. Inform. Theory, vol. IT16, pp. 338-345, 1970.


[26] G. Caire, G. Taricco, and E. Biglieri, "Bit-interleaved coded modulation,"IEEE Trans. Inform., vol. 44, pp. 927-946, May 1998.

[27] X. Li, A. Chindapol and 1. A. Ritcey, "Bit-interleaved coded modulation with iterative decoding and 8PSK signaling," in IEEE Trans. Commun., vol. 50, no. 8, pp. 1250-1257, August 2002.

[28] A. 1. Felstrom, and K. Sh. Zigangirov, "Time-varing periodic convolutional codes with low-density parity-check matrix," IEEE Trans. Inform. Theory, vol. 45, no. 6, pp. 2181-2191, Sept. 1999.

[29] D. Divsalar, and F. Pollara, "Turbo codes for PCS Applications," in Proc. IEEE ICC (Seattle, WA, June 1995) pp. 1064-1070.

[30] S. Benedetto and G. Montorsi, "Unveiling turbo codes: some results on parallel concatenated coding schemes," IEEE Trans. Inform. Theory, vol. 42, no. 2, pp. 409-428, Mar. 1996.

[31] L. C. Perez, 1. Segher, and D. 1. Costello, "A distance spectrum interpretation of turbo codes," IEEE Trans. Inform. Theory, vol. 42, no. 6, pp. 409-428, Nov. 1996.

[32] S. Benedetto and G. Montorsi, "Design of parallel concatenated convolutional codes," IEEE Trans. Commun., vol. 44, no. 5, pp. 591-{500, May 1996.

[33] S. ten Brink, "Convergence behavior of iteratively decoded parallel concatenated codes," IEEE Trans. Commun., vol. 49, no. 10, pp. 1727-1737, Oct 2001.

[34] L. Baht, 1. Cocke, F. Jeinek, and 1. Raviv, "Optimal decoding of linear codes for minimizing symbol rate," IEEE Trans. Inform. Theory, vol. 20, pp. 284-287, Mar. 1974.

[35] 1. Hagenauer, E. Offer, and L. Papke, "Iterative decoding of binary block and convolutional codes," IEEE Trans. Inform. Theory, vol. 42, no. 2, pp. 429-445, Mar. 1996.

[36] P. Robertson, E. Villebrun, and P. Hoeher, "A comparison of optimal and suboptimal MAP decoding algorithms operating in the log domain," in Proc. ICC '95, pp. 1009-1013.

u a Ii s -Turbo ",(/,k) Multilevel s, (/,k) z,(/,k)

Delay Signal

Mapper C "",{I,k) Processor s",(/,k) z",{I,k)

Fig. I. Encoding structure of binary turbo coding with interblock memory.

Fig. 2. Encoding structure of T with the parameters given in Example I.

41

0.1

0.01

lE-3

.s lE-4 co c:: � lE-5 e W lE-6 -in

lE-7

lE-8

lE-9

-�-�-�-� .

Simulation - . - T , IOSC -e- T, IOAC, N'DAC=3 -A-C

--L; - -L; - -L; - -L; _ -L;\ -L; \ --L; __ --L;--L;--L; __

L; __

L;_ --0 --0 --0 --0 --0 • \ --0 --0 __ --0 --0 __ "'-. 0 --0 --0 __ 0 __

0--0 __ 0 _ ) ., 0 --0 � _ .... ---..

lE-l0

--0 --0 --0 -....::i --0 -_� --0 --0 --0 --0

� �o --0 __

0.6 0.8 1.0 1.2 1.4 1.6 Eb/NO(dB)

1.8 2.0 2.2

Fig. 3. Simulation and estimation results for 4-state binary T (N[=IO for IDAC, N[=30 for IDSC) and the conventional turbo code C with K=1024 (N[=30). The interleaver used is an S-random interleaver with S = 18 and K=1024.

;(1,0) ;(1,1) ;(1 I 1,0) s(l"'I,I) ;(1 12,0) s(t I 2, I)

.(1-1,0) 11(1 -1,2) 11(/,0) 11(1,2) 11(1 I 1,0) 11(1 1-1,2)

" vl(/-I,O) v,(/-I, I) v,(t,O) v,(I, I) tJ,(I 1-1,0) tJ,(I I I, I)

11(/,1) 11(1,3) 11(/1-1,1) 11(1 1-1,3) 11(1 12,1) 11(1 1-2,3)

" tJ2(/, 0) ",(1,1) V:.I(t '"' 1,0) v2(f,", 1, \) 1)2(( ... 2,0) V2(t +2, I)

;(1,0) ;(1,1) s(1 + 1,0) 8(/+ I, I) s(1 +2,0) s(t +2, I)

.(1 -1,0) P2(1 -I, I) .(1,0) ",(I, I) tI(1 + 1,0) P2(1 + I, I)

" Vl(' -1,0) vdt-I,I) vl(',Ol vdt,l) vJ(t+ 1,0) v,(t + I, I)

1/(t,I) .(1,3) 1/((+1,1) 1/.(t+ 1,3) 1/(f +2, I) 1/.(t +2,3)

" ",(1,0) ",(1,1) ",(1+ 1,0) ",(1+ 1,1) t/:l(f +2,0) t/:l(t +2, I)

p,(I,O) 1/(t,2) 1'1(f+ 1,0) 1/.(t+ 1,2) 1'df+2,0) 1/.(t +2,2)

" V3(t,0) t-')(t,l) ",(I + 1,0) ",(1+ 1,1) t-')(f +2,0) V:i(t +2, I)

Fig. 5. Relation among sequences $,7J and a for Construction B.

2.4


0.1 ==:;g==�==�� --._.

-.� .---� \ =:=�g�:��:� 0.01

lE-3

"* lE-4 c:::

� �� ___ \-*- nCM,K=8192

\ � o .�. ::--. g lE-S W -iii lE-6

\ O �.

� \ .� -.- Construction A, IOSC, K=2048 o_o-\.-lE-7 -0- Construction A, IOAC , N'DAC=3, K=2048 0 0 -e- Construction B, IOSC, K=2048

lE-8 -0- Construction B, IOAC, N'DAC=3, K=2048 ��,,��.-��-,�.-��-4

3.4 3.S 3.6 3.7 3.8 3.9 4.0 Eb/NO(dB)

4.1 4.2 4.3

Fig. 6. Simulation results for 4-state T and 4-state ITCM [6]. All results are based on N/=18. The interleavers used are S-random interleavers with S�1.

0.9

0.8

0.7

15 0.6 -� = 0.5

0.1 0.2 0.3

_T. RSC2, Eb/No=<l24 dB

--t3-- Oligillal TICM, RSCl, Eb/ No=<l.55 dB

---e-- Oligillal TlC).'I, RS C2, Eb/ No=<l.55 dB

0.4 0.5 0.6 la,(u(t)) [le2(u(t))]

0.7 0.8 0.9

Fig. 7. EXIT charts of a(t) for 4-state T (Construction B) based on IOSC and the 4-state ITCM [6].

4.4

42

0.4 r--.---.----.----.----.----,-----r----,,--.------,

� .;t �N 0.25 e =..";s � �

0.2

.;t �C')0.15 1>_ � .;t 0.1 �N I>

__ a.(t+I), inpUL (,,(v. (t)), oulpUL 4,{v,(t+I),vJ(t+I)) --e-- Ii.(t), inpUL � ,(v, (t + I), VJ (t + 1 )), output (, ,(v. (t))

:';;l 0.05 .L� ___ -�'

oL--L--L--��L-r-�r-�r-�-�-�� o 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 la2(Vl (t» [le1 (V1 (t»]

Fig. 8. EXIT charts of a(t) and a(t + 1) for 4-state T (Construction B) based on IOAC and � = 3.08 dB.

L.

g

.:: \--t=� .-\-.-.-e .. -.-.-e . .-.�_.�_ .�_\:......_

___ __,

K=524288 1 E-3 - e- Origin�1 4-state nCM

* \

.� ., \ .,

� lE-5 00

\ ., ., ., ''e .. _._._

lE-6 • \ • \ lE-7 N,=18.

S-random interleavers with S » 1 lE-8 +-�-,---,�,--.-_r�--��-r�--r_�_r--,-_,r_�

3.0 3.1 3.2 3.3 3.4 3.5 Eb/NO(dB)

3.6 3.7 3.8

Fig. 9. Simulation results for 4-state T (Construction B) and 4-state and 16-state original TTCM [6]. All results are based on N/=18. The interleavers used are S-random interleavers with S � 1.

3.9

Documents

[IEEE Iterative Information Processing (ISTC) - Brest, France (2010.09.6-2010.09.10)] 2010 6th International Symposium on Turbo Codes & Iterative Information Processing - Interblock