Error Control Codes - Aalto · Tarmo Anttalainen Error Control Codes _ _ Tarmo Anttalainen Page 1 02/04/02

Tarmo Anttalainen Error Control Codes___________________________________________________________________________

___________________________________________________________________________Errorcc Page 1 02/04/02

Error Control Codes

Tarmo Anttalainenemail: [email protected]

01.02.2002

Abstract: This paper gives a brief introduction to error control coding. It introduces blockcodes, convolutional codes and trellis coded modulation (TCM). Only binary codes are con-sidered in this brief introduction.

ContentsIntroduction.....................................................................................................................................1

1. BLOCK CODES ................................................................................................................................1

1.1. BINARY BLOCK CODES .................................................................................................................11.1.1. Minimum or Free Hamming Distance..................................................................................21.1.2. Syndromes ............................................................................................................................71.1.3. Error detection .....................................................................................................................81.1.4. Weight Distribution ............................................................................................................101.1.5. Probability of undetected error..........................................................................................101.1.6. Error Correction ................................................................................................................111.1.7. Standard Array...................................................................................................................111.1.8. Syndrome Decoding ...........................................................................................................13

2. CONVOLUTIONAL CODES.........................................................................................................15

2.1. ENCODER DESCRIPTION ..............................................................................................................152.2. CONVOLUTIONAL ENCODING AND DECODING.............................................................................182.3. RECURSIVE SYSTEMATIC CONVOLUTIONAL CODE .......................................................................22

3. TRELLIS CODED MODULATION..............................................................................................24

3.1. ENCODER DESCRIPTION...............................................................................................................243.2. MAPPING BY SET PARTITIONING ..................................................................................................26

4. CONCLUSIONS ..............................................................................................................................28

INDEX ...................................................................................................................................................28

REFERENCES .......................................................................................................................................28


________________________________________________________________________Tarmo Anttalainen Page 1 02/04/02

IntroductionThe history of error control codes began in 1950 when a class of single error correcting blockcodes was introduced by Hamming [7, p 399]. The correcting capability of Hamming codes isquite weak but they are still used in many applications, such as teletext of TV and Bluetooth.During 1960’s convolutional codes were introduced and Viterbi decoding made them practicalto implement. In the beginning of 1980’s trellis coded modulation (TCM), which combinesconvolutional codes and modulation, was invented. With the help of TCM data rate on voiceband modems has increased from 9.6 kbit/s to 33.6 kbit/s. All the three error control methodsare introduced in this paper. Latest major invention in the area of error control is Turbo codebut it is not discussed here.

1. Block Codes

Block codes are described by two parameters, n as the length of the code word and k as thenumber of information symbols encoded into each code word. The redundant information ofn-k symbols is used for Forward Error Correction (FEC) or for error detection if AutomaticRepeat Request (ARQ) or Backward Error Correction (BEC) is in use. How codewords aregenerated for each set of k information bits is defined by generator polynomial or generatormatrix of the code.

1.1. Binary Block Codes

Let V be a set of all possible combinations of n symbols (in binary case a set of bits) xi, calledn-tuples, where i = 1.... n [2]. In binary case these symbols xi may get values 0 or 1.

V = {x1, x2, . . . , xn} (1.1.1)

We call a subset C of V a code and the selected n-tuples of C we call codewords. We use M asa number of code words in C. Note that error control requires redundancy that means that allpossible combinations of n-bits (whole V) are not used as code words. If all code words haveconstant length of n we talk about block code with block length of n.

For a certain code we select M = 2k of vectors in V for code words. For transmission of k in-formation bits we need M codewords each representing one of the possible sets of k informa-tion bits.

Block codes encode k information bits into n-bit codewords. Encoding is done block by blockand each block is independent from the previous and preceding blocks. There is one to onerelationship between each set of k-information bits and one of the codewords.

Code RateWe write a block code as (n, k) code and define the Code Rate for a block code as

Rc = k/n (1.1.2)

Code rate is always smaller than one and it tells the amount of information in the transmitteddata and 1-Rc tells the amount of redundancy in transmitted data. In many error correction



codes in use the code rate is in the order of 0.5 to make performance good enough. If only er-ror detection is needed, code rates close to 1 give good enough performance.

Hamming DistanceHamming distance d(x, y) of the two code words x and y is the number of places in which theydiffer [3]. For example, if c1 =10101 and c2 = 01100, then d(c1, c2 ) = d(10101,01100)=3.

1.1.1. Minimum or Free Hamming Distance

Minimum Hamming distance, dfree, is the smallest number of places in which any two codewords of a code differ [1; 3].

dfree = min d(c1,c2) (1.1.3)

where c1 and c2 represent all different code words of the code. Smallest figure of all possibledistances is taken as the minimum or free distance. Free distance plays important role in errorcontrol coding because it tells what is the minimum number of errors that may change onecode word to another.

Systematic CodesFor systematic codes first bits in transmitted code word equal information bits, that is, first k-bits of n-bit long code word contain information bits as they are and the rest n-k bits are re-dundant bits for error control.

Error Correction and Detection Capability of the Block CodesWhen a code word in error is received the task of the error correcting decoder is to find thecodeword that most probably was the transmitted one. For that we assume that closest wordwith smallest Hamming distance to the received one is the best choice. This is relevant as-sumption because occurrence of smaller number of errors has higher probability than highernumber of errors (in operational system).

If the received code word is equal to one of the error free code words, decoder is not able tocorrect or even detect that errors have occurred. Generally, if t errors occur, decoder is alwaysable to correct errors if

dfree ≥ 2t + 1 (1.1.4)

Sometimes error correction is possible although the inequality above is not satisfied, but t-error correction is not quarantined if dfree < 2t + 1.A code is able to detect an error if the received word is not one of the code words. This is thecase if fewer errors than minimum distance have occurred. Up to l bit errors is always detectedif

dfree ≥ l + 1 (1.1.5)

If l errors occur, and Equation 1.1.5 is valid, it is still sure that code word is not changed toanother code word because all words differ in higher number of bit positions than l, i.e., dfree islarger than l.



Structure of the Linear Block CodesCode words of linear block codes are n-tuples of elements that in binary case are from GF(2),i.e. bits with value 0 or 1. General properties of linear codes are according to the definition [3,p46]:• Sum of two codewords gives one of the codewords;• All-zero vector is one of the code words (sum of one codewords and itself).

Hamming weight and free distanceThe Hamming weight w(c) of a codeword c is equal to the number of nonzero places or com-ponents in the code word [3, p46]. We saw above that the free distance dfree is an importantmeasure when we study the error control capability of a code. Let c be a code word of a linearcode and naturally c-c (or c+c) is all zero code word. Now by evaluating all non-zero codewords we get Hamming weight w(c) of all code words that equal to the number of non-zeroplaces of them. Let us take two binary codewords X and Z. The Hamming distance betweenthese two codeword vectors is [5, p 481]

d(X, Z) = w(X+Z) (1.1.6)

For example if X=[1010] and Z=[1100] then X+Z==[0110] where we have mod-2 addedwords bit by bit. In the case of linear code X+Z=Y is also a codeword. The distance betweenX and Z therefore equals the weight of another codeword Y, i.e.,

d(X,Z) = w(X+Z) = w(Y) (1.1.7)

Thus when we calculate distances between all pairs of codewords we actually evaluate theweight of another codeword that is the sum of the two ones under study. When code word Z isall zero vector then X+Z=X and evaluation actually gives weights of all non-zero codewords.In the case of linear code all distances between codewords equal to the weight of one codeword of this code. Then weights of the codewords give all Hamming distances of a code. Nowthe free or minimum distance is given by minimum Hamming weight of the code:

dfree = min w(c) = w(c)min (1.1.8)

where c is any code word except all zero word and we have taken the minimum weight of allcode words. Now we can rewrite Equations 1.1.4 and 1.1.5

dfree = w(c)min ≥ 2t + 1 (1.1.9)dfree = w(c)min ≥ l + 1 (1.1.10)

where t is the number of errors that the code can always correct and l is the maximum numberthat are always detected.

Matrix representation of linear block codesGenerally linear block codes are defined by the generator matrix G of the code [3, p417]:

G =

⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅

⋅⋅⋅⋅⋅⋅

knkk

n

n

ggg

ggg

ggg

21

22221

11211

(1.1.11)



where each row contains a row vector gi. For example g1 = g g g n11 12 1⋅ ⋅ ⋅ .

Matrix has k rows that is equal to the number of information bits encoded to each codewordand the number of columns n is equal to the length of the codewords.

EncodingWhen an information vector is

i = [i1 i2 . . . ik] (1.1.12)then the code words, c = [c1 c2 . . . cn], are given by

c = i G (1.1.13)that isc = i g i g i g i g i g i g i gk k k k n k kn1 11 2 21 1 1 12 2 1 1+ + ⋅ ⋅ ⋅ + + ⋅ ⋅ ⋅ + ⋅ ⋅ ⋅ + ⋅ ⋅ ⋅ +, (1.1.14)

Let us assume now that information words and code words are binary, i.e., sequences of bi-nary symbols, bits from GF(2). Any code word is a linear combination of the row vectors ofG because information bit places with logical 1 define which rows are added to make up acodeword. The rows in the matrix have to be linearly independent; that is, there is not suchcombination of rows that their sum results to all-zero word. Otherwise different informationwords would produce equal codewords. We can see each of the rows as the basis of the vectorspace where each row represents basis function of one dimension. The number of dimensionsis the number of information bits that is equal to the number of rows in generator matrix.Codewords are vectors in this space.

Two codes are equivalent if and only if their generator matrixes are related [3] by1. Column permutations and2. Elementary row operations.Equivalent codes have similar performance but the set of code words may be different. The setof code words always equals the permuted set of code words of an equivalent code.

The code is not changed (codes are equal, i.e., the set of codewords remains the same) underelementary row operations and Elementary row operations on a generator matrix are as fol-lows [3]:1. Interexchange of any two rows;2. Multiplication of any row by a nonzero field element (only 1 in binary case);3. Replacement of any row by the sum of itself (and a multiple of) any other row.

Elementary row operations change the mapping of information words to code words but per-formance of the code remains the same because the set of code words is unchanged.

Any generator matrix of an (n, k) linear code can be changed by row operations and columnpermutations to the systematic form [3, p 49] [6, p417]:



G = [ ]PI �k =

1 0 0

0 1 0

0 0 1

11 12 1

21 22 2

1 2

⋅ ⋅⋅ ⋅

⋅ ⋅ ⋅ ⋅⋅ ⋅ ⋅ ⋅ ⋅

⋅ ⋅

⋅ ⋅⋅ ⋅

⋅ ⋅ ⋅ ⋅ ⋅⋅ ⋅ ⋅ ⋅ ⋅

⋅ ⋅

�

p p p

p p p

p p p

n

n

k k kn

(1.1.15)

where I k is the k x k identity matrix and P is a k x (n-k) matrix that determines the n-k redun-dant bits or parity check bits. Every linear code has an equivalent systematic code [3, p50].Generator matrix in systematic form generates a systematic linear block code in which thefirst k bits of each code word are identical to the information bits and the remaining n - k bitsof each code word are linear combinations of the k information bits.

Example 1.1.1Let us take a generator matrix of a simple systematic binary linear code [3].

G =

1 0 0 1 0

0 1 0 0 1

0 0 1 1 1

If the information vector is i = [0 1 1]The encoded codeword becomes

c = [0 1 1]

11100

10010

01001

= [0 1 1 1 0]

Singleton boundTo derive an upper bound on dfree we can put any linear block code into systematic form. Themaximum number of non-zero elements in any row of P cannot exceed n-k. Then the numberof non-zero elements in any row of G cannot exceed 1+n-k. Since all rows of G are validcodewords [5; 3]

dfree = w(c)min ≤ 1 + n - k (1.1.16)

This is known as Singleton bound and we can say without any knowledge about the generatoror parity check matrixes that free distance can never exceed 1+n-k. Note that usually equation(1.1.16) gives very optimistic value for dmin. One code that meets Singleton bound is binaryrepetition code which [1, p396]

0 → c0 = [0,0,…,0]1 → c1 = [1,1,…,1]

In this case dfree = d(c0, c1 )= 1 + n – k. Codes that meet Singleton bound are called maxi-mum distance separable (MDS) codes and repetition code is the only binary MDS code. Thenon-binary Reed-Solomon codes are also MDS codes.



Parity Check MatrixThe decoder in the receiver checks if the received codeword is the original codeword or if it isin error. For this it needs parity check matrix H that gives for any error free codeword

c HT = 0 (1.1.17)

Now decoder computes according to (1.1.17) and if the result is all-zero vector 0, most proba-bly no errors have occurred. Naturally H must be compatible with G that is used for genera-tion of the codewords.

To derive the parity check matrix we start with the generator matrix in systematic form. Weuse generator matrix in the encoder to make up the codewords and encoder calculates code-words as

c = i G = i [ I P ] = [ i i P ] = [ck cn-k] (1.1.18)

Where [ck cn-k] represents a codeword in systematic form divided into two parts. First k-bitsof the codeword are identical to the information bits and the second part is parity check sec-tion containing n-k bits. For systematic codeword i = ck and we can see from the equationabove that

cn-k = i P = ck P (1.1.19)

Where ck represent the first k bits of the codeword. Now we may write

- ck P + cn-k = 0

which can be written into another form as

[ck cn-k]

−

−knI

P = c

−

−knI

P = 0 (1.1.20)

Now if we compare formula (1.1.17) and (1.1.20) we notice that we have got the transpose ofthe parity check matrix as

HT =

−

−knI

P(1.1.21)

that has n rows and n-k columns. Parity check matrix we get in a form

H = [- PT I n-k] (1.1.22)

that has n-k rows and n columns. Note that - PT equals PT when we are dealing with binarycodes where elements are form GF(2).

Because c HT = 0 holds for all code words that may be equal to any row in G, we get [3]

G HT = [ ]I Pk � −

−

P

In k

= -P + P = 0

Where 0 is the k x (n-k) matrix where all elements equal zero. The parity check matrix wehave found is valid because it fulfills the requirement of Equation 1.1.17.

Example 1.1.2The generator matrix of the (5, 3) systematic linear block code in Example 1.1.1 was:



G =

1 0 0 1 0

0 1 0 0 1

0 0 1 1 1

= [ I P ]

Now according to (1.1.22) we may write corresponding parity check matrix as

H = [- PT I n-k] = 1 0 1 1 0

0 1 1 0 1

This has n-k=2 rows and n=5 columns. To check a received code word, for example c= [01110], that corresponds to the information vector i = [011], the decoder computes

c HT = [0 1 1 1 0]

1 0

0 1

1 1

1 0

0 1

= [0 0] = 0

The code word is detected to be error free. If we compute G HT we would get zeromatrix

1.1.2. Syndromes

As we saw above the parity check matrix H is directly related to the generator matrix G usedin encoder. Decoder multiplies the transpose of the parity check matrix by the received wordand if it is one of the error free codewords c the decoder gets:

cHT = c

−

−knI

P = [0 . . 0] (1.1.23)

If the received word is in error (and not equal to any of the error free code words) we write itas c’. Them multiplication according to 1.1.23 gives at least one non-zero element in the re-sulting vector. We call this vector, which has as many elements as rows in parity-check matrix(or HT has columns), the syndrome, s, that is given by

s = c’HT (1.1.24)

where c’ represents now a received word, which may be error free or in error. If the syndromes = 0, i.e., all elements of the syndrome equal zero, received word is error free or errors havechanged it to another codeword in which case errors are undetectable. Otherwise errors areindicated by the presence of non-zero elements in s. For error detection it is enough to check ifthere is one or more non-zero elements in syndrome.

Error correction can also be based on the syndrome. To develop decoding method we intro-duce n-bit error vector e, whose nonzero elements mark the positions of transmission errors inc’. For instance, if the transmitted codeword c = [1 0 1 1 0] and the received word in error is c’= [1 0 0 1 1] then e = [0 0 1 0 1]. In general

c’ = c + e (1.1.25)



and in the case of binary codes we can write (in finite field GF(2) additive inverse element of1 is –1 and then -1=1 and -0 = 0)

c = c’ - e = c’ + e (1.1.26)

We can think this in a way that the second error in the same bit location cancels the originalerror and the resulting code word is the original one. If we now substitute this to 1.1.25 to1.1.24 we obtain

s = c’HT = (c + e)HT= cHT + eHT = eHT (1.1.27)

We see that the syndrome depends only on the error pattern, it does not depend on the trans-mitted codeword.

Example 1.1.3Let us take a systematic (7, 4) Hamming code defined by generator matrix [1, p 395]

G = [ I | P ] =

1011000

1110100

1100010

0110001

The corresponding parity check matrix becomes

H = [ - PT I ] =

1001110

0100111

0011101

We see that all columns of H are different and contain at least one non-zero element.This is one characteristic of Hamming codes that produce unique syndrome for all sin-gle error cases.

The transpose of H is: HT =

100

010

001

101

111

110

011

Syndrome is now given by s = c’HT = e HT. Decoding with the help of syndrome isdiscussed in Section 1.1.8.

1.1.3. Error detection

A linear block code detects all error patterns with smaller number of errors than dfree. [2]. Ife ≠ 0 is a code word errors are not noticed. There are 2k –1 undetectable error patterns (thesame as the number of non-zero codewords), but 2n –1 non-zero error patterns. Hence thenumber of detectable error patterns is



2n –1-(2k –1) = 2n –2k

Usually the number of undetectable error patterns is much smaller than total number of possi-ble error patterns. For example for the (7,4) Hamming code in example 1.1.3 there are 24 –1 =15 undetectable error patterns and 27 – 24 = 112 detectable error patterns [1].

Cyclic Redundancy Check (CRC) is the most popular code designed specially for error detec-tion. It is used together with Automatic Repeat Request (ARQ) protocols where errors are de-tected and frames in error are retransmitted. This is much more efficient error control schemethan Forward Error Correction (FEC) we discuss here.

Example 1.1.4Let us assume that the bit error rate in the channel is BER = 1*10-6 and frames, eachcontaining 1000 bits, are transmitted. The probability of a number of errors we getwith the help of the Poisson distribution:

P(i) = (mi/i!)e-m

Where i is a certain number of errors in a frame and we want to find the probabilitythat exactly i errors occur. The average number of errors in the frame is m and in ourcase that is

m = 1000*1*10-6 = 0.001

Now probabilities for i number of errors, P(i) are:

P(0) = e-0.001 = 0.9990P(1) = 0.001⋅ e-0.001 = 1⋅ 10-3

P(2) = 0.0012/2 ⋅ e-0.001= 5 ⋅ 10-7

P(3) = 0.0013/6 ⋅ e-0.001= 1.7 ⋅ 10-10

etc.We see from the results that approximately one frame in a thousand frames containsone error and one frame in two millions (or little bit more) has more than one error(P(2)+P(3)+P(4)+…).

Error correction of a single error requires information which of the 1000 bits in frameis in error, and this requires 10 redundant bits (210 =1024) is enough to tell the loca-tion of the bit in error. Residual frame error probability is approximately 10-6 (the sameas the probability of more than one error in a frame) when all single error cases arecorrected.

Error detection of a single error requires only one parity bit. Parity bit is able to detectall single bit errors and odd number of errors. In this case residual frame error prob-ability is approximately ½ 10-6, that is approximately the same as probability of aframe with two errors. The performance of error detection using only single redundantbit is even better than error correction with 10 redundant bits!

However, there are applications that do not tolerate variable delay caused by ARQ and forthem FEC is the only choice. From now on we concentrate here on error correction only.



1.1.4. Weight Distribution

Consider a block code C and let Ai be the number of codewords of weight i. The set {A0,A1,…, An,} is called weight distribution of code C. The weight distribution can be expressedas a weight enumerator polynomial [1, p397]

A(z) = A0 z0 + A1 z

1 +…+ An zn (1.1.28)

Example 1.1.5:Codewords of the Hamming code in Example 1.1.3 are:

i c weight0 0 0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 1 1 0 1 30 0 1 0 0 0 1 0 1 1 1 40 0 1 1 0 0 1 1 0 1 0 30 1 0 0 0 1 0 0 0 1 1 30 1 0 1 0 1 0 1 1 1 0 40 1 1 0 0 1 1 0 1 0 0 30 1 1 1 0 1 1 1 0 0 1 31 0 0 0 1 0 0 0 1 1 0 31 0 0 1 1 0 0 1 0 1 1 41 0 1 0 1 0 1 0 0 0 1 31 0 1 1 1 0 1 1 1 0 0 41 1 0 0 1 1 0 0 1 0 1 41 1 0 1 1 1 0 1 0 0 1 41 1 1 0 1 1 1 0 0 1 0 41 1 1 1 1 1 1 1 1 1 1 7

The number of zero weight codewords, A0 = 1, weight one codewords, A1 = 0, etc.I.e., A0 = 1, A1 = 0, A2 = 0, A3 = 7, A4 = 7, A5 = 0, A6 = 0, A7 = 1. Hence the weightenumerator polynomial becomes [1, p397]

A(z) = 1 + 7 z3 + 7 z4 + z7

1.1.5. Probability of undetected error

The code cannot detect that errors have occurred if the received word happens to be equal toone of the codewords, i.e., then c1 + e = c2. The probability of undetected error is then [1,p397]

Pe(U) = P(e is a nonzero codeword) = ( )( )∑=

=n

ii iwPA

1

e (1.1.29)

Error probability P(w(e)=i) depends on the coding channel (a portion of a communicationsystem seen by the coding system). The simplest coding channel is the binary symmetricchannel (BSC), where probability that a received bit c’i is not the same as transmitted code-word bit ci (bit error probability BER) is

P(c’i ≠ ci) = p = 1 - P(c’i = ci) (1.1.30)

For a BSC



P(w(e)=i) = pi (1-p)n-i (1.1.31)

And hence the probability for undetected error becomes

Pe(U) = ( )inn

i

ii ppA

−

=∑ −

1

1 (1.1.32)

where Ai is the number of codewords with i number of ones.

Example 1.1.4:The Hamming code in Examples 1.1.3 and 1.1.5 has undetected error probability of

Pe(U) = 7p3(1-p) 4 + 7p4(1-p) 3 + p7

For raw channel bit error rate of p = 10 -2, we get Pe(U) = 7 × 10-6. Hence, the unde-tected error rate can be very small even for a fairly simple block code [1, p397].

1.1.6. Error Correction

As explained in section 1.1.1 a linear block code can correct all error patterns of t of fewererrors if

dfree ≥ 2t + 1 (1.1.33)

where x stands for largest integer contained in x. Then the number of errors that is alwayscorrected successfully is

−≤

2

1freedt (1.1.34)

A code is usually capable of correcting many error patterns of more than t errors. For a BSC,the probability of codeword error, i.e., unsuccessful error correction, is [1, p398]

P(E) ≤ 1 – P(t or fewer errors) = 1 - ( ) init

i

ppi

n −

=

−

∑ 10

(1.1.35)

1.1.7. Standard Array

One conceptually simple method for decoding of any linear block code is standard array de-coding. We construct a decoding table, a standard array, as follows [1, p398]:1. Write all 2k codewords in the first row, beginning with all-zero code word c0. This all-zero

codeword also represents the all-zero error pattern.2. From the remaining n- tuples (not yet written to the table) select an error pattern e2 of

weight 1 and place it in the first column. Under each codeword in other columns write ci +e2, i = 1,..., 2k – 1.

3. Select a minimum weight error pattern e3 from the remaining unused n-tuples and place itin the first column under c0 = 0. Under each codeword put ci + e3, i = 1,..., 2k – 1.

4. Repeat step 3 until all n-tuples are used.

Note that every n-tuple (n-bit word) appears once and only once in the standard array.



Table 1.1.1 Standard Array.

c1

c2

c3

. . . c2k

e1

c2+e

1c

3+e

1. . . c

2k+e1

e2

c2+e

2c

3+e

2. . . c

2k+e2

. . . .

. . . .e

2n-k c2+e

2n-k c3+e

2n-k . . . c2k +e

2n-k

Each row in standard array consists of all received words that would result from the corre-sponding error pattern in the first column. Each row is called a coset and the first left-mostword, error pattern, is called a coset leader. The table contains all possible received n-bitwords, error free words and words in error, and coset leader gives an error vector.

Example 1.1.5Let us construct the standard array for the (5, 2) systematic code with generator matrixgiven by

G = 1 0 1 0 1

0 1 0 1 1

= [ I P ]

We can easily see that the minimum distance dmin = 3. Code has only 4 different codewords and the minimum weight is 3. The standard array is given in Table 1.1.2.

Table 1.1.2 Standard array for the (5, 2) code [6, p447].

Code words0 0 0 0 0 0 1 0 1 1 1 0 1 0 1 1 1 1 1 0

0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 1 1 1 1 10 0 0 1 0 0 1 0 0 1 1 0 1 1 1 1 1 1 0 00 0 1 0 0 0 1 1 1 1 1 0 0 0 1 1 1 0 1 00 1 0 0 0 0 0 0 1 1 1 1 1 0 1 1 0 1 1 01 0 0 0 0 1 1 0 1 1 0 0 1 0 1 0 1 1 1 01 1 0 0 0 1 0 0 1 1 0 1 1 0 1 0 0 1 1 01 0 0 1 0 1 1 0 0 1 0 0 1 1 1 0 1 1 0 0

Coset leaders consist all zero error pattern, all error patterns of weight 1, and two errorpatterns of weight 2. There are many more double error patterns but in the table thereis only room for two of them. Actually the number of rows equals the number of syn-dromes that is 2n-k = 23 = 8. Double error patterns selected according to procedureexplained above give syndromes that are different from the syndromes of single errorpatterns. We might choose other double error patterns instead of ones written in the ta-ble above but, if we follow the procedure above, their syndromes would be unique andtable would still contain each n-tuple only once.

Note that standard array contains all possible received words, error free codewords and wordsin error, all binary words of length n. The number of rows is 2n-k and number of columns is2k and hence the number of words in the table is 2n-k * 2k = 2n, i.e., all n-tuples are present.If we store standard array in decoder, it would check to which column the received word be-



longs and would take uppermost word in that column as the corrected code word. Coset leaderrepresents the most probable error pattern. If error pattern does not equal coset leader errone-ous decoding will result. However, it is usually not reasonable to store standard array becausein the case efficient codes (long codes) it is very large. Decoding principle that requires lessmemory, syndrome decoding, is presented below.

1.1.8. Syndrome Decoding

Syndrome has as many elements as the codewords have redundant bits, that is n-k. This equalsthe number of columns in submatrix P of G and the number of rows of H (or the number ofcolumns of HT ). This restricts the number of different syndromes to 2n-k.

The number of different error vectors equals the total number of different n-bit words minusthe number of code words, that is 2n - 2k. This is usually much higher than the number ofsyndromes and thus syndromes are not unique for all possible error pattern. For example inthe case of (7, 4) Hamming code the number of detectable error patters is 2n - 2k = 112 andthe number of syndromes is only 2n-k = 8. This means that all 112 error cases are detected (allthat are not the same as code words) but only 7 of them can be corrected by the syndrome de-coder.

Error patterns, which we selected to coset leaders in standard array, give unique syndromesand those are used for error correction as follows:1. Compute the syndrome s = c’HT

2. Locate the coset leader el for which s = el HT

3. Decode c’ into cc= c’ + el

The calculation for step 2 can be done by using a simple look-up table as shown in Figure1.1.1.

en e2 e1

++

+

n 2 c’1

Syndrome calculator

Table

c’c’

S

c’

Receivedcode word

c’ + e

Decodedcode word

Figure 1.1.1 Table-lookup decoder [5, p485].

When e corresponds to a single error in the jth bit of the code word s is identical to the jth rowof HT. Therefore, to provide a distinct syndrome for each single-error pattern and for the errorfree pattern, the rows of HT (or columns of H) must all be different (all pairs of two rows



contains two linearly independent vectors) and each of them must contain at least one nonzeroelement.

Example 1.1.6For the (7, 4) Hamming code of previous examples we may compute syndromes for allsingle error vectors [1 0 0 0 0 0 0], [0 1 0 0 0 0 0], . . . . [0 0 0 0 0 0 1] according to s= e HT and we get table below. There are 2n-k -1 = 23 -1 = 7 single error patternsand each corresponding one of the syndromes. The syndromes equal rows in HT.

Table 1.1.2 Syndromes for the (7, 4) Hamming code.

e s0 0 0 0 0 0 0 0 0 01 0 0 0 0 0 0 1 1 00 1 0 0 0 0 0 0 1 10 0 1 0 0 0 0 1 1 10 0 0 1 0 0 0 1 0 10 0 0 0 1 0 0 1 0 00 0 0 0 0 1 0 0 1 00 0 0 0 0 0 1 0 0 1

We see that all single error patterns have unique syndrome and this code is able to cor-rect all of them. Let us now suppose that all zero code word is transmitted and re-ceived word contains two errors such that e = [1 0 0 1 0 0 0]. The decoder calculates s= e HT= [0 0 1] and the table gives error pattern e = [0 0 0 0 0 0 1] assuming that thelast bit has been in error. The decoder inverts the last bit that actually was not in errorand the decoded word includes then three errors, one generated by erroneous correc-tion by the decoder. This code can decode properly only single error cases.

As we saw, the number of nonzero syndromes 2n-k -1 defines how many different error pat-terns we can correct in the decoder and this depends on the number of redundant bits, n-k, ofthe code. In n-bit word the number of different j-error patters is

n

jn

j n j

= −

!! ( )! (1.1.36)

Hence to correct up to t errors k and n must satisfy

2n-k -1 ≥ n n n

tn

n n

t1 2 2

+

⋅ ⋅ ⋅

= +

⋅ ⋅ ⋅

(1.1.37)

where right hand side equals number of different non-zero syndromes and right hand sidegives the number of all error patterns up to t errors. It simply states that each correctable errorpattern must have unique syndrome. In the case of single error correcting codes, such asHamming code, equation reduces to

2n-k -1 ≥ n

Equation 1.1.37 gives the relationship between block length n, number of parity bits n-k andnumber of correctable error patterns.



2. Convolutional Codes

Unlike in the case of block codes, input symbols (usually binary, one bit per symbol) of con-volutional codes are not grouped into blocks but each input bit has influence on a span of out-put bits. When we say that a certain encoder produces an (n, k, K) convolutional code, we ex-press that for k input bits, n output bits are generated giving code rate of k/n. K tells the en-coder’s memory, that is K-1, measured in terms of input symbols. Convolutional encodingmay be continuous process but in many applications encoding is processed for subsequentdata blocks independently. For example in GSM each speech frame is encoded independentlyusing a convolutional encoder.

2.1. Encoder Description

The encoder for a binary rate 1/n convolutional code can be seen as a final-state machine(FSM). Encoder consists of a ν-stage shift register connected to modulo-2 adders and a multi-plexer that converts the adder outputs to serial data stream. The constraint length K of a con-volutional code is defined by the number of shifts through the FSM over which a single inputdata bit can affect the encoder output. For an encoder having ν-stage shift register the con-straint length is equal to

K = ν + 1 (2.1.1)

Figure 2.1.1 shows a simple rate-½ binary convolutional encoder with constraint length ofK=3.

b(1)

outputb

b(2)

inputa

Figure 2.1.1 Binary convolutional encoder, Rc = ½, K = 3 [1].

The binary convolutional encoder can be generalized to rate-k/n binary convolutional code byusing k shift registers and n modulo-2 adders. For a rate-k/n code, the k-bit information vectoral = {al

(1), . . . , al(k)} is input of the encoder at a time instant l and generates the n-bit code

vector bl = {bl(1), . . . , bl

(k)} as shown in Figure 2.1.2. The first k-bit register stage is drawn asdashed line to indicate that it is needed only for multiplexing purposes (other registers areneeded to store previous input vectors, present one need not necessarily to be stored).



+

k

2

1

k

2

1

k

2

1

. . . . .

Kk stage shift register

+ +

n 2 1n 2 1

. . . . .

. . . . .

n-bit output vector bl

k-bitinformationvector al

12n

Figure 2.1.2 General binary convolutional encoder for the CC(n, k, K) code, according to [7, p359].

A convolutional encoder can be described by a set of impulse responses, {gi(j)}, where gi

(j) isthe jth output sequence b(j) that results from the ith input sequence a(i) = (1, 0, 0, 0, . . .). Theimpulse response can have a duration of at most K and have a form gi

(j) = (gi,0(j), gi,1

(j), gi,2(j),...,

gi,K-1(j)). Sometimes {gi

(j)} are called generator sequences. For the encode in Figure 2.1.1 [1,p400]

g(1) = (1, 1, 1) g(2) = (1, 0, 1)

Figure 2.1.3 shows a simple rate k/n = 2/3, constraint length K=2 convolutional encoder. Forthis encoder the generator sequences are [1, p401]:

g1(1) = (1, 1) g1

(2) = (0, 1) g1(3) = (1, 1)

g2(1) = (0, 1) g2

(2) = (0, 1) g2(3) = (0, 1)

where, for example, g1(2) = (0, 1) gives the sequence of the second output bits for input se-

quence (10,00), i.e., a(1) = (1, 0) and a(2) = (0, 0).



b(1)

outputb

b(3)

inputa b(2)

Figure 2.1.3 Binary convolutional encoder, Rc = 2/3, K = 2 [1, p400].

The jth output bi(j), corresponding to ith input sequence a(i) is the discrete convolution

bi(j) = a(i) * gi

(j), where * denotes modulo-2 convolution. The time domain convolution can bereplaced by polynomial multiplication in a D-transform domain so that

bi(j)(D) = a(i)(D) gi

(j)(D)

where a(i)(D) = k

kki Da∑

∞

=0, , is the ith input data polynomial,

bi(j)(D) = ( ) k

k

jki Db∑

∞

=0, , is the jth output polynomial corresponding to the ith input, and

and gi(j)(D) = ( ) k

K

k

jki Dg∑

−

=

1

0, , is the associated generator polynomial.

The jth output sequence becomes

b(j)(D) = ( ) ( ) ( ) ( )DDDk

i

ji

ik

i

ji ∑∑

==

=1

)(

0

)( gab

Corresponding matrix representation is as follows

( ) ( )[ ] ( ) ( )[ ]( ) ( )

( ) ( )

⋅⋅⋅

⋅⋅

⋅⋅⋅

=⋅

⋅

⋅

⋅

,,,

,,

,...,,...,)()1(

)(1

)1(1

)()1()()(

DD

DD

DDDDn

kk

n

kni

gg

gg

aabb

where G(D) =

( ) ( )

( ) ( )

⋅⋅⋅

⋅⋅

⋅⋅⋅⋅

⋅

⋅

⋅

,,,

,,

)()1(

)(1

)1(1

DD

DD

nkk

n

gg

gg

is the generator matrix of the code.

After multiplexing the outputs, the final codeword has a polynomial representation

b(D) = ( )∑=

−n

j

njj DD1

)(1b

Example 2.1.1The generator matrix for the code in Figure 2.1.1 is



G(D) = [1 + D + D2, 1 + D2]

It tells, for example, that the second bit in each output block is the sum of the presentand two bits earlier input bits. For convolutional encoder in Figure 2.1.3 the generatormatrix is

G(D) =

++11

11

D

DDD

which defines that, for example, the second output bit in each output block is gener-ated as the sum of second bit in present and first bit in the previous input block, seeFigure 2.1.3.

Systematic convolutional codes are those where first k encoder output sequences, b(1),..., b(k)

are equal to the k encoder input sequences a(1),..., a(k), i.e., first k bits is each output block areequal to the k-bit input block.

2.2. Convolutional Encoding and Decoding

Convolutional encoder is a Final State Machine (FSM), its operation can be described by astate-diagram and trellis diagram. The state of the encoder is defined by the shift register con-tents. For a k/n code, the ith shift register contains ν1 information bits. State of the encoder attime instant l is defined by all bits store in the shift register

( ))()(1

)1()1(1 ,...,;...;,...,

1

kl

kllll k

aaaa ννσ −−−−=

where, for example, al-1(1) the first bit in previous k bit information block. For a rate 1/n code,

the encoder state is

( )klll aa νσ −−= ;...;1

The total encoder’s memory size is

∑=

=k

iiTot

1

νν

and then the total number of states is NS = Totν2 .

State diagram for encoder in Figure 2.1.1 is shown in Figure 2.1.4. States are labeled as S (i) ,

where i = j

jj

Tot

c 21

1∑

−

=

ν

, where cj represents contents jth one bit memory in the encoder. Contents

is also shown in Figure 2.1.4 (from left to right in Figure 2.1.1)



State S(1)

1 0

State S(3)

1 1

State S(2)

0 1

State S(0)

0 0

0/10

1/00

0/01

0/111/11

1/01

1/10

0/00

Input/output

Figure 2.1.4 State diagram for the convolutional encoder, in Figure 2.1.1 [1, p403].

In general k/n code has 2k branches leaving each state. The branches are labeled asa/b = (a(1), a(2),..., a(k)/ b(1), b(2),..., b(n)), i.e., k-bit input / n-bit output. As an example, if the en-coder in Figure 2.1.4 is in state S(1)= (10) and input a = 1, encoder produces output b = 01and next state will be S(3) = (11).

Convolutional codes are generates by linear adders and they linear codes for which the sum ofany two codewords is another codeword and the all-zeros sequence is one of the codewords.Then weight distribution provides information of distance properties of the code. Now for per-formance analysis we may assume that all-zero codeword was the transmitted one and weanalyze only the error situations that make a decoder to leave all-zero state of the state dia-gram. For that we can construct the modified state diagram by splitting all-zero state into aninput and output states as shown in Figure 2.1.5.

State S(1)

1 0

State S(3)

1 1

State S(2)

0 1

(1/01)

State S(0)

0 0State S(0)

0 0i

(1/11)

(0/10)

(1/00)

(0/01)

(1/10)

(0/11)

D2NL

DNL

DNL

NL

DL

DL

D2L

Figure 2.1.5 Modified state diagram for the binary convolutional encoder, in Figure 2.1.1 [1].

The branches we label as Di Nj L, where i represents the number of ones in the encoders out-put block and j represents the number of ones in the input block corresponding a particulartransition. L acts as counter of transitions.

Let Xs be a variable that represents accumulated weight of each path that enters state S. Thenwe can write equations that define how a state is reached from other states, that is

X1 = D2NL X0in + NL X2



X3 = D NL X2 + DNL X3

X2 = DL X1 + DL X3

X0out = D2 L X2

Solving this group of equations yilds the transfer function

T(D,N,L) = X0out / X0in =)1(1

32

+− LDNL

NLD= D5 N L3 + D6 N2 L4 (L+1) + D7 N 3 L5 (L+1)2+...

The first term of the transfer function tells that the shortest path leaving zero state and enter-ing to it has length of 3 hops and its Hamming distance is 5 that is also the minimum distancedfree.

Transfer function can be simplified if we are only interested on the distance properties of thecode. Then we can set L = N = 1 and get

T(D) = D5 + 2 D6 + 4 D7+...

which gives the weight distribution.

Instead of state diagram trellis diagram is often used to describe both encoding and decodingprocesses for convolutional codes. To draw the trellis diagram for the encoder in Figure 2.1.1we write all states in columns at each time instant J as shown in Figure 2.1.6. Each state corre-sponds one row in trellis. Initially encoder is in all-zero state and only two branches are possi-ble. The encoder generates 00 if 0 is the input to be encoded and 11 if 1 is encoded as seen inFigure 2.1.6.

State S(1)

1 0

State S(3)

1 1

State S(2)

0 1

State S(0)

0 0

Time instant

J = 0 J = 1 J = 2 J = 3 J = 4

1

0

Input symboland statetransition:

00 00 00

00

11 1111

10

01

10

01

01

10

00

11

10

01

10

01

00

1111

00

11

01

10

10

01

00

11

Figure 2.1.6 Trellis diagram for the binary convolutional encoder, in Figure 2.1.1 [1, p405].

Trellis explains encoding very clearly. We simply follow path in trellis according to informa-tion sequence to be encoded. In Figure 2.1.6 input sequence (1 0 1 0 0) is encoded as an ex-ample. It generates output sequence (11 10 00 10 11) which we get by following path that cor-responds given input sequence. Each input sequence has a unique path through trellis.

We may now compare transfer functions derived above the trellis diagram in Figure 2.1.6.We can easily see that there is a path corresponding term D5 N L3 in the transfer function (3



hops, distance 5, one input bit with value 1). Also paths corresponding other can be found intrellis.

Hard-decision Decoding and Viterbi algorithmTo explain decoding process we first assume that hard decision decoding method is used. Thatmeans that the decoder receives binary sequence of binary elements, i.e., values 0 and 1.Viterbi decoder computes path metrics for each survivor path in trellis. This metrics in harddecision case is the Hamming distance of the path and received sequence.

We illustrate Viterbi decoding with a simple example. In Figure 2.1.7 the first bits of an ex-ample received sequence are 00. Decoder knows that, according to trellis, there are only twopossible outputs that may be transmitted when encoding starts at all-zero state and they are 00and 11. Decoder records path metrics 0 for state S(0) at time instant J=1 because output bits ofthe transition are the same as received bits. If path S(0) – S(1) was followed by the encoder, se-quence 11 was transmitted, and two errors have occurred. The decoder records metrics 2 forpath entering state S(1) at time instant J=1.

State S(1)

1 0

State S(3)

1 1

State S(2)

0 1

State S(0)

0 0

Time instant

J = 0 J = 1 J = 2 J = 3 J = 41

00 00 00

00

11 1111

10

01

1001

01

10

00

11

10

01

10

01

00

11

11

00

11

01

10

10

01

00

11

00 01 10 01 11Received sequence

0

2

2

2

3

1

1

4

2

X

2

X

2X

X

1

X2

X 2

X

2X

X

X

J = 4

11

00

11

01

10

10

01

00

11

2

2

2

X

X

X

01

3

X

3

3X

X 3

Corrected sequence 11 01 10 01 1101

Decoded information 1 1 1 1 00

J = 5

Figure 2.1.7 Decoding example, CC(2,1,3) in Figure 2.1.1.

At time instant J=2 metrics for path S(0) – S(1) - S(2), as an example, is 4 because this path cor-responds to transmitted sequence 11 10 but received sequence was 00 01. At time instant J=3two branches enter to each state. Decoder computes path metrics of both possible paths en-tering the state and terminates path with higher metrics. The paths that remain are called sur-vivor paths and there are four survivor paths in our example. At each later time instant fourpaths are terminated and four paths and their metrics are recorded. At the end of the data blockdecision is made which or the four survivor paths was the transmitted one. In our example de-cision is made at time instant J=5 and the path shown by bold line is chosen. The resultingerror corrected data sequence is shown in Figure 2.1.7 as well as decoded information se-quence.

Note that if decision were made at time instant J=1, 2 or 3 another, most probably wrong pathwould have been selected. We see from Figure 2.1.7 that when there is no errors after J=1,



distances of wrong paths increase while distance of the correct path remains the same. Its met-rics equals number of errors occurred.

Transfer function give free distance 5 for our example code and corresponding path in trellisis S(0) - S(1) - S(2) - S(0). Its weight is 5 and we know that if there is two errors in the receivedsequence this code never fails. Actually the longer sequence we decode, the more errors it tol-erates.

InterleavingThe weak point of convolutional codes is that they fail if errors occur in bursts. Decoding re-lies on adjacent and preceding bits in the sequence and right path may be terminated if thereare many errors in a short period of time. As an example we see from Figure 2.1.7 that if allthree first bits in the sequence were in error the correct path were terminated at time instantJ=3 and decoding would fail no matter how long sequence we would decode. This is why in-terleaving is used together with convolutional codes. Interleaving distributes errors over se-quence to be decoded and improves essentially performance of convolutional codes. For ex-ample in GSM encoded speech frames are mixed in a way that errors in two subsequent bits atair interface cause errors that are 8 bits apart in the sequence to be decoded.

Tail bitsConvolutional codes are often used for coding of independent data blocks. For example inGSM each speech frame is encoded independently. Encoding starts from all-zero state andblock is terminated by adding zero tail bits in the end of data block to be encoded so that en-coder returns to all zero state. In our example above two zeros are enough to force encoder toall-zero state. Then decoder may always choose the path that starts from all-zero state and re-turns to that in the end of the data block.

Soft-decision decodingIn soft-decision decoding the input of decoder is the sequence of quantizied symbols, not justbits. Decoding principle is similar as we illustrated above but path metrics are computed moreaccurately. These so-called confidence measures are used for selection of survivor paths thesame way as Hamming distances are used in hard-decision decoding. Soft-decision decodinggives 2 dB better performance than hard-decision decoding.

2.3. Recursive systematic convolutional code

It is possible to construct a recursive systematic convolutional (RSC) encoder from every rateRc =1/n feed-forward non-systematic convolutional encoder, such that the weight distributionof the codes are identical. A rate 1/n code uses generator polynomials g1(D),..., gn (D), one foreach output bit. The output sequences are described as

b(j)(D) = a(D) g(j)(D), j = 1,...,n

To obtain a systematic code, we need to have b(1)(D) = a(D), i.e., the first bit in each n-bitblock equals input bit. If we divide both sides by b(1)(D) we get:

( ) ( )( ) ( )DD

DD a

gb

b ==)1(

)1()1(~

, for j=1 and



( )( ) ( ) ( )

( )D

DD

D

DD

jjj

)1(

)(

)1(

)()( )(

~

gg

agb

b == , J = 2, . . . ,n

Often the g(j)(D) are called feed-forward polynomials and g(1)(D) is called feedback polyno-mial.

Example 2.3.1The generator polynomials of rate ½ convolutional encoder in Figure 2.1.1 are

g(1)(D) = 1 + D + D2

g(2)(D) = 1 + D2

According to procedure above we may derive generators for corresponding RSC en-coder and they are

1)(~ )1( =Dg

( )( ) 2

2

)1(

)2()2(

1

1)(~

DD

D

D

DD

+++==

gg

g

The corresponding RSC encoder is shown in Figure 2.3.1

b(1)

b(2)

inputa

D D

~

~

g(2)(D) = 1 + D2

g(1)(D) = 1 + D + D2

Feedbackpolynomial

Feed-forwardpolynomial

Figure 2.3.1 Binary convolutional encoder, Rc = ½, K = 3 [1, p407].

The weight distribution of RSC code can be obtained by constructing the state diagram andcomputing the transfer function the same way as we did earlier for feed-forward encoder. Forthe encoder in Figure 2.3.1 the transfer function would be

T(D,N,L) =322322

426446335

1 LNDLDDNLDNL

LNDLNDLND

+−−−+−

= D5 N3 L3 + D6 N2 L4 + D6 N4 L5 + ...

If we now set N=L=1 we get weight distribution becomes

T(D) = D5 + 2 D6 + 4 D7+...

This is the same as the weight distribution of feed-forward encoder we derived earlier.

When we compare transfer functions we see that input weights (exponent of N) are very dif-ferent. For example weight 5 codeword is generated by feed forward encoder when inputweight is 1. The RSC encoder requires input weight 2 to generate distance 5 codeword, i.e., toleave all-zero state and to return back there. This property is used for Turbo codes.



Both feed forward and RSC convolutional codes are time invariant, i.e., if the input sequencea(D) produces output sequence b(D), then the input sequence Dia(D) produces output Dib(D).We will see that this not valid for Turbo codes. Note that both generated codewords, b(D) andDib(D), have the same weight.

3. Trellis Coded Modulation

3.1. Encoder description

Convolutional codes with good performance have quite low code rate and they are not attrac-tive for bandwidth limited applications. Trellis coded modulation (TCM) extends signal con-stellation and uses extra bits for error control. It makes error rate of information data smalleralthough raw bit error rate in channel increases because of reduced Euclidean distances. TCMhas three basic features:1. An expanded signal constellation used is larger than would be needed for uncoded trans-

mission. The additional signal points are used for redundancy that is inserted without sac-rificing data rate or bandwidth.

2. The expanded signal constellation is partitioned such that the Euclidean distance in maxi-mised inside each signal point subset.

3. Convolutional encoding is used so that only a certain sequences are allowed.

Figure 3.1.1 shows the basic structure for Ungerboeck’s trellis encoder. The n-bit informationvector a = (ak

(1), ak(2),..., ak

(n)) is transmitted at epoch k. At each epoch m≤ n information bitsare encoded by convolutional encoder into m+r code bits, which selects one of the 2m+r sub-sets of 2n+r-point signal constellation. The number of bits added to each transmitted symbol bytrellis coding is r.

Binary m/(m+r)convolutionalencoder

.

.

.

.

.

.

.

.

.

Selectsubsetof signals

Selectsignalpoint fromsubset

2

Signalpoint

ak1

ak2

akm

bk1

bkm+r

bk2

bkm+r+1

bkm+r+2

bkn+r

akm+1

akm+2

akn

xk

Figure 3.1.1 Ungerboeck trellis encoder [1].

The uncoded n-m information bits select one signal from 2n-m signal subset. Figure 3.1.2shows a 4-state trellis encoder for 8-PSK and now n = 2, r = 1 and m = 1.



2-Bitinput

ak2

ak1

bk1

bk2

bk3

0 0 0 0 1 1 1 1

0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1-----------------

0 1 2 3 4 5 6 7

xk

13

4

56

7

0

2

8-PSK constellation

000

111

011

101

001

110 010 100

bk1 bk

2 bk3

Figure 3.1.2 Ungerboeck trellis encoder [1, p409].

We could transmit pairs of information bits as they are with 4-PSK or with 8-PSK needed af-ter trellis encoder in Figure 3.1.2. We will see that 8-PSK, where additional 4 signal points areused for redundancy, gives better performance. We can see uncoded 4-PSK as one state sys-tem where subsequent signals might be any of the four signals D0 ,D2, D4, D6 and the nextstate is the same as the previous one. Its trellis contains four parallel transitions as shown inFigure 3.1.3 and the receiver have to choose, without any help of coding, which of those wasthe transmitted one.

6

4

02

6

4

02

6

4

02

D0 D2 D4 D6

Figure 3.1.3 Trellis diagram for uncoded 4-PSK.

Convolutional encoder in Figure 3.1.2 has four states and its trellis is shown in Figure 3.1.4.Convolutionally encoded bits define transitions in trellis and because there is one uncoded bitassociated with each transition there is actually two parallel branches for each transition intrellis.



State S(1)

1 0

State S(3)

1 1

State S(2)

0 1

State S(0)

0 0

1

0

Input symboland statetransition:

1/00

1/10

0/11

0/00

1/01

0/10

1/11

1/01

0/00 0/00

1/01 1/01

1/01

1/01

1/00 1/00

0/10 0/10

1/11 1/11

1/10 1/10

00

0101

00

10

11

10

C0

C2

C2

C3

C1

C0

C1C3

C0 C2

C1C3

C2C0

C3C1

0/110/11 11

Figure 3.1.4 Trellis diagram for 4-state trellis code.

3.2. Mapping by set partitioning

The critical step in the design of Ungerboeck’s codes is the method of mapping the outputs ofthe convolutional encoder to the points in the expanded signal constellation. Figure 3.2.1shows partitioning of 8-PSK constellation. Equivalent uncoded system is 4-PSK whereEuclidean distance is √2. Note that minimum Euclidean distance in increased by partitioningstep by step.

Binary 1/2convolutionalencoder

Selectsubsetof signals

Selectsignalpoint fromsubset

Signalpoint

ak1

ak2

bk1

bk2

bk3

xkHere n=2, r=1, m=1

13

5 7

sqr2=1.41

13

4

56

7

0

0,765

31

5

7

31

5

7

2

4

6

0

2

6

2

4 0

6

0

4

2

2

A

B0B1

C0 C1 C3

C2

D0 D2D1 D3 D4 D5 D6 D7

bk1= 1bk

1= 0

bk2=0

bk2=0

bk2=1

bk2=1

bk3=0

bk3=1bk

3=1 bk3=1 bk

3=1bk

3=0bk

3=0 bk3=0

Figure 3.2.1 Set partitioning for an 8-PSK signal constellation.

From trellis diagram we see that the first encoded bit bk1 defines which is the next state in

trellis, for 0 the next state is either S(0) or S(1) and for 1 it is S(2) or S(3) (no matter which was



the previous state). This corresponds partition from A to B0 and B1 in Figure 3.2.1. Second en-coded bit bk

2 defines which transition takes place corresponding next step in partition. For ex-ample from state S(0) only two bit output combinations, 00 and 01, are possible. They corre-spond subsets C0 and C2 which are associated with states on the left-hand side in Figure 3.1.3.Finally uncoded bit bk

3 selects a signal Dx from subset of two signals as shown in Figure 3.2.1.

Note that the minimum distance of paths leaving one state and remerging with the same statelater is 2.141. For example the path S(0) - S(1) - S(2) - S(0) have this minimum distance to all-zero path. The first transition with input zero corresponds to signal 0 or 4 (depending on theuncoded bit) in constellation in Figure 3.1.2. With input “1” signal will be either 2 or 6 andminimum distance between signals 0 or 4 and 2 or 6 is √2. Second transition from state S(1) tostate S(2) generates either signal 1 or 5 and their distance to all zero path (signal 0 or 4) is0.765. Third transition merges path back to all-zero state and generates either signal 2 or 6 andtheir distance to all-zero state signals is √2. Now we have got the minimum Euclidean dis-tance between different paths in trellis and it is

dmin = 141.22765.02222=++

On the other hand each transition has two parallel branches corresponding value of uncodedbit. Then two different but parallel paths in trellis may differ only by one transition where theyuse different parallel branches. From Figure 3.1.2 we see that Euclidean distance between par-allel transitions is 2. For example transition S(0) - S(1) occurs only if the coded bit values are 01corresponding to signals 2 and 6. Which one of these two is chosen depends on the uncodedbit.

As explained above the weak point (shortest minimum distance) is the decision between twosignals at opposite side of the signal constellation. Convolutional encoding have made allother distances higher. At high signal-to-noise ratio (SNR) in AWGN channel the bit errorrate performance is dominated by minimum Euclidean distance error events. The pairwise er-ror probability between two coded sequences x and x’ separated by euclidean distance dmin is[1, p411]

=→

0

2min

4)’(

N

dQxxP

The asymptotic coding gain gives the performance improvement in dBs at high SNR. It is de-fined by

dBEd

EdG

uncodedavuncoded

codedavcodeda

,2min,

,2min,

10log10=

where Eav is the average energy per symbol in signal constellation. Our comparison of un-coded 4-PSK and coded 8-PSK leads to the coding gain of

dBdBE

EG

uncodedav

codedava 32log10

2

2log10 10

,

2

,2

10 ===

Average energy per symbol for both systems is the same. Change from uncoded 4-PSK to un-coded 8-PSK increases data rate by the factor of 2/3 but makes system 5.3dB worse becauseof decreased Euclidean distance. But if we use increased transmission data rate for error pro-tection (TCM) we get 3 dB improvement! This is a result of convolutional encoding which



corrects (close to) all errors if SNR is not very low. Convolutional code selects the signalsubset where Euclidean distances are larger that in corresponding uncoded system. Then erroroccur only if uncoded bit selects wrong signal but this probability is smaller than in uncodedsystem because Euclidean distance inside subset is larger than that of uncoded system.

4. Conclusions

Introduction to the structure and performance of error correction binary linear block codes,binary convolutional codes and trellis coded modulation was given in this paper. Encodingand decoding principles of all methods were explained as well. It was shown that to get wholeadvantage of error correcting capability of convolutional codes encoded data block should beinterleaved or mixed before transmission. This is important because convolutional codes donot tolerate error bursts very well. It was also shown that increase of transmission data ratemay improve system performance. Improvement can be achieved by trellis coded modulation(TCM) which combines convolutional coding and modulation. Design principle for TCM wasexplained with the help of an example.

Index

asymptotic coding gain, 27Automatic Repeat Request (ARQ), 1Backward Error Correction (BEC), 1binary symmetric channel (BSC), 10block length, 1Code Rate, 1Code words, 1constraint length, 15Convolutional Codes, 15coset, 12coset leader, 12Distance, 2Elementary row operations, 4equivalent code, 4finate state machine (FSM)., 15forward error correction (FEC), 1Hamming distance, 2Hamming weigh, 3

Hamming weight, 3linear codes, 3linearly independent, 4lis coded modulation (TCM), 24maximum distance separable (MDS), 5Mimimum distance, 2n-tuples, 1parity check bits, 5parity check matrix, 6recursive systematic convolutional (RSC), 22redundant bits, 5Singleton bound, 5survivor paths, 21syndrome, 7systematic form, 4systematic linear block code, 5weight enumerator polynomial, 10weigth distribution, 10

References[1] Gordon L. Stuber: Principles of Mobile Communication, 2nd Edition; Kluwer Academic Pub-lishers, 2001[2] Chester J. Salwach, Codes that Detect and Correct Error, College Journal in Mathematics.[3] Richard E. Blahut, Theory and Practice of Error Control Codes, Addison-Wesley Publishing Company -1983.[4] Richard E. Blahut, Digital Transmission of Information, Addison-Wesley Publishing Company -90[5] A. Bruce Carlson, Communication Systems, An Introduction to Signals and Noise in Electrical commu-nication, McGraw-Hill International Editions, Third Edition 1988[6] John G. Proakis, Digital Communications, McGraw-Hill International Editions, Third Edition 1995[7] Raymond Steele, Mobile Radio Communications, IEEE Press/Pentech Press

Documents

Error Control Codes - Aalto · Tarmo Anttalainen Error Control Codes _____ _____ Tarmo Anttalainen Page 1 02/04/02

Error Control Codes - Aalto · Tarmo Anttalainen Error Control Codes _ _ Tarmo Anttalainen Page 1 02/04/02