Reed-Solomon Codes: Error Correcting Codes1455935/FULLTEXT01.pdf · 1 Introduction To be able to safely store or transmit information, without losing impor-tant messages, one needs

Bachelor Degree Project

Reed-Solomon Codes: Error Correcting Codes

Author: Isabell Skoglund Supervisor: Per-Anders

Svensson Examinator: Marcus Nilsson Subject: Mathematics Semester: VT2020

Abstract

In the following pages an introduction of the error correcting codesknown as Reed-Solomon codes will be presented together with differentapproaches for decoding. This is supplemented by a Mathematicaprogram and a description of this program that gives an understandingin how the choice of decoding algorithms affect the time it takes tofind errors in stored or transmitted information.

Contents

1 Introduction 2

2 Error Correcting Codes 32.1 Bounds on Codes . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Linear Codes for Finite Fields . . . . . . . . . . . . . . . . . . 62.3 Cyclic Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.4 BCH Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3 Reed-Solomon Codes 193.1 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4 Methods 224.1 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.1.1 Direct Method . . . . . . . . . . . . . . . . . . . . . . 254.1.2 Berlekamp–Massey algorithm . . . . . . . . . . . . . . 26

4.2 My program . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Result 29

6 Conclusion 31

AThe Mathematica Code 33

1

1 Introduction

To be able to safely store or transmit information, without losing impor-tant messages, one needs a way to detect and correct errors that can occurduring the process. All available channels of communication do have somedegree of noise or interference such as a scratch in a CD or a neighbouringchannel in radio transmission, these have to be considered when transmittinginformation.

One way to still be able to transmit messages safely over a noisy channel isto add some redundancy to the message. This gives the ability to reconstructthe interfered message. It can be done by replacing the symbols in the originalmessage by codewords that will have some redundancy in them.

There are several of different ways to do this, one of the simplest waysis to just repeat the message, which gives an repeating code. That is, themessage that is sent is repeated some number of times, so even if some part isdisrupted the message is most likely still readable. Here it is probably quiteeasy to see if there are an error but one is required to send a lot of extrainformation.

Example 1. If the message that should be sent is hello, this is then repeatedsome number of times, for example three times so the transmitted code is thenhellohellohello. Here if it occurs some error, say that the received message isweloohellohello, then one can still see what the original message is. If thereshould be to much errors one needs to ask for the message to be resent.

An other way is to append a parity check digit at the end of the message.That is, if the message is in, for example binary, one can make the rule ofadding a 1 at the end of the message if the total number of 1’s in the messageshould be either odd or even.

Example 2. If the message of 1100001 should be sent the number of ones is3, if the rule is to have the total number of ones to be even, if it is not evenone needs to add an extra 1 in the end, otherwise a 0 is added, this is the paritycheck digit. Then the transmitted code is 11000011. If there occurs an error,for example, the received message is 11100011, where the total number of onesis now 5, since this is not even there have occurred some error and the messageneeds to be resent.

This method gives the ability to notice when there are only one error inthe message. The error can’t be corrected, it is only detected. Then the

2

receiver can ask for the message to be resent [8]. If the message is disruptedit can’t be read but there is almost no extra information sent. This gives themost important balance of error correcting codes, to send as little as possibleand still be able to recover the information when it is disrupted.

2 Error Correcting Codes

Error correction codes are used to ensure that potential errors in a messagethat is sent over a noisy communication channel or is stored on sensitivedevices can be detected and corrected within specific limitations [9]. A mes-sage that is transmitted over an noisy channel needs to be encoded to obtaincodewords. These codewords consists of symbols that come from some alpha-bet A. These codewords are the ones that are transmitted and the receiverthen decodes the message that might no longer be actual codewords. In thedecoding process any possible errors are detected and corrected, to some ex-tent. This is represented in figure 1 found in Introduction to Cryptographywith Coding Theory Second Edition page 399 [8].

The alphabets that are used in Example 1 and in Example 2 are theEnglish alphabet and the binary numbers, respectively. If A is an alphabet,An denotes the set of n-tuples of elements in A, then the elements in a subsetof An are the codewords of a block code with length n. A block code is a codewhere all codewords have the same length. Block codes with some additionalconditions is mostly used in practise, one common condition is to requirethat A is a finite field, this gives that An is a vector space. These type ofcodes are called linear codes [8].

Figure 1: Overview for message transmission over a noisy channel [8].

3

Example 3. If A = {0, 1} is the alphabet of a binary repetition code where eachsymbol is repeated four times, and the code is the set {(0, 0, 0, 0), (1, 1, 1, 1)}that is a subset of A4.

To be able to decode any alphabet, it is useful to decide a measure on howclose two words are to each other. This measurement is called the Hammingdistance, denoted d(v1, v2) for some words v1 and v2 from An. The Hammingdistance is defined as the number of places where the two words differ, thatis the minimum number of errors that needs to occur for v1 to be changedinto v2.

Example 4. If A = {0, 1}, the Hamming distance d = (v1, v2) between v1 =1100 and v2 = 0111 in A4 is equal to 3 since the two words differ in places 1, 3and 4.

If one calculated the Hamming distance between all the different code-words in a code C there exists a minimum value, the minimum distance ofthe code C, denoted d(C), as

d(C) = min {d(v1, v2) | v1, v2 ∈ C, v1 6= v2} .

The minimum distance of C is important since it gives the smallest numberof errors that can change one codeword into another codeword. This is usedwhen a received message have some error so it does not correspond to anexisting codeword. These errors are then corrected by finding the codewordthat has the smallest Hamming distance from the received message. This iscalled nearest neighbour decoding, that is, changing the received message toa codeword by changing as few symbols as possible.

Rules can be set up to require that nearest neighbour decoding actuallygives the correct answer when there are at most t errors. These are describedin Theorem 1. There can occur some trouble when there are more than onenearest neighbour to the received message. For example using the same setup as in Example 4, then 1000 for example have the same Hamming distanceto all four of {0000, 1100, 1010, 1001}. Here one approach is to just guess oneof them, which can seem risky but if one symbol in a long message is guessedwrong the message will probably still be readable. Or if it represents a pixelin a picture and the colour of this pixel is guessed wrong one will still be ableto see the picture. If it is more sensitive information where the meaning ofthe message is changed depending of one symbol the safest way to go is tohave the message resent.

4

Theorem 1. A code C can detect up to s errors if d(C) ≥ s+ 1 and a codeC can correct up to t errors if d(C) ≥ 2t+ 1 [8].

According to Theorem 1 a code can detect up to s error if one is able tochange any codeword v1 at s places without changing it to another existingcodeword v2. The code can also correct up to t errors if one is able to changeany codeword v1 at t places and still have the codeword v1 closest accordingto the Hamming distance [8].

2.1 Bounds on Codes

As mentioned in the introduction, the balance to send as little additionalinformation as possible and still be able to recover an message after an oc-curring error is really important. This is called the code rate or informationrate, R, of a code and represents the ratio of input data symbols contratransmitted code symbols. It can be calculated for a q-tuple (n,M, d) code,where n is the length of the code, M is the number of codewords in the codeand d is the minimum distance of the code, using

R =logq(M)

n.

This is representing what part of the bandwidth that is being used to transmitactual data. When using a code to transmit messages one would like therelative minimum distance, d/n, to be as large as possible to be able to correcta great number of errors. The relative minimum distance is a measure of theerror correcting capability of the code relative to its length [6]. Here onewould also like M to be as large as possible so that the code rate R is closeto 1, since this gives the bandwidth efficiently when transmitting messagesover noisy channels. The problem now is that increasing d tends to decreaseM , or increase n, which in turn lower the code rate. This creates an dilemmawhere one wants both the code rate and the relative minimum distance tobe as large as possible. This is described by the so called Singleton boundgiven by R. Singleton in 1964 presented in Theorem 2.

Theorem 2. Let C be a q-tuple (n,M,d) code. Then

M ≤ qn−d+1[8].

5

A code that satisfies the Singleton bound with an equality is called anmaximum distance separable, MDS, code. This is a code that has the largestpossible value of M for a given n and d.

Proof. Let c and c′ be two codewords, c = (a1, ..., an) and c′ = (ad, ..., an),respectively. If two codewords c1 and c2 are different from each other thenthey differ in at least d places. When c′1 and c′2 are obtained by removingd− 1 entries from c1 and c2, c

′1 and c′2 must differ in at least one place. The

number M of codewords c is equal to the number of vectors c′ obtained in thisway. There are at most qn−d+1 vectors c′ since there are n− d+ 1 positionsin these vectors. This implies that M is less or equal to qn−d+1, as desired[8].

One class of code that fulfils an equality in the Singleton bound and henceis MDS are the Reed-Solomon codes [8].

Example 5. Using the code in Example 3 that is a binary repetition code oflength 4. This is a (4, 2, 4) code. Then the Singleton bound gives

2 = M ≤ q4−4+1 = q1 = 21,

here q is 2 since it is a binary code. Since there are an equality in the Singletonbound this code ia an MDS code.

2.2 Linear Codes for Finite Fields

To be able to decode a code efficiently is really important. For the decodingprocess to be quick it is useful to apply some conditions for the code, thisprovides use for linear codes. Here the alphabet A will be a finite fieldF , where F still can be a lot of different alphabets as long as they arefinite. For example the binary numbers that gives the alphabet F = Z2

or the integers modulo a prime p, which gives the alphabet F = Zp. Thecorresponding vector space over F is the set of n-tuples codewords in F andis denoted F n. A subspace of F n is a nonempty subset, S, that is closedunder linear combinations. Then for all s1, s2 ∈ S and a1, a2 ∈ F it givesthat a1s1 + a2s2 ∈ S. For both the finite fields Z2 and Zp all calculations forelements are done modulo 2 or modulo p, respectively.

Definition 1. A linear code of dimension k and length n over a finite fieldF is a k-dimensional subspace of F n. This type of code is called an [n, k]

6

code. This could be rewritten to an [n, k, d] code, when d is known and is theminimum distance of the code [8].

For example the binary repetition code in Example 3 is a linear code withone-dimensional subspace of Z4

2.The binary parity check code in Example 2 is a linear code with a seven-

dimensional subspace of Z82. This binary code of dimension 7 and length 8

consists of the binary vectors such that the sum of all entries is zero modulo2. Then the vectors

(1, 0, 0, 0, 0, 0, 0, 1), (0, 1, 0, 0, 0, 0, 0, 1), ..., (0, 0, 0, 0, 0, 0, 1, 1)

form a basis of the subspace that contains the binary vectors where the sumof all entries is zero modulo 2.

The ISBN code, that is The International Standard Book Number is anerror detecting code that not is linear. When a book is published it is assignedwith a ISBN number, that is a 10 digit codeword. The first digit gives thelanguage, the second and third digit represents the publisher and the fourthto ninth digits represents a book identity number that the publisher assignsto the book. The last digit is chosen to fulfil

10∑j=1

jaj ≡ 0 mod 11,

where a1, ..., a10 are the digits of the ISBN number for a specific book. Sincethe calculation is made modulo 11 the tenth digit can be 10 which is thenrepresented by X, the first nine digits can only be chosen form {0, 1, ..., 9}and it is this that makes this code not linear. The code is not closed underlinear combinations due to the fact that one can not choose 10 as one of thefirst nine entries.

When a linear code C of dimension k spans over a finite field F , whereF has q elements, then the code C has qk elements. This is seen if there is abasis of C containing k elements, v1, ..., vk. Then all elements of the code Ccan be uniquely written in the form a1v1 + · · · + akvk, where a1, ..., ak ∈ F .Here there are q choices for each ai since F contains q elements, and thereare k ai’s since the dimension of the code is k. Hence there are qk differentelements in C. Here the Singleton bound can be rewritten for linear codesto qk ≤ qn−d+1, where d is the minimum distance of the code and n is thelength. This implies that k + d ≤ n+ 1.

7

As discussed before the minimum distance for a code is the smallest num-ber of symbols that have to change to transform one codeword into anothercodeword, and is represented by the Hamming distance for the code. To com-pute the minimum distance of any arbitrary code, that might not be linear,can be tiresome since it could require computing d(v1, v2) for every pair ofdifferent codewords that belongs to the code C. When it is known that thecode is linear, finding the minimum distance can be done using the Hammingweight instead. The Hamming weight is defined as wt(v1) = d(v1, 0), where0 = (0, 0, ..., 0), that is the number of nonzero places of v1. Then d (C) isinstead the smallest Hamming weight of all the nonzero codewords, like

d(C) = min {wt(v1) | 0 6= v1 ∈ C} .

This gives an advantage, since one no longer needs to calculate every code-word against each other, instead it is only one calculation for each codewordand this goes much faster.

When constructing a linear [n, k] code, it needs a k dimensional subspaceof F n. One way to do this is to choose k vectors that are linearly independentto each other and then take their span. That is the set of all linear combi-nations of these linearly independent vectors [1]. To do this one can choosea k × n generating matrix G of rank k, with entries in F . The subspace isthen given by the set of vectors of the form vG, where v runs through all rowvectors in F k. The rows of the generator matrix G are then the basis for ak dimensional subspace of F of all vectors of length n. This subspace is thelinear code C. This means that every codeword is uniquely expressible as alinear combination of the rows in G.

Definition 2. Let G be a generating k× n matrix for a linear [n, k] code C.An (n− k)× n matrix H such that

GHT = 0,

then H is called the parity check matrix for the code C with generating matrixG.

Theorem 3. If a linear code C has the generating matrix G = [Ik, P ], thenH =

[−P T , In−k

]is a parity check matrix for C [8].

For a generator matrix G = [Ik, P ], where Ik is the k× k identity matrix,the last n − k columns gives the redundancy that together with the first k

8

columns, that still is the message, gives the full codeword. This code is thencalled systematic, where the first k symbols are the information symbols andthe rest are the check symbols.

Example 6. Using Example 2 that is an [8, 7] code the generating matrix Glocks like this,

G =

1 0 0 0 0 0 0 10 1 0 0 0 0 0 10 0 1 0 0 0 0 10 0 0 1 0 0 0 10 0 0 0 1 0 0 10 0 0 0 0 1 0 10 0 0 0 0 0 1 1

.

So the codeword 11000011 is the sum of the first, second and seventh rowmodulo 2. This codeword is then obtained by multiplying (1, 1, 0, 0, 0, 0, 1) withthe generating matrix.

To check if there have occurred any errors one can use the parity checkmatrix H =

[−P T , In−k

]where P T is the transpose of P used to construct

the generator matrix G. When taking the dot product between the codewordand matrix HT and it’s result is not zero, there are some error.

The corresponding parity check matrix H for Example 6 would then be(1, 1, 1, 1, 1, 1, 1, 1), since

−P T = (−1,−1,−1,−1,−1,−1,−1) = (1, 1, 1, 1, 1, 1, 1)

modulo 2 and In−k = I8−7 = I1 = 1. This gives that if v = 11000011 is acodeword the dot product between v and HT should be zero, here v ·HT =1 · 1 + 1 · 1 + 0 · 1 + 0 · 1 + 0 · 1 + 0 · 1 + 1 · 1 + 1 · 1 = 4 ≡ 0 mod 2.

Generally, C ={uG | u ∈ Ak

}is a subspace of An, where G is the k × n

generating matrix. Then if v1 = uG is a codeword, v1HT should be equal to

zero, asv1H

T = (uG)HT = u(GHT ) = 0,

since GHT = 0 for every generating matrix and its corresponding paritycheck matrix. If some error e is introduced the received vector v2 = uG+ e,then multiplying this with the parity check matrix yields

v2HT = (uG+ e)HT = uGHT + eHT = eHT 6= 0,

9

and an error is detected, if e not is a codeword.If a codeword is transmitted and then vector v is received, the receiver

would compute vHT to see if there have occurred some error. If this is notequal to zero at least one error is detected. The value of vHT is called thesyndrome of vector v and the syndrome of some vector v is denoted S(v).When vHT is equal to zero, one can not say that there are no error, one canonly say that v is a codeword. Since it is most likely to not be any errorwhen vHT = 0, than enough errors occurring to change one codeword intoanother codeword, one can assume that no errors have occurred. Now theparity check matrix can be used to detect and correct errors in the processof decoding a received message. Two definitions about cosets will help inthe understanding of the general decoding procedure using the parity checkmatrix.

Definition 3. Let C be a linear code and let u be an n-dimensional vector.The set u+ C given by

u+ C = {u+ c | c ∈ C}

is called a coset of C [8].

Definition 4. A vector having minimum Hamming weight in a coset is calleda coset leader [8].

Using syndrome decoding requires a lot fewer steps than just searchingfor the nearest codeword to the received vector would. It can be done byusing a syndrome lookup table that consists of the coset leaders and theircorresponding syndromes. Then decoding is done by three steps.

1. Calculate the syndrome for a received vector r, S(r) = rHT .

2. Find the coset leader that has the same syndrome as S(r). Let it bec0.

3. Decode the received vector r using the coset leader c0 as r − c0.

Example 7. Let C be a binary linear code that has the generating matrix

G =

(1 0 1 00 1 0 1

).

10

Then the code C includes the codewords

{(0, 0, 0, 0), (1, 0, 1, 0), (0, 1, 0, 1), (1, 1, 1, 1)} ,

these elements will be the first row of a decoding table that will help in thedecoding process. To create the next row take the vector with the smallestHamming weight that do not already have a place in the table, there can bemore than one choice, and than the next three elements in this row is obtainedby the first element of the row subtracted of the element at the top of eachcolumn. This is done for all rows until all possible elements of length four isused, together this creates the table

(0,0,0,0) (1,0,1,0) (0,1,0,1) (1,1,1,1)(1,0,0,0) (0,0,1,0) (1,1,0,1) (0,1,1,1)(0,1,0,0) (1,1,1,0) (0,0,0,1) (1,0,1,1)(1,1,0,0) (0,1,1,0) (1,0,0,1) (0,0,1,1).

When a vector is received, look for it in the table and decode it to the vectorat the top of the same column. If the received vector v is (0, 0, 0, 1) it is decodedto (0, 1, 0, 1).

Example 7 is quite small so even though (0, 0, 0, 1) is decoded to one ofits nearest neighbours it is not the only one that are equally close, (0, 0, 0, 0)is also a nearest neighbour to (0, 0, 0, 1). This becomes a problem since theminimum distance of this code is 2, this means that a general error correc-tion might not be possible. If the code would have fulfilled the conditionsdescribed in Theorem 1 the same procedure will decode the vectors correctly.

The code in the example was used since writing the table and search forthe received vector in this table can be difficult for large codes. Here theparity check matrix H can be used to make the process more manageable.The vectors in the first column is the coset leaders, l, if v is in the same rowas l, then v = l + c fore some codeword c. This gives that

vHT = lHT + cHT = lHT ,

since c is a codeword it gives that cHT = 0. The syndromes are here thevector S(v) = vHT , if two vectors have the same syndrome they belong tothe same coset and have the same coset leader, so the table in Example 7can be replaced by the smaller table,

11

(0,0,0,0) (0,0)(1,0,0,0) (1,0)(0,1,0,0) (0,1)(1,1,0,0) (1,1).

Example 8. Using the same code C as in Example 7 with the same generatingmatrix G, decoding the received vector v = (0, 0, 0, 1) is now done by multiplyingit by HT , this gives

S(v) = vHT = (0, 0, 0, 1)

1 00 11 00 1

= (0, 1).

This is the syndrome of the third row in the smaller table, now subtract the cosetleader from the vector v modulo 2 and the codeword (0, 1, 0, 1) is found, whichis the same as in Example 7.

For large codes this procedure is too inefficient to be practical [8]. For ageneral linear code the problem of finding the nearest neighbour is hard andis considered a NP-complete problem, where NP stands for “nondeterminis-tic polynomial time” and is a classification of how hard this problem is tosolve [4]. There are certain types of codes that have more efficient decodingprocedures for example the cyclic codes.

2.3 Cyclic Codes

A linear code C is called cyclic if a cyclic shift of one codeword in C generatesanother codeword in C. If C is cyclic then if

(c0, c1, ..., cn−1) ∈ C it gives that (cn−1, c0, c1, ..., cn−2) ∈ C.

When continuing doing cyclic shifts, more codewords are generated, this givesthat all cyclic permutations of a codeword is also a codeword. The code usedin Example 7 is therefore also a cyclic code, if any of the codewords is shiftedin a cyclic way it will still be a codeword.

If F is a finite field and as before consisting of the integers mod p, wherep is a prime, then let F [x] denote the set of all polynomials in x that have

12

coefficients in F . Then with the positive number n the code will work in

F [x]

(xn − 1),

that denotes the elements of F [x] mod (xn − 1). This is the polynomialswith degree less than n. If a polynomial of degree n or larger is encounteredit is divided by (xn − 1) and the remainder is the new polynomial. A cyclicshift of a word corresponds to multiplying the corresponding polynomial inF [x] / (xn − 1) by x modulo xn− 1. The general description of a cyclic codeis given in Theorem 4.

Example 9. Using the code in Example 7 where one codeword (1, 0, 1, 0), whichis the first row of the generating matrix G, can be represented as the polynomialg(x) = 1 + x2. Then g(x)x gives the second row of G, continuing with g(x)x2,that represents two cyclic shifts,

g(x)x2 = x2 + x4 ≡ 1 + x2 mod xn − 1.

Since the degree is equal to n = 4 the computation is done modulo x4 − 1, thisthen gives the first row of G again.

Theorem 4. Let C be a cyclic code of length n over a finite field F . Foreach codeword (c0, ..., cn) in C, the polynomial c0 + c1x + · · · + cn−1x

n−1 isassociated in F [x]. Then let g (x) be the polynomial of smallest degree out ofall the nonzero polynomials obtained from C in this way. Dividing g (x) by it’shighest coefficient, one may assume that g (x) is a monic polynomial, wherea monic polynomial is when the leading coefficient is one. This polynomialg (x) is then called the generating polynomial for C and

1. g (x) is uniquely determined by C.

2. g (x) is also a divisor of (xn − 1), i.e. g(x)h(x) = xn − 1 for someh(x) ∈ F (x).

3. C is exactly the set of coefficients of the polynomials of the form

g (x) f (x), where deg (f) ≤ n− 1− deg (g).

4. A polynomial m (x) ∈ F [x] / (xn − 1) corresponds to a codeword in Cif and only if h (x)m (x) ≡ 0 mod (xn − 1), where h(x) is defined by 2[8].

13

If g (x) = c0 + c1x + . . . ck−1xk−1 + xk is built like in Theorem 4, then

by part 3 of the theorem, every codeword of C corresponds to a polynomialof the form g (x) f (x), where deg (f) ≤ n − 1 − deg (g). Since f (x) is alinear combination of 1, x, x2, ..., xk−1, this gives that every codeword in C isa linear combination of the codewords corresponding to the polynomials

g (x) , g (x)x, g (x)x2, ..., g (x)xk−1.

These are in turn corresponding to the vectors

(g0, ...gk, 0, 0...) , (0, g0, ..., gk, 0, ...) , ..., (0, ..., 0, g0, ..., gk) .

Then a generating matrix for C can be build similar as the one done for thelinear codes,

G =

g0 g1 . . . gk 0 0 . . .0 g0 g1 . . . gk 0 . . ....

......

......

......

0 . . . 0 g0 g1 . . . gk

.

To construct the parity check matrix for C corresponding to the generatingmatrix one uses part 4 of Theorem 4. Here h (x) = h0 + h1x + · · · + hmx

m,where m = n− k,

H =

hm hm−1 . . . h0 0 0 . . .0 hm hm−1 . . . h0 0 . . ....

......

......

......

0 . . . 0 hm hm−1 . . . h0

.

This should fulfil that g(x)h(x) = xn− 1 which is equivalent to g(x)h(x) ≡ 0mod (xn − 1), this in turn gives that GHT = 0, which is true for everygenerating matrix and its corresponding parity check matrix. As mentionedfor linear codes a parity check matrix H for a linear code C means thatvHT = 0 if and only if v ∈ C. This is the same for the cyclic code, cHT = 0if and only if c ∈ C.

Example 10. Constructing a generating matrix G for a code of length 7 canbe done by factorising the polynomial x7 − 1, since the generating polynomialg(x) should divide

x7 − 1 = (x− 1)(x3 + x2 + 1)(x3 + x+ 1).

14

Then the generating polynomial could be chosen to g(x) = 1 + x+ x2 + x4 thisthen generates the matrix

G =

1 1 1 0 1 0 00 1 1 1 0 1 00 0 1 1 1 0 1

.

Here a cyclic shift of the first row gives all the nonzero codewords, so all code-words is C are

C ={(0, 0, 0, 0, 0, 0, 0), (1, 1, 1, 0, 1, 0, 0), (0, 1, 1, 1, 0, 1, 0), (0, 0, 1, 1, 1, 0, 1),

(1, 0, 0, 1, 1, 1, 0), (0, 1, 0, 0, 1, 1, 1), (1, 0, 1, 0, 0, 1, 1), (1, 1, 0, 1, 0, 0, 1)}.

Note that it happens to be like this for this particular example, for cyclic codes ingeneral there can be additional codewords, whose cyclic shifts also is codewords.To check that this is all the codewords one can take the linear combination inevery possible way and check that it generates one of the codewords in thislist. This code is cyclic since a cyclic shift of one codeword generates anothercodeword. Here the parity check matrix H is constructed from the parity checkpolynomial, h(x), that satisfies g(x)h(x) = x7 − 1, hence h(x) = x3 + x + 1which gives the matrix

H =

1 0 1 1 0 0 00 1 0 1 1 0 00 0 1 0 1 1 00 0 0 1 0 1 1

.

The parity check matrix gives a way to detect errors in a transmittedmessage for cyclic codes in a similar way as for a linear code. It can stillbe hard to correct the occurring errors for a general cyclic code, to make iteasier one can give even more structure to the code, like a BCH code [8].

2.4 BCH Codes

BCH codes were discovered in the late 1950’s by R. C. Bose and D. K. Ray-Chaudhuri and independently by A. Hocquenghem, hence the name BCH.BCH codes are a class of cyclic codes, that has a decoding algorithm thatcan correct multiple occurring errors. These types of codes are specificityused in satellites and the special BCH codes called Reed-Solomon codes have

15

a lot off different applications. To construct a BCH code one needs somebackground information regarding polynomials.

If a polynomial d (x) is a divisor of the polynomial f (x) then f (x) =d (x) g (x) for some g(x). Here 1 and f (x) are trivial divisors of f(x) sincethey are always divisors of f (x). All other divisors are called non trivial, orproper, divisors of f (x). If a polynomial f (x) have no proper divisors in afinite field F , then f (x) is said to be irreducible over F .

Let the polynomial f(x) be of degree n ≥ 1, and irreducible over thefinite field F , where

F = GF (pn) =Zp [x]

f(x)= {a0 + a1α + · · ·+ an−1α

n−1 | ai ∈ Zp, f(α) = 0},

where GF (pn) denotes the Galois field with pn elements, p is a prime. IfF ∗ = F \ {0} denotes the group F without the zero element, this group iscyclic. If α is the generating element of this group, such that 〈α〉 = F ∗, thenf(x) is called a primitive polynomial [1].

When using addition and multiplication of polynomials this is done mod-ulo some irreducible polynomial h (x) of degree n, let F n [x] be the set ofall polynomials in F [x] with degree less than n. Here each codeword in F n

corresponds to a polynomial in F n [x], so one can also use addition and mul-tiplication of codewords in F n. Then multiplication in F n is defined to bemodulo an irreducible polynomial of degree n [5].

When using a primitive polynomial to construct GF (2r), that representsthe Galois field with elements based on 2r, in binary it would be an r-bitnumber, all computations in the field is easier than when a non-primitiveirreducible polynomial is used.

Let β be in F n and represent the codeword corresponding to x mod h (x),where h (x) is a primitive polynomial of degree n. Then βi is equivalent toxi mod h (x). Here note that if 1 ≡ xm mod h (x) it means that 0 = 1 + xm

mod h (x) which gives that h (x) divides 1 + xm. Since h (x) is a primitivepolynomial, h (x) does not divide 1 + xm for m less than 2n − 1, this givesthat βm is not equal to 1 for m less than 2n − 1. If βj = βi for j 6= i if andonly if βi = βj−iβi, this implies βj−i = 1. From this one can say that

F n\ {0} ={βi | i = 0, 1, ..., 2n − 2

}.

That is, every non-zero codeword in F n can be represented by some powerof β. This property makes multiplication in this field easy. An example of

16

this using GF (24) and h (x) = 1 + x+ x4 shown in Table 1 found in CodingTheory and Cryptography the Essentials page 114 [5].

Example 11. Using Table 1 to compute multiplication of codewords is doneusing powers of β. To compute (1100) (1010) transform the codewords topowers of β, then

(1100) (1010) = β4β8 = β12 = 1111.

This can be done since

(1 + x)(1 + x2

)≡(1 + x+ x2 + x3

)mod h (x) .

codeword polynomial in x mod h (x) power of β0000 0 -1000 1 β0 = 10100 x β0010 x2 β2

0001 x3 β3

1100 1 + x ≡ x4 β4

0110 x+ x2 ≡ x5 β5

0011 x2 + x3 ≡ x6 β6

1101 1 + x+ x3 ≡ x7 β7

1010 1 + x2 ≡ x8 β8

0101 x+ x3 ≡ x9 β9

1110 1 + x+ x2 ≡ x10 β10

0111 x+ x2 + x3 ≡ x11 β11

1111 1 + x+ x2 + x3 ≡ x12 β12

1011 1 + x2 + x3 ≡ x13 β13

1001 1 + x3 ≡ x14 β14

Table 1: Construction of GF (24) where h (x) = 1 + x+ x4 [5].

An element α in GF (2r) is called a primitive element if αm is not equalto 1 for m between 1 and 2r − 1. That is, α is a primitive element if everynon-zero codeword in GF (2r) can be expressed as a power of α. Then if aprimitive polynomial h (x) is used to construct the finite field GF (2r) withβ defined as above, then β is a primitive element.

17

Usually the order of the non-zero element α in GF (2r) is the smallestpositive integer m such that αm = 1. For any non-zero element α in GF (2r),α has order m less than 2r − 1. Hence this α is a primitive element if it hasorder 2r − 1 [5].

This definition of primitive element α will be useful when one wants toconstruct the class of BCH codes that is called Reed-Solomon codes, since itis used when constructing the generating polynomial for the code.

To start the construction of a BCH code of length n over a finite fieldF , one needs to factorize xn − 1 in the same way as in the section of cycliccodes,

xn − 1 = f1(x)f2(x) . . . fr(x),

where each fi(x) is an irreducible polynomial over the field F . If α is aprimitive root modulo n, then α0, α1, ..., αn−1 are the roots of xn − 1 suchthat

xn − 1 = (x− 1)(x− α) . . . (x− αn−1).This means that each fi(x) is a product of some of the factors x−αj, then eachαj is a root of the polynomials fi(x). For each j, let qj(x) be the polynomialfi(x) that fulfils fi(α

j) = 0, then the polynomials q0(x), q1(x), ..., qn−1(x) areformed. The polynomials ql(x) are not all distinct since the polynomial fi(x)can have two different powers αj,αl as roots, then the polynomial fi(x) willserve as both qj(x) and ql(x). Then a BCH code of designed distance δ is acode with the generating polynomial

g(x) = lcm{qk+1(x), qk+2(x), ..., qk+δ−1(x)},

where k is some chosen integer. A BCH code C with designed distance δ,such that d(C) ≥ δ, this is the so-called BCH bound, where the BCH boundsays that for a cyclic code C of length n over F with minimum weight d. IfC contains δ − 1 consecutive elements for some integer δ, then d is greateror equal to δ [6]. The polynomial g(x) is called the minimal polynomial of αover F since it is the monic polynomial of minimal degree in F (x) such thatg(α) = 0.

Example 12. Using the same polynomial as in Example 10, then

x7 − 1 = (x− 1)(x3 + x2 + 1)(x3 + x+ 1)

and using the other possible generating polynomial g(x) = x4 + x3 + x2 + 1 =(x − 1)(x3 + x + 1). If α then is a root of x3 + x + 1, it is a primitive root

18

modulo n, where n is equal to 7. This gives that g(α) = 0 as well as g(α2) =(α2)3 + α2 + 1 = 0, since the computations is done with binary numbers andα3 = α+ 1 as well as squaring (α3)2 = (α+ 1)2 = α2 + 2α+ 1 = α2 + 1. Thisgives that the square of a root α is also a root of x3 + x + 1, then α4 = (α2)2

is also a root to g(x). Now g(x) can be rewritten

g(x) = x3 + x+ 1 = (x− α)(x− α2)(x− α4).

All the remaining powers of α must be roots to (x − 1) and (x3 + x2 + 1),respectively, in summary the different polynomials qi are

q0(x) = x− 1

q1(x) = q2(x) = q4(x) = x3 + x+ 1

q3(x) = q5(x) = q6(x) = x3 + x2 + 1.

If the chosen integer k is equal to −1 and d = 3 then the least common multipleof g(x) is then

g(x) =lcm{qk+1(x), qk+2(x), ..., qk+d−1(x)}=lcm{q0(x), q1(x)}=x4 + x3 + x2 + 1.

This example says that the minimum weight is at least 3. If k = −1 and d ischosen to 4 then the generating polynomial g1(x) is

g1(x) = lcm{q0(x), q1(x), q2(x)} = g(x),

since q1(x) = q2(x) the least common multiple does not change and now theminimum weight of this code also is at least 4. The actual minimum weightof this code is equal to 4, which can be seen if the minimum weight of thecodewords is calculated.

3 Reed-Solomon Codes

The Reed-Solomon codes where introduced in 1960 by I. S. Reed and G.Solomon and are a type of BCH codes. If F is a finite field with q elements,where q = pr, where p is a prime, then for a binary code, 2r. If n = q − 1,

19

then F contains a primitive element, α. Then the generating polynomial isconstructed as

g (x) =(x− αb

) (x− αb+1

). . .(x− αb+d−1

), (1)

where d is between 1 and n. Usually b is chosen to 0 or 1, from here on itwill be chosen to 1. This generator polynomial will have coefficients in F andgenerates a BCH code C over F of length n that is called a Reed-Solomoncode.

Since g (α) equals zero for all powers 0, ..., d−1 of α the BCH bound givesthat the minimum distance for C is at least d.

A Reed-Solomon code is a cyclic [n, n+ 1− d, d] code, where n = 2r − 1and the elements are from GF (2r). The rest is based of the fact that thegenerating polynomial is a polynomial of degree d − 1 so it has at mostd nonzero coefficients. This gives that the codeword corresponding to thegenerating polynomial is a codeword of weight at most d. It gives that theminimum weight for C is exactly d and the dimension of C is n− deg (g) =n+ 1− d, where g is the generating polynomial, which gives the notation forthe code. The notation [n, k] can also be used where d = n+ 1− k.

The codewords in C is given by the polynomials

g (x) f (x) ,

where deg(f) is less or equal to n− d. Since there are q different choices forthe n− d+ 1 coefficients of f (x) there are qn−d+1 polynomials f (x). Hencethere are qn−d+1 different codewords in C. Due to this the Reed-Solomoncode fulfils the criterion for a MDS code, that is, there is an equality in theSingleton bound [8].

3.1 Encoding

To start an encoding process one needs to define the message polynomial. Fora Reed-Solomon [n, k] code, k information symbols form the message thatis encoded as one block and can be represented by the message polynomialm (x). This message polynomial m (x) is then of order k − 1,

m (x) = mk−1xk−1 + · · ·+m1x+m0,

20

where the coefficients mk−1, ...,m0 are message symbols from an alphabet,the usually the Galois field GF (2r).

The Reed-Solomon encoding can be done both cyclic and systematic,when it is done systematic it is still a cyclic structure in the background, forexample when constructing the generating polynomial. For both methods ofencoding the same generating polynomial g (x) is used, shown in Equation(1). For the cyclic approach the generating matrix is made in the samemanner as in the section of cyclic codes. Then the coefficients of the messagepolynomial represents a message vector instead which is multiplied throughthe generating matrix to construct a codeword. The generating polynomialis then

g (x) = (x− α)(x− α2

). . .(x− αd

)= g1 + g2x+ · · ·+ gd−1x

d−1,

and the corresponding generating matrix for g (x) is

G =

g0 g1 . . . gd−1 0 0 . . .0 g0 g1 . . . gd−1 0 . . ....

......

......

......

0 . . . 0 g0 g1 . . . gd−1

︸︷︷︸

n

,

where there are n columns and k rows. The codeword c is then

c = (mk−1, ...,m1,m0)G.

3.2 Properties

For many types of applications the errors often occur in bursts and are notrandomly distributed. A burst error is when there occurs an error in manyadjacent bits, for example a scratch on a CD. Reed-Solomon codes, wherethe message symbols comes from the finite field F = GF (2r). Here GF (2r)represents the Galois field with elements based on 2r. Using the elements ofa Galois field as message symbols the coefficients of the message polynomialwould be represented by an r-bit binary number. It is due to this that theReed-Solomon code is good with burst errors, since even though all bits in asymbol is in error it will only count as one symbol error in terms of correctioncapacity of the code [3].

21

4 Methods

When one starts to decode a received message there are some different ap-proaches to consider. There is an direct method that is based on some trialand error, there are also different algorithms to speed up the direct method.An other type of decoding algorithm are the Berlekamp-Massey algorithmthat will be used later on.

4.1 Decoding

Let [n, k, d] be some Reed-Solomon code, where n = 2r−1, k is the dimensionand d is the minimum weight. Since the elements of the code comes fromGF (2r), correcting a received vector means that one needs to both find thethe locations of the error as well as magnitudes of the error. The errorlocation is defined as the position in the received vector that holds an errorand is refereed to by an error location number, if the jth coordinate of thevector is an error location then then its error location number is xj. The errormagnitude of an error location is the size of the error in this error locationj, it is the error in the coefficient of xj. To decode a Reed-Solomon code oneneeds to find both the error location and the corresponding error magnitude[5].

If the received vector is represented as an polynomial, R (x), then it canbe seen as

R (x) = T (x) + E (x) , (2)

where T (x) is the transmitted codeword and E (x) is the error that haveoccurred. Here E (x) = En−1x

n−1 + · · ·+E1x+E0, where each coefficient isan element from GF (2r). The positions of the errors are determined by thedegree of x. The correction capacity of the code are, as discussed before, dneeds to be greater or equal to 2t+1, where t is the number of errors that canbe corrected. This gives that if more than t = (d− 1) /2 of the coefficientsin E (x) is non-zero the error may not be corrected. Like what happenedin Example 7, where the algorithm did still work but the nearest codewordmight be another than the one that was transmitted.

To know if there have occurred any errors one needs to calculate the syn-dromes for the received polynomial. This can be done in some different ways.One way is to divide the received polynomial by the generator polynomial, ifthe received polynomial is an actual codeword this can be done without any

22

remainder. This property extends to the factors of the generator polynomial,which gives the ability to find each syndrome value S1, ..., Sd by dividing thereceived polynomial by each of the factors x−αi corresponding to the factorsin Equation 1,

R (x)

x− αi= Qi (x) +

Six− αi

, (3)

where Q (x) is the quotient and i goes from 1 to d. Here the remainder isthe sought syndrome values S1, ..., Sd. Rearranging Equation (3) gives theequation for each syndrome value,

Si = Qi (x)× (x− αi) +R (x) .

Hence when x = αi this can be reduced to

Si = R(αi)

= Rn−1(αi)n−1

+ · · ·+R1αi +R0,

where the coefficients Rn−1, ..., R0 are the symbols of the received polynomial.This gives an alternative way of finding the syndrome values, namely bysubstituting x = αi in the received polynomial. This is possible since thesyndrome values only are dependent on the error pattern, that is

R(αi)

= T(αi)

+ E(αi),

where T (αi) is equal to zero since x− αi is a factor of the generating poly-nomial that in turn is a factor of T (x). Hence this can be reduced to

R(αi)

= E(αi)

= Si. (4)

When no error is detected the syndrome values S1, ..., Sd will be zero [3].The relation between the syndromes and the error polynomial can be

used to set up a set of simultaneous equations and from these the error canbe found. Here the error polynomial is rewritten to only include the errorlocations. Assuming that v errors have occurred, v needs to be less or equalto t, otherwise the errors can not be corrected. The error polynomial isrewritten to

E(x) = Y1xj1 + Y2x

j2 + · · ·+ Yvxjv ,

23

where j1, ..., jv is the error location number for each error, respectively. Theerror magnitude at each location is represented by Y1, ..., Yv. Substitutingthis back in the syndrome Equation (4) gives

Si = E(αi)

= Y1αij1 + Y2α

ij2 + · · ·+ Yvαijv

= Y1Xi1 + Y2X

i2 + · · ·+ YvX

iv,

where (X1 = αj1), ..., (Xv = αjv) still are the error locators with errorlocation numbers j1, ..., jv. Now the 2t syndrome equations can be set up bya matrix equation:

S1

S2...S2t

=

X1

1 X12 . . . X1

v

X21 X2

2 . . . X2v

......

...X2t

1 X2t2 . . . X2t

v

×Y1Y2...Yv

. (5)

Note that the syndromes S1, ..., S2t−1 corresponds with the roots of the gen-erating polynomial α, ..., αd chosen in Equation (1).

There are now two different ways to construct the error locator polyno-mial. The first one, denoted σ(x), is constructed as

σ(x) = (x−X1)(x−X2) . . . (x−Xv)

where the error locators X1, ..., Xv are roots to the polynomial and producesa polynomial of degree v with the coefficients σ1, ..., σv.

The second error locator polynomial, denoted Λ(x), is constructed as

Λ(x) = (1−X1x)(1−X2x) . . . (1−Xvx), (6)

where the factors (1 −Xjx) gives that it is the inverses X−11 , ..., X−1v of theerror locators that are the roots of the polynomial with coefficients Λ1, ...,Λv.However the coefficients of both σ and Λ are the same since σ(x) can berewritten to xv × Λ(1/x). The two error locator polynomials is used ondifferent places to make the computations easier.

When the error locations X1, ..., Xv have been found they can be substi-tuted back into the syndrome equation and be solved by direct calculationusing matrix inversion of the matrix equation (5), this produces the errormagnitudes Y1, ..., Yv. If the matrix that are obtained not is invertible an

24

alternative method of calculating the error value Yj is to use the Forneyalgorithm which will not be described here, Algebraic Coding Theory byBerlekamp E. for further reading [2]. Now when the symbols containing er-rors have been identified by Xj and the magnitude of these errors Yj is found,the errors can be corrected by subtracting the error polynomial E(x) of thereceived vector R(x), by rewriting of Equation (2) this gives the transmittedcodeword T (x) [3].

4.1.1 Direct Method

The task of finding the coefficients of the error locator polynomial have somedifferent approaches, this section will describe the direct method.

Now there is a corresponding root X−1j for each error that makes Λ(x)equal to zero, this can be written as

1 + Λ1X−1j + · · ·+ Λv−1X

−v+1j + ΛvX

−vj = 0.

This can be multiplied through by YjXi+vj to be rewritten as

YjXi+vj + Λ1YjX

i+v−1j + · · ·+ ΛvYjX

ij = 0,

where each term of YjXj with powers and j = 1, ..., v can be rewritten assyndromes,

Si+v + Λ1Si+v−1 + · · ·+ ΛvSi = 0,

where i = 0, ..., 2t− v − 1.This now produces a set of 2t − v simultaneous key equations, where

Λ1, ...,Λv are unknown. To solve these equations for Λ1, ...,Λv one can usethe first v equations,

SvSv+1

Sv+2...

S2v−1

=

Sv−1 Sv−2 . . . S1

Sv Sv−1 . . . S2

Sv+1 Sv . . . S3...

......

S2v−2 S2v−3 . . . Sv−1

×

Λ1

Λ2

Λ3...

Λv

, (7)

except that v is unknown here. So to find what v is it is necessary to calculatethe determinant for the matrix, for each value of v. These calculations shouldstart with v = t and continue down until a non-zero determinant is found [7].

25

This non-zero determinant gives that the equations are independent and canbe solved. Then the coefficients to the error locator polynomial are found byinverting the matrix and solve the equations. When the coefficients of theerror locator polynomial is found they can be used to find the sought errorlocation numbers that indicates the error location for each error. When theerror location polynomial is written as

Λ(x) = X1(x−X−11 )X2(x−X−12 ) . . . ,

then the function value will be zero if x = X−11 , X−12 , ... and this is the casewhen x = α−j1 , α−j2 , ..., hence the values of X1, ..., Xv are found by trial anderror. This can be done using the Chien search, that is trying the powers ofαj for j between 0 and n − 1, since this covers the whole field, to find theroots of Λ(x). The values of αj for j between 0 and n − 1 are substitutedinto Equation 6 and each result is evaluated. If this expression is evaluatedto zero then the value of x is a root and identifies an error location. Here itis j that gives the error location number [3].

4.1.2 Berlekamp–Massey algorithm

This algorithm gives an alternative method for finding the error locatorpolynomial that is faster than the direct method. Here the error locatorpolynomial σ(x) is calculated with the syndromes S1, ..., S2t. Let σR(x) =1 + σt−1x + σt−2x

2 + · · · + σ0xt, this can be seen as the reverse of the error

locator polynomial σ(x).Then let S(x) = 1 +S1x+S2x2 + · · ·+S2tx

2t be thesyndrome polynomial. Here by using the division algorithm one can write

σR(x)S(x) = q(x)x2t+1 + r(x),

where the degree of r(x) is less or equal to 2t.This version of the algorithm will produce a polynomial P2t(x) satisfying

P2t(x)S(x) = q2t(x)x2t+1 + r2t(x), where the degree of P2t(x) is less or equalto t and the degree of r2t(x) is also less than t. Hence P2t(x) is equal toσR(x). Now let

qi(x) = qi,0 + pi,1x+ · · ·+ qi,2t−1−ix2t−1−i

and also let

pi(x) = x2t+1−iPi(x) = pi,0 + pi,1x+ · · ·+ pi,lxl,

26

then at step i, the algorithm calculates qi(x), pi(x) and the integers Di andzi.

The following steps are then used to calculate the error locator polyno-mial with the Berlekamp-Massey algorithm. Let T (x) be the transmittedcodeword that is encoded using a generator polynomial g(x) constructed inthe same manner as in Equation (1), then let the received vector with someerror be R(x). The decoding process is then continued as follows:

1. Calculate the syndromes for the received vector as R(αi) = Si, wherei = 1, ..., 2t.

2. Now define

q−1 = 1 + S1x+ S2x2 + · · ·+ S2tx

2t

q0 = S1 + S2x+ · · ·+ S2tx2t−1

p−1 = x2t+1 and

p0 = x2t,

as well the initial conditions D−1 = −1, D0 = 0 and z0 = 1.

3. Then for i = 1, ..., 2t, qi(x), pi(x), Di and zi is recursively defined fortwo different cases as follows

(a) If qi−1,0 = 0, then

qi(x) =qi−1x,

pi(x) =pi−1x,

Di = 2 +Di−1 and

zi = zi−1.

(b) If qi−1,0 6= 0 then

qi(x) =qi−1(x)− qi−1,0

qzi−1,0qzi−1

(x)

x,

pi(x) =pi−1(x)− qi−1,0

qzi−1,0pzi−1

(x)

x,

Di = 2 + min{Di−1, Dzi−1

}, and

zi =

{i− 1, if Di−1 ≥ Dzi−1

,

zi−1, otherwise.

27

If e, that is less or equal to t, errors have occurred during the transmissionof the codeword then p2t(x), that is equal to σR(x), has degree e and theerror locator polynomial σ = p2t,e + p2t,e−1x + · · · + p2t,1x

e−1 + xe then hase distinct roots. These roots can be found similarly as in the direct methodand gives then the error location number [5].

4.2 My program

The Mathematica program that I have used can be found in appendix A,in this section there will be a specific example run of this program to sehow it works. The first section of this program takes a text together withsome n and k, that represents the Reed-Solomon code [n, k], and encodes thetext. The encoding process start with that each letter gets represented by aelement in Fn, and k such elements then creates one block of length k. Thisis then encoded by the k×n generating matrix G that is constructed by thegenerating polynomial g(x). This polynomial g(x) consists of consecutivepowers of the smallest primitive element α modulo n, then

g(x) = (x− α1)(x− α2) . . . (x− αn−k).

For this example n = 257 and k = 249, the generating polynomial

g(x) = 44 + 118x+ 4x2 + 174x3 + 156x4 + 42x5 + 157x6 + 183x7 + x8,

the computations are done mod n and here α = 3. The generating matrixG is then constructed in a the same manner as for a cyclic code. The blocksthat consists of the message elements is then multiplied through G to encodethe message. An output of the program gives how many errors, at most terrors if d(C) ≥ 2t + 1 according to the Theorem 1, that are introduced ineach block and then one calculation of the error positions from the directmethod and one from the Berlekamp–Massey algorithm.

Example 13. If 2 errors is introduced in one block of the text found in appendixA, there are some error in two different positions that are represented by xj. Thenumber of errors as well as there positions are of course unknown when thedecoding process begins. First the syndrome values for the corresponding vectoris calculated, here

{17, 31, 133, 61, 71, 20, 176, 155},they are the same for both methods. From now on the methods differ, the firstdescription will consider the direct method.

28

The first step of the direct method is to find the matrix equation that corre-sponds to Equation (7), for the syndrome values in this example the correspond-ing matrix equation is the following:[

−133−61

]=

[17 3131 133

]×[Λ1

Λ2

].

Here the set up is slightly different then the one from Equation (7), since Ifound that, the setup was easier to understand when this one was easier toimplement. The one used in the program is from the book Coding Theoryand Cryptography the Essentials Second Edition [5]. Since the first nonzerodeterminant is encountered when v = 2 there are two errors and therefore twoerror positions Λ1 and Λ2. These gives the error locator polynomial, where theentries are in reversed order,

1 + Λ2x+ Λ1x2 = 1 + 240x+ 129x2.

From here the zeros of the error locator polynomial are found using the Chiensearch, this gives the error positions numbers 235 and 229. This means that thereare some error in the positions corresponding to x235 and x229, respectively.

For the Berlekamp-Massey algorithm the error locator polynomial is evaluatedat every step to eventually finding the correct one. The initial values is are set upas in the section about the Berlekamp-Massey algorithm, then the first iteration,where i = 1 gives

qi(x) = 256 + 120x+ 113x2 + 62x3 + 98x4 + 93x5 + 247x6 + 192x7

pi(x) = x7 + 240x8,

here pi(x) is the guessed error locator polynomial. When i = 2t then the listof estimated error locator polynomials looks like this where the last estimationis the actual error locator polynomial. Now computing the error position in thesame way as for the direct method will give the same error position numbers sincethe error locator polynomial is the same in both cases, hence this also gives thatthere are some error in the positions corresponding to x235 and x229, respectively.

5 Result

The meaning of the Mathematica program is to compare the two differentapproaches of finding the error positions, that is the direct method and the

29

i estimated polynomial1 x7 + 240x8

2 x6 + 240x7 + x8

3 x5 + 240x6 + 129x7

4 x4 + 240x5 + 129x6

5 x3 + 240x4 + 129x5

6 x2 + 240x3 + 129x4

7 x+ 240x2 + 129x3

8 = 2t 1 + 240x+ 129x2,

Number of errors Direct method Berlekamp-Massey1 0.457 0.4501 0.440 0.4521 0.440 0.4401 0.453 0.4312 0.437 0.4342 0.440 0.4402 0.520 0.5002 0.520 0.470

Random 0.550 0.500Random 0.310 0.473Random 0.442 0.435Random 0.440 0.440

Table 2: A quick overview of some runs of the Mathematica program, herethe timing is measured in seconds.

Berlekamp-Massey algorithm. After a lot of testing using the RepeatedTimingcommand in Mathematica that gives a trimmed mean of the timings that areobtained, that is the lower and upper quartiles are discarded, and at leastfour evaluations is executed. The command then gives the average time inseconds that it takes to evaluate some calculation, in this case the differentdecoding methods. Some of the results is represented in Table 2, where bothmethods are evaluated for the same error in each row. The Berlekamp-Masseyalgorithm beats the direct method for 2 or 3 errors almost every time. Whenonly one error is introduced they are almost identical, considering the timeit takes to find the error position. When a mixture of errors are introduced

30

there are different results depending on the number of errors in the differentblocks of code. One thing that stands out is that the Berlekamp-Masseyalgorithm stays between about 0.43 and 0.50 seemingly independent of theerror number, where the direct method is quite dependent of the number oferrors. Here where the direct method beats the Berlekamp-Massey algorithmin the section of random number of errors there were some blocks with 0 errorsas well as single errors, which made the direct method faster.

6 Conclusion

For codes where more errors is possible the direct method needs to calcu-late the determinant for larger matrices to find the number of errors, thenthis method potentially needs to linearly solve a large matrix equation tofind the error locations. The complexity for the direct method increasesrapidly and can not be used efficiently for correction of many errors. For theBerlekamp-Massey algorithm the complexity increases more slowly whichgives the possibility to correct more errors without losing efficiency.

31

References

[1] Beachy, J. A; Blair, W. D. Abstract Algebra, Third Edition, WavelandPress, INC. Long Grove, Illinois, 2006. Chapter 3 and Page 459.

[2] Berlekamp E. Algebraic Coding Theory, World Scientific Publishing Co.Pte. Ltd, Singapore, 2015.

[3] Clarke C. K. P. Reed-Solomon Error Correction, BBC Research and De-velopment, 2002. PDF.

[4] Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introductionto Algorithms (2nd ed.), MIT Press, McGraw-Hill, 2001. ”Chapter 34:NP–Completeness”.

[5] Hankerson, D. R.; Hoffman, D. G.; Leonard, D. A.; Lindner, C. C.;Phelps, K. T.; Rodger, C. A.; Wall, J. R. CODING THEORY ANDCRYPTOGRAPHY THE ESSENTIALS Second Edition, Revised andExpanded CRC Press, Boca Raton, 2000.

[6] Huffman W. C.; Pless V. Fundamentals of Error-Correcting Codes, Cam-bridge University Press, New York, 2003. Page 89 and 151.

[7] Reed, I. S; Solomon, G. ”POLYNOMIAL CODES OVER CERTAIN FI-NITE FIELDS”, J Soc. INDUST. AL. MATH. Vol.8 No. 2, June 1960

[8] Trappe, W.; Washington L. C. Introduction to Cryptography with CodingTheory Second Edition, Pearson Education. Inc., Upper Saddle River,2006. ”Chapter 18: Error Correcting Codes”.

[9] Weisstein, Eric W. ”Error-Correcting Code.” From MathWorld-A Wolfram Web Resource. http://mathworld.wolfram.com/Error-CorrectingCode.html

32

A

The Mathematica Code

ClearAll [ x ] ;(∗The t e x t i s from the book A Study In S c a r l e t by Arthur Conan Doyle \t h a t i s about S h e r l o c k Holmes . The t e x t does on ly c o n s i s t s o f \Lowercase l e t t e r s and b lank spaces . ∗)t ex t = ” i con s id e r that a mans bra in o r i g i n a l l y i s l i k e a l i t t l e \empty a t t i c and you have to s tock i t with such f u r n i t u r e as you \choose a f o o l takes in a l l the lumber o f every s o r t that he comes \a c ro s s so that the knowledge which might be u s e f u l to him ge t s \crowded out or at bes t i s jumbled up with a l o t o f other th ing s so \that he has a d i f f i c u l t y in l ay ing h i s hands upon i t now the s k i l f u l \workman i s very c a r e f u l indeed as to what he takes in to h i s bra in \a t t i c he w i l l have nothing but the t o o l s which may help him in doing \h i s work but o f the se he has a l a r g e assortment and a l l in the most \p e r f e c t order i t i s a mistake to th ink that that l i t t l e room has \e l a s t i c wa l l s and can d i s t end to any extent depend upon i t the re \comes a time when f o r every add i t i on o f knowledge you f o r g e t \something that you knew be f o r e i t i s o f the h i ghe s t importance \t h e r e f o r e not to have u s e l e s s f a c t s e lbowing out the u s e f u l ones ” ;(∗ Construc t ing the (n , k ) RS code where n=2ˆm −1=255 where n i s the \code word wi th n symbols , k=239 where k i s the o r i g i n a l message wi th \k symbols , each symbol c o n s i s t s o f m=8 b i t s . There are n−k=2t p a r i t y \check symbols and t symbol e r r o r s t h a t can be c o r r e c t e d in a b l o c k . ∗)\

n = 257 ; (∗ F i r s t prime to cover 8− b i t s ∗)k = 249 ;code = reedSolomonCode [ text , n , k ] ;

(∗−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−∗)(∗ I n t r o d u c i n g some error ∗)

codeWithError = {} ;For [ i = 1 , i < Length [ code ] + 1 , i ++,AppendTo [ codeWithError , code [ [ i ] ] . xˆRange [ 0 , Length [ code [ [ i ] ] ] − 1 ] ] ;

33

numberOfError = RandomInteger [ Floor [ ( n − k − 1 ) / 2 ] ] ;Print [ numberOfError , ” e r r o r s i s int roduced ” ] ;For [ j = 0 , j < numberOfError , j ++,

randomError = xˆRandomInteger [ n ] ;codeWithError [ [ i ] ] = codeWithError [ [ i ] ] + randomError ;(∗ Print [ randomError ] ; ∗)

] ;] ;Print [ ”∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗” ] ;(∗−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−∗)(∗Direc t method ∗)Print [ ” Di rec t Method” ] ;ClearAll [ x , syndromeValues , coe f fEr ro rLoca to r , syndromeMatrix ,syndromeMatrixAnw , e r r o r P o s i t i o n ] ;

RepeatedTiming [For [ a = 1 , a < Length [ codeWithError ] + 1 , a++,

(∗Computing the syndrom v a l u e s f o r the curren t t r a n s m i t t e d v e c t o r .Each o f the syndrome v a l u e s can be ob ta ined by s u b s t i t u t i n g x wi th \the p r i m i t i v e roo t and i t s powers t h a t b u i l d s the genera tor \po lynomia l . ∗)

syndromeValues = {} ;For [ i = 1 , i < n − k + 1 , i ++,

x = Primit iveRoot [ n , 1 ] ˆ i ;AppendTo [ syndromeValues , Mod[ codeWithError [ [ a ] ] , n ] ] ;

] ;ClearAll [ x ] ;(∗ I f the syndromes are zero the k symbols are a codeword and no \error e x i s t s ∗)t = Floor [ ( n − k ) ] / 2 ;I f [Dot [{1 , 1 , 1 , 1 , 1 , 1 , 1 , 1} , syndromeValues ] != 0 ,v ;c o e f fE r r o rLo ca t o r ;

(∗Find the a c t u a l number o f e r r o r s ∗)For [ j = t , j > 0 , j−−,

syndromeMatrix =Table [ syndromeValues [ [ i + l − 1 ] ] , { i , j } , { l , j } ] ;

34

syndromeMatrixAnw =Table[−syndromeValues [ [ i + j ] ] , { i , j } ] ;

I f [Mod[Det [ syndromeMatrix ] , n ] != 0 ,c o e f f E r r o rLo ca t o r =LinearSolve [ syndromeMatrix , syndromeMatrixAnw , Modulus −> n ] ;

v = j ; (∗Actual number o f e r r o r s ∗)j = 0 ;

] ;] ;

(∗Find the p l a c e o f the error ∗)er rorLocatorPo lynomia l =1 + Reverse [ c o e f f E r r o rLo ca t o r ] . xˆRange [ 1 , v ] ;

e r r o r P o s i t i o n = {} ;(∗ S o l v i n g the error l o c a t o r po lynomia l us ing the Chien search ∗)For [ i = 0 , i < n , i ++,

x = Primit iveRoot [ n , 1 ] ˆ i ;I f [Mod[ e r rorLocatorPolynomia l , n ] == 0 ,AppendTo [ e r r o r P o s i t i o n , Mod[− i − 1 , n ] ] ;

] ;] ;

Print [ ” e r r o r s are in p o s i t i o n s ” , e r r o r P o s i t i o n ] ;] ;] ;] ;

Print [ ”∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗” ] ;(∗−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−∗)(∗Berlekamp ’ s a lgor i thm , an approximation o f the error l o c a t o r \po lynomia l . ∗)Print [ ”Berlekamp ’ s Method” ] ;ClearAll [ x ] ;

RepeatedTiming [For [ a = 1 , a < Length [ codeWithError ] + 1 , a++,

(∗Computing the syndrom v a l u e s f o r the curren t t r a n s m i t t e d v e c t o r .

35

Each o f the syndrome v a l u e s can be ob ta ined by s u b s t i t u t i n g x wi th \the p r i m i t i v e roo t and i t s powers t h a t b u i l d s the genera tor \po lynomia l . ∗)syndromeValues = {} ;For [ i = 1 , i < n − k + 1 , i ++,

x = Primit iveRoot [ n , 1 ] ˆ i ;AppendTo [ syndromeValues , Mod[ codeWithError [ [ a ] ] , n ] ] ;

] ;

ClearAll [ x ] ;(∗ I f the syndromes are zero the k symbols are a codeword and no \error e x i s t s ∗)

t = Floor [ ( n − k ) ] / 2 ;I f [Dot [{1 , 1 , 1 , 1 , 1 , 1 , 1 , 1} , syndromeValues ] != 0 ,

q l i s t = {} ;p l i s t = {} ;d l i s t = {−1, 0} ;z l i s t = {0 , 1} ;

AppendTo [ q l i s t , 1 + syndromeValues . xˆRange [ 1 , 2 t ] ] ;AppendTo [ q l i s t , syndromeValues . xˆRange [ 0 , 2 t − 1 ] ] ;AppendTo [ p l i s t , xˆ(2 t + 1 ) ] ;AppendTo [ p l i s t , xˆ(2 t ) ] ;

(∗ p l a c e 1=−1, 2=0∗)For [ i = 1 , i < 2 t + 1 , i ++,

I f [ CoefficientList [ q l i s t [ [ i − 1 + 2 ] ] , x ] [ [ 1 ] ] == 0 ,AppendTo [ q l i s t , PolynomialMod [ q l i s t [ [ i − 1 + 2 ] ] / x , n ] ] ;AppendTo [ p l i s t , PolynomialMod [ p l i s t [ [ i − 1 + 2 ] ] / x , n ] ] ;AppendTo [ d l i s t , 2 + d l i s t [ [ i − 1 + 2 ] ] ] ;AppendTo [ z l i s t , z l i s t [ [ i − 1 + 2 ] ] ] ;

,AppendTo [ q l i s t ,

PolynomialMod [ ( q l i s t [ [ 2 ] ] − ( CoefficientList [q l i s t [ [ i − 1 + 2 ] ] , x ] [ [ 1 ] ] /CoefficientList [ q l i s t [ [ z l i s t [ [ i − 1 + 2 ] ] ] ] , x ] [ [ 1 ] ] ) ∗q l i s t [ [ z l i s t [ [ i − 1 + 2 ] ] ] ] ) / x , n ] ] ;

36

AppendTo [ p l i s t ,PolynomialMod [ ( p l i s t [ [ i − 1 + 2 ] ] − ( CoefficientList [q l i s t [ [ i − 1 + 2 ] ] , x ] [ [ 1 ] ] /CoefficientList [ q l i s t [ [ z l i s t [ [ i − 1 + 2 ] ] ] ] , x ] [ [ 1 ] ] ) ∗p l i s t [ [ z l i s t [ [ i − 1 + 2 ] ] ] ] ) / x , n ] ] ;

AppendTo [ d l i s t ,2 + Min [ d l i s t [ [ i − 1 + 2 ] ] , d l i s t [ [ z l i s t [ [ 2 ] ] ] ] ] ] ;

I f [ d l i s t [ [ i − 1 + 2 ] ] >= d l i s t [ [ z l i s t [ [ i − 1 + 2 ] ] ] ] ,AppendTo [ z l i s t , i ] ;

,AppendTo [ z l i s t , z l i s t [ [ i − 1 + 2 ] ] ] ;

] ;] ;

] ;] ;

errorLocatorPolynomialB = p l i s t [ [ 2 t + 2 ] ] ;

(∗Find the p l a c e o f the error ∗)e r r o r P o s i t i o n = {} ;

(∗ S o l v i n g the error l o c a t o r po lynomia l us ing the Chien search ∗)For [ i = 0 , i < n , i ++,

x = Primit iveRoot [ n , 1 ] ˆ i ;I f [Mod[ errorLocatorPolynomialB , n ] == 0 ,AppendTo [ e r r o r P o s i t i o n , Mod[− i − 1 , n ] ] ;

] ;] ;

Print [ ” e r r o r s are in p o s i t i o n s ” , e r r o r P o s i t i o n ] ;] ;] ;

(∗ Imput : some t e x t a s w e l l as the prime n and the dimension k ( number \o f in format ion symbols ) o f some code .Output : the t e x t encoded .Function : Encode some t e x t through the c y c l i c genera tor matrix made \by the genera tor po lynomia l t h a t i s c o n s t r u c t e d by the s m a l l e s t \p r i m i t i v e roo t o f n . ∗)reedSolomonCode [ t ext , n , k ] :=

37

Module [{ tempText , padX , returnText , GP, coeffG , padG , matrixG ,message } ,(∗From l e t t e r to i n t e g e r r e p r e s e n t e d ∗)tempText = ToCharacterCode [ t ex t ] ;padX = ToCharacterCode [ ”x” ] ;

(∗Add padding , to be d i v i s i b l e by k ∗)I f [Mod[ Length [ tempText ] , k ] != 0 ,For [ i = 0 , i < Mod[ Length [ tempText ] , k ] , i ++,AppendTo [ tempText , padX [ [ 1 ] ] ] ;] ;] ;message = {} ;(∗Divide the t e x t i n t o messages o f l e n g h t k ∗)For [ i = 0 , i < Length [ tempText ] / k , i ++,AppendTo [ message , tempText [ [ i ∗k + 1 ; ; ( i ∗k ) + k ] ] ]] ;

(∗ Primit iveRoot [ n ,1 ] g e n e r a t e s the s m a l l e s t i n t e g e r s modulo n t h a t are r e l a t i v e l y \prime to n∗)(∗Construct the genera tor po lynomia l ∗)GP = 1 ; (∗To not g e t a t d i f f e r e n t v a l u e when m u l t i p l y i n g ∗)For [ i = 1 , i < n − k + 1 , i ++,GP = GP ( x − Primit iveRoot [ n , 1 ] ˆ i ) ;] ;

GP = PolynomialMod [GP, n ] ;(∗Construct the genera tor matrix ∗)coe f fG = CoefficientList [GP, x ] ;padG = PadRight [ coef fG , n ] ;matrixG = Table [ RotateRight [ padG , i ] , { i , 0 , k − 1 } ] ;

returnText = Mod[ message . matrixG , n ] ;

Return [ returnText ] ;] ;

38

Documents

Reed-Solomon Codes: Error Correcting Codes1455935/FULLTEXT01.pdf · 1 Introduction To be able to safely store or transmit information, without losing impor-tant messages, one needs