33
1 Error Control Coding and Applications Eric Chen Computer Science Group HKr

1 Error Control Coding and Applications Eric Chen Computer Science Group HKr

Embed Size (px)

Citation preview

1

Error Control Coding and

Applications

Eric Chen

Computer Science GroupHKr

2

2

Internet Banking

• Postgiro• Bankgior• OCR

3

3

OCR-Number (OCR-nummer ) ?

• A reference number links your payment to the invoice

• Typed OCR in-correctly ? Error detection?

• E24.se reported (how it is possible?)– Företagaren Thomas Hultberg fyllde i fel

OCR-nummer när han skulle betala in skatten. Nu tvingas han betala in de 229 000 kronorna en gång till.

4

4

Outline of the presentation

• About me• Errors and Error Effect• Error Control Coding• Personnummer, OCR• Hamming code• My Research and Results

5

5

About Me

• Education – 1978.7—1982.7 Electronics Engineering, Harbin C.

U.– 1982.7—1986.4 Computer Science, SWJTU– 1987.2—1991.4 Electrical Engineering, SWJTU

• Work – 1986.1—1993.4 Lecturer, Associate Prof. SWJTU– 1993.4—1994.6 Guest researcher, Linköping Uni.– 1994.7– present HKr

6

6

Information Transmission System• Source encoding (remove redundancy)• Encoder ( add redundancy )• Decoder ( error detection/correction)• Source decoding

Information Sink

Receiver(Decoder)

Transmitter(Encoder)

CommunicationChannel

Information Source

Noisek-digit k-digitn-digit n-digit

7

7

Introduction to Coding Theory

• Also called channel coding• The study of methods for efficient and

accurate transfer of information– Detecting and correcting transmission errors

8

8

Errors in digital communications

9

9

Errors and Error Effect

• Errors

0 1 or 1 0

Bits can be lost

• Error effect

downloaded programs from Internet ?

CD music ?

Internet banking services ?

• Errors must be detected/corrected !!!

10

10

Channel model -- BSC

• Binary symmetric channel– p: the bit error probability

0 0

1 1

p1

p1

p

p

11

11

Why Error Control Coding ?

• Bit error rate p p = 1/100000 = 10-5 for optical disks p = 10-11 for a fiber link

• Some calculations p = 10-6

• download a file of length 107 bits 10 bit errors • Data rate at 10 Mbps 1 bit error in every 1

sec!!

p = 10-11, and data rate 10 Gigabits/sec 1 bit error each 10 second !

12

12

• add additional information, or redundancy

to data

• added by sender, checked by receiver

• k data digits encoded to a codeword of n digits

• Code rate r = k / n

k nEncoded as

codeword

Error Control Coding – Principle

13

13

Application Example– Swedish personal ID

• 640823-3234 ?• yy mm dd – nnnP

yy mm dd– year month daynnn – serial number

odd– for male, even for femaleP ? That is parity check digit

Used for error detection !• OCR number uses the same technique

14

14

Personal ID Encoding Method

position 1 2 3 4 5 6 7 8 9 10 6 4 0 8 2 3 3 2 3 ?2×odd 12 4 0 8 4 3 6 2 6add2-digits 3 4 0 8 4 3 6 2 6sum = 3 + 4 + 0 + 8 + 4 + 3 + 6 + 2 + 6 = 36take the last digit of the sum: 6parity check digit = 10 – 6 = 4 640823-3234

15

15

Personal ID Error Detection

640823-3234 460823-3234 ?position 1 2 3 4 5 6 7 8 9 10 4 6 0 8 2 3 3 2 3 ?2×odd 8 6 0 8 4 3 6 2 6add2-digits 8 6 0 8 4 3 6 2 6sum = 8 + 6 + 0 + 8 + 4 + 3 + 6 + 2 + 6 = 43take the last digit of the sum: 3parity check digit = 10 – 3 = 7 It is not equal to 4 Error in the number !

16

16

• The same coding methods have been

used for

– OCR reference number

– Bankgironummer

– Organisationsnummer

• Reference

– http://www.lur.nu/OCR/generera.php

Parity Check Applications

17

17

• Binary Hamming [7, 4] code k = 4, n =

7 Encode 4 data bits by adding 3 parity bits

Can correct any single error

• Encoding

a b c d a b c d x y z

Where a, b, c, d are information bits

x, y, z are parity check bits

Error Correcting Code– Hamming Code

18

18

• Given a, b, c, d. How to get x, y, z ?

Place a, b, c, d in the intersections

Label circles by x, y, z

Parity checking rule:

the sum of each circle is 0

x = a+b+c, y = a + c + d, z = b + c +

d

Hamming Code

a

b cd

x y

z

19

19

• Given a, b, c, d. How to get x, y, z ?

0101 0101 xyz

so the codeword is 0101 110

Hamming Code Example

0

1 01

1 1

0

0

1 01

x y

z

a

b cd

x y

z

20

20

• 0101 110 sent 0100 110 received. Encode 0100 0100 101

Compare 101 with received 110 101 110 = 011, there is an error

bit d must be in error, it affects y, z correction 0101

Hamming Code for Error Correction

0

1 00

1 0

1

0

1 00

1 1

0 received reconstructed

21

21

• Only detect errors

– Using protocol to correct errors:

ACK: positive acknowledgement ( I got it) NAK: negative acknowledgement ( sorry )

• Simple, reliable, high code rate• Used in data communications

Error Detecting Codes

sender receivercodeword

ACK/NAK

22

22

• Detect and correct errors

• No feedback channel required• Complicated, lower code rate (k/n)• Used in storage systems (computer

storage, CD, DVD), and • space communications

Error Correcting Codes

sender receivercodeword

23

23

Generator Matrix and Encoding

Generator matrix G– Example Hamming [7, 4] code

– Encoding: (a,b,c,d) G = c codeword

a

b cd

x y

z

x = a+b+c, y = a + c + d, z = b + c + d

1101000

1110100

1010010

0110001

G

24

24

Parity Check Matrix and Decoding

Parity check matrix H– HGtr = 0– Example Hamming [7, 4] code

– Syndrome s: a column vector of length n-k– Decoding

• Received codeword y: Hytr = s the syndrome of y• If s = 0, no error detected• Otherwise, there must be errors

a

b cd

x y

z

1001110

0100101

0011111

H

25

25

Optimal codes

• Distance or d-optimal code– A linear [n, k, d]q code is d-optimal if there does not

exist an [n, k, d+1]q code.

• Length or n-optimal code– A linear [n, k, d]q code is n-optimal if there does not

exist an [n–1, k, d]q code.

• k-optimal code– A linear [n, k, d]q code is k-optimal if there does not

exist an [n, k+1, d]q code.

26

26

My Research

• Difference triangle sets– A generalization of Golomb rulers– 1989—1995

• Majority-logic decodable codes– 1988—1992

• Quasi-Twisted codes (ex. a [112,13,48] code)– 1989—present

• Two-weight codes and graphs– 2006—present

27

27

Difference Triangle Set

• Golomb ruler 1 4 6– R = {0, 1, 4, 6} 3 5– Difference triangle 2

• Difference triangle sets– A generalization of Golomb rulers– A set of t Golomb rulers – R1 = {0, 6, 11, 13}, R2 = {0, 8, 17, 18}, R3 = {0, 3,

15, 19} 6 11 13 8 17 18 3 15 19 5 7 9 10 12 16 2 1 4

28

28

Current Research Interests

• Quasi-Twisted Codes – Many QT codes are good or optimal– Computer constructions

• 2-weight codes– Non-zero codewords of weights w1 or w2

– Related to strongly regular graphs

29

29

Computer search for QT Codes

• Given a cyclic weight matrix of order s

• How to select p columns such that– Maximize the minimum row sums of p cols– Row sums of values w1 or w2

0321

3012

2101

1210

...

...

...

...

...

dddd

dddd

dddd

dddd

D sss

ss

s

30

30

Computer search for QT Codes

• Given a cyclic weight matrix of order s

– Columns 0, 1, 3 produces a QT code with minimum distance of 6

– row sums are of values 6 and 8 two-weight code found

2213323

3221332

2322133

3232213

3323221

1332322

2133232

D

31

31

Publications

• Co-authored books– One text book ( VAX-11 Assembly lang. prog. )– One book ( combinatorial coding theory and appl.)

• IEEE Trans. Information Theory – 8 papers

• IEE Electronics Letters– 6 papers

• Codes, Designs and Cryptography– 1 paper (DDD disjoint distinct difference set)

32

32

Online Database on Codes

• A web database of binary quasi-cyclic codes

http://moodle.tec.hkr.se/~chen/research/codes/searchqc2.htm

see also: codetables http://www.codetables.de

• A Web database of two-weight codeshttp://moodle.tec.hkr.se/~chen

/research/2-weight-codes/search.php

33

33

References

• Personnummer http://skatteverket.se/download/18.1e6d5f87115319ffba380001857/70408.pdf

• http://www.e24.se/pengar24/dinekonomi/familjeekonomi/artikel_360879.e24

• http://www.lur.nu/OCR/generera.php