33
The Math Behind the Compact Disc Linear Algebra and Error- Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield university

The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

Embed Size (px)

Citation preview

Page 1: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

The Math Behind the Compact Disc

Linear Algebra and Error-Correcting Codes

william j. martin. mathematical sciences. wpi

wednesday december 3. 2008

fairfield university

Page 2: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

How the device works

The compact disc is a complex system incorporating interesting ideas from engineering, physics, CS and math. We will focus only on the mathematics of the error- correction strategy.

For more info on the CD, see Kelin Kuhn’s book “Laser Engineering”:

Page 3: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Borrowed from K J Kuhn’s book “Laser Engineering”

Page 4: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

The Pits

Each pit is 0.5 microns wide…and 0.83 to 3.56 microns long.Tracks are separated by 1.6 microns of “land”Wavelength of green light is about 0.5 micron40 tracks under one strand of human hair

Page 5: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Modelling a CommunicationsChannel

Linear algebra model: r = m+e (vector add.)

Page 6: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Channel with Error Correction

Page 7: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Turn it into an algebra problem!

A number system that the computer can understand:F = { 0, 1 }Ordinary multiplicationAddition: 1+1=0

Now music is turned into binary vectors!

Page 8: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

A bit (or a nibble?) of graph theory

The n-cube is a type of Hamming graphVertices are all binary n-tuplesn-tuples are adjacent if they differ in only one coordinateNice ‘eigenvalues’!

Page 9: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Binary Vector Spaces

The vectors are all possible binary n-tuples

0 0 1 0 1 1 1 0 1 0 1 1 0 0 0

+

0 0 1 1 1 1 0 0 0 0 0 0 0 0 1

=

0 0 0 1 0 0 1 0 1 0 1 1 0 0 1

Page 10: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Hamming Distance

The distance between two binary n-tuples x and y is the number of coordinates in which they differ

This is a metric: dist( x, y ) 0 with dist( x, y ) = 0 iff x=y dist( x, y ) = dist( y, x ) Triangle inequality dist( x, z ) dist( x, y ) + dist( y, z )

dist( 001100, 001011 ) = 3

Page 11: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Theorem

Let C (the “code”) be a subset of F with minimum distance between any two codewords equal to d.

Then there exists an algorithm which corrects up to t errors per transmitted codeword if and only if d 2t + 1.

n

Page 12: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Proof

If x and y are distinct codewords, then the balls of radius t around them are disjoint. So if the received vector is within distance t of x, it must be at distance > t from any other codeword. So decoding is unique.

Page 13: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

A Useful Extension of the Theorem

The above (computationally infeasible) decoding algorithm also correctly recovers from any t symbol errors and any s symbol erasures provided d > 2t+s.

transmit: 0 1 1 2 2 3 0

receive: 0 1 3 3 ? ? ?

(here, t=2 errors and s=3 erasures)

Page 14: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Small Example

Let C denote the “rowspace” of the matrix

Then C = { 000000, 110100, 011010, 101110, 001101, 111001, 010111, 100011 } and C has minimum distance 3 so C allows correction of

any single-bit error in any transmitted codeword.

Page 15: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

The binary Hamming code

Codewords: 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 1 0 0 0 0 0 1 0 1 1 1 0 1 1 0 1 0 0 1 0 0 1 0 1 1 0 0 1 1 0 1 0 1 1 0 0 1 0 1 0 0 0 1 1 0 1 1 1 1 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 1 0 1 0 0 0 1 1 1 0 1 1 1 0 0 1 0 1 0 0 0 1 0 1 0 1 1 1 0

Quadratic Residues!

In we have

1 = 1 6 = 1

2 = 4 5 = 4

3 = 2 4 = 2

ZZ 7

2

2

2

2

2

2

Page 16: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

The Fano projective plane

Vector Space: FF

“Poynts”: 1-dim. subspaces

“Lynes”: 2-dim. subspaces

3

2

Page 17: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

All codewords:

0 0 0 0 0 0 0 1 1 1 1 1 1 1

0 0 0 1 1 1 1 1 1 1 0 0 0 0

0 1 1 0 0 1 1 1 0 0 1 1 0 0

0 1 1 1 1 0 0 1 0 0 0 0 1 1

1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 1 0 1 0 0 1 0 0 1 0 1

1 1 0 0 1 1 0 0 0 1 1 0 0 1

1 1 0 1 0 0 1 0 0 1 0 1 1 0

C = nullsp(H) where

Page 18: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Codes from polynomialsLet’s replace F={0,1} with F={0,1,…,6} (with

modular arithmetic). Now consider the vector space F[z] of all polynomials in z with coefficients in F. For any subset N of F, we have a linear transformation

L: F[z] F via f(z) [ f(0), f(1), f(2), f(3), f(4), f(5) ]

(Here, we use, N={0,1,2,3,4,5}.)This is a Reed-Solomon code.

N

Page 19: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Polynomials to Codewords

Example:

Let the message be [1, 2, 2] (working mod 7)Polynomial is f(z) = z + 2 z + 2Codeword is

[f(0), f(1), f(2), f(3), f(4), f(5)] = [ 2, 5, 3, 3, 5, 2]

2

Page 20: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Reed-Solomon Codes

FACT: Two polynomials of degree less than k having k points of intersection must be equal.

SO: Reed-Solomon code of length n<q and dim k has min. dist. n-k+1

Page 21: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Compact Disc Parameters

SONY/Philips design (1980)Music is sampled 44,100 times per secondEach sample consists of 32 bits, representing

left and right channel signal magnitude 0—65535 (Pulse Code Modulation – PCM)So chip must process 1,411,200 raw data bits per secondBut it gets much worse!

Page 22: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Cross-Interleaved RS Codes

Inner code is a 28-dimensional subspace of a32-dimensional vector space over a finite field of

size 256.Outer code is a 24-dimensional subspace of a 28-dimensional vector space.Six 32-bit samples make up a 192-bit frame which is encoded as a 224-bit codeword. (Eventually, codewords have length 588 bits!)

Page 23: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Encoding – The numbers

The codewords from the first code are interleaved into a virtually infinite array of 28 rows of symbols over GF(256).We pull out 8 binary columns (one symbol) to obtain a 28x8=224-bit frame which is then encoded using another Reed-Solomon code to obtain a codeword of length 256 bits.

Page 24: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Interleaving to disperse errors

Codewords of first code are stacked like bricks 28 rows of vectors over GF(256)Extract columns and re-encode using second Reed-Solomon code

Page 25: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Splitting Odd and Even Bits

Page 26: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Back to the Pits

Each pit is 0.5 microns wide…and 0.83 to 3.56 microns long.Tracks are separated by 1.6 microns of “land”Not all 01-sequences can be recorded

Page 27: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

EFM: Eight-to-Fourteen Modulation

This encoding scheme can only store sequences where each consecutive pair of ones is separated by at least 2 and at most 10 zerosThis is achieved by a mapping F F

which is given by a lookup table. 2 2

148

Page 28: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Further Processing

Three more ‘merge bits’ are added to each of these 14So 256+8=264=33x8 bits, carrying six samples, or 192 information bits, gets encoded as 588 channel bits on the diskThis represents 0.000136 seconds of music

Page 29: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

What actually goes on the disc?

We must do this 7,350 times per secondSo CD player reads 4,321,800 bits per second of music producedTo get 74 minutes of music, we must store

74x60x4321800 = 19,188,792,000bits of data on the compact disc!

Page 30: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

When in doubt, erase

Inner code has minimum distance 5 (over GF(256))Rather than correct two-symbol errors, the CD just erases the entire received vector.

Page 31: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

So…how good is it?

The two Reed-Solomon codes team up to correct ‘burst’ errors of up to 4000 consecutive data bits (2.5 mm scratch on disc)If signal at time t cannot be recovered, interpolate

With smart data distribution, this allows for recovery from burst errors of up to 12,000 data bits (7.5 mm track length on disc)If all else fails, mute, giving 0.00028 sec of silence.

Page 32: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

Other Applications

Space communications (Mariner,Voyager,etc.)DVD, CD-R, CD-ROMCell phones, internet packetsMemory: chips, hard drives, USB sticksRAID disk arraysQuantum computing

Page 33: The Math Behind the Compact Disc Linear Algebra and Error-Correcting Codes william j. martin. mathematical sciences. wpi wednesday december 3. 2008 fairfield

04/21/23 W J Martin Mathematical Sciences WPI

The Last Slide

Thank You All!