A Brief Tour of FEC for Serial Link Systems - sisoft.com · Encoding Source Encoder Channel Encoder Modulator ᑣ ᑤ ᑣ=ᑣ0,ᑣ1,…,ᑣ −1 ᑤ=ᑤ0,ᑤ1,…,ᑤ −1 message

Prof. Shu Lin U.C.Davis

Dr. Cathy Liu Avago

Dr. Michael Steinberger SiSoft

A Brief Tour of FEC for Serial

Link Systems

Outline

• Introduction

• Finite Fields and Vector Spaces

• Linear Block Codes

• Cyclic Codes

• Important Classes of Cyclic Codes

• FEC Applications to Serial Link Systems

• System Performance Estimation with FEC

• Error Correlation Study

Block Diagram of a Serial Link Information

Source

Source

Encoder

Channel

Encoder Modulator

Channel

Demodulator Channel

Decoder

Source

Decoder Destination

Noise

𝑣

𝑢 𝑣

𝑟

Encoding Source

Encoder

Channel

Encoder Modulator

𝑢 𝑣

𝑢 = 𝑢0, 𝑢1, … , 𝑢𝑘−1

𝑣 = 𝑣0, 𝑣1, … , 𝑣𝑛−1

message

codeword

single bit or m-bit symbol

k

n

n-k check bits or symbols added

(n,k) block code

Decoding

Demodulator Channel

Decoder

Source

Decoder

𝑣 𝑟

𝑟 = 𝑟0, 𝑟1, … , 𝑟𝑛−1

𝑣 = 𝑣0, 𝑣1, … , 𝑣𝑛−1 Hard Decision:

Soft Decision:

𝑣𝑗 𝑟𝑗

0

1

0

1

1 − 𝑝

1 − 𝑝

𝑝

𝑝

binary symmetric channel

binary input, 8-ary output

discrete channel

𝑣𝑗

0

1

𝑟𝑗

0

1

7

P 0 0

P 1 1

P 7 0

P 1 0

P 0 1

P 7 1

Optimal Decoding

Minimize 𝑃(𝐸)

optimum decoding rule

Minimize 𝑃 𝐸 𝑟 Maximize 𝑃 𝑣 = 𝑣 𝑟

𝑃 𝑣 𝑟 =𝑃 𝑣 𝑃 𝑟 𝑣

𝑃(𝑟)

Optimal decoder concept

1. Compute for every possible value of 𝑣

2. Choose 𝑣 to be the value with the largest probability

Maximum likelihood decoder (all messages equally likely!)

1. Compute 𝑃 𝑟 𝑣𝑗 for every possible value of 𝑣𝑗

2. Choose 𝑣𝑘 to be the value with the largest probability

3. Find 𝑢𝑘 corresponding to 𝑣𝑘

Signal to Noise Ratio 𝐸𝑏: Energy per bit. A measure of signal energy

𝑁0: Noise spectral density. A measure of noise energy 𝐸𝑏

𝑁0: A unit-less measure of signal to noise ratio

Noise limited channel:

coding gain @ BER

1E-15

1E-13

1E-11

1E-09

0.0000001

0.00001

0.001

0.1 5 7 9 11 13 15

BER uncoded

BER coded

Shannon Capacity Limit Assume:

Occupied bandwidth 𝑊 (use most restrictive definition possible)

Transmitted power 𝑃𝑠 Data rate 𝑅𝑐ℎ𝑎𝑛𝑛𝑒𝑙

Then define channel capacity**

C ≜ 𝑊 log2 1 +𝑃𝑠

𝑊𝑁0

Hard to approach Shannon limit without using FEC. ** Wozencraft and Jacobs, Principles of Communication Engineering, pg. 323, John Wiley and Sons, Inc., copyright 1965

0.5

0

BE

R

𝐸𝑏𝑁0

𝑅𝑐ℎ𝑎𝑛𝑛𝑒𝑙 < C 𝑅𝑐ℎ𝑎𝑛𝑛𝑒𝑙 > C

Theoretical minimum BER

High Speed Serial Channel

Dispersion limited

not

Noise limited

Hard decision decoding

not

Soft decision decoding

A different animal

Binary Arithmetic

Addition (XOR)

0 + 0 = 0

0 + 1 = 1

1 + 0 = 1

1 + 1 = 0

Multiplication (AND)

0 · 0 = 0

0 · 1 = 0

1 · 0 = 0

1 · 1 = 1

Galois Field {0,1} = GF(2)

ϵ {0,1} ϵ {0,1}

NOTE: Subtraction = Addition (?!)

Finite (Galois) Fields For any

𝑝 positive prime integer

𝑚 positive (non-zero) integer

Define 𝑝𝑚 symbols

Unique arithmetic Addition (maps back to same 𝑝𝑚 symbols)

Multiplication (maps back to same 𝑝𝑚 symbols)

Familiar algebraic properties Same as real numbers or complex numbers

Can define vectors and polynomials!

GF(𝑝𝑚)

𝑝 = 2 GF(2𝑚) 𝑚-bit symbols (especially 𝑚 = 1 )

Implement arithmetic using linear feedback shift registers.

𝑚 > 1 essential for tolerating bursts of errors.

Vectors 𝑉𝑛: 𝑎0, 𝑎1, … , 𝑎𝑛−1 𝑎𝑖 ∈ 𝐺𝐹 2𝑚

Addition:

𝑎0, 𝑎1, … , 𝑎𝑛−1 + 𝑏0, 𝑏1, … , 𝑏𝑛−1 = 𝑎0 + 𝑏0, 𝑎1 + 𝑏1, … , 𝑎𝑛−1 + 𝑏𝑛−1

Scalar multiplication:

𝑐 𝑎0, 𝑎1, … , 𝑎𝑛−1 = 𝑐 ∙ 𝑎0, 𝑐 ∙ 𝑎1, … , 𝑐 ∙ 𝑎𝑛−1

Inner product:

𝑎, 𝑏 = 𝑎0 ∙ 𝑏0 + 𝑎1 ∙ 𝑏1 + ⋯+ 𝑎𝑛−1 ∙ 𝑏𝑛−1

Orthogonal:

𝑎, 𝑏 =0

Subspace

𝑉𝑘 subset of 𝑉𝑛: 𝑎 ∈ 𝑉𝑘 and 𝑏 ∈ 𝑉𝑘 → 𝑎 + 𝑏 ∈ 𝑉𝑘

The concept of subspaces is critical to error correction coding

Linear Block Code 𝑉𝑛: 𝑎0, 𝑎1, … , 𝑎𝑛−1 𝑎𝑖 ∈ 𝐺𝐹 2

Decompose 𝑉𝑛 into orthogonal subspaces 𝑉𝑘 and 𝑉𝑛−𝑘

(orthogonal: 𝑝, 𝑞 = 0 for all 𝑝 ∈ 𝑉𝑘, 𝑞 ∈ 𝑉𝑛−𝑘 )

Codewords of 𝒏, 𝒌 Linear Block Code form 𝑉𝑘

Linear: Includes all linear combinations within 𝑉𝑘

Block: Each codeword is a block of 𝑛 bits or 𝑛 𝑚-bit symbols

𝑟 = 𝑝 + 𝑞 𝑟 ∈ 𝑉𝑛, 𝑝 ∈ 𝑉𝑘, 𝑞 ∈ 𝑉𝑛−𝑘

rate = 𝑘 𝑛

Binary Linear Block Code

Encoding 𝑛, 𝑘 linear block code 𝐶

𝑉𝑘 is generated by 𝑘 basis vectors 𝑔0, 𝑔1, … , 𝑔𝑘−1.

These vectors can be organized into the generator matrix

𝐺 =

𝑔0

𝑔1

⋮𝑔𝑘−1

=

𝑔0,0 𝑔0,1

𝑔1,0 𝑔1,1

⋯ 𝑔0,𝑛−1

⋯ 𝑔1,𝑛−1

⋮ ⋮𝑔𝑘−1,0 𝑔𝑘−1,1

⋱ ⋮⋯ 𝑔𝑘−1,𝑛−1

message 𝑢 = 𝑢0, 𝑢1, … , 𝑢𝑘−1

produces

codeword 𝑣 = 𝑢 ∙ 𝐺 = 𝑢0𝑔0 + 𝑢1𝑔1 + ⋯+ 𝑢𝑘−1𝑔𝑘−1 𝑣 ∈ 𝑉𝑘

Linear Systematic Block Code

Systematic: 𝑣 = 𝑣0, 𝑣1, … , 𝑣𝑛−𝑘−1, 𝑢0, 𝑢1, … , 𝑢𝑘−1

unaltered message parity check

Linear combination of message bits/symbols

Check for errors (and correct if possible)

Linear Block Code + Systematic = Linear Systematic Block Code

Parity-Check Matrix 𝑉𝑛−𝑘 is generated by 𝑛 − 𝑘 basis vectors ℎ0, ℎ1, … , ℎ𝑛−𝑘−1.

These vectors can be organized into the parity-check matrix

𝐻 =

ℎ0

ℎ1

⋮ℎ𝑛−𝑘−1

=

ℎ0,0 ℎ0,1

ℎ1,0 ℎ1,1

⋯ ℎ0,𝑛−1

⋯ ℎ1,𝑛−1

⋮ ⋮ℎ𝑛−𝑘−1,0 ℎ𝑛−𝑘−1,1

⋱ ⋮⋯ ℎ𝑛−𝑘−1,𝑛−1

Since 𝑣 ∈ 𝑉𝑘 𝑣, ℎ𝑖 = 0 for all 𝑖

𝑣 ∙ 𝐻𝑇 = 0,0, … , 0

(We said the subspaces

were orthogonal.)

parity constraint

Linear Systematic Block Code Example Messages

𝒖𝟎, 𝒖𝟏, 𝒖𝟐

Codewords

𝒗𝟎, 𝒗𝟏, 𝒗𝟐, 𝒗𝟑, 𝒗𝟒, 𝒗𝟓

(000) (000000)

(100) (011100)

(010) (101010)

(110) (110110)

(001) (110001)

(101) (101101)

(011) (011011)

(111) (000111)

𝑘 = 3 𝑛 = 6

Matrices for Example Code

𝐺 =

𝑔0

𝑔1

𝑔2

=0 1 11 0 11 1 0

1 0 00 1 00 0 1

Suppose message 𝑢 = (101)

𝑣 = 𝑢 ∙ 𝐺

= 1 ∙ 𝑔0 + 0 ∙ 𝑔1 + 1 ∙ 𝑔2

= 1 ∙ 011100 + 0 ∙ 101010 + 1 ∙ 110001

= 011100 + 000000 + 110001

= (101101)

𝑢 parity

𝐻 =1 0 00 1 00 0 1

0 1 11 0 11 1 0

𝑣0 = 𝑢1 + 𝑢2

𝑣1 = 𝑢0 + 𝑢2

𝑣2 = 𝑢0 + 𝑢1

𝑣3 = 𝑢0

𝑣4 = 𝑢1

𝑣5 = 𝑢2

Error Detection

𝑟 = 𝑣 ∴ 𝑟 ∈ 𝑉𝑘 correct transmission

𝑟 ≠ 𝑣 𝑟 ∈ 𝑉𝑘 error (detectable)

𝑟 ≠ 𝑣 𝑟 ∈ 𝑉𝑘 error (undetectable) /

Syndrome 𝑠 ≜ 𝑟 ∙ 𝐻𝑇 𝑠 = 𝑠0, 𝑠1, … , 𝑠𝑛−𝑘−1

𝑠 = 𝑟 ∙ 𝐻𝑇 = 0 correct transmission

or undetectable error

𝑠 = 𝑟 ∙ 𝐻𝑇 ≠ 0 detectable error

𝑣 transmitted codeword 𝑟 received vector (hard decision)

Error Correction

1. Compute the syndrome 𝑠 of the received vector 𝑟 to detect errors.

2. Identify the locations of the errors (the hardest part).

3. Correct the errors.

Identifying Error Locations

Error vector 𝑒 ≜ 𝑟 + 𝑣

= 𝑒0, 𝑒1, … , 𝑒𝑛−1

= 𝑟0 + 𝑣0, 𝑟1 + 𝑣1, … , 𝑟𝑛−1 + 𝑣𝑛−1

Remember we noted that

subtraction = addition?

Here it is.

𝑒𝑗 = 0 if 𝑟𝑗 = 𝑣𝑗 𝑒𝑗 = 1 if 𝑟𝑗 ≠ 𝑣𝑗 error location

Estimated error vector 𝑒∗

Estimated transmitted codeword 𝑣∗ = 𝑟 + 𝑒∗ 𝑣∗ ∈ 𝑉𝑘

Choose 𝑒∗ to minimize the number of error locations needed

to make 𝑣∗ a valid codeword.

Error Correction Capability Hamming distance: 𝑑(𝑣, 𝑤) 𝑣, 𝑤 ∈ 𝑉𝑛 defined over 𝐺𝐹 𝑝𝑚

number of symbol locations where 𝑣 and 𝑤 differ

(Apply to 𝐶 defined over 𝐺𝐹 2 )

Minimum distance of (𝑛, 𝑘) linear block code 𝐶: 𝑑𝑚𝑖𝑛 𝐶 = min {𝑑 𝑣, 𝑤 : 𝑣, 𝑤 ∈ 𝐶, 𝑣 ≠ 𝑤}

Error-correction capability: 𝑡 =𝑑𝑚𝑖𝑛 𝐶 −1

2

(𝑛, 𝑘, 𝑑𝑚𝑖𝑛) linear block code

v

w 𝑡

x r

Cyclic Codes

𝑎 = 𝑎0, 𝑎1, … , 𝑎𝑛−1 𝑎𝑖 ∈ 𝐺𝐹 2𝑚

Right shift operator 𝑎(1) ≜ 𝑎𝑛−1, 𝑎0, … , 𝑎𝑛−2

Cyclic code 𝐶: 𝑣 ∈ 𝐶 → 𝑣(1) ∈ 𝐶

• Encoding and syndrome computation implemented using shift

registers with simple feedback.

• Inherent algebraic structure enables many implementation options.

Polynomials using 𝐺𝐹(2𝑚) For each vector 𝑎0, 𝑎1, 𝑎2, … , 𝑎𝑛−1

There is a corresponding polynomial 𝑎0 + 𝑎1𝑋 + 𝑎2𝑋2 + ⋯+ 𝑎𝑛−1𝑋

𝑛−1

(allowing for the unique arithmetic of the 𝐺𝐹 2𝑚 )

Same arithmetic and algebraic properties as polynomials with real or complex coefficients.

Polynomial version of right shift operator

1. Multiply by 𝑋 𝑋𝑎 𝑋 = 𝑎0𝑋 + 𝑎1𝑋2 + 𝑎2𝑋

3 + ⋯+ 𝑎𝑛−2𝑋𝑛−1 + 𝑎𝑛−1𝑋

𝑛

2. Divide by 𝑋𝑛 + 1 𝑋𝑎 𝑋 = 𝑎𝑛−1 + 𝑎0𝑋 + 𝑎1𝑋2 + ⋯+ 𝑎𝑛−2𝑋

𝑛−1 + 𝑎𝑛−1(𝑋𝑛+1)

3. Keep the remainder 𝑋𝑎 𝑋 = 𝑎(1) 𝑋 + 𝑎𝑛−1(𝑋𝑛 + 1) Remember: In 𝐺𝐹(2𝑚)

subtraction =addition

Generator Polynomial 𝑔 𝑋 = 1 + 𝑔1𝑋 + 𝑔2𝑋

2 + ⋯+ 𝑔𝑛−𝑘−1𝑋𝑛−𝑘−1 + 𝑋𝑛−𝑘

degree 𝑛 − 𝑘 non-zero

A code polynomial 𝑣(𝑋) is in code 𝐶 iff it has the form

𝑣 𝑥 = 𝑎 𝑋 𝑔(𝑋)

𝑛 > degree ≥ 𝑛 − 𝑘 In principle, 𝑎(𝑋) could be a message. However, the resulting code would not be systematic.

𝑔(𝑋) is the generator polynomial for the code 𝐶.

degree < 𝑘

𝑔(𝑥) is a factor of 𝑋𝑛 + 1

Systematic Encoding 1. Right-shift the message by 𝑛 − 𝑘 symbols (that is, multiply by 𝑋𝑛−𝑘)

𝑋𝑛−𝑘𝑢 𝑋 = 𝑢0𝑋𝑛−𝑘 + 𝑢1𝑋

𝑛−𝑘+1 + ⋯+ 𝑢𝑘−1𝑋𝑛−1

2. Fill in the parity-check symbols in a way that creates a codeword.

𝑋𝑛−𝑘𝑢 𝑋 = 𝑎 𝑋 𝑔 𝑋 + 𝑏 𝑋

𝑏 𝑋 + 𝑋𝑛−𝑘𝑢 𝑋 = 𝑎 𝑋 𝑔 𝑋 = 𝑣(𝑋)

degree ≤ 𝑛 − 𝑘 degree = 𝑛 − 𝑘

parity-check message codeword

Systematic Encoding Circuit

b0 b1 b2 bn-k-1

g1 g2 gn-k-1

gate

message 𝑢

parity-check

symbols

codeword

𝑣

Example (7,4)

Cyclic Code

𝑔 𝑋 = 𝑋3 + 𝑋 + 1

𝑋7 + 1 = (𝑋4 + 𝑋2 + 𝑋 + 1) ∙ 𝑔(𝑋)

Message Codeword Code Polynomial

(0000) (0000000) 0 = 0 ∙ 𝑔(𝑋)

(1000) (1101000) 1 + 𝑋 + 𝑋3 = 𝑔(𝑋)

(0100) (0110100) 𝑋 + 𝑋2 + 𝑋4 = 𝑋𝑔(𝑋)

(1100) (1011100) 1 + 𝑋2 + 𝑋3 + 𝑋4 = 1 + 𝑋 𝑔(𝑋)

(0010) (1110010) 1 + 𝑋 + 𝑋2 + 𝑋5 = 1 + 𝑋2 𝑔(𝑋)

(1010) (0011010) 𝑋2 + 𝑋3 + 𝑋5 = 𝑋2𝑔(𝑋)

(0110) (1000110) 1 + 𝑋4 + 𝑋5 = 1 + 𝑋 + 𝑋2 𝑔(𝑋)

(1110) (0101110) 𝑋 + 𝑋3 + 𝑋4 + 𝑋5 = 𝑋 + 𝑋2 𝑔(𝑋)

(0001) (1010001) 1 + 𝑋2 + 𝑋6 = 1 + 𝑋 + 𝑋3 𝑔(𝑋)

(1001) (0111001) 𝑋 + 𝑋2 + 𝑋3 + 𝑋6 = 𝑋 + 𝑋3 𝑔(𝑋)

(0101) (1100101) 1 + 𝑋 + 𝑋4 + 𝑋6 = 1 + 𝑋3 𝑔(𝑋)

(1101) (0001101) 𝑋3 + 𝑋4 + 𝑋6 = 𝑋3𝑔(𝑋)

(0011) (0100011) 𝑋 + 𝑋5 + 𝑋6 = 𝑋 + 𝑋2 + 𝑋3 𝑔(𝑋)

(1011) (1001011) 1 + 𝑋3 + 𝑋5 + 𝑋6 = 1 + 𝑋 + 𝑋2 + 𝑋3 𝑔(𝑋)

(0111) (0010111) 𝑋2 + 𝑋4 + 𝑋5 + 𝑋6 = 𝑋2 + 𝑋3 𝑔(𝑋)

(1111) (1111111) 1 + 𝑋 + 𝑋2 + 𝑋3 + 𝑋4 + 𝑋5 + 𝑋6 = 1 + 𝑋2 + 𝑋3 𝑔(𝑋)

Error Detection and Correction

Divide 𝑟(𝑋) by 𝑔(𝑋) 𝑟 𝑋 = 𝑎 𝑋 𝑔 𝑋 + 𝑠(𝑋) (Requires a feedback shift register circuit with 𝑛 − 𝑘 flip-flops.)

• If 𝑠 𝑋 = 0, assume that 𝑟 𝑋 = 𝑣 𝑋

(Transmission was correct.)

• If 𝑠(𝑋) ≠ 0 an error definitely occurred.

Locate and correct the error(s).

Example Error Detection Circuit

𝑔 𝑋 = 𝑋3 + 𝑋 + 1

gate

gate

input

Important Classes of Cyclic Codes E

rror

Corr

ection C

apacity t

Random

Errors (e.g., satellite channel)

Burst

Errors (e.g., scratch in a CD)

Bose

Chaudhuri

Hocquenghem

Reed

Solomon

Low

Density

Parity

Check

Hamming

Fire

Hamming Codes

• For any positive integer 𝑚 ≥ 3, there exists a 2𝑚 − 1, 2𝑚 − 𝑚 − 1, 3

Hamming code with minimum distance 3.

• This code is capable of correcting a single error at any location over a

block of 2𝑚 − 1 bits.

• Decoding is simple.

BCH Codes • For any positive integer 𝑚 ≥ 3 and 𝑡 < 2𝑚 − 1, there exists a binary

cyclic BCH code with the following parameters:

Length: 𝑛 = 2𝑚 − 1

Number of parity-check bits: 𝑛 − 𝑘 ≤ 𝑚𝑡 Minimum distance: 𝑑𝑚𝑖𝑛 ≥ 2𝑡 + 1

• This code is capable of correcting t or fewer random errors over a span of

2𝑚 − 1 bit positions and hence called a t-error-correcting BCH code.

Reed Solomon (RS) Codes • For any q that is a power of a prime and any t with 1 𝑡 < 𝑞, there exists an RS code

with code symbols from a finite field GF(q) of order q with the following parameters:

Length: 𝑞 − 1

Dimension: 𝑞 − 2𝑡 − 1

Number of parity-check symbols: 𝑛 − 𝑘 = 2𝑡 Minimum distance: 𝑑𝑚𝑖𝑛 = 2𝑡 + 1

• If 𝑞 = 2𝑚 then each symbol consists of 𝑚 consecutive bits.

• Corrects up to 𝑡 symbols in a block.

‒ Effective for correcting random errors.

‒ Effective for correcting bursts of errors

(when multiple errors occur in a single symbol).

• The most commonly used RS code is the (255, 239, 17).

Used in optical and satellite communications and data storage systems.

Fire Codes

• Optimized for bursts of errors.

(Errors all occur in isolated windows of length 𝑙.)

• Decoding is very simple.

It is called error trapping decoding.

Low Density Parity Checking (LDPC)

• An LDPC code over GF(q), a finite field with q elements, is a

q-ary linear block code given by the null space of a

sparse parity-check matrix H over GF(q).

• An LDPC code is said to be regular if its parity-check matrix H has

a constant number of ones in its columns, say , and

a constant number of ones in its rows, say .

and are typically quite small compared to 𝒏.

• Low-density parity-check (LDPC) codes are currently the

most promising coding technique for approaching the

Shannon capacities (or limits) for a wide range of channels.

FEC Applications to Communication

System

• FEC used for many dispersion and noise limited systems – Single burst error correction OIF CEI-P fire code (1604, 1584) and

10GBASE-R QC code (2112, 2080) for 10G serial link system

– Reed Solomon (RS) codes widely applied in telecommunication systems such as 100GBASE-KR4 and KP4

– Turbo codes in deep space satellite communications

– Low Density Parity Check (LDPC) codes in 10GBASE-T, DVB, WiMAX, disk drive read channel, and NASA standard code (8176, 7156) used in the NASA Landsat (near earth satellite communications) and the Interface Region Imaging Spectrograph (IRIS) missions

– Trellis coded modulation (TCM) in 1000BASE-T

• Recently adopted FEC – Fire code (1604, 1584) – OIF CEI-P

– QC code (2112, 2080) – 10GBASE-KR

– RS(528, 514, 7) over GF(210) – 100GBASE-KR4

– RS(544, 514, 15) over GF(210) – 100GBASE-KP4

• Applying FEC to serial link system needs to consider – Coding gain

– Coding overhead

– Encoder/Decoder latency

– Encoder/Decoder complexity

FEC Applications to Serial Link System

• DFE error propagation will

cause long burst error and

degrade coding gain

• Hence, burst error correcting

FEC such as RS codes are

preferred

• However, RS(544, 514, 15)

relaxes BER requirement from

1e-15 to 1e-6 for serial link

system 100GBASE-KP4

Coding Gain => BER Relaxing

• Trade-off between coding gain and channel loss due to coding

overhead – KR4 absorbs RS(528,514,7) overhead, link rate 25.8Gb/s remains

– KP4 RS(544,514,15) increases link rate from 25.8Gb/s to 27.2Gb/s

Coding Overhead => Higher Link Rate

10 11 12 13 14 15 16 17

10-60

10-50

10-40

10-30

10-20

10-10

100

SNR (dB)

BE

R

uncoded

RS n=514, t=1

t=2

t=2

t=3

t=4

t=4

t=...

t=16

t=1

hig

her t

RS(n, 514, t) codes

• Encoder

– Normally takes relatively small latency: t

• Decoder:

– Syndrome Computation (SC): n/p cycles

– Key Equation Solver (KES): 2*t cycles

– Chien Search and Forney (CSnF): n/p + (1~2)

cycles (p is the parallel level of processing in a design)

• Total about 50ns-200ns for KR4 and KP4

FEC @100Gb/s

Encoding/Decoding Time => Link Latency

Delay Line

t

syndro

me

KES t

t

Chien

Searc

h

Forne

y t,p

• Decoder complexity is normally

proportional to t

Encoder/Decoder Complexity => Cost (Area + Power)

t

syn

dro

m

e

KES t t

Chien

Searc

h Forne

y

20% 35-55% 25-45%

Area

Area Power

t,p Code t Gates Area

KR4 RS

(528,514)

7 100-

150k

1

KP4 RS

(544,514)

15 200-

350k

2-2.5

• Random error BSC model

• Burst error Gilbert model

• Multinomial statistical model

• PDA and Importance Sampling

System Performance Estimation w/ FEC

Random and Gilbert Burst Error Models

• Random error model – Binary symmetric channel – AWGN noise

• Gilbert burst error model

– The probability of a successive error with 1-tap DFE

– Probability of the burst error length

• Then, post FEC BER can be calculated based on p(bl) and error correction capability of the code

2

)21(

2

)21(

4

1 0101 SNRbberfc

SNRbberfcpep

)2

(2

1)(

SNRerfcSNRQBER pre

0 if1

0 if)1(

kp

kpkblp

ep

kep

Generic FEC Model - Multinomial Distribution

• A generic FEC model based on multinomial distribution can be used to

calculate post FEC BER performance

– Assume that errors are caused by independent random or burst error events

at output of SerDes detection

– Let wi (i=1,2,3,4, …) be the probability of having i-byte (m bits per byte) error

event

– Then code word failure probability

where k=k1+k2+k3+k4 and ki range from 1 to an upper limit such as t+1

Error Correlation Study Channel

12.5

Gb/s

CTLE

+

DFE

Additive White

Gaussian Noise

Goals:

• BER ~ 10-5

• Vary correlation to data pattern

• Vary DFE error propagation

• Determine error correlation

PRBS (variable SR length)

Error Correlation Study Method • Time domain simulation

• Simulated 500 million bits for each case.

• For each bit error, recorded

* Bit error log for analysis probe ARX1_Probe

Time bit_no pattern

4.27E-05 533466 1 1 1 0 0 1 0 1 0 0 0 0 0 0 0 1 1 1 1

7.34E-05 917369 0 1 1 0 1 0 1 1 0 0 1 0 1 1 1 1 1 0 1

9.85E-05 1231038 0 1 1 1 0 1 1 1 1 1 1 0 1 0 1 0 0 0 0

1.00E-04 1254936 0 1 1 0 1 1 1 0 0 0 0 0 1 1 0 0 0 0 1

1.07E-04 1343202 0 1 1 1 0 0 0 0 0 1 0 0 1 0 1 0 1 0 1

2.29E-04 2862468 0 1 1 1 1 0 0 0 1 0 1 0 1 0 1 1 1 0 0

2.94E-04 3676209 1 1 1 1 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0

3.09E-04 3858573 0 1 1 0 1 1 1 1 0 0 0 1 1 0 0 1 1 1 0

3.14E-04 3920436 1 1 1 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 0

3.38E-04 4220051 1 1 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 1 1

3.64E-04 4545431 0 1 1 1 1 0 0 0 0 0 1 0 0 0 1 1 0 1 1

4.92E-04 6153288 1 1 1 0 1 1 0 0 1 1 1 1 0 0 0 1 0 0 1

6.10E-04 7624993 1 1 1 0 0 1 1 1 0 1 0 1 1 1 0 1 1 1 1

• Bit time

• Surrounding data pattern

• Previous bit errors less than

64 bits from current error

(Autocorrelation function of error process)

Error Correlation Example

PRBS63

CTLE + DFE

DFE fully adaptive

Rx Noise 50mV rms

Total Errors 1169

Minimum bits between errors 1703

Maximum bits between errors 3237475

There should have been at least twenty errors at distance equal to one.

Where did they go?

Pattern Correlation Q: To what extent were the bits adjacent to the errored bit

correlated with the errored bit?

-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

-80 -60 -40 -20 0 20

Co

rrela

tio

n C

oeff

icie

nt

Relative Bit Position

Perfectly correlated

Perfectly anti-correlated

Uncorrelated

Errored bit

Error Spacing vs. Equalization Q: To what extent are errors grouped close together?

A: It depends…

PRBS63 PRBS63 PRBS63

CTLE+DFE+Noise CTLE+minimal DFE DFE only

Total Errors 1169 4815 2987

Minimum bits between

errors

1703 70 2

Maximum bits between

errors

3237475 928258 1728641

Increasing pattern dependence

DFE Error Autocorrelation vs. Data Pattern

Lots of errors close together

Significant pattern dependence

DFE error propagation is not the primary impairment.

0

50

100

150

200

250

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63

PRBS28

PRBS31

PRBS39

PRBS47

PRBS63

0

10

20

30

40

50

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63

PRBS39

PRBS47

PRBS63

PRBS28 PRBS31 PRBS39 PRBS47 PRBS63

errors 2987 2863 1305 929 454

Importance Sampling What if you already knew which data patterns (events) were likely to cause errors?

Those are the only ones you’d bother to simulate.

* If you choose the wrong data patterns, your results are going to be worthless.

N patterns

M patterns 𝑃𝐼𝑆 𝑒𝑟𝑟

𝑃 𝑒𝑟𝑟 =𝑀

𝑁𝑃𝐼𝑆 𝑒𝑟𝑟

Distortion Analysis Invent some really nasty data patterns

-1

-0.5

0

0.5

1

-10 -5 0 5

x x x x 0 x x x x 0 1 0 x x x x

Interleave them!

x x x x 0 x x x x 0 1 0 x x x x

x x x x 1 x x x x 1 0 1 x x x x

x x x x 0 x x x x 0 1 0 x x x x

x x x x 1 x x x 0 1 0 1 0 0 1 0 x 0 1 0 x x x x 𝑀

𝑁=2−12

• Pattern dependence appears to be the primary impairment.

• Use Distortion Analysis to identify the critical data patterns.

• Use Importance Sampling to simulate only the critical data patterns and yet get unbiased results for the

real system.

Serial Channel Error Correlation Study

Preliminary Conclusions

Should we have codes designed specifically for high speed serial channels?

Documents

A Brief Tour of FEC for Serial Link Systems - sisoft.com · Encoding Source Encoder Channel Encoder Modulator ᑣ ᑤ ᑣ=ᑣ0,ᑣ1,…,ᑣ −1 ᑤ=ᑤ0,ᑤ1,…,ᑤ −1 message