01 - Fountain Codes

8/12/2019 01 - Fountain Codes

1/7


2/7


3/7

What if N is slightly greater than K? Let N K+E,whereEis the small number of excess packets. Our questionnow is, what is the probability that the random KbyNbinary matrix Gcontains an invertibleKbyKmatrix? Letus call this probability 1d, so thatd is the probability thatthe receiver will not be able to decode the file when Eexcesspackets have been received. This failure probability d isplotted against E for the case K 100 in Fig. 3 (it looksidentical for allK410). For anyK, the probability of failureis bounded above by

dE 2E 3This bound is shown by the thin dotted line in Fig. 3.

In summary, the number of packets required to haveprobability 1

dof success is

K

log

2

1=d. The expectedencoding cost per packet isK/2 packet operations, since onaverage half of the packets must be added up (a packetoperation is the exclusive-or of two packets of size lbits).The expected decoding cost is the sum of the cost of thematrix inversion, which is about K3 binary operations, andthe cost of applying the inverse to the received packets,which is about K2/2 packet operations.

While a random code is not in the technical sense aperfect code for the erasure channel (it has only a chanceof 0.289 of recovering the file whenKpackets have arrived),it is almost perfect. An excess of Epackets increases theprobability of success to at least (1d), where d 2E.Thus, as the file size Kincreases, random linear fountain

codes can get arbitrarily close to the Shannon limit. Theonly bad news is that their encoding and decoding costs arequadratic and cubic in the number of packets encoded. Thisscaling is not important if K is small (less than onethousand, say); but we would prefer a solution with lowercomputational cost.

4 Intermission

Before we study better fountain codes, it will help to solvethe following exercises. Imagine that we throw ballsindependently at random into Kbins, where K is a largenumber such as 1000 or 10 000.

1. AfterN Kballs have been thrown, what fraction of thebins do you expect have no balls in them?

2. If we throw three times as many balls as there are bins, isit likely that any bins will be empty? Roughly how manyballs must be thrown for it to be likely that every bin hasa ball?

3. Show that in order for the probability that all Kbins have at least one ball to be 1d, we requireNKlogeK=d balls.

Rough calculations like these are often best solved byfinding expectations instead of probabilities. Instead offinding the probability distribution of the number of emptybins, we find the expected number of empty bins. This iseasier because means add, even where random variables arecorrelated.

The probability that one particular bin is empty after Nballs have been thrown is

1 1K

N eN=K 4

So when N K, the probability that one particular bin isempty is roughly 1/e, and the fraction of empty bins must beroughly 1/etoo. If we throw a total of 3Kballs, the emptyfraction drops to 1/e3, about 5%. We have to throw a lot of

balls to make sure all the bins have a ball! For general N,the expected number of empty bins is

KeN=K 5This expected number is a small number d (which roughlyimplies that the probability that all bins have a ball is (1d))only if

N4KlogeK

d 6

5 The LT code

The LT code retains the good performance of the random

linear fountain code, while drastically reducing the encodingand decoding complexities. You can think of the LT code asa sparse random linear fountain code, with a super-cheapapproximate decoding algorithm.

5.1 EncoderEach encoded packet tn is produced from the source files1;s2;s3;. . .sK as follows:

1. Randomly choose the degree dn of the packet from adegree distribution r(d); the appropriate choice of rdepends on the source file sizeK, as we will discuss later.

2. Choose, uniformly at random,dndistinct input packets,and settnequal to the bitwise sum, modulo 2, of those dnpackets.

This encoding operation defines a graph connecting

encoded packets to source packets. If the mean degree dis significantly smaller than Kthen the graph is sparse. Wecan think of the resulting code as an irregular low-densitygenerator-matrix code.

5.2 DecoderDecoding a sparse-graph code is especially easy in the caseof an erasure channel. The decoders task is to recover sfrom t sG, where G is the matrix associated with thegraph ( just as in the random linear fountain code, weassume the decoder somehow knows the pseudorandom

matrix G).The simple way to attempt to solve this problem is by

message passing. We can think of the decoding algorithm as

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

0 2 4 6 8 10

probability

offailure

number of redundant packets

100

101

102

103

104

105

106

1070 5 10 15 20

Fig. 3 Performance of the random linear fountainThe solid line shows the probability that complete decoding is not

possible as a function of the number of excess packets, E. The thin

dashed line shows the upper bound, 2E, on the probability of error

1064 IEE Proc.-Commun., Vol. 152, No. 6, December 2005


4/7

the sumproduct algorithm [5, Chaps. 16, 26 and 47] if wewish, but all messages are either completely uncertain orcompletely certain. Uncertain messages assert that amessage packet sk could have any value, with equalprobability; certain messages assert that skhas a particularvalue, with probability one.

This simplicity of the messages allows a simple descrip-tion of the decoding process. We will call the encodedpacketstn check nodes.

1. Find a check nodetnthat is connected to only one source

packetsk(if there is no such check node, this decodingalgorithm halts at this point, and fails to recover all thesource packets).

(a) Set sk tn.(b) Addskto all checks tn0 that are connected to sk:

tn0 : tn0skfor all n0 such thatGn0k 1:(c) Remove all the edges connected to the source

packetsk.

2. Repeat (1) until allskare determined.

This decoding process is illustrated in Fig. 4 for a toy casewhere each packet is just one bit. There are three sourcepackets (shown by the upper circles) and four receivedpackets (shown by the lower check symbols), which havethe values t1; t2; t3; t4 1011 at the start of the algorithm.

At the first iteration, the only check node that isconnected to a sole source bit is the first check node (panel

a). We set that source bit s1 accordingly (panel b), discardthe check node, then add the value ofs1(1) to the checks towhich it is connected (panel c), disconnecting s1 from thegraph. At the start of the second iteration (panel c), thefourth check node is connected to a sole source bit,s2. Wesets2tot4(0, in panel d), and add s2to the two checks it isconnected to (panel e). Finally, we find that two checknodes are both connected to s3, and they agree about thevalue ofs3(as we would hope!), which is restored in panel f.

5.3 Designing the degree distributionThe probability distribution r(d) of the degree is a criticalpart of the design: occasional encoded packets must havehigh degree (i.e.,dsimilar toK) in order to ensure that thereare not some source packets that are connected to no-one.Many packets must have low degree, so that the decodingprocess can get started, and keep going, and so that thetotal number of addition operations involved in theencoding and decoding is kept small. For a given degreedistributionr(d), the statistics of the decoding process canbe predicted by an appropriate version of density evolution,a technique first developed for low-density parity-checkcodes [5, p. 566].

Before giving Lubys choice for r(d), let us think about

the rough properties that a satisfactoryr(d) must have. Theencoding and decoding complexity are both going to scalelinearly with the number of edges in the graph, so thecrucial quantity is the average degree of the packets. Howsmall can this be? The balls-in-bins exercise helps here: thinkof the edges that we create as the balls and the sourcepackets as the bins. In order for decoding to be successful,every source packet must surely have at least one edge in it.The encoder throws edges into source packets at random,so the number of edges must be at least of order KlogeK.If the number of packets received is close to ShannonsoptimalK, and decoding is possible, the average degree ofeach packet must be at least logeK, and the encoding anddecoding complexity of an LT code will definitely be atleastKlogeK. Luby showed that this bound on complexitycan indeed be achieved by a careful choice of degreedistribution.

Ideally, to avoid redundancy, we would like the receivedgraph to have the property that just one check node hasdegree one at each iteration. At each iteration, when thischeck node is processed, the degrees in the graph arereduced in such a way that one new degree-one check nodeappears. In expectation, this ideal behaviour is achieved bythe ideal soliton distribution,

r1 1=Krd

1

dd 1 for d

2; 3;. . .;K

7

The expected degree under this distribution is roughlylogeK.

This degree distribution works poorly in practice, becausefluctuations around the expected behaviour make it verylikely that at some point in the decoding process there willbe no degree-one check nodes; and, furthermore, a fewsource nodes will receive no connections at all. A smallmodification fixes these problems.

The robust soliton distribution has two extra parameters,candd; it is designed to ensure that the expected number ofdegree-one checks is about

S c logeK=d ffiffiffiffiKp 8rather than 1, throughout the decoding process. Theparameterdis a bound on the probability that the decoding

s1

0 1

1 1

1 1

1

1

0

1 1

1

01

1

c

d

e

f

a

b

01

01

s2 s

3

++

++

+ ++

+ ++

+ +++

1

1

0

1

Fig. 4 Example decoding for a fountain code with K 3 sourcebits and N 4 encoded bitsFrom[5]

IEE Proc.-Commun., Vol. 152, No. 6, December 2005 1065
http://-/?-http://-/?-


5/7

fails to run to completion after a certain number K0 ofpackets have been received. The parametercis a constant oforder 1, if our aim is to prove Lubys main theorem aboutLT codes; in practice however it can be viewed as a freeparameter, with a value somewhat smaller than 1 givinggood results. We define a positive function

td

s

K

1

d ford 1; 2;. . .;K=S 1

s

KlogS=d ford K=S

0 ford4K=S

8>>>>>: 9

(see Fig. 5) then add the ideal soliton distributionrtotandnormalise to obtain the robust soliton distribution, m:

md rd tdZ

10where Z Sdrd td. The number of encodedpackets required at the receiving end to ensure that thedecoding can run to completion, with probability at least1 d, isK0 KZ.

Lubys analysis[3]explains how the small-dend oft hasthe role of ensuring that the decoding process gets started,

and the spike in tatd K/Sis included to ensure that everysource packet is likely to be connected to a check at leastonce. Lubys key result is that ( for an appropriate value ofthe constant c) receiving K0 K 2 logeS=dS checksensures that all packets can be recovered with probability atleast 1d. In the illustrative Figures (Figs. 6a and b) theallowable decoder failure probability d has been set quitelarge, because the actual failure probability is much smallerthan is suggested by Lubys conservative analysis.

In practice, LT codes can be tuned so that a file oforiginal size K 10 000 packets is recovered with anoverhead of about 5%. Figure 7 shows histograms of theactual number of packets required for a couple of settings ofthe parameters, achieving mean overheads smaller than 5%

and 10% respectively. Figure 8 shows the time-courses ofthree decoding runs. It is characteristic of a good LT codethat very little decoding is possible until slightly more than

Kpackets have been received, at which point, an avalancheof decoding takes place.

6 Raptor codes

You might think that we could not do any better than LTcodes: their encoding and decoding costs scale as KlogeK,whereK is the file size. But raptor codes [6]achieve lineartime encoding and decoding by concatenating a weakenedLT code with an outer code that patches the gaps in the LT

code.LT codes had decoding and encoding complexity that

scaled as logeKper packet, because the average degree ofthe packets in the sparse graph was logeK. Raptor codes

use an LT code with average degree d about 3. With thislower average degree, the decoder may work in the sensethat it does not get stuck, but a fraction of the sourcepackets will not be connected to the graph and so willnot be recovered. What fraction? From the balls-in-bins

exercise, the expected fraction not recovered is ~f ed,which for d 3 is 5%. Moreover, ifKis large, the law oflarge numbers assures us that the fraction of packets notrecovered in any particular realisation will be very close to

~f. So, here is Shokrollahis trick: we transmit a K-packet fileby first pre-coding the file into ~KK=1 ~f packets withan excellent outer code that can correct erasures if the

0

0.1

0.2

0.3

0.4

0.5

0 10 20 30 40 50

rho

tau

Fig. 5 The distributionsr(d) andt(d) for the case K 10 000,c 0:2,d 0.05, which gives S244, K/S 41, andZ1:3The distributiont is largest at d 1 and d/K S. From[5]

0

20

40

60

80

100

120

140

101102

101102

delta = 0.01

delta = 0.1

delta = 0.9

10000

10200

10400

10600

10800

11000

c

delta = 0.01

delta = 0.1

delta = 0.9

a

b

Fig. 6 The number of degree-one checks S and the quantity K0against the two parameters c andd, for K 10 000aNumber of degree-one checks S

bQuantity K0

Lubys main theorem proves that there exists a value ofc such that,

given K0 received packets, the decoding algorithm will recover the Ksource packets with probability 1d. From[5]

1066 IEE Proc.-Commun., Vol. 152, No. 6, December 2005
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-


6/7

erasure rate is exactly ~f; then we transmit this slightlyenlarged file using a weak LT code that, once slightly more

thanKpackets have been received, can recover 1 ~f~Kofthe pre-coded packets, which is roughly Kpackets; then weuse the outer code to recover the original file (Fig. 9).

Figure 10 shows the properties of a crudely weakened LTcode. Whereas the original LT code usually recoversK 10000 packets within a number of received packetsN

11000, the weakened LT code usually recovers 8000

packets within a received number of 9250. Better per-formance can be achieved by optimising the degreedistribution.

For our excellent outer code, we require a code that cancorrect erasures at a known rate of 5% with low decodingcomplexity. Shokrollahi uses an irregular low-densityparity-check code. For further information about irregularlow-density parity-check codes, and fast encoding algo-rithms for them, see [5, pp. 567572] and[7, 8].

7 Applications

Fountain codes are an excellent solution in a wide variety ofsituations. Here we mention two.

10 000 10 500 11 000 11 500 12 000

10 000 10 500 11 000 11 500 12 000

10 000 10 500 11 000 11 500 12 000

a

b

c

Fig. 7 Histograms of the actual number of packets N required inorder to recover a file of size K 10 000 packetsa c 0.01, d 0.5 (S 10,K/S 1010, and Z1:01)b c 0.03, d 0.5 (S 30,K/S 337, andZ1:03)c c 0.1, d 0.5 (S 99,K/S 101, andZ1:1)From[5]

0

2000

4000

6000

8000

10000

0 2000 4000 6000 8000 10 000 12 000

numberdecoded

Fig. 8 Practical performance of LT codesThree experimental decodings are shown, all for codes created with the

parameters c 0.03, d 0.5 (S 30,K/S 337, and Z1:03) and afile of size K 10 000. The decoder is run greedily as packets arrive.The vertical axis shows the number of packets decoded as a function of

the number of received packets. The right-hand vertical line is at anumber of received packetsN 11 000, i.e., an overhead of 10%

0

2000

4000

6000

8000

10000

0 2000 4000 6000 8000 10 000 12 000

max degree 8

max degree K

Fig. 10 The idea of a weakened LT codeThe LT degree distribution with parameters c 0.03, d 0.5 istruncated so that the maximum degree to be 8. The resulting graph has

mean degree 3. The decoder is run greedily as packets arrive. As in

Fig. 8, the thick lines show the number of recovered packets as a

function of the number of received packets. The thin lines are the

curves for the original LT code from Fig. 8. Just as the original LTcode usually recoversK 10 000 packets within a number of receivedpackets N 11 000, the weakened LT code recovers 8000 packetswithin a received number of 9250

N= 18

K= 16

+ + + + + + + + + + + + + + + + + +

Fig. 9 Schematic diagram of a raptor codeIn this toy example, K 16 source packets (top row) are encoded bythe outer code into ~K 20 pre-coded packets (centre row). Thedetails of this outer code are not given here. These packets are encoded

intoN

18 received packets (bottom row) with a weakened LT code.

Most of the received packets have degree 2 or 3. The average degree is3. The weakened LT code fails to connect some of the pre-coded

packets to any received packet these 3 lost packets are highlighted in

grey. The LT code recovers the other 17 pre-coded packets, then the

outer code is used to deduce the original 16 source packets

IEE Proc.-Commun., Vol. 152, No. 6, December 2005 1067
http://-/?-http://-/?-http://-/?-http://-/?-


7/7

Documents

01 - Fountain Codes