Upload
namrata-singh
View
137
Download
6
Tags:
Embed Size (px)
Citation preview
UNIT V
INTRODUCTION TO INFORMATION THEORY
ERROR PROBABILITY In any mode of communication, as long as channel noise exists, the
communication cannot be error-free
However, in the case of digital systems, the accuracy can be improved by reducing the error probability
. For all digital systems,
Where,
bKEeP e
i
bb
SSignal powerE bit energy
Bit rate R
Limitations on Minimising Error Probability
Increasing bit energy means:
Increasing the signal power ( for a given bit rate )
OR
Decreasing the bit rate ( for a given signal power )
OR
Both Due to physical limitations, cannot be increased beyond a certain limit. Therefore,
in order to reduce further, we must reduce the rate of transmission of information bits .
This implies that to obtain
In the presence of channel noise, it is not possible to obtain error-free communication.
iS bR
bR
iS
iS
ePbR
0, 0e bP we must have R
Shannon Theory
Shannon, in 1948 showed that for a given channel, as long as the rate of transmission of information digits (Rb) is maintained within a certain limit
(known as Channel Capacity), it is possible to achieve error-free communication.
Therefore, in order to obtain, it is not necessary to make . It can be obtained by maintaining ,
where C = channel capacity (per sec).
The presence of random disturbance in the channel does not, by itself set any limit on the transmission accuracy. Instead it sets a limit on the information rate for which .
0eP 0bR bR C
0eP
Information Capacity: Hartley’s Law
Therefore, there is a theoretical limit to the rate at which information can be sent along a channel with a given bandwidth and signal-to-noise ratio
The relationship between time, information capacity, and channel bandwidth is given by:
HARTLEY’S LAW
where I = amount of information to be sent
t = transmission time
B = channel bandwidth
k = a constant which depends on the type of data coding used and the signal-to-noise ratio of the channel
I ktB
Information Capacity: Shannon Hartley Law
Ignoring the noise, the theoretical limit to the amount of data that can be sent in a given bandwidth is given by:
Shannon-Hartley Theorem
where C = information capacity in bits per second
B =the channel bandwidth in hertz
M= number of levels transmitted
22 logC B M
Shannon-Hartley Law : Explanation
Consider a channel that can pass all frequencies from 0 to B hertz A simple binary signal with alternate 1’s and 0’s is transmitted. Such a signal would be a simple square wave with frequency one-half the bit-rate
Since it is square-wave, the signal will have harmonics at all odd multiples of its fundamental frequency, with declining amplitude as the frequency increases. At very low-bit rates, the output signal will be similar to the output after passage through the channel but as the bit-rate increases, the frequency of the signal also increases and more of its harmonics are filtered out making the output more and more distorted
Finally, for a bit-rate of 2B, the frequency of the input signal becomes B and only the fundamental frequency component of the input square-wave signal will pass through the channel
Input Signal Output Signal at maximum bit-rate 2B
Thus, with binary input signal , the channel capacity would be
2C B
1V
-1V
1 0 1 10 01 1
t t
Multilevel Signalling
The previously discussed idea can be extended to any number of levels
Consider an input signal with 4-voltage levels, -1V, -0.5V,0.5V and 1V, then ach voltage level would correspond to two bits of information
Four-level Code
Therefore, we have managed to transmit twice as much information in the same bandwidth. However, the maximum frequency of the signal would not change.
1V
-1V
11 10 01 00
t
Shannon Limit
From the previous discussion it seems that any amount of information can be transmitted through a given channel by simply increasing the number of levels. This is however, not true for because of noise
As the number of levels increase, the probability of occurrence of error due to noise also increases. Therefore, for a given noise level, the maximum data rate is given by:
where, C = information capacity in bits per second
B = bandwidth in hertz
S/N = signal-to-noise ratio
2log 1S
C BN
Example
Ques: A telephone line has a bandwidth of 3.2 KHz and signal-to-noise ratio of 35 dB. A signal is transmitted down this line using a four-level code. Find the
maximum Theoretical data rate
Ans: The maximum data rate ignoring noise is given as:
The maximum data using Shannon- limit is given as:
We will take the lesser of the two results. It also implies that it is possible to increase the data rate over this channel by using more levels
2
32
2 log
=2 3.2 10 log 4
12.8 Kb/s
C B M
10/ log (35 /10) 3162S N anti
2, log (1 / )
=37.2 /
Therefore C B S N
Kb s
Measure of Information
Common-Sense Measure of Information
Consider the following three hypothetical headlines in a morning paper
Tomorrow the sun will rise in the east United States invades Cuba Cuba invades the United States
Now, from the point of view of common sense, the first headline conveys hardly any information, the second one conveys a large amount of information, and the third one conveys yet a larger amount of information.
If we look in terms of the probability of occurrence of the above three events then the probability of occurence of the first event is unity, the probability of occurence of the second event is very low and the probability of occurrence of the third event is practically zero
Common-Sense Measure of Information contd…
Therefore, an event with lower probability of occurrence has a greater surprise element associated with it and hence conveys larger amount of information as compared to an event with greater probability of occurrence.
Mathematically,
If P = Probability of occurrence of a message I = Information content of the message
Then,
For and for, ,1, 0P I 0,P I ,
1 logI
P
The Engineering Measure of Information
From the engineering point of view, the information in a message is directly proportional to the minimum time required to transmit the message.
It implies that a message with higher probability can be transmitted in shorter time than required for a message with lower probability.
For efficient transmission shorter code words are assigned to the alphabets like a, e, o, t, etc which occur more frequently and longer code words are assigned are to alphabets like x, z, k, q, etc which occur less frequently. Therefore, the alphabets that occur more frequently in a message (i.e., the alphabets with a higher probability of occurrence) need a shorter time to transmit as compared to those with smaller probability of occurrence.
Therefore, the time required to transmit a symbol (or a message) with probability of occurrence
P is 1
logP
The Engineering Measure of Information contd…
Let us consider two equiprobable binary messages m1 and m2. These two equiprobable messages require a minimum of one binary digit (which can assume two values)
Similarly, to encode 4 equiprobable messages, we require a minimum of 2 binary digits per message. Therefore, each of these four messages takes twice as much transmission time as that required by each of the two equiprobable messages and hence, it contains two times (twice) more information
Therefore, generally in order to encode each of the n equiprobable messages, the no. of binary digits required are:
2logr n
The Engineering Measure of Information contd…
Since, all the messages are equiprobable, therefore, the number of binary digits required to encode each message of probability P is:
Therefore, the information content I of each message of probability P is given as:
Similarly for r-ary digits
2
1logr
P
2
2
1 log
1log
IP
I KP
2
1log ; for K=1I
P
1log r-ary unitsrI
P
Units of Information
Now, it is evident that:
and in general,
2
1 1log log rI r ary units bits
P P
2, 1 logr ary unit r bits
1 log sr ary unit r s ary unit
The 10-ary unit of information is called the Hartley ( in the honour of R.V.L Hartley)
21 log 10 3.32 hartley bits bits
21 log 1.44 nat e bits bits
AVERAGE INFORMATION PER MESSAGE: Entropy of a Source
Consider a memory less source (i.e., each message emitted is independent of the previous message(s)) emitting messages with probabilities respectively ( )
Now, the information content of message is given by:
Thus, the average or mean information per message emitted by the source is given by:
1 2, ,..., nm m m 1 2 3, , ,...., nP P P P1 2 .... 1nP P P
iI im
2
1log i
i
I bitsP
21 1
( ) log n n
i i i ii i
H m PI bits P P bits Entropy
Maximum value of Entropy
Since, entropy is the measure of uncertainity, the probability distribution that generates maximum uncertainity will have the maximum entropy
The maximum value of is obtained as:( )H m
1 2 1
( )0, , 1, 2,..., 1 ( .... )n n
i
dH mfor i n and P P P P
dP
2 2 21
2 22 2 2 2
, ( ) log log log
log log( ) 1 1, log log log log 0
n
i i i i n ni
i i n ni n i n
i i i n
H m P P P P P P
d P P P PdH mP e P e P P
dP dP P P
2 2
2
log log 0
log 0
n i
n
i
P P
P
P
Maximum value of Entropy contd…
Therefore, the maximum value of entropy occurs for equiprobable messages, i.e., when
Therefore, the maximum value of entropy = the minimum no. of binary digits required to encode the message
1 2 3
1,
i n
n
The previous equation is true if P P
P P P Pn
max1
1 1( ) log log
n
i
H m nn n
i nP P
The Intuitive (Common Sense) and Engineering Interpretation of Entropy
From the engineering point of view, the information content of any message is equal to the minimum number of digits required to encode the message. Therefore,
Entropy = H(m) = The average value of the minimum number of digits required for encoding each message
Now, from the intuitive (common sense) point of view, information is considered synonymous with the uncertainity or the surprise element associated with it,, a message with lower probability of occurrence which has greater uncertainity conveys larger information.
Therefore, if , is a measure of uncertainity (unexpectedness) of the message, then,
= avg. uncertainity per message of the source that generates messages
If the source is not memory-less (i.e. a message emitted at any time is not independent
of the previous messages emitted), then the source entropy will be less than,. This is because the dependence of the previous messages reduces its uncertainity
1log
P
1
1logi
i i
PP
Source Encoding
We know that the minimum number of binary digits required to encode a message is equal to the source entropy H(m) ,if all the messages are equiprobable (with probability P) . It can be shown that this result is true even for the non-equiprobable messages.
Let a source emit m messages with probabilities respectively. Consider a sequence of N messages with .Let be the number of times message occurs in this sequence. Then,
Thus, the message mi occurs times in the whole sequence of N messages
1 2, , , nm m m 1 2 3, , , , nP P P P N
iK
lim ii N
KP
N
im
iNP
Source Encoding contd…
Now, consider a typical sequence Sn of N messages from the source.
Because, the N messages (of probability ) occur times respectively and because each message is independent, the probability of occurrence of a typical sequence of N messages is given by:
Therefore, the number of digits required to encode such sequence is:
1 2 3, , , , nP P P P 1 2, ,....., nNP NP NP
1 21 2( ) ( ) ( ) ......( ) NNPNP NP
N NP S P P P
1
1log
( )
1log
NN
N
N ii i
LP S
L N PP
( ) bitsNL NH m
Source Encoding contd…
Therefore, the average number of digits required per message is given as:
This shows that we can encode a sequence of non-equiprobable messages by using
on an average H(m) no. of bits per message.
( ) bitsNLL H mN
Compact Codes
The source coding theorem states that in order to encode a source with entropy H(m), we need to have a minimum number of H(m) binary digits per message or
r-ary digits per message.
Thus, the average wordlength of an optimum code is H(m), but to attain this length , we have to encode a sequence of N messages (N) at a time.
However, if we wish to encode each message directly without using longer sequences, then, the average length of the code word per message will be > H(m)
In practice, it is not desirable to use long sequences, as they cause transmission delay and to the equipment complexity. Therefore, it is preferred to encode messages directly, even if the price has to be paid in terms of increased wordlength.
( )rH m
Huffman Code
Let us suppose that we are given a set of n messages (), then to find the Huffman Code
Step 1 All the messages are arranged in the order of decreasing probability
Step 2 The last two messages (messages with least probabilities) are then combined into one message (i.e. their probabilities are added up)
Step 3 These messages are now again arranged in the decreasing order of probability
Step 4 The whole of the process is repeated until only two messages are left (in the case of binary digits coding) or r messages are left ( in the case of r-ary digits coding).
Step 5 In the case of binary digits coding, the two (reduced) messages are assigned 0 and
1as their first digits in the code sequence (and in the case r-ary digits coding the reduced messages are assigned 0, 1… r-1).
Huffman Code contd…
Step 6 We then go back and assign the numbers 0 and 1 to the second digit for the
two messages that were combined in the previous step. We regressing this way until the first column is reached.
Step 7 The optimum (Huffman) code obtained this way is called the Compact Code.
The average length of the compact code is given as:
This is compared with the source entropy given by:
1
n
i ii
L PL
1
1( ) log
n
r i ri i
H m PP
Huffman Code contd…
Code Efficiency
Redundancy
( )H m
L
1
Huffman Code contd…
For an r-ary code, we will have exactly r messages left in the last reduced set, if and only if, the total number of original messages is equal to r + k(r-1), where k is an integer. This is because each reduction decreases the number of messages by (r-1).Therefore, if there are k reductions; there must be r + k(r-1)
In case the original number of messages does not satisfy this condition, we must add some dummy messages with zero probability of occurrence until this condition is satisfied
For e.g., if r = 4 and the number of messages n = 6, then we must add one dummy message with zero probability of occurrence to make the total number of messages 7, i.e., [4 + 1(4 – 1)]
EXERCISE-1
Ques. 1 Obtain the compact (Huffman) Code for the following set of messages:
Messages Probabilities
m1 0.30
m2 0.15
m3 0.25
m4 0.08
m5 0.10
m6 0.12
EXERCISE-1
Ans. 1 The optimum Huffman code is obtained as follows:
Messages Probabilities S1 S2 S3 S4
m1 0.30 00 0.30 00 0.30 00 0.43 1 0.57 0
m2 0.25 10 0.25 10 0.27 01 0.30 00
0.43 1
m3 0.15 010 0.18 11 0.25 10 0.27 01
m4 0.12 011 0.15 010 0.18 11
m5 0.10 110 0.12 011
m6 0.08 111
Ans. 1 contd…
The average length of the compact code is given by:
The entropy H(m) of the source is given by:
Code Efficiency =
Redundancy =
1
(0.30 2) (0.25 2) (0.15 3) (0.12 3) (0.10 3) (0.08 3)
= 0.60 + 0.50 + 0.45 + 0.36 + 0.30 + 0.24
= 2.45
i in
L PL
bits
( )0.976
H m
L
1 0.024
21
1( ) log
= 0.521089678 0.5 0.4105448 0.3670672 0.3321928+0.2915085
=2.418
in i
H m PP
bits
EXERCISE-1 contd…
Ques. 2 A zero memory source emits six messages as shown Find the 4-ary (quaternary) Huffman Code. Determine its avg. wordlength, the efficiency and the redundancy.
Messages Probabilities
m1 0.30
m2 0.15
m3 0.25
m4 0.08
m5 0.10
m6 0.12
EXERCISE-1 contd…
Ans. 2 The optimum Huffman code is obtained as follows:
Messages Probabilities S1
m1 0.30 0 0.30 0
m2 0.25 2 0.30 1
m3 0.15 3 0.25 2
m4 0.12 10 0.15 3
m5 0.10 11
m6 0.08 12
m7 0.0 13
Ans. 2 contd…
The average length of the compact code is given by:
The entropy H(m) of the source is given by:
Code Efficiency =
Redundancy =
1
(0.30 1) (0.25 1) (0.15 1) (0.12 2) (0.1 2) (0.08 2)
= 1.3 4-ary
i in
L PL
bits
41
1( ) log
=1.209 4-ary
in i
H m PP
bits
( )0.93
H m
L
1 0.07
EXERCISE-1 contd…
Ques. 3 A zero memory source emits messages and with probabilities 0.8 and 0.2 respectively. Find the optimum binary code for the source as well as for the second and third order extensions ( i.e. N=2 and 3)
(Ans. For N=1 H(m) = 0.72 ; η = 0.72
For N=2 L’ = 1.56 ; η = 0.923
For N=3 L’’ = 2.184 ;η = 0.989 )
Shannon Fano Code
An efficient code can be generated by the following simple procedure known as the Shannon-Fano algorithm:
Step 1 List the source symbols in decreasing probability
Step 2 Partition the set into two sets that are as close to equiprobable as possible Assign 0 to the upper set and 1 to the lower set.
Step 3 Continue this process, each time partitioning the sets with as nearly equal
probabilities as possible until further partitioning is not possible.
EXERCISE-2
Ques. 1 Obtain Shannon-Fano Code for the following given set of messages from a memory-less source
Messages Probabilities
X1 0.30
X2 0.25
X3 0.20
X4 0.12
X5 0.08
X6 0.05
ix ( )iP x
EXERCISE-2 contd…
Ans. 2
Ans. 2 contd…
The optimum wordlength of the code word is:
2 0.3 2 0.25 2 0.20 3 0.12 4 0.08 4 0.05
= 2.38
L
2 2
2 2 2 2
( ) (0.30 log 0.30 0.25 log 0.25 0.20
log 0.20 0.12 log 0.12 0.08 log 0.08 0.05 log 0.05)
= 2.0686389
H m
( )0.86
H mefficiency
L
1 0.1308239
Some more examples
Ques. 2 A memory-less source emits messages with
(i) Construct a Shannon-Fano Code for X and calculate efficiency of the code
(ii) Repeat for the Huffman Code and compare the results.
1 2 3 4 5, , , and x x x x x
1( ) 0.4,P x 2 3 4 5( ) 0.19, ( ) 0.16, ( ) 0.15 ( ) 0.1P x P x P x and P x
Ans. 2
The Shannon-Fano Code can be obtained as:1x2x3x4x5x
Ans. 2 contd…
The optimum wordlength of the code word is:
2 (0.40 0.19 0.16) 3 (0.15 0.10)
= 2.25
L
2 2
2 2 2
( ) (0.40 log 0.40 0.19 log 0.19 0.16
log 0.16 0.15 log 0.15 0.10 log 0.10)
= 2.1497523
H m
( )0.9554454
H mefficiency
L
1 0.0445546
Ans. 2 contd…
Now, let us generate the Huffman Code and compare the results. Huffman code is generated as:
Messages Probabilities S1 S2 S3
m1 0.40 1 0.40 1 0.40 1 0.60 0
m2 0.19 000 0.25 01 0.35 00 0.40 1
m3 0.16 001 0.19 000 0.25 01
m4 0.15 010 0.16 001
m5 0.10 011
Ans. 2 contd…
The average length of the compact code is given by:
Therefore, we find that the efficiency of Huffman Code is better than that of Shannon Fano Code
1
0.40 (0.19 0.16 0.15 0.10) 3
= 2.2
i in
L PL
bits
( )0.9771601
H mefficiency
L
Some more examples contd…
Ques. 3 Construct Shannon-Fano Code and Huffman Code for five equiprobable messages emitted from a memory-less
source with probability P=0.2
Ans. 3
Ans. 3 contd…
The optimum wordlength of the code word is:
(0.2 2) 3 (0.2 3) 2
= 2.4
L
2( ) (0.2 log 0.2 5)
= 2.3219281
H m
( )0.96747
H mefficiency
L
1 0.03253
Ans. 3 contd…
Now, Huffman code is generated as:
Messages Probabilities S1 S2 S3
m1 0.20 01 0.40 1 0.40 1 0.60 0
m2 0.20 000 0.20 01 0.40 00 0.40 1
m3 0.20 001 0.20 000 0.20 01
m4 0.20 10 0.20 001
m5 0.20 11
Ans. 3 contd…
The average length of the compact code is given by:
The efficiency is the same for both the codes
1
(0.2 3) 2 (0.2 2) 3
= 2.4
i in
L PL
bits
( )0.96747
H mefficiency
L
Construct Shannon-Fano Code and Huffman Code for the following:
Messages Probabilities
m1 0.50
m2 0.25
m3 0.125
m4 0.125
Ans: For Shannon Fano Code: H (m) =1.75; L=1.75; η=1
Messages Probabilities
m1
m2
m3
m4
m5
m6
1
2
1
4
1
8
1
16
1
32
1
32
Ans: For Shannon Fano Code: H (m) =1.9375; L=1.9375; η=1)
1. 2. 3. Messages Probabilities
m1
m2
m3
m4
m5
1
31
9
1
91
91
9
Ans: For Shannon Fano Code:
H (m) =4/3; L=4/3; η=1
CHANNEL CAPACITY OF A DISCRETE MEMORYLESS CHANNEL
Let a source emit symbols . The receiver receives symbols The set of received symbols may or may not be identical to the transmitted set.
If the channel is noiseless The reception of some symbol yj uniquely determines the message transmitted
Because of noise however, there is a certain amount of uncertainity regarding the transmitted symbol when yj is received
If represents the conditional probabilities that xi was transmitted when yj is received, then there is an uncertainity of about xi when yj is received
Thus, the average loss of information over the transmitted symbol when yj is received is given as:
1 2, ,...., rx x x 1 2, ,....., sy y y
( )i jP x ylog[1/ ( )]i jP x y
1( ) ( ) log
( )j i ji i j
H x y P x y bits per symbolP x y
Contd…
When this uncertainity is averaged over all x i and yj, we obtain the average uncertainity about a transmitted symbol when a symbol is received which is denoted as:
This uncertainity is caused by channel noise. Hence, it is the average loss of information about a transmitted symbol when a symbol is received. Therefore, is also called equivocation of x with respect to y
( ) ( ) ( ) j jj
H x y P y H x y bits per symbol
Joint Probability =P(x , )
1( ) ( ) log
( )i j
j i ji j i j
y
P y P x yP x y
1( ) ( , ) log
( )i ji j i j
H x y P x y bits per symbolP x y
Contd…
If the channel is noiseless
= probability that yj is received when xi is transmitted
This is characteristic of the channel and the receiver. thus, a given channel (with its receiver) is specified by the channel matrix
( ) 0H x y
( )j iP y x
Contd…
We can obtain reverse conditional probabilities using Bayes’ Rule:( )i jP x y
DISCRETE MEMORYLESS CHANNELS
Channel Representation
Contd…
The figure shows a Discrete Memoryless Channel
It is a statistical model with an input X and output Y
During each unit of time (signaling interval), the channel accepts an input symbol from X, and in response it generates an output symbol from Y.
The channel is discrete when the alphabets of both X and Y are finite
It is memoryless when the current output depends on only the current input and not on any of the previous inputs.
The Channel Matrix
As discussed earlier a channel is completely specified by the complete set of transition probabilities. Accordingly, a DMC channel shown above is specified by a matrix of transition probabilities [P ( )], given by:
Since each input to the channel results in some output, each row of the channel matrix must add up to unity ,i.e.,
Y X
1
( ) 1 i jj
P y x for all i
Contd…
If the input probabilities P(X) are represented by the row matrix, then we have:
and the output probabilities P(Y) are represented by the row matrix as:
1 2[ ( )] [ ( ) ( ) ... ( )]mP X P x P x P x
Contd…
If P(X) is represented as a diagonal matrix, then we have:
Then,
=Joint Probability Matrix and
The element is the joint probability of transmitting x i and receiving yj
[ ( , )] [ ( )] [ ( )]dP X Y P X P Y X
( , )i jP x y
SPECIAL CHANNELS
Lossless Channel
A channel described by a channel matrix with only one non-zero element in each column is called a lossless channel
A lossless channel is represented as:
Therefore, in the lossless transmission, no source information is lost in transmission.
Deterministic Channel
A channel described by a channel matrix with only one non-zero element in each row is called a deterministic channel
A deterministic channel is represented as:
Since each row has only one non-zero element, therefore, this element must be unity. Thus, when a given symbol is sent in a deterministic channel, it is clear which output symbol will be received.
Noiseless Channel
A channel is called noiseless if it is both lossless and deterministic.
A noiseless channel is represented as:
Therefore, the channel matrix has only one element in each row and in each column, and this element is unity. Also the number of input and output symbols are the same i.e. m = n for a noiseless channel.
Binary Symmetric Channel (BSC)
A binary symmetric channel is represented as:
This channel has two inputs ( ) and two output ( ).
It is symmetric channel because the probability of receiving a 1 if a 0 is sent is the same as the probability of receiving a 0 if a 1 sent.
This common transition probability is denoted by p.
1 20, 1x x 1 20, 1y y
EXERCISE-3
Ques. 1 Consider a binary channel as shown:
Find the channel matrix of the channel Find P(y1) and P(y2) when P(x1)= P(x2)=0.5 Find the Joint Probabilities P(x1, y2), P(x2, y1)
Ans. 1
The Channel Matrix can be given as:
The Output Probability Matrix is given as:
Now, the Joint Probability Matrix is given as:
EXERCISE-3 contd…
Ques. 2 Two binary channels discussed in the previous question are connected in cascade as shown:
Find the overall channel matrix of the resultant channel, and draw the equivalent channel diagram
Find P(z1) and P(z2) when P(x1)= P(x2)=0.5
Ans. 2
We have
The resultant equivalent channel diagram is shown as:
Ans. 2 contd…
EXERCISE-3 contd…
Ques. 3 A channel has the following channel matrix:
Draw the Channel diagram
If the source has equally likely outputs, compute the probabilities associated with the channel outputs for p = 0.2
Ans. 3
Above shown is a Binary Erasure Channel with two inputs, X1=0;X2=1and 3 outputs y1=0 ,y2=e and y3=1; where e indicates an erasure which means that the output is in doubt, and hence it should be erased
The output matrix for the above given channel at p=0.2 can be given as:
ERROR-FREE COMMUNICATION OVER A NOISY CHANNEL
We know that messages from a source with entropy H(m) can be encoded by using an average of H(m) digits per message. This encoding has, however, zero redundancy.
Hence, if we transmit these coded messages over a noisy channel, some of the information will be received erroneously. Therefore, we cannot have error-free communication over a noisy channel when the messages are encoded with zero redundancy. Redundancy in general helps combat noise.
A simple example of the use of redundancy is the Single Parity Check Code in which an extra binary digit is added to each code word to ensure that the total number of 1’s in the resulting codeword is always even (or odd). If a single error occurs in the received code-word, the parity is violated, and the receiver requests retransmission.
Error-Correcting Codes
The two important types of Error correcting codes include:
Block Codes Convolutional Codes
Block and Convolutional codes
Block Codes
In block codes, a block of k data digits is encoded by a code word of n digits (n>k), i.e., for each sequence of k data digits, there is a distinct code word of n digits
In block codes, k data digits are accumulated and then encoded into a n-digit code word.
If k data digits are transmitted by a
code word of n digits, the number of check digits is m=n – k.
The Code Efficiency (also known as the code rate) = .
Such a code is called (n, k) code .
Convolutional Codes
In Convolutional or recurrent codes, the coded sequence of n digits depends not only on the k data digits but also on the previous N-1 data digits (N>1). Hence, the coded sequence for a certain k data digits is not unique but depends on N-1 earlier data digits
In Convolutional codes, the coding is done on a continuous, or running basis
k
n
LINEAR BLOCK CODES
A code word comprising of n-digits and a data word comprising of k digits can be represented by row matrices as:
c = ( ) d = ( )
Generally, in linear block codes, all the n digits of c are formed by linear combinations (modulo-2 additions) of k data digits
A special case where and the remaining digits from are linear combinations of is known as systematic code
1 2 3, , ,..., nc c c c1 2 3, , ,..., kd d d d
1 2 3, , ,..., nc c c c
1 2 3, , ,..., nc c c c
1 2 3, , ,..., kd d d d
1 1 2 2, ,..., k kc d c d c d 1 to k nc c
1 2 3, , ,..., kd d d d
LINEAR BLOCK CODES
For linear Block Codes:
Minimum Distance between code words: Dmin
Number of errors that can be detected:Dmin-1
Number of errors that can be corrected:
Dmin-1/2 If Dmin is odd
Dmin-2/2 If Dmin is even
In a systematic code, the first k digits of a code word are the data digits and the last m = n – k digits are the parity-check digits, formed by linear combination of data digits :1 2 3, , ,..., kd d d d
Example 1:
For a (6, 3) code, the generator matrix is
For all eight possible data words find the corresponding code words, and verify that this code is a single-error correcting code.
Solution
Solution contd… :Decoding
Since the modulo-2 sum of any sequence with itself is zero, we get:
Decoding
Decoding contd…
But because of possible channel errors, is in general a non-zero row vector s called the Syndrome
Therefore, from the received word r we can get s and hence the error word e i .But this procedure does not give a unique solution because r can be expressed in terms of other code words other than ci.
Decoding contd…
Since for k-dimensional data words there are code words , the equation is satisfied by error vectors.
For e .g,
If d = 100 corresponding to it c = 100101 and an error occurred in the third digit, then, r =101101 and e=001000
But in the case of c = 101011 and e = 001000 also the received word would be r = 101101.
Similarly, for c = 110110 and e = 011011, again, the received word would be r =
101101
Therefore in the case of 3-bit data, there are 8 possible data words, 8 corresponding code words and hence 8 possible error vectors for each received word.
Maximum-likelihood Rule
If we receive r, then we decide in favour of that c for which r is most likely to be received. , i.e., c corresponding to that e which represents minimum bit errors
2k Ts eH2k
Example-2 contd…
A (6, 3) code is generated according to the generating matrix in the
previous example. The receiver receives r = 100011. Determine the corresponding data word d.
Solution
Solution contd…
CYCLIC CODES
Cyclic codes area subclass of linear block codes.
In Linear block codes, the procedure for selecting a generator matrix is relatively easy for single-error correcting codes. However it cannot carry us very far in constructing higher order error correcting codes. Cyclic codes have a fair amount of mathematical structure that permits us the design of higher order correcting codes.
For Cyclic codes, encoding and syndrome calculations can be easily implemented using simple shift registers.
CYCLIC CODES contd…
One of the important properties of code polynomials is that when is divided by
, the remainder is .
This property can be easily verified as:
( )ix c x
1nx ( ) ( )ic x
Proof:
Consider a polynomial
This is a polynomial of degree n-1 or less. There are a total of 2k such polynomials corresponding to 2k data vectors
Thus, we obtain a linear (n, k) code generated by (A) .
Now, let us prove that this code generated is indeed cyclic
Proof contd…
EXERCISE-3
Ques. 1 Find a generator polynomial g(x) for a (7, 4) cyclic code and find code vectors for the following data vectors: 1010, 1111, 0001, and 1000.
Ans.1 Now, in this case n =7 and n-k =3 and
The generator polynomial should be of the order n-k=
Let us take:
For d = [1 0 1 0]
7 3 3 21 ( 1)( 1)( 1)x x x x x x
3 2( ) 1g x x x
3( )d x x x
Ans. 1 contd…
SYSTEMATIC CYCLIC CODES
In the previous example, the first k digits were not necessarily the data digits. Therefore, it is not a systematic code.
In a systematic code, the first k digits are the data digits and the last n-k digits are the parity check digits
In a systematic cyclic code, the code word polynomial is given as:
---- (B)
where, is the remainder when is divided by g(x)
1( ) ( ) ( )nc x x d x x
( )x ( )n kx d x
Example:
Construct a systematic (7, 4) cyclic code using a generator polynomial
Example: Solution contd…
Cyclic Code Generation
Coding and Decoding of Cyclic codes can be very easily implemented using Shift Registers and Modulo-2 adders
Systematic Code generation involves division of by g(x) which is implemented using a shift register with feedback connections according to the generator polynomial g(x)
An encoding circuit with n-k shift registers is shown as:
( )n kx d x
Cyclic Code Generation contd…
The k-data digits are shifted in one at a time at the input with the switch s held at position p1 . The symbol D represents one-digit delay
As the data digits move through the encoder, they are also shifted out onto the output line, because the first k digits of the code word are the data digits themselves
As soon as the last (or kth ) data digit clears the last [(n-k)th] register, all the registers
contain parity check digits. The switch s is now thrown to position p2,and then the parity check digits are shifted out one at a time onto the line
Every valid code polynomial c(x) is a multiple of g(x). In case of error during transmission , the received word polynomial r(x) will not be a multiple of g(x). Thus,
Cyclic Code Generation contd…
Every valid code polynomial c(x) is a multiple of g(x). In case of error during transmission , the received word polynomial r(x) will not be a multiple of g(x). Thus,
1
( ) ( ) ( )( ) ( ) Re (of degree n-k or less)
( ) ( ) ( )
r(x)= c(x)+e(x)
where, e(x) = error polynomial, then since c(x) is a multiple of g(x)
( )s(x)=Rem
( )
r x s x r xm x and s x m syndrome polynomial
g x g x g x
e x
g x