View
8
Download
0
Category
Preview:
Citation preview
Source Coding Techniques
1. Huffman Code.
2. Two-pass Huffman Code.
4. Fano code.
5. Shannon Code.
6. Arithmetic Code.
3. Lemple-Ziv Code.
Source Coding Techniques
1. Huffman Code.
2. Two-path Huffman Code.
4. Fano Code.
5. Shannon Code .
6. Arithmetic Code.
3. Lemple-Ziv Code.
Source Coding Techniques
1. Huffman Code.
With the Huffman code in the binary case the two leastprobable source output symbols are joined together, resulting in a new message alphabet with one less symbol
1 take together smallest probabilites: P(i) + P(j) 2 replace symbol i and j by new symbol 3 go to 1 - until end
Application examples: JPEG, MPEG, MP3
ADVANTAGES:• uniquely decodable code• smallest average codeword length
DISADVANTAGES:• LARGE tables give complexity • sensitive to channel errors
1. Huffman Code.
Huffman is not universal!it is only valid for one particular type of source!
For COMPUTER DATA data reduction is
lossless→ no errors at reproductionuniversal→ effective for different types of data
1. Huffman Code.
Huffman Coding: Example
• Compute the Huffman Code for the source shown
0.2s3
0.1s4
0.4s2
0.2s1
0.1s0
Symbol Probability
pk
Source Symbol
sk
( ) ( )
( )
( )
=
+ × + ×
= ≥
2
2
2
1H 0 4 log
0 4
12 0 2 log
0 2
12 0 1 log
0 1
2 12193
S ..
..
..
. L
Solution A
0.1
0.1
0.2
0.2
0.4
Stage I
s4
s0
s3
s1
s2
Source Symbol
sk
Solution A
0.2
0.2
0.2
0.4
Stage II
0.1
0.1
0.2
0.2
0.4
Stage I
s4
s0
s3
s1
s2
Source Symbol
sk
Solution A
0.2
0.4
0.4
Stage III
0.2
0.2
0.2
0.4
Stage II
0.1
0.1
0.2
0.2
0.4
Stage I
s4
s0
s3
s1
s2
Source Symbol
sk
Solution A
0.4
0.6
Stage IV
0.2
0.4
0.4
Stage III
0.2
0.2
0.2
0.4
Stage II
0.1
0.1
0.2
0.2
0.4
Stage I
s4
s0
s3
s1
s2
Source Symbol
sk
Solution A
0.4
0.6
Stage IV
0.2
0.4
0.4
Stage III
0.2
0.2
0.2
0.4
Stage II
0.1
0.1
0.2
0.2
0.4
Stage I
s4
s0
s3
s1
s2
Source Symbol
sk
0
1
0
1
0
1
0
1
Solution A
0.4
0.6
Stage IV
0.2
0.4
0.4
Stage III
0.2
0.2
0.2
0.4
Stage II
0.1
0.1
0.2
0.2
0.4
Stage I
011s4
010s0
11s3
10s1
00s2
Code Source Symbol
sk0
1
0
1
0
1
0
1
Solution A Cont’d
0.10.20.40.20.1
Symbol Probability
pk
11s3
011s4
00s2
10s1
010s0
Code word ck
Source Symbol
sk
( ) =H 2 12193S .
THIS IS NOT THE ONLY SOLUTION!
( ) ( )≤ < +H H 1S L S
= × + ×+ × + × + ×=
0 4 2 0 2 20 2 2 0 1 3 0 1 32 2
L . .. . ..
Another Solution B
0.4
0.6
Stage IV
0.2
0.4
0.4
Stage III
0.2
0.2
0.2
0.4
Stage II
0.1
0.1
0.2
0.2
0.4
Stage I
0011s4
0010s0
000s3
01s1
1s2
Code Source Symbol
sk
0
1
0
1
0
1
0
1
Another Solution B Cont’d
0.10.20.40.20.1
Symbol Probability
pk
000s3
0011s4
1s2
01s1
0010s0
Code word ck
Source Symbol
sk
( ) =H 2 12193S .
( ) ( )≤ < +H H 1S L S
= × + ×+ × + × + ×=
0 4 1 0 2 2
0 2 3 0 1 4 0 1 42 2
L . .
. . ..
What is the difference between the two solutions?
• They have the same average length• They differ in the variance of the average code
length
• Solution A• s 2=0.16
• Solution B• s 2=1.36
( )−
=
= −∑1 2
2
0
K
k kk
p l Lσ
Source Coding Techniques
1. Huffman Code.
2. Two-pass Huffman Code.
4. Fano Code.
5. Shannon Code.
6. Arithmetic Code.
3. Lemple-Ziv Code.
Source Coding Techniques2. Two-pass Huffman Code.
This method is used when the probability of symbols in the information source is unknown. So we first can estimate this probability by calculating the number ofoccurrence of the symbols in the given message then we can find the possible Huffman codes. This can be summarized by the following two passes.
Pass 1 : Measure the occurrence possibility of each character in the message
Pass 2 : Make possible Huffman codes
Source Coding Techniques2. Two-pass Huffman Code.
Example
Consider the message: ABABABABABACADABACADABACADABACAD
0
Source Coding Techniques
1. Huffman Code.
2. Two-path Huffman Code.
4. Fano Code.
5. Shannon Code.
6. Arithmetic Code.
3. Lemple-Ziv Code.
Lempel-Ziv Coding• Huffman coding requires knowledge of a
probabilistic model of the source• This is not necessarily always feasible
• Lempel-Ziv code is an adaptive coding technique that does not require prior knowledge of symbol probabilities
• Lempel-Ziv coding is the basis of well-known ZIP for data compression
• GIF, TIFF, V.42bis modem compression standard, PostScript Level 2
• 1977 published by Abraham Lempel and Jakob Ziv
• 1984 LZ-Welch algorithm published in IEEE Computer• Sperry patent transferred to Unisys (1986)• GIF file format Required use of LZW algorithm
Lempel-Ziv Coding History
•Universal
•Lossless
Lempel-Ziv Coding Example0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
8
Encoding
Representation
10Subsequence
97654321Codebook Index
Lempel-Ziv Coding Example0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
8
Encoding
Representation
0010Subsequence
97654321Codebook Index
Lempel-Ziv Coding Example0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
8
Encoding
Representation
010010Subsequence
97654321Codebook Index
Lempel-Ziv Coding Example0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
8
Encoding
Representation
011010010Subsequence
97654321Codebook Index
Lempel-Ziv Coding Example0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
8
Encoding
Representation
10011010010Subsequence
97654321Codebook Index
Lempel-Ziv Coding Example0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
8
Encoding
Representation
01010011010010Subsequence
97654321Codebook Index
Lempel-Ziv Coding Example0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
100
8
Encoding
Representation
01010011010010Subsequence
97654321Codebook Index
Lempel-Ziv Coding Example0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
100
8
Encoding
Representation
10101010011010010Subsequence
97654321Codebook Index
Lempel-Ziv Coding Example0 0 0 1 0 1 1 1 0 0 1 0 1 0 0 1 0 1…
1100
61
100
8
110110000100100100110010Source Code
624121421211Representation
10101010011010010Subsequence
97654321Codebook Index
0010 0011 1001 0100 1000 1100 1101Source encoded bits
Information bits
How Come this is Compression?!• The hope is:
• If the bit sequence is long enough, eventually the fixed length code words will be shorter than the length of subsequences they represent.
• When applied to English text• Lempel-Ziv achieves approximately 55%• Huffman coding achieves approximately
43%
Encoding idea Lempel Ziv Welch-LZW
Assume we have just read a segment w from the text.a is the next symbol.
If wa is not in the dictionary,?Write the index of w in the output file.?Add wa to the dictionary, and set wß a.
?If wa is in the dictionary,
?Process the next symbol with segment wa.
a
w
a
LZ Encoding example• address 0: a address 1: b address 2: c
Input string: a a b a a c a b c a b c b
a a b a a c a b c a b c b
a a
a a ba a b a
a a b a a ca a b a a c a
a a b a a c a b ca a b a a c a b c a b
aa not in dictionry, output 0 add aa to dictionarycontinue with a, store ab in dictionary
continue with b, store ba in dictionary
aa in dictionary, aac not,
output update
0 aa 30 ab 41 ba 5
3 aac 62 ca 74 abc 8
7 cab 9
LZ Encoderaabaacabcabcb 0013247
UNIVERSAL (LZW) (decoder)
1. Start with basic symbol set
2. Read a code c from the compressed file.- The address c in the dictionary determines the segment w.- write w in the output file.
3. Add wa to the dictionary: a is the first letter of the next segment
LZ Decoding example• address 0: a address 1: b address 2: c
a
a a !a a b .
a a b a a .a a b a a c .
a a b a a c a b .a a b a a c a b c a .
Output a
output a determines ? = a, update aaoutput 1 determines !=b, update ab
Input update
0aa 30ab 41
ba 53aac 62ca 74abc 87
Output String:
?
LZ Decoder aabaacabcabcb0013247
Exercise
0.05r
0.05w
0.25o
0.4l
0.05d
0.1e
0.1h
ProbabilitySymbol
1. Find Huffman code for the following source
2. Find LZ code for the following input
0011001111010100010001001
Source Coding Techniques
1. Huffman Code.
2. Two-path Huffman Code.
4. Fano Code.
5. Shannon Code.
6. Arithmetic Code.
3. Lemple-Ziv Code.
4. Fano Code.
The Fano code is performed as follows:
1. arrange the information source symbols in order of decreasing probability
2. divide the symbols into two equally probable groups, as possible as you can
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
4. repeat steps 2 and 3 per group as many times as this is possible.
5. stop when no more groups to divide
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol
1. arrange the information source symbols in order of decreasing probability
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol
2. divide the symbols into two equally probable groups, as possible as you can
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
00
111
11
111
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol
4. repeat steps 2 and 3 per group as many times as this is possible.
00
111
11
111
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
2. divide the symbols into two equally probable groups, as possible as you can
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
01
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
2. divide the symbols into two equally probable groups, as possible as you can
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
001
11
111
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
4. repeat steps 2 and 3 per group as many times as this is possible.
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
2. divide the symbols into two equally probable groups, as possible as you can
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
01
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
01
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
01
2. divide the symbols into two equally probable groups, as possible as you can
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
01
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
0
01
111
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
010
01
111
2. divide the symbols into two equally probable groups, as possible as you can
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
010
01
111
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
0
1
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
010
01
111
0
1
2. divide the symbols into two equally probable groups, as possible as you can
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
010
01
111
0
1
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
0
011
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
010
01
111
0
10
011
2. divide the symbols into two equally probable groups, as possible as you can
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
010
01
111
0
10
011
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
0
1
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
010
01
111
0
10
011
0
1
2. divide the symbols into two equally probable groups, as possible as you can
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
010
01
111
0
10
011
0
1
3. each group receives one of the binary symbols (i.e. 0 or 1) as the first symbol
01
Example 1:
4. Fano Code.
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Fano CodeProbabilitySymbol00
111
11
111
01
001
11
111
010
01
111
0
10
011
0
1
5. stop when no more groups to divide
01
4. Fano Code.
Note that: If it was not possible to divide preciselythe probabilities into equally probable groups, we should try to make the division as good as possible, as we can see from the following example.
1/9X
1/9W
1/9V
1/3U
1/3T
Fano CodeProbabilitySymbol
Example 2:
1
0
00
0
1
1 1
1 1 1
0
Source Coding Techniques
1. Huffman Code.
2. Two-path Huffman Code.
4. Fano Code.
5. Shannon Code.
6. Arithmetic Code.
3. Lemple-Ziv Code.
5. Shannon Code.
The Shannon code is performed as follows:
1. calculate a series of cumulative probabilities ∑=
=n
iik
pq1
, k=1,2,…,n
2. calculate the code length for each symbol using log( 1\pi ) = li < log ( 1\pi ) + 1
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
1. calculate a series of cumulative probabilities
5. Shannon Code.
0+
1/4+
1/2+5/8+
3/4+
13/16+7/8+
29/32+
15/16+
31/32
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
2. calculate the code length for each symbol using
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
log( 1\pi ) = li < log ( 1\pi ) + 1
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
2. calculate the code length for each symbol using
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
log( 1\pi ) = li < log ( 1\pi ) + 1
Log(1/(1/4)) = l1 < Log(1/(1/4)) + 1
2 = l1 < 2 + 1
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
2. calculate the code length for each symbol using
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
log( 1\pi ) = li < log ( 1\pi ) + 1
Log(1/(1/4)) = l1 < Log(1/(1/4)) + 12 = l1 < 2 + 1
l1 = 2
2
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
2. calculate the code length for each symbol using
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
log( 1\pi ) = li < log ( 1\pi ) + 1
223
3
445
555
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
223
3
445
555
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
223
3
445
555
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1
qk = c1 2 -1 + c2 2-2
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
223
3
445
555
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1
qk = c1 2 -1 + c2 2-2
0 = c1 2 -1 + c2 2-2
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
223
3
445
555
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1
qk = c1 2 -1 + c2 2-2
0 = c1 2 -1 + c2 2-2
c1 = 0, c2 = 0
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
223
3
445
555
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1
qk = c1 2 -1 + c2 2-2
0 = c1 2 -1 + c2 2-2
c1 = 0, c2 = 0
00
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
223
3
445
555
3. write qk in the form c1 2 -1 + c2 2-2 + … + cli 2-li where each ci is either 0 or 1
0001
100101
11001101
11100111011111011111
Example 3:
qk Length li
1/32J
1/32I
1/32H
1/32G
1/16F
1/16E
1/8D
1/8C
1/4B
1/4A
Shannon CodeProbabilitySymbol
5. Shannon Code.
0
1/41/25/8
3/413/16
7/8
29/3215/1631/32
223
3
445
555
0001
100101
11001101
11100111011111011111
Example 3:
Example
1110
101
01
00
Shannon Code
0.9
0.7
0.4
0
qk
4
3
2
2
Length li
1110.1Z
1100.2Y
100.3X
00.4W
Fano codeProbabilitySymbol
5. Shannon Code.
Note that:
from examples 1 and 3 one may conclude that Fanocoding and Shannon coding produce the same code, however this is not true in general as we can see from the following example.
Source Coding Techniques
1. Huffman Code.
2. Two-path Huffman Code.
4. Shannon Code.
5. Fano Code.
6. Arithmetic Code.
3. Lemple-Ziv Code.
In arithmetic coding a message is encoded as a number from the interval [0, 1).
The number is found by expanding it according to the probability of the currently processed letter of the message being encoded.
This is done by using a set of interval ranges IR determined by the probabilities of the information source as follows:
IR ={ [0, p1), [p1, p1+ p2), [p1+ p2, p1+ p2+ p3), … [p1+ … + pn-1, p1+ … + pn) }
Putting we can write IR = { [0, q1), [q1, q2), …[qn-1, 1) }
In arithmetic coding these subintervals also determine the proportionaldivision of any other interval [L, R) contained in [0, 1) into subintervals IR[L,R] as follows:
6. Arithmetic Code.
Coding
∑=
=j
iij
pq1
In arithmetic coding these subintervals also determine the proportionaldivision of any other interval [L, R) contained in [0, 1) into subintervals IR[L,R] as follows:
IR[L,R] = { [L, L+(R-L) q1), [L+(R-L) q1, L+(R-L) q2), [L+(R-L) q2, L+(R-L) q3), … , [L+(R-L) Pn-1, L+(R-L) ) }
Using these definitions the arithmetic encoding is determined by the Following algorithm:
ArithmeticEncoding ( Message )
1. CurrentInterval = [0, 1);While the end of message is not reached
2. Read letter xi from the message;3. Divid CurrentInterval into subintervals IRCurrentInterval;
Output any number from the CurrentInterval (usually its left boundary);
This output number uniquely encoding the input message.
6. Arithmetic Code.
Coding
6. Arithmetic Code.
Coding
Example Consider the information source
Then the input message ABBC# has the unique encoding number 0.23608.
0.20.10.30.4#CBA
As we will see the explanation In the next slides
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
1. CurrentInterval = [0, 1);
2. Read X2. Read Xii
A [0, 1)
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
2. Read X2. Read Xii
A [0, 1)
3. Divid CurrentInterval into subintervals IRCurrentInterval;
IR[0,1)= { [0, 0.4) , [0.4, 0.7),[0.7, 0.8), [0.8, 1)}
[L+(R-L) qi, L+(R-L) qi+1)
∑=
=j
iij
pq1
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
2. Read X2. Read Xii
A [0, 1)
3. Divid CurrentInterval into subintervals IRCurrentInterval;
IR[0,1)= { [0, 0.4) , [0.4, 0.7),[0.7, 0.8), [0.8, 1)}
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
0.20.10.30.4
#CBA
[0, 0.4)
No. 1
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4)
2. Read X2. Read Xii
B
3. Divid CurrentInterval into subintervals IRCurrentInterval;
IR[0,0.4)= { [0, 0.16) , [0.16, 0.28),[0.28, 0.32), [0.32, 0.4)}
[L+(R-L) qi, L+(R-L) qi+1)
∑=
=j
iij
pq1
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4)
2. Read X2. Read Xii
B
3. Divid CurrentInterval into subintervals IRCurrentInterval;
IR[0,0.4)= { [0, 0.16) , [0.16, 0.28),[0.28, 0.32), [0.32, 0.4)}
[0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4) B [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
0.20.10.30.4
#CBA
[0.16, 0.28)
No. 2
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4) B [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)B [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
2. Read X2. Read Xii
3. Divid CurrentInterval into subintervals IRCurrentInterval;
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4) B [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)B [ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
0.20.10.30.4
#CBA
No. 2
[0.208, 0.244)
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4) B [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)C
[ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)
2. Read X2. Read Xii
3. Divid CurrentInterval into subintervals IRCurrentInterval;
B
[0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4) B [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)C
[ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)B
[0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
0.20.10.30.4
#CBA
No. 3
[0.2332, 0.2368)
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4) B [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)C
[ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)B
[0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
[0.2332, 0.2368)
2. Read X2. Read Xii
#
3. Divid CurrentInterval into subintervals IRCurrentInterval;
[0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4) B [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)C
[ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)B
[0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
[0.2332, 0.2368)
2. Read X2. Read Xii
# [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
0.20.10.30.4
#CBA
No. 3
[0.23608, 0.2368)
6. Arithmetic Code.
Coding
Example A B B C #input message:
SubintervalsSubintervalsCurrent intervalCurrent intervalXXii
A [0, 1) [0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)[0, 0.4) B [0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4)
[0.16, 0.28)C
[ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28)B
[0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
[0.2332, 0.2368)
2. Read X2. Read Xii
# [0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
[0.23608, 0.2368)
# is the end of input message Stop Return current interval [0.23608, 0.2368)
6. Arithmetic Code.
Coding
Example A B B C #input message:
# is the end of input message Stop Return current interval [0.23608, 0.2368)
Return the lower bound of the currentinterval as the codeword of the input message
0.23608ABBC#
CodewordInput message
6. Arithmetic Code.
Decoding
Arithmetic decoding can be determined by the following algorithm:
ArithmeticDecoding ( Codeword )
0. CurrentInterval = [0, 1);While(1)
1. Divid CurrentInterval into subintervals IRCurrentInterval;2. Determine the subintervali of CurrentInterval to which
Codeword belongs;
3. Output letter xi corresponding to this subinterval;
4. If xi is the symbol ‘#’Return;
5. CurrentInterval = subintervali in IRCurrentInterval;
6. Arithmetic Code.
Decoding
Example
0.2#
0.1C
0.3B
0.4A
ProbabilitySymbol
Consider the information source
Then the input code word 0.23608 can be decoded to the message ABBC#
As we will see the explanation In the next slides
6. Arithmetic Code.
Decoding
input codeword:
SubintervalsSubintervalsCurrent intervalCurrent interval
0. CurrentInterval = [0, 1);
0.23608
OutputOutput
Example
[0, 1)
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval OutputOutput
1. Divid CurrentInterval into subintervals IRCurrentInterval;
[L+(R-L) qi, L+(R-L) qi+1)
∑=
=j
iij
pq1
input codeword: 0.23608Example
IR[0,1)= { [0, 0.4) , [0.4, 0.7),[0.7, 0.8), [0.8, 1)}
[0, 1)
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval OutputOutput
IR[0,1)= { [0, 0.4) , [0.4, 0.7),[0.7, 0.8), [0.8, 1)}
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
input codeword: 0.23608Example
[0, 1)
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval
0 = 0.23608 < 0.4
OutputOutput
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
2. Determine the subintervali of CurrentInterval to which Codeword belongs;
input codeword: 0.23608Example
[0, 1)
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval
0 = 0.23608 < 0.4
OutputOutput
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
2. Determine the subintervali of CurrentInterval to which Codeword belongs;
input codeword: 0.23608Example
[0, 1)
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval OutputOutput
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
3. Output letter xi corresponding to this subinterval;
input codeword: 0.23608Example
0.20.10.30.4
#CBANo. 1
No. 1
A[0, 1)
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval OutputOutput
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
4. If xi is the symbol ‘#’
input codeword: 0.23608Example
A[0, 1)
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval OutputOutput
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
4. If xi is the symbol ‘#’
input codeword: 0.23608Example
A
NO
[0, 1)
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval
[0, 1)OutputOutput
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
5. CurrentInterval = subintervali in IRCurrentInterval;
input codeword: 0.23608Example
A
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval
[0, 1)OutputOutput
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
5. CurrentInterval = subintervali in IRCurrentInterval;
input codeword: 0.23608Example
A[0, 0.4)
6. Arithmetic Code.
Decoding
SubintervalsSubintervalsCurrent intervalCurrent interval
[0, 1)OutputOutput
[0, 0.4) , [0.4, 0.7), [0.7, 0.8), [0.8, 1)
input codeword: 0.23608Example
A[0, 0.4)
Similarly we repeat the algorithm steps 1 to 5 until the output symbol = ‘#’
[0, 0.16) , [0.16, 0.28), [0.28, 0.32), [0.32, 0.4) B[0.16, 0.28)
C
[ 0.16, 0.208) , [0.208, 0.244), [0.244, 0.256), [0.256, 0.28) B[0.208, 0.244) [0.208, 0.2224) , [0.2224, 0.2332), [0.2332, 0.2368), [0.2368, 0.244)
[0.2332, 0.2368) #[0.2332, 0.23464) , [0.23464, 0.23572), [0.23572, 0.23608), [0.23608, 0.2368)
4. If xi is the symbol ‘#’ Yes Stop
Return the output message: A B B C #
Recommended