Upload
narayanasiva
View
212
Download
0
Tags:
Embed Size (px)
DESCRIPTION
MMC12_InfoTheory_Entropiecod
Citation preview
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Multimedia
Communications
Dr.-Ing. Aljoscha Smolic
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Information Theory,
Entropycoding
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Overview
Basic terms and principles of information theory and signal processing for multimedia
Information theory, entropycoding Communication channel Sampling Quantisation Transformation Signal processing (filtering) Statistics Prediction
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Materials
J.-R. Ohm
Multimedia Communication Technology
Springer-Verlag
Google, Wikipedia: Information theory, Entropycoding
C.E. Shannon, "A Mathematical Theory of Communication", Bell System
Technical Journal, vol. 27, pp. 379-423, 623-656, July, October, 1948
http://cm.bell-labs.com/cm/ms/what/shannonday/shannon1948.pdf
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Relevance - Irrelevance
Relevance: Parts of a message that are of impact for the receiver
Irrelevance: Unimportant parts of a message Things, we cannot perceive (tones > 20 kHz, infrared light, masking mp3)
Things, the presentation device (e.g. display) cannot handle (conversion cinema -> TV)
Fundamental instrument of compression: Irrelevance reduction, detection and removal of
irrelevant parts of messages Example: mp3, frequencies we cannot hear, are
detected and not transmitted
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
JPEG
Original Compressed 1:150
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
JPEG
Original Compressed 1:150
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Redundancy Redundancy: Parts of a message that result from the
rest, that can be reconstructed given the rest
E.g.: der weie Schimmel, He jumped into the lake and got wet.
We can find another representation, which is more compact
Rot, rot, rot, rot, rot, blau, blau, blau, rot, rot, rot 5 x rot, 3 x blau, 3 x rot Run-length coding, fax (black, white)
Fundamental instrument of compression: Redundancy reduction, detection and removal of
redundant parts of messages
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Reversibility
Redundancy reduction: is fully reversible, i.e. the complete, exact information can be reconstructed
Irrelevancy reduction is irreversible, i.e. the discarded parts can not be reconstructed later
Redundancy relates to statistics and models
Irrelevancy relates to human perception
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Communication Plane
Relevancy
Non-Redundancy
Interesting
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Information content: Numerical measure for the information that is finally included within a message
Numerical measure for the predictability of an event within its context
Measured via the probability of a message/event Die Tagesschau kommt um 20.00h, low IC Die Tagesschau kommt heute um 20.30h, high IC
The more surprise, the higher the numerical value of IC
Information is surprise, uncertainty
Information Content
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Low IC High IC
t t
Information Content
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
?
Information Content
Low IC High IC
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
j event from event space J
E.g. Kopf from {Kopf, Zahl}
Probability for result Kopf: P(j) =
Definition Information Content (unit bit):
)(1log)( 2 jPji = 12/1
1log)"Kopf(" 2 ==i
Information Content
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Manipulated coin Probability for Kopf: P(j) = 1/4 Probability for Zahl: P(j) = 3/4
Event Kopf is more surprising -> higher IC
24/1
1log)"Kopf(" 2 ==i
42,04/3
1log)"Zahl(" 2 ==i
Information Content
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Entropy = mean information content of a source Determines how many bit per symbol are necessary to encode a
source binary (theroretical bound)
Entropy
==
J
jjijPJH
1)()()(
112/112/1)"Coin(" =+=H
815,024/142,04/3)"Coin dManipulate(" =+=H
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Source Coin has higher entropy (mean information content) than source Manipulated Coin
Source Manipulated Coin contains redundancy, is predictable
In theory: Coding of source Coin requires a mean of 1 bit per event (0 = Kopf, 1 = Zahl)
Coding of source Manipulated Coin requires a mean of 0,815 bit per event (see below how to do code assignment)
Entropy
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Excercise: Calculate Information Content and Entropy for source Wrfel assuming same probabilities for all events
P(j) = 1/6
58,258,261)"(" 6
1 ===j
WrfelH
Entropy
58,26/1
1log)"6("...)"1(" 2 ==== ii
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Excercise: Calculate Information Content and Entropy for source gezinkter Wrfel assuming the following probabilities:
P(1) = P(2) = P(3) = P(4) = 1/8; P(5) = P(6) = 2/8 = 1/4
Entropy
38/1
1log)"4("...)"1(" 2 ==== ii
24/1
1log)"6(")"5(" 2 === ii
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Entropy
5,22413
81)" (" 4
1
6
5 = +== =j j
WrfelgezinkterH
38/1
1log)"4("...)"1(" 2 ==== ii
24/1
1log)"6(")"5(" 2 === ii
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Entropy Wrfel
1/6
P(j) i(j)
j1 6
2,58H(J)=2,58=log26
j1 6
58,26log6
11log)"6("...)"1(" 22 ===== ii 58,258,261)(
6
1
=
==
jJH
)(1log)( 2 jPji =
==
J
jjijPJH
1)()()(
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Uniformly distributed source: all events have same probability
Maximum Entropy of a source with N events: Decision Content H0 = log2N = max(H)
Equal to information content of every possible event None of the events is favored Source is not predictable, completely random No Redundancy
Decision Content
j1 N
P(j)1/N
j1 N
i(j)log2N
H(J) = log2N = H0
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Entropy of a Source with a Known Event
j1 6
P(j)
0
1
j1 6
i(j)
0
===== 22 log01log)"5("...)"1(" ii 0
11log)"6(" 2 ==i
)min(0010)(5
1HJH
j==+=
=
Source does not transmit information
H(J) = 0
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Entropy gezinkter Wrfel
3
j1 6
i(j)H(J)=2,5
j1 6
P(j)
1/8
1/4
2
P(1) = P(2) = P(3) = P(4) = 1/8; P(5) = P(6) = 1/4
38/1
1log)"4("...)"1(" 2 ==== ii 24/11log)"6(")"5(" 2 === ii
5,22413
81)" (" 4
1
6
5 = +== =j j
WrfelgezinkterH
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Non-uniformly distributed source: events with different probabilities
Entropy smaller than decision content 0 H(J) H0 = log2N Source contains redundancy, it is predictable R = H0 - H(J)
Redundancy
j1 N
i(j)0 H(J) H0
j1 N
P(j)
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Continuous signals are analogue, the representation (e.g. voltage) follows the physical phenomenon (sound, light)
Goal: Digital Representation: Representation via a sequence of symbols Mostly binary symbols (bits) with 2 possible values (states), e.g. 0 and 1, A und B
Physical representation of this information about states then e.g. by 2 voltage values
Digital Representation
t
U(t)
1 0 1 1 0on/off
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Not the signal itself, but a different description of it is represented, the information
High quality at low cost
Error and noise free transmission, storage, copy possible
Digital manipulation (editing) of the signals is possible, software Integration to multimedia is possible, new formats of content
Digital Representation
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Assignment of information to a symbol
Kopf = 1 Zahl = 0
Source alphabet (discrete) Codebook
rot = 00 gelb = 01 grn = 10 wei = 11
01 00 10 01 =
Digital Representation
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Codebook with N bit can represent 2N code words
N = 1 Code words 1, 0 => 2 source events can be represented (source alphabet Kopf, Zahl)
N = 2 Code words 00, 01, 10, 11 => 4 source events can be represented ( )
Digital Representation
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Source alphabet (discrete) Codebook A source with M events requires a codebook with N >= log2M
Wrfel: M = 6 -> N >= log26 = 2,58 bit -> N = 3 bit z.B. 000 = 1, 001 = 2, 010 = 3, 011 = 4, 100 = 5, 101 = 6
Unused code words: 110, 111
German Alphabet: M = 26 -> N >= log226 = 4,7 bit -> N = 5 bit z.B. 00000 = a, 00001 = b, Possible code words with 5 bit: 25 = 32
-> 6 code words unused
Fixed Length Coding
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Excercise: Create a binary fixed length code for the teams in the Fussball Bundesliga (18 teams)
Which code length is required? How many code words are not used?
M = 18 => N >= log218 bit = 4,17 bit => N = 5 bit
E.g. 00000 = Hertha BSC, 00001 = Bayern Muenchen,
Possible code words with 5 bit: 25 = 32 -> 14 code words unused
Fixed Length Coding
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Excercise: Create a binary fixed length code for the international airports (e.g. TXL, SFO), assumption 26 letters
How may elements does the source alphabet contain? Which code length is required? How many code words are not used?
M = 26*26*26 = 263 = 17576 =>N >= log217576 bit = 14,10 bit => N = 15 bit
Possible code words with15 bit: 215 = 32768-> 15192 code words unused
Digitale Reprsentation
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Manipulated coin Probability for Kopf: P(j) = 1/4 Probability for Zahl: P(j) = 3/4
Event Kopf is more surprising -> higher IC
24/1
1log)"Kopf(" 2 ==i
42,04/3
1log)"Zahl(" 2 ==i
Remind: Information Content
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Entropy = mean information content of a source Determines how many bit per symbol are necessary to encode a
source binary (theroretical bound)
Remind: Entropy
==
J
jjijPJH
1)()()(
112/112/1)"Coin(" =+=H
815,024/142,04/3)"Coin dManipulate(" =+=H
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Define combined event: Double Manipulated Coin
Source: KK, KZ, ZK, ZZ
Probabilities: P(KK) = 1/4 * 1/4 = 1/16 P(KZ) = 1/4 * 3/4 = 3/16 P(ZK) = 3/4 * 1/4 = 3/16 P(ZZ) = 3/4 * 3/4 = 9/16
Combined Events
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Variable length code (VLC): (Arbitrarily defined!!! No Huffman Code)
1 = ZZ 000 = ZK 001 = KZ 010 = KK
Mean bit length per event ( [Probability * number of bits for event j] ):
875,131613
1633
1631
169)()()(
1=+++= =
=
J
jjNjPJN
Variable Length Codes
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Mean number of bits per combined event: 1,875 bit
I.e. coding of one event requires a mean of 0.9375 bit
Coding as single event requires 1 bit per event
Usage of combined events and variable length codes adapted to probability allows approximation to theoretical bound given by Entropy of 0,815 bit
Variable Length Codes
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Define combined event: Triple manipulated coin
Source: KKK, KKZ, KZK, KZZ, ZKK, ZKZ, ZZK, ZZZ
23 events
Combined Events
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Probabilities: P(KKK)= 1/4 * 1/4 * 1/4 = 1/64 P(KKZ)= 1/4 * 1/4 * 3/4 = 3/64 P(KZK)= 1/4 * 3/4 * 1/4 = 3/64 P(KZZ)= 1/4 * 3/4 * 3/4 = 9/64 P(ZKK)= 3/4 * 1/4 * 1/4 = 3/64 P(ZKZ)= 3/4 * 1/4 * 3/4 = 9/64 P(ZZK)= 3/4 * 3/4 * 1/4 = 9/64 P(ZZZ)= 3/4 * 3/4 * 3/4 = 27/64
Combined Events
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Variable lenght code (Huffman): KKK = 00000 KKZ = 00001 KZK = 00010 KZZ = 010 ZKK = 00011 ZKZ = 011 ZZK = 001 ZZZ = 1
Variable Length Codes
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Mean bit length per event ( [Probability * number] of bits for event j):
2,469=+++
++++== =
164273
6493
6493
649
56435
6435
6435
641)()()(
1
J
jjNjPJN
Variable Length Codes
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Mean number of bits per combined event: 2,469 bit
I.e. coding of one event requires a mean of 0.823 bit
Coding as single event requires 1 bit per event
Usage of combined events and variable length codes adapted to probability allows approximation to theoretical bound given by Entropy of 0,815 bit
Variable Length Codes
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Definition of longer combined events and adequate VLCs allows even better approximation of entropy
Theorie: For combined events of infinite length the entropy is reached for every source
But: Infinite processing time (delay)
In practice: Optimization of efficiency and delay for given application
Variable Length Codes
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
The entropy of a source determines the minimum bitrate required for error free transmission of source symbols.
It can be achieved by block codes and appropriate variable lenghth coding.
Shannons Theorem
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Entropy Coding Purpose: Lossless coding of a discrete source A = {a1, a2, , aJ} using a
codebook C = {c1, c2, , cJ}
Goal: Compression exploiting redundancy of the source
Theoretical bound: Entropy of the source
Determines the minimum mean number of bits per symbol required
for binary coding of the source A
==
J
j jjaiaPH
1)()()(A
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Instruments: Variable length coding, block codes (for combined
events)
Code words of different bit length are assigned according to
probability of source events
Events with high probability get short code words, unlikely events get
longer code words
On average this results in a reduction of overall bitrate
Entropy Coding
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Code Rate Bit length of code word
Rate = average bit length of code words of a code
Code must be decodable unambiguously:
Separators between code words, increases rate
Prefix free codes: no code word may be the beginning of any
other code word
= ====
J
j jjJ
j jjjaiaPHaNaPaNR
11)()()()()()( A
)( jaN
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Code Tree
Prefix free codes :
Can be represented
by a code tree
Wurzel("root")
1
0
1
0
1
0 10
10
10
1
0
1
01
0
Knoten("node")
Blatt("leave")
"111"
"1101"
"1100"
"1011"
"1010"
"011"
"0101"
"0100"
"100"
"00"
Zweig("branch")
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
p(7)=0,29
p(6)=0,28
p(5)=0,16
p(4)=0,14
p(3)=0,07
p(2)=0,03p(1)=0,02
p(0)=0,01 p=0,03p=0,06
p=0,13
p=0,27
p=0,43
p=0,57
"0""1" "0"
"0"
"0"
"0"
"0"
"0"
"1"
"1"
"1"
"1"
"1""1"
"11"
"10"
"01"
"001"
"0001"
"00001"
"000001"
"000000"
Huffman Codes
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Huffman Codes
Algorithm of Huffman coding
Sort the events by probabilities
Combine the least probable branches, assign 0 and 1
Create new combined branch with sum of probabilities
Combine again least probable branches
etc. until root
A code tree is created
The code words result reading from root to leaves
Aljoscha Smolic
Multimedia Communications
FOR CLASS USE ONLY
DO NOT DISTRIBUTE
Huffman Codes In this example:
Entropy:
Fixed length coding:
M = 8 => N = log28 = 3
Rate: 49,2)()()(1
= ===
J
j jjjaNaPaNR
45,2)( =AH