15
STATISTIC & INFORMATION THEORY (CSNB134) MODULE 9 INFORMATION CONTENT, ENTROPY & CODING EFFICIENCY

STATISTIC & INFORMATION THEORY (CSNB134) MODULE 9 INFORMATION CONTENT, ENTROPY & CODING EFFICIENCY

Embed Size (px)

Citation preview

STATISTIC & INFORMATION THEORY

(CSNB134)

MODULE 9INFORMATION CONTENT,

ENTROPY & CODING EFFICIENCY

Information Uncertainty

In Module 8, we have learned that the basic model of Information Theory is:

Here, information is generated at the source, send through a channel and consume in the drain.

In other words, information is being transmitted from the sender to the receiver. Prior to the transmission, the receiver has no idea what is the content of the information.

This implies the concept of information as a random variable and ‘Information Uncertainty’ (i.e. the receiver is uncertain of the content of the information, until after he / she has received it down the transmission channel)

Information Uncertainty & Probability Consider the following statements:

(a) Tomorrow, the sun will rise from the East. (b) The phone will ring in the next 1 hour. (c) It will snow in Malaysia this December.

Everybody knows the sun always rises in the East. The probability of this event is almost 1. Thus, this statement hardly carries any information.

The phone may or may not ring in the next 1 hour. The probability of this event is less that the probability of event in statement (a). Statement (b) carries more information than statement (a).

It has never snowed in Malaysia. The probability of this event is almost 0. Statement (c) carries the most information amongst all.

Information Uncertainty & Probability (cont.) Therefore, we can conclude that the amount of information

carries by each statement (or the information content of a single event) is inversely proportioned to the probability of that event.

The formulae is:

The unit of I(xi) is determined by the base of the logarithm. Since in the digital world, the basic unit is in bits, hence the formulae is: )(log

)(

1log)( i

ii xP

xPxI

)(log

)(

1log)( i

ii xP

xPxI

)(log)(

1log)( 22 i

ii xP

xPxI

)(log

)(

1log)( 22 i

ii xP

xPxI

Entropy

Entropy is defined as the weighted sum of all information contents.

Entropy is denoted as H(x) , and its formulae in base 2 is:

)(log)()(

1log)()( 2

12

1i

n

xi

i

n

xi xPxP

xPxPXH

)(log)(

)(

1log)()( 2

12

1i

n

xi

i

n

xi xPxP

xPxPXH

Note: It is weighted / normalized by multiplying each information content with the probability of the event

Binary Entropy Function

Consider a discrete binary source that emits a sequence of statistically independent symbols. The output is either ‘0’ with probability p or a ‘1’ with a probability 1-p. The entropy of the binary source is:

The overall Binary

Entropy Function

can be plotted as:

)]}1(log)1[()](log){[()(log)()( 2221

ppppxPxPXH i

n

xi

)]}1(log)1[()](log){[()(log)()( 2221

ppppxPxPXH i

n

xi

p

Exercise 1

Calculate the entropy of a binary source that has an equal probability of transmitting a ‘1’ and a ‘0’.

1)693147.0

693147.0()(

)2ln(

)5.0ln(1)(

)]5.0(log)5.0([2)(

)]5.0(log)5.0([)]5.0(log)5.0([)(

)(log)()(

2

22

21

XH

XH

XH

XH

xPxPXH i

n

xi

1)693147.0

693147.0()(

)2ln(

)5.0ln(1)(

)]5.0(log)5.0([2)(

)]5.0(log)5.0([)]5.0(log)5.0([)(

)(log)()(

2

22

21

XH

XH

XH

XH

xPxPXH i

n

xi

Note: If your calculator does not have the log2 function, then you may use this formulae: logbx = ln x / ln b

Equal probability of transmitting a ‘1’ and a ‘0’ means

p = 0.5

Exercise 2

Calculate the entropy of a binary source only transmit ‘1’ (i.e. transmit ‘1’ all the time!)

0)(

)]0(log)0([)]1(log)1([)(

)(log)()(

22

21

XH

XH

xPxPXH i

n

xi

0)(

)]0(log)0([)]1(log)1([)(

)(log)()(

22

21

XH

XH

xPxPXH i

n

xi

Always transmitting a ‘1’ means

p = 1

Exercise 3

Calculate the entropy of a binary source that transmit a ‘1’ after every 9 series of ‘0’.

468993.0)(

)]15200.0()9.0([)]32193.3()1.0([)(

)]9.0(log)9.0([)]1.0(log)1.0([)(

)(log)()(

22

21

xH

XH

XH

xPxPXH i

n

xi

468993.0)(

)]15200.0()9.0([)]32193.3()1.0([)(

)]9.0(log)9.0([)]1.0(log)1.0([)(

)(log)()(

22

21

xH

XH

XH

xPxPXH i

n

xi

Transmit a ‘1’ after every 9 series of ‘0’ is similar to

0000000001 means

p = 0.1 and

1-p = 0.9

Exercise 4

Assuming a binary source is used to cast the outcome of a fair dice. What is the entropy of the dice in bits?

584962.2)(

)61(2log)6

1)(6()(

)(log)()( 21

XH

XH

xPxPXH i

n

xi

584962.2)(

)61(2log)6

1)(6()(

)(log)()( 21

XH

XH

xPxPXH i

n

xi

A fair dice always results in

p = 1/6 for all possible 6 outcomes

Recaps…

Entropy is the weighted sum of all information contents.

In the example of the fair dice which outcome is represented in bits, the entropy is 2.584962.

This can be concluded as : 2.584962 is the recommended average number of bits needed to sufficiently describe the outcome of a single cast.

However, we know that by using fixed length coding we need to use an average of 3 bits to represent 6 symbols (Note: we denote Ṝ as the average bits / symbol, thus Ṝ = 3) .

We know that 23 = 8, thus it is sub-optimal to represent only 6 symbols with 3 bits.

Note: there is a difference between what is recommended and what is used in actual!

Average Number of Bits per Symbol Previously, we denote Ṝ as the average number

of bits per symbol. In a fixed length coding (FLC), Ṝ is equal to n

bits where 2n = maximum number of symbols than can be represented with n bits.

In a fixed length coding, all symbols are represented by fixed same length number of bits.

There is also another type of coding which is known as variable length coding (VLC). Here, all symbols need not necessarily be represented by a fixed same length number of bits.

In both FLC and VLC the formulae for Ṝ is:where l(xi ) = length per symbol i

)()(1

i

n

xi xPxlR

)()(1

i

n

xi xPxlR

Coding Efficiency

The formulae for entropy (H) is :

Where as, the formulae for average number of bits / symbol (Ṝ) is:

From here we can derive the coding efficiency (ἠ) as:

R

H R

H

)(log)()( 21

i

n

xi xPxPXH

)(log)()( 21

i

n

xi xPxPXH

)()(1

i

n

xi xPxlR

)()(1

i

n

xi xPxlR

Exercise 5

Calculate the coding efficiency of FLC for the example of the fair dice where the entropy is 2.584962 and that the average number of bits / symbol using FLC is 3.

The ideal optimal coding would yield efficiency of 1.

861654.03

R

H 861654.03

R

H

STATISTIC & INFORMATION THEORY

(CSNB134)

INFORMATION CONTENT, ENTROPY & CODING EFFICIENCY

--END--