Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
Feb
ruar
y 6,
200
5LH
-1
Hig
her-
Ord
er L
ossl
ess
Sou
rce
Cod
es
(1)
Blo
ck c
odes
(a
ka d
ictio
nary
-bas
ed c
odes
)
(a)
Fix
ed-t
o-fix
ed le
ngth
cod
es (
FF
LC)
(b)
Fix
ed-t
o-va
riabl
e le
ngth
cod
es (
FV
LC)
(incl
udin
g th
e fir
st-o
rder
cod
es d
iscu
ssed
ear
lier)
(c)
Var
iabl
e-to
-fix
ed-le
ngth
(V
FLC
)
(d)
Var
iabl
e-to
-var
iabl
e-le
ngth
cod
es (
VV
LC)
(e)
Run
-leng
th c
odes
(a
typ
e V
FLC
or
VV
LC)
(2)
Con
ditio
nal c
odes
(3)
Arit
hmet
ic c
odes
(4)
Uni
vers
al/a
dapt
ive
code
s
For
war
d (t
wo
pass
)or
Bac
kwar
d/on
e pa
ss
Dic
tiona
ry b
ased
(pr
inci
pally
, Le
mpe
l-Ziv
typ
e) o
rS
tatis
tical
(p
rinci
pally
, ad
aptiv
e ar
ithm
etic
cod
ing)
We
will
dis
cuss
(1
b) a
nd (
2) h
ere.
W
ill d
iscu
ss o
ther
s la
ter
as t
ime
pe
rmits
.
Feb
ruar
y 6,
200
5LH
-2
Exa
mpl
e of
FV
LC
So
urc
e:
stat
iona
ry r
and.
pro
cess
{X
n},
alph
abet
A =
{1,
2,3}
, H
(X)
= lo
g 2 3
= 1
.585
P(X
n=i)
= 1 3
, P
(Xn+
1=j|X
n=i)
= 1 2
, j=
i1 4
, j≠i
firs
t-o
rder
co
de
P(X
=x)
xco
de
wo
rd
1/3
10
1/3
210
1/3
311
rate
1.67
----
----
----
----
----
----
----
----
----
----
----
----
How
is it
tha
t w
e ha
ve b
eate
n th
een
trop
y?
As
prev
ious
ly d
efin
ed,
the
entr
opy
is a
low
er b
ound
onl
y to
the
perf
orm
ance
of
first
-ord
er c
odin
g.
se
con
d-o
rder
co
de
P(X
n=x 1
,Xn+
1=x 2
)x 1
x 2co
de
wo
rd1/
611
000
1/12
1201
101/
1213
0111
1/12
2110
01/
622
001
1/12
2310
11/
1231
110
1/12
3211
11/
633
010
avg
leng
th19
/6
rate
1.58
3
Feb
ruar
y 6,
200
5LH
-3
For
mal
ities
of F
ixed
-to-
Var
iabl
e-Le
ngth
Blo
ck C
odes
So
urc
e
Dis
cret
e-tim
e, d
iscr
ete-
valu
ed w
ith a
lpha
bet
A =
{a 1
,...}
= {
1, 2
, ...
},
finite
or c
ount
ably
infin
ite
Mod
eled
as
a di
scre
te-t
ime
rand
om p
roce
ss
Not
atio
n: X
or
{X
n} o
r {
Xn}
∞ n=-∞
or
{X
n}∞ n=
1
Ass
ume
prob
abili
ty d
istr
ibut
ion
is k
now
n,
i.e.
Pr(
X1=
x 1,..
.,Xk=
x k)
iskn
own
for
all c
hoic
es o
f k
and
(x
1,…
,xk)
Ass
ume
X i
s st
atio
nary
, i.e
.
Pr(
X1=
x 1,..
.,Xk=
x k)
= P
r(X
n+1=
x 1,..
.,Xn+
k=x k
)
for
any
n an
d x 1
,…,x
k
Seq
uenc
e no
tatio
n:
Ak
= a
ll se
quen
ces
x =
(x 1
,…,x
k)
of le
ngth
k
fro
m a
lpha
bet
A,
If A
ha
s M
sy
mbo
ls,
then
A
k has
M
k s
eque
nces
.
Alte
rnat
e no
tatio
n:
Ak
= {a
1, a
2, …
},
whe
re e
ach
ai
is s
ome
sequ
ence
of
leng
th k
fro
m
A.
Pro
babi
litie
s:
(P1,
P2,
…),
w
here
P
i = P
r((X
1,…
,Xk)
= a
i) o
r p
(x)
=p X
(x)
= P
r(X
=x)
Feb
ruar
y 6,
200
5LH
-4
Fix
ed-t
o-v
aria
ble
len
gth
(F
VL
) C
od
es
An
FV
L co
de w
ith (
inpu
t) b
lock
leng
th (
aka
"ord
er"
or "
dim
ensi
on")
k
divi
des
the
sour
ce s
eque
nce
into
blo
cks
of le
ngth
k
enco
des
each
blo
ck w
ith a
uni
quel
y de
coda
ble,
var
iabl
e-le
ngth
cod
eha
ving
one
cod
ewor
d fo
r ev
ery
poss
ible
sou
rce
sequ
ence
of
leng
th
k.
Thu
s, it
is e
ssen
tially
the
sam
e as
a f
irst-
orde
r va
riabl
e-le
ngth
cod
e,ex
cept
tha
t a
larg
er c
odeb
ook
is n
eede
d be
caus
e on
e co
dew
ord
isne
eded
for
eac
h bl
ock
of le
ngth
k,
as
opp
osed
to
a sm
alle
r co
debo
okw
ith o
ne c
odew
ord
for
each
ind
ivid
ual
sour
ce s
ymbo
ls.
The
key
cha
ract
eris
tics
of s
uch
a co
de a
re:
k =
blo
ckle
ngth
,
C =
{ v 1
, v2,
… }
=
code
book
, un
ique
ly d
ecod
eabl
e (u
sual
ly p
refix
),w
ith o
ne c
odew
ord
for
each
seq
uenc
e x
in
A
k ,
Cod
ewor
d le
ngth
s l 1
,l 2,..
.,
whe
re
l i =
l(v i
) =
l(i)
= l(
a i)
Eac
h co
dew
ord
has
the
form
v i
= (
v i,1,..
.,v i,l
i) ∈
{0,1
}*,
whe
re
{0,1
}*
is t
he s
et o
f al
l fin
ite-le
ngth
bin
ary
sequ
ence
s.
e =
en
codi
ng r
ule,
whi
ch
assi
gns
a co
dew
ord
in
C
to e
ach
x
in A
k .
Spe
cific
ally
, z
= e
(x)
= v
i w
hen
x =
ai.
deco
ding
pro
cedu
re d
epen
ds o
n th
e co
debo
ok.
Feb
ruar
y 6,
200
5LH
-5
The
ope
ratio
n of
an
FV
L co
de w
ith b
lock
leng
th
k =
3
is il
lust
rate
d be
low
:
sou
rce
seq
ue
nce
en
cod
ed
b
ina
ry s
eq
ue
nce
X
X
X
X
X
X
X
X
X
X
...
1 2
3
4 5
6
7
8
9
10
e(X
X
X
)1 2 3
e(X
X
X
)4 5 6
e(X
X
X
)7 8 9
e(X
.
..)
10
The
per
form
ance
of
the
code
is d
eter
min
ed b
y its
Ave
rage
len
gth
– l =
E l(
X)
= E
l(e(
X))
= ∑ x∈
Ak
l(
x) p
(x)
=
∑ i
l i P
i
Ra
te
R =
– l k
bits
/sou
rce
sym
bol
Feb
ruar
y 6,
200
5LH
-6
Key
Que
stio
ns
1.
How
to
desi
gn b
lock
leng
th F
VL
code
s w
ith in
put
bloc
klen
gth
ksm
all a
vera
ge le
ngth
or
rate
?
2.
How
sm
all c
an b
e th
e ra
te o
f an
FV
L co
de w
ith b
lock
engt
h k
?
3.
How
sm
all c
an b
e th
e ra
te o
f an
y F
VL
code
with
any
blo
ckle
ngth
wha
tsoe
ver?
W
hat
valu
e of
k
is b
est?
An
swe
rs
1.
How
to
desi
gn F
VL
code
s w
ith s
mal
l ave
rage
leng
th o
r ra
te?
Use
sam
e ap
proa
ches
as
befo
re,
i.e.
appl
y S
hann
on o
r H
uffm
ande
sign
str
ateg
ies
to t
he s
et c
onta
inin
g pr
obab
ilitie
s of
(
P1,
P2,
… ),
equi
vale
ntly
to
{p(
x):
x ∈
Ak }
Feb
ruar
y 6,
200
5LH
-7
2.
How
sm
all c
an b
e th
e ra
te o
f an
FV
L co
de w
ith b
lock
engt
h k
?
Def
ine:
− l* k =
sm
alle
st a
vg le
ngth
am
ong
all p
refix
(or
UD
) co
des
with
bloc
klen
gth
k
R* k
= − l* k k
=
sm
alle
st r
ate
amon
g al
l FV
L: c
odes
with
blo
ckle
ngth
k
Fro
m t
he C
odin
g T
heor
em f
or f
irst-
orde
r co
des,
we
dedu
ce
FV
L C
od
ing
Th
eore
m 1
:
H(X
1,...
,Xk)
≤
− l* k <
H(X
1,...
,Xk)
+ 1
Hk(
X)
≤
R* k
< H
k(X
) +
1 k
wh
ere H
(X1,
...,X
k)
= -
∑ i P
i log
2 P
i =
-∑ x∈
Ak
p
(x)
log 2
p(x
)
=
entr
opy
of
X1,
...,X
k
Hk
= H
k(X
) =
1 k H(X
1,...
,Xk)
= k
th-o
rder
ent
ropy
of
X
Feb
ruar
y 6,
200
5LH
-8
3.
How
sm
all c
an b
e th
e ra
te o
f an
y F
VL
code
with
any
blo
ckle
ngth
wha
tsoe
ver?
W
hat
valu
e of
k
is b
est?
Incr
easi
ng
k d
ecre
ases
the
1 k ter
m,
whi
ch is
a g
ood
thin
g.
But
we
also
need
to
know
how
H
k w
ith
k?
Fac
t 1:
F
or a
sta
tiona
ry r
ando
m p
roce
ss,
Hk+
1 ≤
Hk
≤ H
1
Hk'
s co
nver
ge t
o a
limit.
lim k→∞ H
k =
inf k
Hk
Pro
of:
The
ine
qual
ities
will
be
prov
ed l
ater
whe
n w
e di
scus
s co
nditi
onal
codi
ng.
Sin
ce t
he
Hk'
s a
re n
onin
crea
sing
and
bou
nded
fro
m b
elow
by
zero
,th
ey m
ust
appr
oach
a li
mit.
(R
esul
t fr
om a
naly
sis
(e.g
. M
ath
451)
.)
Sin
ce t
heH
k's
are
non
incr
easi
ng t
he li
mit
is a
lso
the
inf.
Def
ine: H
∞ =
lim k→
∞ H
k =
ent
ropy
rat
e of
the
sou
rce
X
Feb
ruar
y 6,
200
5LH
-9
Cod
ing
The
orem
1 a
nd F
act
1 im
ply
that
tha
t ra
te
R
of a
ny
code
with
bloc
klen
gth
k
satis
fies
R ≥
Hk
≥ H
∞
Tha
t is
, no
FV
L co
de w
hats
oeve
r ca
n ha
ve r
ate
less
tha
n H
∞.
Mor
eove
r, f
or a
ny s
mal
l num
ber
ε,
we
can
choo
se
k la
rge
enou
gh t
hat
Hk(
X)
≤ H
∞ +
ε 2
and
1 k ≤ ε 2
.
The
n C
odin
g T
heor
em 1
impl
ies
that
for
thi
s k
, t
here
is a
cod
e w
ith r
ate
R
=
R* k
< H
k(X
) +
1 k ≤
H∞ +
ε 2 +
ε 2 =
H∞ +
ε.
Sin
ce
ε c
an b
e ar
bitr
arily
sm
all,
this
sho
ws
ther
e ar
e co
des
with
rat
ear
bitr
arily
clo
se t
o H
∞.
In
som
e ca
ses,
the
re m
ay a
ctua
lly b
e a
code
with
rate
H
∞,
but
in o
ther
s th
ere
is n
ot.
In
any
even
t, w
e ha
ve p
rove
d th
efo
llow
ing:
FV
L C
od
ing
Th
eore
m 2
: F
or a
sta
tiona
ry s
ourc
e X
, t
he "
leas
t" a
vera
gera
te o
f F
VL
code
s of
any
blo
ckle
ngth
is
R* F
VL
=∆ i
nf k
R* k
= H
∞.
Equ
ival
ently
, no
FV
L co
de c
an h
ave
rate
less
tha
n H
∞,
and
for
any
ε >
0,
ther
e is
a c
ode
with
rat
e le
ss t
han
H∞
+ ε
.
Feb
ruar
y 6,
200
5LH
-10
Not
es a
nd E
xam
ples
(1)
The
R
* k's
nee
d no
t de
crea
se m
onot
onic
ally
. (
The
y ar
e su
badd
itive
.)
Exa
mpl
e:
Con
side
r a
bina
ry I
ID s
ourc
e w
ith
p(0
)=.7
, p
(1)
= .
2
R* 1
= 1
,
{0,
1}
is a
n op
timal
cod
e w
ith b
lock
leng
th 1
R* 2
= 1.
81 2 =
.905
, {0
, 10
, 11
0, 1
11}
is a
n op
timal
cod
e w
ith b
lock
leng
th 2
R* 3
= 2.
726
3 =
.908
7{0
0, 1
0, 0
10, 0
11, 1
100,
110
1, 1
110,
111
1}is
an
optim
al c
ode
with
blo
ckle
ngth
3
(2)
R* k
is s
ubad
ditiv
e,
i.e.
for
any
pos
itive
inte
gers
k
and
l,
R* k+
l ≤ k k+
l R* k
+ l l+k
R* l
As
we
have
dis
cuss
ed p
revi
ousl
y, a
sub
addi
tive
sequ
ence
has
a l
imit
that
equa
ls it
s in
f. T
here
fore
, su
badd
itivi
ty im
plie
s
R*
= i
nf k R
* k =
lim k→
∞ R
* k
Sub
addi
tivity
impl
ies
that
R
* k d
ecre
ases
whe
n k
in
crea
ses
by a
n in
tege
rm
ultip
le:
R* 1
≥ R
* k ≥
R* mk
≥ R
* f
or a
ll po
sitiv
e in
tege
rs
k a
nd
m
Feb
ruar
y 6,
200
5LH
-11
Pro
of
of
sub
add
itiv
ity:
Let
Ck
and
C
l be
cod
es w
ith b
lock
leng
ths
k
and
l,
resp
ectiv
ely,
tha
t ha
vera
tes
R* k
and
R
* l,
resp
ectiv
ely.
Le
t C
be
the
blo
ckle
ngth
k+
l co
deob
tain
ed b
y us
ing
Ck
on
the
first
k
sou
rce
sym
bols
and
C
l on
the
nex
t l
sour
ce s
ymbo
ls.
Tha
t is
, C
= C
k×C
l. T
hen,
the
ave
rage
of
leng
th o
f C
is
sum
of
the
aver
age
leng
ths
of
Ck
and
C
l, i.
e.
– l(C
) =
– l(C
k) +
– l(C
l) ,
and
the
rate
of
C
R(C
) =
– l(C
)k+
l =
k k+l – l(
Ck)
k +
l k+l – l(
Cl)
l
=
k k+l R
(Ck)
+
l k+l R
(Cl)
=
k k+l R
* k +
l k+l R
* l
Sin
ce
R* k
≤ R
(C),
w
e ha
ve f
rom
the
abo
ve t
hat
R
* k ≤
k k+
l R* k
+
l k+l R
* l
(3)
In m
ost
case
s, t
here
is n
o co
de w
ith r
ate
exac
tly e
qual
to
H∞
,
and
ther
e is
no
code
with
blo
ckle
ngth
k
and
rat
e ex
actly
equ
al t
o H
k.
(4)
We
will
sho
w la
ter
that
for
an
IID s
ourc
e (i.
e. s
tatio
nary
and
mem
oryl
ess)
H1
= H
2 =
... =
Hk
= H
∞
Thi
s ha
ppen
s on
ly w
hen
the
sour
ce i
s m
emor
yles
s.
So
ther
e is
les
s be
nefit
to in
crea
sing
k
for
an
IID s
ourc
e th
an f
or a
sou
rce
with
mem
ory.
Feb
ruar
y 6,
200
5LH
-12
(5)
We
will
sho
w la
ter
that
for
a (
first
-ord
er)
Mar
kov
sour
ce
Hk
= k-
1 k H
(X1,
X2)
- k-
2 k H
(X1)
=
1 k H
(X1)
+ k-
1 k H
(X2|
X1)
,
whe
re
H(X
2|X
1)
is t
he c
ondi
tiona
l ent
ropy
to
be d
efin
ed la
ter.
Exa
mpl
e:
A s
tatio
nary
Mar
kov
sour
ce
X
alph
abet
A =
{1,
2,3}
, P
(Xn=
i) =
1 3 ,
P(X
n+1=
j|Xn=
i) =
1 2 ,
j=i
1 4 , j≠
ik
12
3∞
Hk
log 2
3
1.58
5
1 2 (lo
g 2 3
+ 3 2)
1.54
2
1 3 (lo
g 2 3
+ 2
3 2)
1.52
8
3 2
1.50
0
R* k
1.66
71.
583
1.55
61.
500
Hk
+ 1 k
2.58
52.
043
1.86
11.
500
Hk
+ P
max k
1.91
81.
626
1.55
61.
500
1.0
00
1.5
00
2.0
00
2.5
00
3.0
00
02
46
81
01
2
blo
ckle
ng
th
Hk
Hk
+ 1/
k
R* k
Feb
ruar
y 6,
200
5LH
-13
(6)
"Red
unda
ncy
of a
cod
e"
=∆ R
ate
- H
∞
By
Cod
ing
The
orem
1:
redu
ndan
cy o
f th
e be
st F
VL
code
with
blo
ckle
ngth
k
is a
t m
ost
1/k.
(7)
Tig
hter
upp
er b
ound
on
R* k
(G
alla
ger1 )
It
can
be s
how
n th
at
− l* k ≤
H(X
1,...
,Xk)
+ P
max
if
Pm
ax <
1/2
H(X
1,...
,Xk)
+ P
max
+ 0
.086
if P
max
≥ 1
/2
whe
re
Pm
ax
=
max
{P1,
P2,
… }
. H
ence
,
R* k
≤ H
k(X
)+ P
max
/k i
f P
max
< 1
/2
Hk(
X)
+ P
max
/k +
0.0
86/k
if P
max
≥ 1
/2
Sin
ce
Pm
ax
decr
ease
s w
ith k
(u
sual
ly q
uite
rap
idly
), t
his
indi
cate
s th
atre
dund
ancy
dec
reas
es m
ore
rapi
dly
than
1/k
.
1 R.G
. Gal
lage
r, "
Var
iatio
ns o
n a
them
by
Huf
fman
," IE
EE
Tra
ns. I
nfor
m. T
hy.,
pp. 6
68-6
74, N
ov. 1
978.
Feb
ruar
y 6,
200
5LH
-14
(8)
Alth
ough
the
re a
re o
ther
typ
es o
f lo
ssle
ss c
odes
bes
ides
FV
L bl
ock
code
s,
it tu
rns
out
that
for
a s
tatio
nary
sou
rce,
no
code
can
hav
e ra
te le
ssth
an
H∞
. T
hus
FV
L co
des
perf
orm
as
wel
l as
any
code
s.
(9)
Com
plex
ity:
A
n F
VL
code
with
blo
ckle
ngth
k
for
a s
ourc
e w
ith a
n M
-sy
mbo
l alp
habe
t ha
s M
k co
dew
ords
. T
hus,
com
plex
ity o
f th
is t
ype
ofco
ding
gro
ws
rapi
dly
(exp
onen
tially
) w
ith k
.
Feb
ruar
y 6,
200
5LH
-15
Est
imat
es o
f H
k fo
r V
ario
us L
angu
ages
M=
|A|
log 2
MH
1H
2H
3H
4H
∞
Eng
lish
264.
704.
143.
853.
671.
3E
nglis
h27
4.75
4.03
3.68
3.48
Eng
lish
264.
704.
12E
nglis
h27
4.75
4.09
3.66
3.39
3.21
Fre
nch
264.
703.
98S
pani
sh26
4.70
4.02
Ge
rma
n26
4.70
4.10
Po
rtu
gu
ese
26?
4.70
?3.
923.
723.
53R
uss
ian
365.
174.
554.
003.
653.
42S
am
oan
174.
093.
403.
042.
832.
69T
amil
304.
914.
34A
rab
ic32
5.00
4.21
3.99
3.49
Ch
ine
se47
0012
.20
9.63
Ref
eren
ce:
Tex
t C
ompr
essi
on,
Bel
l, C
lear
y, W
itten
, p.
95.
(Som
e "c
onve
rsio
n" w
as n
eede
d to
obt
ain
H2,
H3,
H4
from
the
tab
le g
iven
ther
e.)
Feb
ruar
y 6,
200
5LH
-16
Con
ditio
nal L
ossl
ess
Cod
es
Exa
mp
le:
X, s
tatio
nary
, alp
habe
t: A
= {
1,2,
3},
P(X
n=i)
= 1 3
, P
(Xn+
1=j|X
n=i)
= 1 2
, j=
i1 4
, j≠i
firs
t-o
rder
co
de
P(X
=x)
xco
de
wo
rd1/
31
01/
32
101/
33
11R
ate
1.67
seco
nd
-ord
er c
od
e
P(X
n=x 1
,Xn+
1=x 2
)x 1
x 2co
de
wo
rd
1/6
1100
01/
1212
0110
1/12
1301
111/
1221
100
1/6
2200
11/
1223
101
1/12
3111
01/
1232
111
1/6
3301
0av
g le
ngth
19/
6R
ate
1.58
3
con
dit
ion
al c
od
e
pr
evio
us s
ymbo
l Xn
-1
Xn
12
3
11/
20
1/4
111/
410
21/
410
1/2
01/
411
31/
411
1/4
101/
20
− l1.
51.
51.
5
Rat
e =
1.5
bits
/sym
bol
=
H∞
Is c
ode
uniq
uely
dec
odab
le?
Yes
, kn
owin
g th
e pr
evio
us s
ymbo
l, th
ede
code
r kn
ows
whi
ch p
refix
cod
e w
asus
ed t
o en
code
the
cur
rent
sym
bol.
It is
sai
d to
be
"bac
kwar
d ad
aptiv
e".
Feb
ruar
y 6,
200
5LH
-17
sour
cese
quen
ce
enco
ded
bina
ry s
eque
nce
X
X
X
X
X
X
X
X
...
1
2
3
4
5
6
7
8
e(X
|X
)2
1 e(
X |
X )
3
2e(
X |
X )
5
4e(
X |
X )
7
6
e(X
|X
)4
3
e(X
|X
)6
5
e(X
|X
)8
7
Con
ditio
nal L
ossl
ess
Cod
ing:
For
mal
Intr
oduc
tion
Ass
ume
disc
rete
-val
ued,
sta
tiona
ry s
ourc
e X
, as
usu
al.
Bas
ic id
ea:
Whe
n en
codi
ng
Xn,
use
a c
ode
desi
gned
for
the
con
ditio
nal p
roba
bilit
ydi
strib
utio
n of
X
n g
iven
the
val
ue o
f th
e pr
evio
us s
ymbo
l X
n-1.
Tha
t is
, w
hen
Xn-
1=a,
us
e a
pref
ix c
ode
Ca
= {
v a,1
, va,
2, …
} w
ith le
ngth
s {
l a,1
, la,
2, ..
. }
desi
gned
for
p(1|
a), p
(2|a
), p
(3|a
), ..
.
wh
ere
p(i|a
) =
Pr(
Xn=
i | X
n-1=
a )
Con
ditio
nal R
ate:
giv
en
Xn-
1=a,
Ra
= – l a
= ∑
i l a
,i p(
i|a)
Ove
rall
Rat
e:
R =
∑ a p
(a)
– l a =
∑a
∑ i l a,i
p(a)
p(i|
a)
Feb
ruar
y 6,
200
5LH
-18
If fo
r ea
ch a
, C
a is
opt
imal
for
( p
(1|a
), p
(2|a
), p
(3|a
), ..
. ),
then
H(X
|a)
≤ R
a <
H(X
|a)
+ 1
wh
ere
H(X
|a)
= H
(X2|
X1=
a) =
-∑ i
p(i|
a) lo
g 2 p
(i|a)
= c
ond'
l ent
r’y o
f X
2 gi
ven
X1=
a
and
the
over
all r
ate
is
H(X
2|X
1) ≤
R <
H(X
2|X
1) +
1
wh
ere
H(X
2|X
1) =
∑ a p
(a)
H(X
2|X
1=a)
=
-∑ a
∑ i p
(a)
p(i|a
) lo
g 2 p
(i|a)
=
cond
ition
al e
ntro
py o
f X
2 gi
ven
X1
Not
e ho
w s
imila
r th
is n
otat
ion
is t
o th
at o
f H
(X2|
X1=
a).
The
y ar
e no
t th
e sa
me.
Co
nd
itio
nal
Lo
ssle
ss C
od
ing
Th
eore
m:
For
a s
tatio
nary
sou
rce
X,
H(X
2|X
1) ≤
R* c
< H
(X2|
X1)
+ 1
,
whe
re
R* c
den
otes
the
leas
t ra
te o
f co
nditi
onal
cod
es u
sed
to e
ncod
eso
urce
X
.
Feb
ruar
y 6,
200
5LH
-19
No
tes
:
(1)
Thi
s is
a k
ind
of b
ackw
ard
adap
tive
codi
ng.
The
enc
oder
and
dec
oder
adap
t th
e co
de b
ased
on
the
prev
ious
sym
bol.
(2)
It is
not
abs
olut
ely
esse
ntia
l tha
t th
e C
a's
be
pref
ix c
odes
. H
owev
er,
if no
t, on
e w
ould
hav
e to
req
uire
a k
ind
of "
mut
ual"
uni
que
deco
dabi
lity,
nam
ely,
tha
t an
y fin
ite s
eque
nce
of c
odew
ords
fro
m a
ny o
f th
e co
des
be d
ecod
able
in o
nly
one
way
.
(3)
The
upp
er b
ound
R
* c <
H(X
2|X
1)+
1 c
an b
e tig
hten
ed b
y us
ing
the
boun
d R
< H
(X2|
X1)
+p c
,max
+0.
0836
, w
here
p c
,max
is
the
larg
est v
alue
of
p(i|a
) o
ver
all c
hoic
es o
f i
and
a.
Feb
ruar
y 6,
200
5LH
-20
Pre
dict
ive/
Diff
eren
tial C
odin
g
A s
peci
al k
ind
of c
ondi
tiona
l cod
e
FV
Len
code
rF
VL
deco
der
Exa
mp
le:
Alp
habe
t of
X:
AX =
{0,
1,2}
Alp
habe
t of
U:
AU =
{-2
,-1,
0,1,
2}
Cod
eboo
k fo
r U
: C
U =
{110
, 01,
00,
10,
111
}
Equ
ival
ent
cond
ition
al e
ncod
ing
tabl
e
prev
ious
sym
bol X
n-1
Xn
01
2
000
0111
01
1000
012
111
1000
Com
plex
ity:
Com
pute
U
n =
Xn-
Xn-
1
Sto
re ju
st t
he o
ne c
odeb
ook
CU
with
2|
A|-
1 co
dew
ords
Ra
te:
R ≅
H(U
)
Ofte
n, H
(U)
≅ H
(X2|
X1)
Imp
rove
me
nt:
Rep
lace
"de
lay"
with
a p
redi
ctor
~ Xn
= g
(Xn-
1,X
n-2,
...)
Wid
ely
used
in
loss
less
im
age
codi
ng,
e.g.
"lo
ssle
ss J
PE
G".
Feb
ruar
y 6,
200
5LH
-21
Impr
oved
Ver
sion
s of
Con
ditio
nal L
ossl
ess
Cod
ing
The
re a
re t
wo
way
s to
im
prov
e co
nditi
onal
cod
ing.
(1)
mth
-ord
er c
ondi
tiona
l enc
odin
g:
cond
ition
on
m p
ast
sym
bols
(ra
ther
tha
n on
on
e)
(2)
(m
,k)
cond
ition
al b
lock
cod
ing:
co
nditi
onal
ly e
ncod
e k
sym
bols
at
atim
e (r
athe
r th
an o
ne).
Feb
ruar
y 6,
200
5LH
-22
(1)
mth
-ord
er c
on
dit
ion
al c
od
ing
In m
th-o
rder
con
ditio
nal c
odin
g, w
hen
enco
ding
sym
bol
Xn
one
use
s a
pref
ixco
debo
ok d
eter
min
ed b
y th
e pr
evio
us
m
sym
bols
Xn-
m,…
,Xn-
1.
Tha
t is
, fo
r ev
ery
m-t
uple
x =
(x 1
,…,x
m)
in A
m ,
the
re is
a p
refix
cod
eboo
k
Cx
= {
v x,1
, vx,
2, …
}w
ith l
engt
hs { lx,
1, l x
,2, …
}an
d a
dist
inct
cod
ewor
d fo
r ev
ery
sym
bol i
n A
. W
hen
(X
n-m
,…,X
n-1)
= x
,th
en
Xn
is e
ncod
ed w
ith c
odeb
ook
Cx.
The
ope
ratio
n of
a 2
nd-o
rder
con
ditio
nal c
ode
is il
lust
rate
d be
low
:
sourc
ese
quence
enco
ded
bin
ary
sequence
X X
X
X
X
X
...
1
2
3
4
5
6
e(X
|X
X
)
3
2
1
e(X
|X
X
)
4
3
2(X |X
X
)
5
4
3
e(X
|X
X
)
6
5
4
Def
init
ion
:
R* 1|
m =
leas
t ra
te o
f m
th-o
rder
con
ditio
nal c
odin
g fo
r so
urce
X
Feb
ruar
y 6,
200
5LH
-23
An
alys
is
Giv
en t
hat
(X
n-m
,…,X
n-1)
= x
, t
he c
ondi
tiona
l ave
rage
rat
e of
the
cod
e is
Rx
= – l x
= ∑
i
l x,i
p(i|x
)
whe
re
p(i|x
) =
Pr(
Xn=
i|Xn-
m,…
,Xn-
1=x)
,
and
the
over
all r
ate
is
R =
∑ x p
(x)
Rx
= ∑ x
p(x
) – l x
whe
re
p(x)
= P
r(X
n-1
n-m
=x)
= P
r(X
m 1=
x),
(
the
latte
r by
sta
tiona
rity)
.
If fo
r ea
ch
x, C
x is
opt
imal
for
(p(
1|x)
, p(
2|x)
, p(
3|x)
, ...
),
then
H(X
n|x)
≤ R
x <
H(X
n|x)
+ 1
wh
ere
H(X
n|x)
=
H(X
n|X
n-m
,…,X
n-1=
x) =
- ∑ i
p(i|
x) lo
g 2 p
(i|x)
=
con
ditio
nal e
ntro
py o
f X
2 g
iven
Xn-
m,…
,Xn-
1=x.
The
ove
rall
rate
is
R* 1|
m
and
from
the
abo
ve
H1|
m ≤
R* 1|
m <
H1|
m +
1
wh
ere
H1|
m =
H(X
n|X
n-m
,…,X
n-1)
= ∑ a
p(x
) H
(Xn|
Xn-
m,…
,Xn-
1=x)
= c
ondi
tiona
l ent
ropy
of
H(X
n|X
n-m
,…,X
n-1)
Aga
in,
note
how
sim
ilar
this
nam
e is
to
that
of
H(X
n|X
n-m
,…,X
n-1=
x).
Feb
ruar
y 6,
200
5LH
-24
Est
imat
es o
f H
(Xn|
Xn-
m,…
,Xn-
1) f
or v
ario
us la
ngua
ges
H1|
m
=∆ H
(Xn|
Xn-
m,…
,Xn-
1)
M=
|A|
log 2
MH
(X1)
H1|
2H
1|3
H1|
4H
1|8
H1|
12H
1|∞
Eng
lish
264.
704.
143.
563.
31.
3
Eng
lish
274.
754.
033.
323.
1
Eng
lish
264.
704.
12
Eng
lish
274.
754.
093.
232.
852.
662.
432.
40
Fre
nch
264.
703.
98
Spa
nish
264.
704.
02
Ger
man
264.
704.
10
Por
tugu
ese
26?
4.70
?3.
923.
513.
15
Rus
sian
365.
174.
553.
442.
952.
722.
452.
40
Sam
oan
174.
093.
402.
682.
402.
282.
162.
14
Tam
il30
4.91
4.34
Ara
bic
325.
004.
213.
772.
49
Chi
nese
4700
12.2
09.
63
Ref
eren
ce:
Tex
t C
ompr
essi
on,
Bel
l, C
lear
y, W
itten
, p.
95.
Feb
ruar
y 6,
200
5LH
-25
(2)
Co
nd
itio
nal
Blo
ck C
od
ing
.
In (
k|m
)-co
nditi
onal
-blo
ck e
ncod
ing,
just
as
with
FV
L en
codi
ng w
ithbl
ockl
engt
h k
, o
ne d
ivid
es t
he s
ourc
e se
quen
ce in
to b
lock
s of
leng
th
k.
The
n as
with
mth
-ord
er c
ondi
tiona
l cod
ing,
whe
n en
codi
ng a
blo
ckbe
ginn
ing
with
sym
bol
Xn,
i.e
. th
e bl
ock
(X
n,…
,Xn+
k-1)
on
e us
es a
pref
ix c
odeb
ook
dete
rmin
ed b
y th
e pr
evio
us
m
sym
bols
X
n-m
,…,X
n-1.
Tha
t is
, fo
r ev
ery
m-t
uple
x
= (
x 1,…
,xm
) in
A
m ,
the
re is
a p
refix
code
book
C
x =
{ v x
,1, v
x,2,
… }
with
leng
ths
{ l x
,1, l
x,2,
… }
and
one
dist
inct
cod
ewor
d fo
r ev
ery
sequ
ence
in
Ak .
W
hen
(X
n-m
,…,X
n-1)
= x
,th
en
(Xn,
…,X
n+k-
1)
is e
ncod
ed w
ith c
odeb
ook
Cx.
The
ope
ratio
n of
a (
1,2)
-con
d'l b
lock
cod
e is
illu
stra
ted
belo
w:
sourc
ese
quence
enco
ded
bin
ary
sequence
X X
X
X
X
X
X
..
. 1
2
3
4
5
6
7
e(X
X
|X
)
3
2
1e(X
X
|X
)
7
6
5e(X
X
|X
)
5
4
3
Def
init
ion
:
R* k|
m
=
leas
t ra
te o
f an
y (
k|m
)-co
nditi
onal
cod
e fo
r X
Feb
ruar
y 6,
200
5LH
-26
An
alys
is
To
allo
w m
ore
com
pact
exp
ress
ions
, le
t X
v u d
enot
e th
e se
quen
ce
(Xu,
…,X
v).
Giv
en th
at X
n-1
n-m
= x
, t
he a
vera
ge r
ate
of t
he c
ode
is
Rx
=
1 k – l x =
1 k ∑ i
l x,i
p(a i
|x)
whe
re
p(a i
|x)
= P
r(X
n+k-
1n
= a i
|Xn-
1n-
m =
x),
an
d th
e ov
eral
l rat
e is
R =
1 k ∑ x p
(x)
Rx
=
1 k ∑ x p
(x)
– l x
whe
re
p(x)
= P
r(X
n-1
n-m
=x)
= P
r(X
m 1=
x),
(th
e la
tter
by s
tatio
narit
y).
If C
x is
opt
imal
for
(p(
a 1|x
), p
(a2|
x), p
(a3|
x), .
.. ),
th
en
1 k H(X
n+k-
1n
|x)
≤
Rx
<
1 k H
(Xn+
k-1
n|x
) +
1 k
wh
ere
H(X
n+k-
1n
|x)
=
H(X
n+k-
1n
|Xn-
1n-
m=x
) =
- ∑ i
p(a
i|x)
log 2
p(a
i|x)
=
con
ditio
nal e
ntro
py o
f X
n+k-
1
giv
en X
n-1
n-m
=x.
and
from
the
abo
ve,
the
over
all r
ate
is
Hk|
m ≤
R* k|
m <
Hk|
m +
1 k
Feb
ruar
y 6,
200
5LH
-27
wh
ere
Hk|
m =
1 k H
(Xn+
k-1
n|X
n-1
n-m
)
=
(k|m
)-th
ord
er e
ntro
py
and
H(X
n+k-
1n
|Xn-
1n-
m)
= ∑ x
p(x
) H
(Xn+
k-1
n|x
) =
- ∑ x
∑ i p
(x)
p( a
i|x)
log 2
p(a
i|x)
=
cond
ition
al e
ntro
py o
f X
n+k-
1
giv
en X
n-1
n-m
Aga
in,
note
how
sim
ilar
this
nam
e is
to
that
of
H(X
n+k-
1n
|Xn-
1n-
m=x
).
Feb
ruar
y 6,
200
5LH
-28
We
now
sum
mar
ize
and
take
the
lim
it.
Co
din
g T
heo
rem
fo
r C
on
dit
ion
al B
lock
Co
des
:
For
a s
tatio
nary
sou
rce
X a
nd in
tege
rs
k≥1,
m≥0
(a)
Hk|
m ≤
R* k|
m <
Hk|
m +
1 k ,
wh
ere
R* k|
m
deno
tes
the
leas
t ra
te o
f an
y (k
|m)
cond
ition
al b
lock
cod
e
Hk|
m =
1 k H
(Xn+
k-1
n|X
n-1
n-m
) is
the
(k|
m)-
th o
rder
ent
ropy
.
Mo
reo
ver,
(b)
lim k→∞
R* k|
m =
H∞,
for
any
m
(c)
H∞ ≤
lim m→
∞R
* k|m
< H
∞ +
1 k ,
for
any
k.
Whe
n m
=0,
w
e in
terp
ret
a (k
|m)
cond
ition
al b
lock
cod
e to
mea
n an
unco
nditi
onal
blo
ck c
ode.
T
his
theo
rem
gen
eral
izes
tho
se o
n pp
. 7,
9.
Pro
of:
F
or
m=
0,
(a)
and
(b)
wer
e es
tabl
ishe
d in
the
the
orem
s on
pp.
7,9
.F
or
m≥1
, (
a) w
as e
stab
liish
ed o
n pr
evio
us p
ages
. (
b) a
nd (
c) f
ollo
w f
rom
(a)
and
Pro
pert
ies
15 a
nd 1
6, g
iven
late
r, w
hich
sho
w
lim k→∞H
k|m
= H
∞ f
or a
ny m
, a
nd
lim
m→
∞H
k|m
= H
∞ f
or a
ny k
Thi
s th
eore
m d
emon
stra
tes
that
one
can
app
roac
h th
e le
ast
poss
ible
rat
e, H
∞,
in a
num
ber
of w
ays.
Feb
ruar
y 6,
200
5LH
-29
Co
mp
lexi
ty o
f co
nd
itio
nal
an
d c
on
dit
ion
al b
lock
co
des
Sup
pose
the
sou
rce
alph
abet
con
tain
s M
sy
mbo
ls.
A d
irect
impl
emen
tatio
n of
an
(k|m
) co
nditi
onal
blo
ck c
ode
requ
ires
stor
ing
the
Mk
cod
ewor
ds o
f ea
ch o
f th
e M
m
code
s C
x.
Tha
t is
, th
eto
tal s
tora
ge r
equi
red
for
enco
ding
or
deco
ding
is p
ropo
rtio
nal t
o M
k+m
.
Thu
s co
mpl
exity
gro
ws
expo
nent
ially
with
m
an
d k
.
Sup
pose
we
fix a
val
ue
K
and
requ
ire t
hat
k+
m =
K,
the
reby
fix
ing
the
com
plex
ity,
at le
ast
appr
oxim
atel
y.
Wha
t va
lues
of
k
and
m
lead
to
the
low
est
rate
?
The
ans
wer
dep
ends
on
the
sour
ce,
so t
here
is
no u
nive
rsal
ly b
est
choi
ce.
One
can
use
H
k|m
+ 1
/k
as a
gui
de t
o th
e ch
oice
of
k
and
m.
The
fac
ts g
iven
late
r sh
ow t
hat
amon
g ch
oice
s of
k
,m
such
tha
tk+
m=
K,
Hk|
m
is s
mal
lest
whe
n k
=1
and
m
=K
-1,
i.e.
H1|
K-1
is
sma
llest
.
Feb
ruar
y 6,
200
5LH
-30
Exa
mp
le:
A h
ypo
thet
ical
20t
h-o
rder
Mar
kov
sou
rce
wit
h
H1=
10,
H
∞ =
2
Hk|
m
02468
10
12
05
10
15
20
25
bloc
k si
ze k
bits/symbol
m=
0
m=
1
m=
2
m=
4
m=
6
m=
8
m=
10
mem
ory
leng
th
Feb
ruar
y 6,
200
5LH
-31
1 5 9 13 1
7m=0
m=2
m=4
m=6
m=8
m=10
012345678910
bits/symbol
bloc
k si
ze k
Hk|
m
9-1
0
8-9
7-8
6-7
5-6
4-5
3-4
2-3
1-2
0-1
mem
ory
Feb
ruar
y 6,
200
5LH
-32
Tab
le o
f H
k|m
m =
01
23
45
67
89
10
11
12
k=
11
08.
217.
476.
96.
426
5.62
5.27
4.94
4.6
4.34
4.07
3.8
29.
117.
847.
196.
666.
25.
815.
445.
14.
84.
494.
213.
943.
68
38.
567.
536.
936.
46.
015.
635.
284.
94.
644.
354.
073.
813.
6
48.
037.
156.
66.
155.
755.
385
4.73
4.43
4.15
3.88
3.6
3.38
57.
626.
86.
345.
915.
525.
24.
854.
544.
253.
983.
73.
473.
23
67.
36.
586.
15.
695.
34.
994.
674.
384.
13.
83.
573.
333.
09
77
6.34
5.89
5.5
5.15
4.82
4.51
4.22
3.9
3.69
3.44
3.2
2.96
86.
746.
135.
75.
324.
984.
664.
364.
13.
813.
563.
313.
072.
84
96.
515.
95.
525.
154.
824.
514.
23.
943.
683.
433.
192.
952.
75
10
6.3
5.77
5.37
5.02
4.69
4.4
4.1
3.83
3.57
3.33
3.09
2.86
2.67
11
6.14
5.62
5.23
4.88
4.6
4.27
3.99
3.72
3.47
3.22
2.99
2.78
2.61
12
5.95
5.45
5.07
4.7
4.42
4.13
3.86
3.6
3.35
3.12
2.91
2.71
2.56
13
5.76
5.28
4.9
4.59
4.28
43.
733.
473.
243.
042.
842.
662.
52
14
5.59
5.1
4.77
4.45
4.15
3.87
3.61
3.37
3.15
2.96
2.78
2.61
2.48
15
5.4
4.97
4.63
4.31
4.02
3.75
3.5
3.28
3.08
2.9
2.73
2.57
2.45
16
5.26
4.82
4.49
4.18
3.89
3.64
3.41
3.2
3.01
2.84
2.68
2.54
2.42
17
5.1
4.68
4.35
4.05
3.78
3.54
3.32
3.13
2.95
2.79
2.64
2.5
2.4
18
4.95
4.54
4.22
3.94
3.68
3.46
3.25
3.06
2.9
2.75
2.6
2.48
2.37
19
4.81
4.41
4.1
3.84
3.6
3.38
3.18
3.01
2.85
2.71
2.57
2.45
2.35
20
4.67
4.29
43.
743.
523.
313.
122.
962.
812.
672.
542.
432.
34
The
val
ues
on a
dia
gona
l, su
ch a
s th
ose
prin
ted
in b
old,
hav
e th
e sa
me
com
ple
xity
.
Feb
ruar
y 6,
200
5LH
-33
Mor
e D
efin
ition
s an
d P
rope
rtie
s of
Ent
ropy
Fo
r T
wo
Ran
do
m V
aria
ble
s
Def
init
ion
s:
(a)
The
ent
ropy
of
a pa
ir of
ran
dom
var
iabl
es X
and
Y
(som
etim
es c
alle
d th
eir
"joi
nt"
entr
opy)
is
H(X
,Y)
= -
∑ x,y p
(x,y
) lo
g 2 p
(x,y
) ,
w
here
p(
x,y)
=∆ P
r(X
=x
and
Y=
y).
Thi
nk o
f (
X,Y
) as
one
ran
dom
var
iabl
e w
ith a
n al
phab
et c
onsi
stin
g of
pai
rs.
The
nth
e ab
ove
is ju
st t
he u
sual
def
initi
on o
f en
trop
y, a
nd h
as t
he u
sual
pro
pert
ies.
(b)
The
con
ditio
nal e
ntro
py o
f ra
ndom
var
iabl
e X
gi
ven
a pa
rtic
ular
val
ue o
f ra
ndom
varia
ble
Y H(X
|Y=y
) =
-∑ x
p(x
|y)
log 2
p(x
|y),
whe
re
p(x|
y) =
Pr(
X=
x|Y
=y)
.
Thi
s is
an
ordi
nary
ent
ropy
--
nam
ely
the
entr
opy
of
X
whe
n a
spec
ific
valu
e of
Y
is g
iven
--
and
as s
uch
it ha
s al
l the
usu
al p
rope
rtie
s of
ent
ropy
.
(c)
Con
ditio
nal e
ntro
py o
f ra
ndom
var
iabl
e X
giv
en a
ran
dom
var
iabl
e Y
H(X
|Y)
= ∑ y
p(y
) H
(X|Y
=y)
= -
∑ x,y
p(x
,y)
log 2
p(x
|y)
Thi
s is
an
aver
age
of t
he p
revi
ous
kind
of
cond
ition
al e
ntro
py.
Feb
ruar
y 6,
200
5LH
-34
Pro
per
ties
: (
elem
enta
ry i
n in
form
atio
n th
eory
)
1.H
(X|Y
) ≥
0
with
equ
ality
iff
X
is a
func
tion
of
Y
Pro
of:
H(X
|Y=
y) ≥
0
for
all y
be
caus
e it
is a
n or
dina
ry e
ntro
py.
Sin
ce H
(X|Y
)is
the
ave
rage
of
such
, it
too
is n
onne
gativ
e.
H(X
|Y)
= 0
if
and
only
ifH
(X|Y
=y)
=0
for
each
y
with
pos
itive
pro
babi
lity.
T
his
happ
ens
if an
d on
ly f
orea
ch y
with
pos
itive
pro
babi
lity,
the
re is
a v
alue
x s
uch
that
P
r(X
=x|
Y=
y)=
1.
Inot
her
wor
ds,
Y d
eter
min
es
X;
i.e.
X
is
a f
unct
ion
of
Y.
2.H
(X|Y
) ≤
H(X
)
with
equ
ality
iff
X a
nd Y
are
in
depe
nden
t.
Pro
of:
We'
ll sh
ow
H(X
) -
H(X
|Y)
≥ 0
with
equ
ality
iff
X in
dep
of Y
.
H(X
) -
H(X
|Y)
= -
∑ x,y p
(x,y
) lo
g 2 p
(x)
+ ∑ x,
y p
(x,y
) lo
g 2 p(
x,y)
p(y)
= -
∑ x,y p
(x,y
) ln
p(x)
p(y
)p(
x,y)
1 ln
2
≥ -
∑ x,y p
(x,y
)
p
(x) p
(y)
p(x,
y) -
1
1 ln 2
u
sing
ln z
≤ z
-1
= -
∑ x,y p
(x) p
(y)
1 ln 2
+ ∑ x,
y p
(x,y
) 1 ln
2 =
0.
Equ
ality
hol
ds if
and
onl
y if
p(x
) p(y
) =
p(x
,y)
for
all x
,y; i
.e.
if an
d on
ly if
X a
ndY
ar
e in
depe
nden
t.
Feb
ruar
y 6,
200
5LH
-35
3.
Cha
in r
ule:
H(X
,Y)
= H
(X)
+ H
(Y|X
) =
H(Y
) +
H(X
|Y)
Pro
of: H
(X,Y
)=
-∑ x,
y p
(x,y
) lo
g 2 p
(x) p
(y|x
)
= -∑ x,
y p
(x,y
) lo
g 2 p
(x)
-∑ x,
y p
(x,y
) lo
g 2 p
(y|x
)
= H
(X)
+ H
(Y|X
).
Inte
rcha
ngin
g X
an
d Y
in
thi
s ar
gum
ent
show
s H
(X,Y
) =
H(Y
)+H
(X|Y
).
4.H
(X,Y
) ≤
H(X
) +
H(Y
)
with
equ
ality
iff
X a
nd Y
are
inde
pend
ent
Pro
of:
Usi
ng F
acts
3 a
nd 2
,
H(X
,Y)
= H
(Y)
+ H
(X|Y
) ≤
H(Y
) +
H(X
)
with
equ
ality
iff
X a
nd Y
are
inde
pend
ent.
5.H
(X)
≤ H
(X,Y
)w
ith e
qual
ity if
f Y
is a
func
tion
of X
Pro
of:
By
Fac
t 3,
H(X
) =
H(X
,Y)
- H
(Y|X
) ≤
H(X
,Y),
w
here
the
ineq
ualit
y is
from
Fac
t 1, a
nd e
qual
ity h
olds
iff
H(X
|Y)
= 0
, whi
ch b
y F
act 1
hap
pens
iff
X i
s a
func
tion
of
Y.
Feb
ruar
y 6,
200
5LH
-36
Mo
re T
han
Tw
o R
and
om
Var
iab
les
Def
init
ion
s:
(d)
Ent
ropy
of
X1,
…,X
k (
som
etim
es c
alle
d th
eir
"joi
nt e
ntro
py")
H(X
1,...
,Xk)
= -
∑ x
p(x
) lo
g 2 p
(x)
w
here
x
= (
x 1,…
,xk)
(e)
Con
ditio
nal e
ntro
py o
f ra
ndom
var
iabl
es
X1,
…,X
k g
iven
par
ticul
ar v
alue
s of
rand
om v
aria
bles
Y
1,…
,Ym
H(X
1,…
,Xk|
Y1,
…,Y
m=
y 1,…
,ym
) =
-∑ x
p(x
|y)
log 2
p(x
|y),
whe
re
y =
(y 1
,…,y
m)
(f)
C
ondi
tiona
l ent
ropy
of
rand
om v
aria
bles
X1,
…,X
k g
iven
ran
dom
var
iabl
esY
1,…
,Ym
H(X
1,…
,Xk|
Y1,
…,Y
m)
= ∑ y
p(y
) H
(X1,
…,X
k|Y
1,…
,Ym
=y 1
,…,y
m)
= -∑ x,
y p
(x,y
) lo
g 2 p
(x|y
)
Not
e:If
one
thin
ks o
f (
X1,
…,X
k)
and
(Y
1,…
,Ym
) e
ach
as a
sin
gle
rand
om v
aria
ble
with
a v
ecto
r-va
lued
alp
habe
t, th
en t
he a
bove
for
mul
as a
re t
he s
ame
as t
heco
rres
pond
ing
"tw
o-ra
ndom
var
iabl
e" f
orm
ulas
.
Feb
ruar
y 6,
200
5LH
-37
Pro
per
ties
:
6.H
(Y1,
...,Y
n|X
1,…
,Xm
) ≤
H(Y
1,…
,Yn|
X1,
…,X
m'),
0 ≤
m' <
m,
with
equ
ality
iff
Y1,
…,Y
n is
con
ditio
nally
inde
pend
ent
of
Xm
'+1,
…,X
m
give
nX
1,…
,Xm
'.
Pro
of:
Like
that
of F
act 2
.
7.C
hain
rul
e:
H(X
1,...
,Xk)
= H
(X1)
+ H
(X2|
X1)
+ H
(X3|
X1,
X2)
+ H
(X4|
X1,
X2,
X3)
+
... +
H(X
k|X
1,X
2,…
,Xk-
1)
Pro
of:
Like
that
of F
act 3
.
8.H
(X1,
...,X
k) ≤
H(X
1) +
H(X
2) +
... +
H(X
k)
with
equ
ality
iff
the
Xi's
ar
e in
depe
nden
t
Pro
of:
Fol
low
s fr
om th
e ch
ain
rule
and
the
fact
that
H
(Xi|X
1,…
,Xi-1
) ≤
H(X
i) w
itheq
ualit
y if
and
only
if X
i is
inde
pend
ent o
f X
1,…
,Xi-1
.
Feb
ruar
y 6,
200
5LH
-38
Fo
r S
tati
on
ary
Ran
do
m P
roce
sses
Let
{X
k}
be s
tatio
nary
ran
dom
pro
cess
.
Def
init
ion
s:
(g)
Hk
=∆
1 k H(X
1,...
,Xk)
(h)
H1|
m =∆
H(X
n|X
n-m
,X2,
…,X
n-1)
( H
1|0
=∆ H
(X1)
= H
1 )
(i)H
k|m
=∆
1 k H
(Xn+
k-1
n|X
n-1
n-m
)(H
1|m
= H
1|m
)
(j)H
∞ =∆
lim k→∞ H
k =
ent
ropy
rat
e of
X
Fac
ts t
o b
e es
tab
lish
ed:
Hk|
m
decr
ease
s or
sta
ys t
he s
ame
as
k o
r m
de
crea
ses
Hk|
m →
H∞ a
s k
and
/or
m g
o to
infin
ity..
Feb
ruar
y 6,
200
5LH
-39
Pro
per
ties
:
9.H
1|m
+1 ≤
H1|
m
Pro
of:
Fol
low
s fr
om P
rope
rty
6 an
d st
atio
narit
y.
10.
Hk
= 1 k
(H1
+ H
1|1
+ H
1|2
+ …
+H
1|k-
1) ≥
H1|
k-1
≥ H
1|k
Pro
of:
Hk
= 1 k
H(X
1,…
,Xk)
= 1 k
(H(X
1) +
H(X
2|X
1) +
H(X
3|X
1,X
2) +
... +
H(X
k|X
1,X
2,…
,Xk-
1))
by c
hain
rul
e=
1 k (H
1 +
H1|
1 +
H1|
2 +
… +
H1|
k-1)
by
sta
tiona
rity
≥ 1 k
(H1|
k-1
+ H
1|k-
1 +
H1|
k-1
+ …
+ H
1|k-
1)
by
Pro
p. 9
.
= H
1|k-
1 ≥
H1|
k b
y P
rop.
9
11.
Hk+
1 ≤
Hk
Sin
ce
Hk'
s a
re n
onin
crea
sing
and
bou
nded
bel
ow b
y ze
ro,
they
mus
t ha
ve a
limit.
H
ence
, H
∞
is a
wel
l-def
ined
qua
ntity
.
Pro
of:
By
Pro
p. 1
0,
Hk
is th
e av
erag
e of
k te
rms
H1,
H1|
1, H
1|2,
…, H
1|k-
1.
Sim
ilarly
, H
k+1
is a
vera
ge o
f k+
1 te
rms
H(X
1), H
1|1,
H1|
2, …
, H
1|k-
1, H
1|k
.
Sin
ce t
he e
xtra
ter
m in
H
k+1
is n
o la
rger
tha
n al
l oth
er t
erm
s,
Hk+
1 ≤
Hk.
Feb
ruar
y 6,
200
5LH
-40
12.
lim k→∞H
1|k
= lim k→
∞H
k =∆
H
∞
Pro
of:
Sin
ce t
he
H1|
k's
are
non
incr
easi
ng w
ith
k a
nd b
ound
ed b
elow
by
zero
, th
ey m
ust
have
a li
mit.
S
ince
by
Pro
p. 1
0,
Hk
is t
he a
vera
ge o
f th
e k
term
s H
1, H
1|1,
H1|
2, …
, H1|
k-1,
th
e lim
it of
the
H1|
k's
equ
als
the
limit
of t
heH
k's,
w
hich
by
defin
ition
is
H∞
.
13.
Hk|
m =
1 k (H
1|m
+ H
1|m
+1 +
... +
H1|
m+k
-1)
Pro
of:
Hk|
m =
1 k
H(X
n+k-
1n
|Xn-
1n-
m)
= 1 k
(H(X
n|X
n-1
n-m
) +
H(X
n+1|
Xn n-
m)
+ ...
+ H
(Xn+
k-1|
Xn+
k-2
n-m
)) =
1 k (H
1|m
+ H
1|m
+2 +
... +
H1|
m+
k-1)
) b
y st
atio
narit
y
14.
H1|
k+m
≤ H
k|m
≤ H
k
Pro
of:
(a)
Hk|
m =
1 k H
(Xn+
k-1
n|X
n-1
n-m
) ≤
1 k H
(Xn+
k-1
n)
= H
k
(b)
By
Pro
p. 1
3,
Hk|
m
is t
he a
vera
ge o
f te
rms,
eac
h of
whi
ch is
at
leas
t H
1|k+
m.
The
refo
re,
Hk|
m ≥
H1|
k+m
.
Feb
ruar
y 6,
200
5LH
-41
15H
K+1
|m ≤
HK
|m
Hk|
m+1
≤ H
K|m
The
refo
re,
H1
≥ H
2 ≥
H3
≥
... d
ecre
ases
with
k
H1|
1
≥H
2|1
≥H
3|1
≥H
1|2
≥
H3|
2≥
H3|
2≥
......
......
......
decr
ease
s w
ith
m
16.
lim k→∞H
k|m
= H
∞ f
or a
ny m
Pro
of:
Rec
all f
rom
Pro
p. 1
3 th
at
Hk|
m =
1 k (H
1|m
+ H
1|m
+2 +
... +
H1|
k+m
)
It fo
llow
s th
at t
he li
mit
as
k→∞
of
H
k|m
eq
uals
lim k→∞H
1|k+
m =
H∞
17.
lim m→
∞H
k|m
= H
∞
for
any
k
Pro
of:
Rec
all f
rom
Pro
p. 1
3 th
at
Hk|
m =
1 k (H
1|m
+ H
1|m
+2 +
... +
H1|
k+m
)
It fo
llow
s th
at t
he li
mit
as
m→
∞
of
Hk|
m
equa
ls
lim m→
∞H
1|k+
m =
H∞
Feb
ruar
y 6,
200
5LH
-42
Exa
mp
les:
A.
IID
sou
rce:
H(X
1) =
H1
= H
k =
H∞ =
H1|
1 =
H1|
k =
Hk|
m,
for
all k
,m
B.
(firs
t-or
der)
Mar
kov
sour
ce:
Def
init
ion
: A
sta
tiona
ry s
ourc
e is
firs
t-or
der
Mar
kov
if
p(x m
|x1,
…,x
m-1
) =
p(x
m|x
m-1
) f
or a
ll m
and
x1,
…,x
m
The
n fo
r an
y m
≥ 1
,
H1|
m =
H(X
m|X
1,…
,Xm
-1)
=
-∑
x 1…
x m
p(x
1,…
,xm
) log
2 p(
x m|x
1,…
,xm
-1)
= -
∑x 1
…x m
p(x
1,…
,xm
) lo
g 2 p
(xm
|xm
-1)
= -
∑x m
-1,x
m
p(x
m-1
,xm
) lo
g 2 p
(xm
|xm
-1)
=
H(X
m|X
m-1
) =
H
(X2|
X1)
=
H1|
1
by s
tatio
narit
y
Sin
ce a
ll th
e H
1|m
's
equa
l H
1|1,
H∞ =
lim m→
∞ H
1|m
= H
1|1
By
Pro
p. 1
0 a
nd t
he a
bove
Hk
= 1 k
(H1+
H1|
1+H
1|2+
…+H
1|k-
1) =
1 k(
H1+
(k-1
)H1|
1) =
1 k
H1+
k-1 k H
1|1
from
whi
ch w
e se
e th
at
Hk
con
verg
es d
own
to
H∞
, a
s it
shou
ld.
Feb
ruar
y 6,
200
5LH
-43
Sim
ilarly
, on
e ca
n sh
ow
Hk|
m =
H1|
1 =
H∞
f
or a
ll k,
m
Bec
ause
H
1|1
= H
∞,
we
see
that
firs
t-or
der
cond
ition
al c
odin
g ca
n co
me
with
inon
e bi
t of
opt
imal
for
firs
t-or
der
Mar
kov
sour
ces.
In
oth
er w
ords
, th
ere
is n
ore
ason
to
use
2nd
or h
ighe
r or
der
cond
ition
al c
odin
g, a
nd t
he o
nly
reas
on t
ous
e bl
ock
codi
ng is
to
redu
ce t
he p
ossi
bly
one
bit
of r
edun
danc
y.
C.
nth
-ord
er M
arko
v so
urc
e:
Def
init
ion
: A
sta
tiona
ry s
ourc
e is
nth
-ord
er M
arko
v if
p(x k
|x1,
…,x
k-1)
= p
(xk|
x k-n
,…,x
k-1)
fo
r al
l k>
n &
x1,
…,x
k
The
n as
in t
he c
ase
of 1
st-o
rder
Mar
kov
sour
ces,
it c
an b
e sh
own
that
if
m≥n
,
H1|
m =
Hk|
m =
H1|
n =
H∞ ,
fo
r ev
ery
k,
and
that
if k
> n
,
Hk
=
1 k (nH
n+H
1|n+
H1|
n+1+
…+H
1|k-
1) =
1 k (nH
n+H
1|n+
H1|
n+…
+H1|
n)
=
n k H
n +
k-n k H
1|n
whi
ch c
onve
rges
dow
n to
H
∞ =
H1|
n a
s k
→ ∞
Sin
ce
H1|
n =
H∞
, n
th-o
rder
con
ditio
nal c
odin
g co
mes
with
in 1
bit
of o
pt'l
for
nth-
orde
r M
arko
v so
urce
s.
Thu
s th
ere
is n
o re
ason
to
use
high
er o
rder
con
di-
tiona
l cod
ing,
and
the
onl
y re
ason
to
use
bloc
k co
ding
is t
o re
duce
the
at
mos
t1
bit
redu
ndan
cy.