Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Chapter 9
Gaussian Channel
Peng-Hua Wang
Graduate Inst. of Comm. Engineering
National Taipei University
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 2/31
Chapter Outline
Chap. 9 Gaussian Channel
9.1 Gaussian Channel: Definitions
9.2 Converse to the Coding Theorem for Gaussian Channels
9.3 Bandlimited Channels
9.4 Parallel Gaussian Channels
9.5 Channels with Colored Gaussian Noise
9.6 Gaussian Channels with Feedback
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 3/31
9.1 Gaussian Channel: Definitions
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 4/31
Introduction
Yi = Xi + Zi, Zi ∼ N(0, N)
n Xi: input, Yi:output, Zi: noise. Zi is independent of Xi.
n Without further constraint, the capacity of this channel may be infinite.
u If the noise variance N is zero, the channel can transmit an
arbitrary real number with no error.
u If the noise variance N is nonzero, we can choose an infinite
subset of inputs arbitrary far apart, so that they are distinguishable
at the output with arbitrarily small probability of error.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 5/31
Introduction
n The most common limitation on the input is an energy or power
constraint.
n We assume an average power constraint. For any codeword
(x1, x2, . . . , xn) transmitted over the channel, we require that
1
n
n∑
i=1
x2i ≤ P
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 6/31
Information Capacity
Definition 1 (Capacity) The information capacity of the Gaussian
channel with power P is
C = maxf(x):E[X2]≤P
I(X;Y )
We can calculate the information capacity as follows.
I(X;Y ) = h(Y )− h(Y |X) = h(Y )− h(X + Z|X)
= h(Y )− h(Z|X) = h(Y )− h(Z)
≤ 1
2log 2πe(P +N)− 1
2log 2πeN
=1
2log
(
1 +P
N
)
Note that E[Y 2] = E[(X + Z)2] = P +N and the entropy of
gaussian with variance σ2 is12log 2πeσ2.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 7/31
Information Capacity
Therefore, the information capacity of the Gaussian channel is
C = maxE[X2]≤P
I(X;Y ) =1
2log
(
1 +P
N
)
and the equality holds when X ∼ N(0, P ).
n Next, we will show that this capacity is achievable.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 8/31
Code for Gaussian Channel
Definition 2 ((M,n) code for Gaussian Channel) An (M,n) code
for the Gaussian channel with power constraint P consists the following:
1. An index set {1, 2, . . . ,M}.2. An encoding function x : {1, 2, . . . ,M} → X n, yielding
codewords xn(1), xn(2), . . . , xn(M), satisfying the power
constraint P
1
n
n∑
i=1
x2i (w) ≤ P, w = 1, 2, . . . ,M.
3. A decoding function g : Yn → {1, 2, . . . ,M}.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 9/31
Definitions
Definition 3 (Conditional probability of error)
λi = Pr(g(Y n) 6= i|Xn = xn(i)) =∑
g(yn) 6=i
p(yn|xn(i))
=∑
yn
p(yn|xn(i))I(g(yn) 6= i)
n I(·) is the indicator function.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 10/31
Definitions
Definition 4 (Maximal probability of error)
λ(n) = maxi∈{1,2,...,M}
λi
Definition 5 (Average probability of error)
P (n)e =
1
M
M∑
i=1
λi
n The decoding error is
Pr(g(Y n) 6= W ) =M∑
i=1
Pr(W = i) Pr(g(Y n) 6= i|W = i)
If the index W is chosen uniformly from {1, 2, . . . ,M}, then
P(n)e = Pr(g(Y n) 6= W ).
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 11/31
Definitions
Definition 6 (Rate) The rate R of an (M,n) code is
R =logM
nbits per transmission
Definition 7 (Achievable rate) A rate R is said to be achievable for a
Gaussian channel with a power constraint P if there exists a
(⌈2nR⌉, n) code with codewords satisfying the power constraint such
that the maximal probability of error λ(n) tends to 0 as n → ∞.
Definition 8 (Channel capacity) The capacity of a channel is the
supremum of all achievable rates.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 12/31
Capacity of a Gaussian Channel
Theorem 1 (Capacity of a Gaussian Channel) The capacity of a
Gaussian channel with power constraint P and noise variance N is
1
2log
(
1 +P
N
)
bits per transmission.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 13/31
Sphere Packing Argument
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 14/31
Sphere Packing Argument
For each sent codeword, the received codeword is contained in a
sphere of radius√nN . The received vectors have energy no grater
than n(P +N), so they lie in a sphere of radius√
n(P +N). How
many codeword can we use without intersection in the decoding
sphere?
M =An
(
√
n(P +N))n
An(√nN)n
=
(
1 +P
N
)n/2
where A the constant for calculating the volume of n-dimensional sphere. For example,
A2 = π, A3 = 43π. Therefore, the capacity is
1
nlogM =
1
2log
(
1 +P
N
)
.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 15/31
R < C → Achievable
n Codebook. Let Xi(w), i = 1, 2, . . . , n, w = 1, 2, . . . , 2nR be
i.i.d. ∼ N (0, P − ǫ). For large n,
1
n
∑
X2i → P − ǫ.
n Encoding. The codebook is revealed to both the sender and the
receiver. To send the message index w, the transmitter sends the
wth codeword Xn(w) in the codebook.
n Decoding. The receiver searches for the one that is jointly typical
with the received vector. If there is one and only one such codeword
Xn(w), the receiver declares W = w. Otherwise, the receiver
declares an error. If the power constraint is not satisfied, the receiver
also declare an error.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 16/31
R < C → Achievable
n Probability of error. Assume that codeword 1 was sent.
Y n = Xn(1) + Zn. Define the events
E0 =
{
1
n
n∑
j=1
X2j (1) > P
}
and
Ei = {(
Xn(i), Y n(i) is in A(n)ǫ
)
}.Then an error occurs if
u The power constraint is violate. ⇒ E0 occurs.
u The transmitted codeword and the received sequence are not
jointly typical. ⇒ Ec1 occurs.
u Wrong codeword is jointly typical with the received sequence. ⇒E2 ∪ E3 ∪ · · · ∪ E2nR occurs.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 17/31
R < C → Achievable
Let W be uniformly distributed. We have
P (n)e =
1
2nR
∑
λi = P (E) = Pr(E|W = 1)
= P (E0 ∪ Eca ∪ E2 ∪ E3 ∪ · · · ∪ E2nR)
≤ P (E0) + P(Ec1) +
2nR∑
i=2
P (Ei)
≤ ǫ+ ǫ+
2nR∑
i=2
2−n(I(X;Y )−3ǫ)
≤ 2ǫ+ 2−n(I(X;Y )−R−3ǫ) ≤ 3ǫ
for n sufficient large and R < I(X;Y )− 3ǫ.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 18/31
R < C → Achievable, final part
n Since the average probability of error over codebooks is less then 3ǫ,
there exists at least one codebook C∗ such that Pr(E|C∗) < 3ǫ.
u C∗ can be found by an exhaustive search over all codes.
n Deleting the worst half of the codewords in C∗, we obtain a code with
low maximal probability of error. The codewords that violates the
power constraint is definitely deleted. (why?) Hence, we have
construct a code that achieves a rate arbitrarily close to C .
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 19/31
9.2 Converse to the Coding Theorem forGaussian Channels
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 20/31
Achievable → R < C
We will prove that if P(n)e → 0 then R ≤ C = 1
2log(1 + P
N). Let W
be distributed uniformly. We have W → Xn → Y n → W . By Fano’s
inequality,
H(W |W ) ≤ 1 + nRP (n)e = nǫn, where ǫn =
1
n+RP (n)
e → 0
as P(n)e → 0. Now,
nR = H(W ) = I(W ; W ) +H(W |W )
≤ I(W ; W ) + nǫn ≤ I(Xn;Y n) + nǫn(data processing ineq.)
= h(Y n)− h(Y n|Xn) + nǫn = h(Y n)− h(Zn) + nǫn
≤n∑
i=1
h(Yi)− h(Zn) + nǫn ≤n∑
i=1
h(Yi)−n∑
i=1
h(Zi) + nǫn
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 21/31
Achievable → R < C
nR ≤n
∑
i=1
(h(Yi)− h(Zi)) + nǫn
≤∑
(
1
2log (2πe(Pi +N))− 1
2log 2πeN
)
+ nǫn
=∑ 1
2log
(
1 +Pi
N
)
+ nǫn
≤ n
2log
(
1 +P
N
)
+ nǫn
since every codeword satisfies the power constraint. Thus,
R ≤ 1
2log
(
1 +P
N
)
+ ǫn.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 22/31
9.3 Bandlimited Channels
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 23/31
Capacity of Bandlimited Channels
n Suppose the output of a band-limited channel can be represented by
Y (t) = (X(t) +N(t)) ∗ h(t)
where X(t) is the input signal, Z(t) is the white Gaussian noise,
and h(t) is the impulse response of the channel with bandwidth W .
n The sampling frequency is 2W. If the channel be used over the time
interval [0, T ], then there are 2WT samples transmitted.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 24/31
Capacity of Bandlimited Channels
n If the noise has power spectral density N0/2 watts/Hz, the noise
power is (N0/2)(2W ) = N0W. The noise energy per sample is
N0W ∗ T/2WT = N0/2. If the signal power is P . The signal
energy per sample is PT/2WT = P/2W.
n The capacity is 12log
(
1 + P/2WN0/2
)
bits/sample or
C = W log
(
1 +P
N0W
)
bits/second
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 25/31
9.4 Parallel Gaussian Channels
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 26/31
Capacity of Bandlimited Channels
n In this section we consider k independent Gaussian channels in
parallel with a common power constraint. The objective is to distribute
the total power among the channels so as to maximize the capacity.
The channels are modeled as
Yj = Xj + Zj , j = 1, 2, . . . , k.
with Zj ∼ N (0, Nj). There is a common power constraint
E
[
k∑
j=1
X2j
]
≤ P.
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 27/31
Capacity of Bandlimited Channels
The information capacity is
C = maxf(X−1,...,xn):EX2
i <PI(X1, X2 . . . , Xk;Y1, Y2, . . . , Yk)
Since Z1, Z2, . . . , Zk are independent,
I(X1, X2 . . . , Xk;Y1, Y2, . . . , Yk)
=h(Y1, Y2, . . . , Yk)− h(Y1, Y2, . . . , Yk|X1,X2 . . . ,Xk)
=h(Y1, Y2, . . . , Yk)− h(Z1, Z2, . . . , Zk|X1, X2 . . . , Xk)
=h(Y1, Y2, . . . , Yk)− h(Z1, Z2, . . . , Zk)
=h(Y1, Y2, . . . , Yk)−∑
i
h(Zi)
≤∑
i
h(Yi)−∑
i
h(Zi) ≤∑
i
1
2log
(
1 +Pi
Ni
)
where Pi = EX2i and
∑
Pi = P
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 28/31
Capacity of Bandlimited Channels
Therefore, we have a constrained optimization problem
max∑
i
1
2log
(
1 +Pi
Ni
)
subject to∑
i
Pi ≤ P, Pi ≥ 0.
This can be solved by Lagrange multiplier together with the Kuhn-Tucker
condition.
− 1
2
1/Ni
1 + Pi/Ni
− µi + λ = 0
− Pi ≤ 0,∑
i
Pi − P ≤ 0
µiPi = 0, λ(∑
i
Pi − P ) = 0
µi ≥ 0, λ ≥ 0
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 29/31
Capacity of Bandlimited Channels
Case I. λ = 0. We have
Pi +Ni = − 1
2µi, Pi = − 1
2µi−Ni
This violates the condition −Pi ≤ 0 since Ni > 0 and µi ≥ 0.
Case II. λ 6= 0. We have
Pi +Ni =1
2(λ− µi)=
12λ
= constant, Pi > 0( imply µi = 0)
12(λ−µi)
, Pi = 0.
We can solve λ by∑
i Pi =∑
i(12λ
−Ni)+ = P
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 30/31
Capacity of Bandlimited Channels
Peng-Hua Wang, May 14, 2012 Information Theory, Chap. 9 - p. 31/31
Nonlinear Optimization
For the problem
min f(x1, x2, . . . , xn)
subject to
gj(x1, x2, . . . , xn) ≤ 0, j = 1, 2, . . . m
The necessary conditions for optimization are
∂f
∂xi
+∑
j
µj∂gj∂xi
= 0, i = 1, 2, . . . , n
gj(x1, x2, . . . , xn) ≤ 0, j = 1, 2, . . . ,m
µjgj(x1, x2, . . . , xn) = 0, j = 1, 2, . . . ,m
µj ≥ 0, j = 1, 2, . . . ,m