Upload
howard
View
20
Download
7
Embed Size (px)
DESCRIPTION
Solutions for EE126 at UC Berkeley
Citation preview
EE126 Discussion 4: Solutions
Jerome Thai
February 20, 2014
1 Continuous random variables
Problem 1. Let X be uniformly distributed in [0, 1]. Assume that, given X = x, the randomvariable Y is exponentially distributed with rate x+ 1.(a) Calculate E[Y ].(b) Find MLE[X|Y = n].(c) Find MAP[X|Y = n].
Solution. (a) We first note that E[Y ] = E[E[Y |X]]. Computing E[Y |X]:
E[Y |X = x] = E[exp(x+ 1)] = 1/(x+ 1)
From above, E[Y ] = E[E[Y |X = x]] = E[(1 + x)−1] =∫ 1x=0(1 + x)−1 = [ln(1 + x)]10 = ln 2
(b, c) As X is uniformly distributed the MAP and MLE are equivalent. This means we just needto compute the MLE, i.e. the X that maximizes the likelihood of Y = y:
MLE[X|Y = n] = arg maxx
f(Y = y |X = x)
= arg maxx
(1 + x)e−(1+x)y
We can find this by calculating derivatives and setting it equal to zero:
ddx(1 + x)e−(x+1)y = −y(1 + x)e−(x+1)y + e−(x+1)y
= e−(x+1)y(−(x+ 1)y + 1) = 0=⇒ x = 1
y − 1
Recognizing that x is also limited to be in the range of [0, 1]. Thus:
MAP[X |Y = y] =
1 if y ≤ 0.51y − 1 if 0.5 ≤ y ≤ 1
0 otherwise
Problem 2. Let X, Y be two independent Exp(1) random variables. Calculate E[X |X > y] (twodifferent ways).
Solution. We can note that by the memoryless property of the exponential distribution, given{X > Y = y}, X − Y is exponentially distributed with rate 1. Thus,
E[X |X > y] = y + 1
We can find this result by direct calculation. Using Bayes’ rule:
P (X ∈ [x, x+ dx] and X > y) = P (X ∈ [x, x+ dx] |X > y)P (X > y)
1
hence ∫x≥0
xP (X ∈ [x, x+ dx] and X > y) =
∫x≥0
xP (X ∈ [x, x+ dx] |X > y)P (X > y)
The term on the left-hand side inside the integral is equal to 0 for x < y:∫x≥y
xP (X ∈ [x, x+ dx] and X > y) = P (X > y)
∫x≥0
xP (X ∈ [x, x+ dx] |X > y)
in other words ∫ +∞
yxe−xdx = E[X |X > y]P (X > y)
hence:
E[X |X > y] =
∫ +∞y xe−xdx
P (X > y)=
∫ +∞y xe−xdx
e−y
with ∫ +∞
yxe−xdx = [−xe−x]+∞y +
∫ +∞
ye−xdx = ye−y + [−e−x]+∞y = e−y(1 + y)
henceE[X |X > y] = y + 1
2 Markov chains
Problem 3. a) Give an example of a Markov chain Xn on {0, 1, 2, 3} and a function of the Markovchain that is not a Markov chain.b) Give an example of a Markov chain Xn on {0, 1, 2, 3} and a function of that Markov chain thatis not constant and not identical to Xn and that is a Markov chain.
Solution. a) Let Xn be cyclic on {0, 1, 2, 3}, i.e., moving from 0 to 1 to 2 to 3 to 0, etc, withprobability 1. Assume that X0 is uniform on {0, 1, 2, 3}. Let f(0) = 0, f(1) = 1, f(2) = f(3) = 2.Then f(Xn) is not a Markov chain. Indeed,
P [f(X2) = 2 | f(X1) = 2, f(X0) = 2] = 0
whereasP [f(X2) = 2 | f(X1) = 2] > 0
b) Any one-to-one function will do. For a non-trivial example where the function is many-to-one, we use symmetry. Let Xn be the MC with the state transition diagram shown below. Thenf(Xn) with f(0) = 0, f(1) = 1, f(2) = 1, f(3) = 0 is a MC. The main idea is that the futureof f(Xn) looks the same whether Xn = 0 or Xn = 3 and also whether Xn = 1 or Xn = 2, bysymmetry.
Problem 4. You roll a die until the sum of the last two rolls yields 9. What is the average numberof rolls?
2
Figure 1: MC for problem 3.
Solution. We can follow the same approach as done in class. For the sum of the last two rolls tobe 9, both rolls must be greater than or equal to 3. Let’s define the following two definitions:
1) Let α be the average remaining number of rolls given you have just rolled 1 or 2.2) Let β be the average remaining number of rolls given you have just rolled 3-6.
Once we reach β there is 1/6 probability of having the sum equal 9.
α = (1/3)(α+ 1) + (2/3)(β + 1)
β = 1/6 ∗ 1 + (1/3)(α+ 1) + (1/2)(β + 1)
Solving this gives α = 10.5, β = 9. This it takes on average 10.5 rolls for the sum to equal 9.
Problem 5. Consider the numbers 1, 2, · · · , 12 written around a ring as they usually are on aclock. Consider the Markov chain with state space {1, 2, · · · , 12} that at any time jumps withequal probability to one of the two adjacent numbers. What is the expected number of steps thatthe Markov chain will take to return to its original position?
Solution. Let {Xn, n ≥ 0} denote the Markov chain of interest. By symmetry, we may assumewithout loss of generality that we start in state 1. Let h(x) := E[T1 |X0 = x], where
T1 := inf{n ≥ 1 : Xn = 1}
By symmetry we must have h(x) = h(14 − x) for all x ∈ {1, 2, · · · , 12}. Thus it suffices to writethe following first step equations (whose validity can be seen from the Markov property):
h(1) = 1 + h(2)/2 + h(12)/2 = 1 + h(2)h(2) = 1 + 1/2× 0 + h(3)/2 = 1 + h(3)/2h(3) = 1 + h(2)/2 + h(4)/2h(4) = 1 + h(3)/2 + h(5)/2h(5) = 1 + h(4)/2 + h(6)/2h(6) = 1 + h(5)/2 + h(7)/2h(7) = 1 + h(6)/2 + h(8)/2 = 1 + h(6)
Note that we have substituted h(12) by h(2) and h(8) by h(6), so these equations only involve thevariables h(1), h(2), · · · , h(7). We can rewrite the last six equations as
h(2) = 2 + h(3)− h(2)h(3)− h(2) = 2 + h(4)− h(3)h(4)− h(3) = 2 + h(5)− h(4)h(5)− h(4) = 2 + h(6)− h(5)h(6)− h(5) = 2 + h(7)− h(6)h(7)− h(6) = 1
3
This form is set up for easy successive substitution starting from the bottom equation and workingupwards, which allows us to conclude that h(2) = 11. Substituting this in the first of these equationsgives h(1) = 12.
3 Confidence intervals
Note 1. Let X be a random variable with finite expected value µ and finite non-zero variance σ2.Then for any real number a > 0,
P (|X − µ| > a) ≤ σ2/a2
Problem 6. In order to estimate the probability of head in a coin flip, p, you flip a coin n times,and count the number of heads, Sn. You use the estimator p = Sn/n. You choose the sample sizen to have a guarantee
P (|Sn/n− p| ≥ ε) ≤ δ
Determine how the value of n suggested by Chebyshev inequality changes when ε is reduced to halfof its original value? How does it change when δ is reduced to its original value?
Solution. The random variable Sn/n has mean p and variance np(1− p)/n2 = p(1− p)/n, hence,
P (|Sn/n− p| ≥ ε) ≤ p(1− p)nε2
by Chebyshev inequality. Hence we want:
δ =p(1− p)nε2
or n =p(1− p)δε2
When ε is reduced to half of its original value, we need four times more flips to have the sameguarantee, and when δ is reduced to half of its original value, we need two times more flips.
4