4
EE126 Discussion 4: Solutions erˆomeThai February 20, 2014 1 Continuous random variables Problem 1. Let X be uniformly distributed in [0, 1]. Assume that, given X = x, the random variable Y is exponentially distributed with rate x + 1. (a) Calculate E[Y ]. (b) Find MLE[X |Y = n]. (c) Find MAP[X |Y = n]. Solution. (a) We first note that E[Y ]= E[E[Y |X ]]. Computing E[Y |X ]: E[Y |X = x]= E[exp(x + 1)] = 1/(x + 1) From above, E[Y ]= E[E[Y |X = x]] = E[(1 + x) -1 ]= R 1 x=0 (1 + x) -1 = [ln(1 + x)] 1 0 = ln 2 (b, c) As X is uniformly distributed the MAP and MLE are equivalent. This means we just need to compute the MLE, i.e. the X that maximizes the likelihood of Y = y: MLE[X |Y = n] = arg max x f (Y = y | X = x) = arg max x (1 + x)e -(1+x)y We can find this by calculating derivatives and setting it equal to zero: d dx (1 + x)e -(x+1)y = -y(1 + x)e -(x+1)y + e -(x+1)y = e -(x+1)y (-(x + 1)y + 1) = 0 = x = 1 y - 1 Recognizing that x is also limited to be in the range of [0, 1]. Thus: MAP[X | Y = y]= 1 if y 0.5 1 y - 1 if 0.5 y 1 0 otherwise Problem 2. Let X, Y be two independent Exp(1) random variables. Calculate E[X | X>y] (two different ways). Solution. We can note that by the memoryless property of the exponential distribution, given {X>Y = y},X - Y is exponentially distributed with rate 1. Thus, E[X | X>y]= y +1 We can find this result by direct calculation. Using Bayes’ rule: P (X [x, x + dx] and X>y)= P (X [x, x + dx] | X>y)P (X>y) 1

EEC 126 Discussion 4 Solutions

  • Upload
    howard

  • View
    20

  • Download
    7

Embed Size (px)

DESCRIPTION

Solutions for EE126 at UC Berkeley

Citation preview

Page 1: EEC 126 Discussion 4 Solutions

EE126 Discussion 4: Solutions

Jerome Thai

February 20, 2014

1 Continuous random variables

Problem 1. Let X be uniformly distributed in [0, 1]. Assume that, given X = x, the randomvariable Y is exponentially distributed with rate x+ 1.(a) Calculate E[Y ].(b) Find MLE[X|Y = n].(c) Find MAP[X|Y = n].

Solution. (a) We first note that E[Y ] = E[E[Y |X]]. Computing E[Y |X]:

E[Y |X = x] = E[exp(x+ 1)] = 1/(x+ 1)

From above, E[Y ] = E[E[Y |X = x]] = E[(1 + x)−1] =∫ 1x=0(1 + x)−1 = [ln(1 + x)]10 = ln 2

(b, c) As X is uniformly distributed the MAP and MLE are equivalent. This means we just needto compute the MLE, i.e. the X that maximizes the likelihood of Y = y:

MLE[X|Y = n] = arg maxx

f(Y = y |X = x)

= arg maxx

(1 + x)e−(1+x)y

We can find this by calculating derivatives and setting it equal to zero:

ddx(1 + x)e−(x+1)y = −y(1 + x)e−(x+1)y + e−(x+1)y

= e−(x+1)y(−(x+ 1)y + 1) = 0=⇒ x = 1

y − 1

Recognizing that x is also limited to be in the range of [0, 1]. Thus:

MAP[X |Y = y] =

1 if y ≤ 0.51y − 1 if 0.5 ≤ y ≤ 1

0 otherwise

Problem 2. Let X, Y be two independent Exp(1) random variables. Calculate E[X |X > y] (twodifferent ways).

Solution. We can note that by the memoryless property of the exponential distribution, given{X > Y = y}, X − Y is exponentially distributed with rate 1. Thus,

E[X |X > y] = y + 1

We can find this result by direct calculation. Using Bayes’ rule:

P (X ∈ [x, x+ dx] and X > y) = P (X ∈ [x, x+ dx] |X > y)P (X > y)

1

Page 2: EEC 126 Discussion 4 Solutions

hence ∫x≥0

xP (X ∈ [x, x+ dx] and X > y) =

∫x≥0

xP (X ∈ [x, x+ dx] |X > y)P (X > y)

The term on the left-hand side inside the integral is equal to 0 for x < y:∫x≥y

xP (X ∈ [x, x+ dx] and X > y) = P (X > y)

∫x≥0

xP (X ∈ [x, x+ dx] |X > y)

in other words ∫ +∞

yxe−xdx = E[X |X > y]P (X > y)

hence:

E[X |X > y] =

∫ +∞y xe−xdx

P (X > y)=

∫ +∞y xe−xdx

e−y

with ∫ +∞

yxe−xdx = [−xe−x]+∞y +

∫ +∞

ye−xdx = ye−y + [−e−x]+∞y = e−y(1 + y)

henceE[X |X > y] = y + 1

2 Markov chains

Problem 3. a) Give an example of a Markov chain Xn on {0, 1, 2, 3} and a function of the Markovchain that is not a Markov chain.b) Give an example of a Markov chain Xn on {0, 1, 2, 3} and a function of that Markov chain thatis not constant and not identical to Xn and that is a Markov chain.

Solution. a) Let Xn be cyclic on {0, 1, 2, 3}, i.e., moving from 0 to 1 to 2 to 3 to 0, etc, withprobability 1. Assume that X0 is uniform on {0, 1, 2, 3}. Let f(0) = 0, f(1) = 1, f(2) = f(3) = 2.Then f(Xn) is not a Markov chain. Indeed,

P [f(X2) = 2 | f(X1) = 2, f(X0) = 2] = 0

whereasP [f(X2) = 2 | f(X1) = 2] > 0

b) Any one-to-one function will do. For a non-trivial example where the function is many-to-one, we use symmetry. Let Xn be the MC with the state transition diagram shown below. Thenf(Xn) with f(0) = 0, f(1) = 1, f(2) = 1, f(3) = 0 is a MC. The main idea is that the futureof f(Xn) looks the same whether Xn = 0 or Xn = 3 and also whether Xn = 1 or Xn = 2, bysymmetry.

Problem 4. You roll a die until the sum of the last two rolls yields 9. What is the average numberof rolls?

2

Page 3: EEC 126 Discussion 4 Solutions

Figure 1: MC for problem 3.

Solution. We can follow the same approach as done in class. For the sum of the last two rolls tobe 9, both rolls must be greater than or equal to 3. Let’s define the following two definitions:

1) Let α be the average remaining number of rolls given you have just rolled 1 or 2.2) Let β be the average remaining number of rolls given you have just rolled 3-6.

Once we reach β there is 1/6 probability of having the sum equal 9.

α = (1/3)(α+ 1) + (2/3)(β + 1)

β = 1/6 ∗ 1 + (1/3)(α+ 1) + (1/2)(β + 1)

Solving this gives α = 10.5, β = 9. This it takes on average 10.5 rolls for the sum to equal 9.

Problem 5. Consider the numbers 1, 2, · · · , 12 written around a ring as they usually are on aclock. Consider the Markov chain with state space {1, 2, · · · , 12} that at any time jumps withequal probability to one of the two adjacent numbers. What is the expected number of steps thatthe Markov chain will take to return to its original position?

Solution. Let {Xn, n ≥ 0} denote the Markov chain of interest. By symmetry, we may assumewithout loss of generality that we start in state 1. Let h(x) := E[T1 |X0 = x], where

T1 := inf{n ≥ 1 : Xn = 1}

By symmetry we must have h(x) = h(14 − x) for all x ∈ {1, 2, · · · , 12}. Thus it suffices to writethe following first step equations (whose validity can be seen from the Markov property):

h(1) = 1 + h(2)/2 + h(12)/2 = 1 + h(2)h(2) = 1 + 1/2× 0 + h(3)/2 = 1 + h(3)/2h(3) = 1 + h(2)/2 + h(4)/2h(4) = 1 + h(3)/2 + h(5)/2h(5) = 1 + h(4)/2 + h(6)/2h(6) = 1 + h(5)/2 + h(7)/2h(7) = 1 + h(6)/2 + h(8)/2 = 1 + h(6)

Note that we have substituted h(12) by h(2) and h(8) by h(6), so these equations only involve thevariables h(1), h(2), · · · , h(7). We can rewrite the last six equations as

h(2) = 2 + h(3)− h(2)h(3)− h(2) = 2 + h(4)− h(3)h(4)− h(3) = 2 + h(5)− h(4)h(5)− h(4) = 2 + h(6)− h(5)h(6)− h(5) = 2 + h(7)− h(6)h(7)− h(6) = 1

3

Page 4: EEC 126 Discussion 4 Solutions

This form is set up for easy successive substitution starting from the bottom equation and workingupwards, which allows us to conclude that h(2) = 11. Substituting this in the first of these equationsgives h(1) = 12.

3 Confidence intervals

Note 1. Let X be a random variable with finite expected value µ and finite non-zero variance σ2.Then for any real number a > 0,

P (|X − µ| > a) ≤ σ2/a2

Problem 6. In order to estimate the probability of head in a coin flip, p, you flip a coin n times,and count the number of heads, Sn. You use the estimator p = Sn/n. You choose the sample sizen to have a guarantee

P (|Sn/n− p| ≥ ε) ≤ δ

Determine how the value of n suggested by Chebyshev inequality changes when ε is reduced to halfof its original value? How does it change when δ is reduced to its original value?

Solution. The random variable Sn/n has mean p and variance np(1− p)/n2 = p(1− p)/n, hence,

P (|Sn/n− p| ≥ ε) ≤ p(1− p)nε2

by Chebyshev inequality. Hence we want:

δ =p(1− p)nε2

or n =p(1− p)δε2

When ε is reduced to half of its original value, we need four times more flips to have the sameguarantee, and when δ is reduced to half of its original value, we need two times more flips.

4