13
18.175: Lecture 11 Independent sums and large deviations Scott Sheffield MIT 18.175 Lecture 11 1

18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

18.175: Lecture 11

Independent sums and large deviations

Scott Sheffield

MIT

18.175 Lecture 11 1

Page 2: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Outline

Recollections

Large deviations

18.175 Lecture 11 2

Page 3: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Outline

Recollections

Large deviations

18.175 Lecture 11 3

Page 4: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Recall Borel-Cantelli lemmas

S∞� First Borel-Cantelli lemma: If P(An) < ∞ then n=1 P(An i.o.) = 0.

� Second Borel-Cantelli lemma: If An are independent, then S∞ P(An) = ∞ implies P(An i.o.) = 1. n=1

18.175 Lecture 11 4

Page 5: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Kolmogorov zero-one law

Consider sequence of random variables Xn on some probability space. Write F � = σ(Xn, Xn1 , . . .) and T = ∩nF � .n n

T is called the tail σ-algebra. It contains the information you can observe by looking only at stuff arbitrarily far into the future. Intuitively, membership in tail event doesn’t change when finitely many Xn are changed.

Event that Xn converge to a limit is example of a tail event. Other examples?

Theorem: If X1, X2, . . . are independent and A ∈ T then P(A) ∈ {0, 1}.

18.175 Lecture 11 5

I

I

I

I

Page 6: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Kolmogorov maximal inequality

Thoerem: Suppose Xi are independent with mean zero and Snfinite variances, and Sn = i=1 Xn. Then

P( max |Sk | ≥ x) ≤ x −2Var(Sn) = x −2E |Sn|2 . 1≤k≤n

Main idea of proof: Consider first time maximum is exceeded. Bound below the expected square sum on that event.

18.175 Lecture 11 6

I

I

Page 7: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Kolmogorov three-series theorem

Theorem: Let X1, X2, . . . be independent and fix A > 0.S Write Yi = Xi 1(|Xi |≤A). Then Xi converges a.s. if and only if the following are all true:

� S∞ P(|Xn| > A) < ∞ n=1S∞

� EYn convergesn=1S∞� Var(Yn) < ∞ n=1

Main ideas behind the proof: Kolmogorov zero-one law S implies that Xi converges with probability p ∈ {0, 1}. We just have to show that p = 1 when all hypotheses are satisfied (sufficiency of conditions) and p = 0 if any one of them fails (necessity).

To prove sufficiency, apply Borel-Cantelli to see that probability that Xn i.o. is zero. Subtract means from = Yn

Yn, reduce to case that each Yn has mean zero. Apply Kolmogorov maximal inequality.

18.175 Lecture 11 7

I

I

I

6

Page 8: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Outline

Recollections

Large deviations

818.175 Lecture 11

Page 9: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Outline

Recollections

Large deviations

18.175 Lecture 11 9

Page 10: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Recall: moment generating functions

Let X be a random variable.

The moment generating function of X is defined by M(t) = MX (t) := E [etX ]. S

txWhen X is discrete, can write M(t) = e pX (x). So M(t)x is a weighted average of countably many exponential functions. ∞When X is continuous, can write M(t) = etx f (x)dx . So−∞ M(t) is a weighted average of a continuum of exponential functions.

We always have M(0) = 1.

If b > 0 and t > 0 then tX ] ≥ E [et min{X ,b}] ≥ P{X ≥ b}etbE [e .

If X takes both positive and negative values with positive probability then M(t) grows at least exponentially fast in |t|as |t| → ∞.

18.175 Lecture 11 10

I

I

I

I

I

I

I

Page 11: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Recall: moment generating functions for i.i.d. sums

We showed that if Z = X + Y and X and Y are independent, then MZ (t) = MX (t)MY (t)

If X1 . . . Xn are i.i.d. copies of X and Z = X1 + . . . + Xn then what is MZ ?

Answer: MXn . Follows by repeatedly applying formula above.

This a big reason for studying moment generating functions. It helps us understand what happens when we sum up a lot of independent copies of the same random variable.

18.175 Lecture 11 11

I

I

I

I

Page 12: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

Large deviations

Consider i.i.d. random variables Xi . Want to show that if φ(θ) := MXi (θ) = E exp(θXi ) is less than infinity for some θ > 0, then P(Sn ≥ na) → 0 exponentially fast when a > E [Xi ].

Kind of a quantitative form of the weak law of large numbers. The empirical average An is very unlikely to E away from its expected value (where “very” means with probability less than some exponentially decaying function of n).

1Write γ(a) = limn→∞ log P(Sn ≥ na). It gives the “rate” of n exponential decay as a function of a.

18.175 Lecture 11 12

I

I

I

Page 13: 18.175: Lecture 11 Independent sums and large deviations · 18.175: Lecture 11 Independent sums and large deviations Scott Sheffield. MIT. 18.175. Lecture 11. 1

MIT OpenCourseWarehttp://ocw.mit.edu

18.175 Theory of ProbabilitySpring 2014

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms .