14
6 Elementary Notions in Probability Theory In this chapter we introduce the first notions of the theory of probability, we begin the discussion about conditioning, and we make a detailed study of the concept of stochastic independence. 17 Events and Random Variables In this section we introduce the primary notions of event, probability and random variable. The σ -algebra of events 17.1. The starting point for probability theory is a set X whose elements represent the possible outcomes of an experiment. A basic notion in probability theory is that of event. The events are subsets of X . Thus the family A of events is included in P ( X ). We are not interested in the nature of an event A A, but we are rather concerned with its occurrence or nonoccurrence. A contains two special events: the impossible event, denoted by , and the sure event, denoted by X . If A and B are events such that A B , then we say that A implies B . With the help of the logical operations expressed by the terms “not”, “or” and “and” we can form new events. Thus to each event A A there corresponds the contrary event A c A . If A, B A, then A B A is the event A or B , and A B A is the event A and B . If A B =∅, then we say that A and B are disjoint events. For practical and theoretical reasons we shall assume that the events family A is a σ -algebra. Probabilities of events 17.2. Suppose that in n repetitions of an experiment the event A A occurs n A times, so that the frequency of occurrence of A is n A / n. If n A / n approximate to a number P ( A) as n →∞, then P ( A) represents the probability of the event A. Obviously, 0 P ( A) 1. Also, we have P () = 0 and P ( X ) = 1. If A and B are disjoint events, then it is clear that P ( A B ) = P ( A) + P ( B ). In order to get a rich mathematical theory we shall assume that P is σ -additive, and so ( X , A, P ) is a probability space. We shall say that a property holds P -almost surely [ P -a.s.] if it holds P -almost everywhere. When confusion is impossible, we shall write simply a.s. instead of P -a.s. Also a.s. convergence will mean P -a.s. convergence. If A, B A with 1 A 1 B a.s. or 1 A = 1 B a.s., we shall sometimes write A a.s. B or A a.s. = B , respectively. (With the notation of (9.36), observe that A a.s. B if and only if A B , and A a.s. = B if and only if A B .) Analysis and Probability. http://dx.doi.org/10.1016/B978-0-12-401665-1.00006-0 © 2013 Elsevier Inc. All rights reserved.

Analysis and Probability || Elementary Notions in Probability Theory

  • Upload
    aurel

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

6 Elementary Notions in ProbabilityTheory

In this chapter we introduce the first notions of the theory of probability, we beginthe discussion about conditioning, and we make a detailed study of the concept ofstochastic independence.

17 Events and Random Variables

In this section we introduce the primary notions of event, probability and randomvariable.

The σ -algebra of events 17.1. The starting point for probability theory is a set Xwhose elements represent the possible outcomes of an experiment. A basic notion inprobability theory is that of event. The events are subsets of X . Thus the family A ofevents is included in P(X). We are not interested in the nature of an event A ∈ A, butwe are rather concerned with its occurrence or nonoccurrence. A contains two specialevents: the impossible event, denoted by ∅, and the sure event, denoted by X . If A andB are events such that A ⊂ B, then we say that A implies B. With the help of the logicaloperations expressed by the terms “not”, “or” and “and” we can form new events. Thusto each event A ∈ A there corresponds the contrary event Ac ∈ A . If A, B ∈ A, thenA ∪ B ∈ A is the event A or B, and A ∩ B ∈ A is the event A and B. If A ∩ B = ∅,then we say that A and B are disjoint events. For practical and theoretical reasons weshall assume that the events family A is a σ -algebra.

Probabilities of events 17.2. Suppose that in n repetitions of an experiment the eventA ∈ A occurs n A times, so that the frequency of occurrence of A is n A/n. If n A/napproximate to a number P(A) as n → ∞, then P(A) represents the probability ofthe event A. Obviously, 0 � P(A) � 1. Also, we have P(∅) = 0 and P(X) = 1.If A and B are disjoint events, then it is clear that P(A ∪ B) = P(A) + P(B). Inorder to get a rich mathematical theory we shall assume that P is σ -additive, and so(X,A, P) is a probability space. We shall say that a property holds P-almost surely[P-a.s.] if it holds P-almost everywhere. When confusion is impossible, we shall writesimply a.s. instead of P-a.s. Also a.s. convergence will mean P-a.s. convergence. If

A, B ∈ A with 1A � 1B a.s. or 1A = 1B a.s., we shall sometimes write Aa.s.⊂ B or

Aa.s.= B, respectively. (With the notation of (9.36), observe that A

a.s.⊂ B if and only ifA � B, and A

a.s.= B if and only if A ∼ B.)

Analysis and Probability. http://dx.doi.org/10.1016/B978-0-12-401665-1.00006-0© 2013 Elsevier Inc. All rights reserved.

164 Analysis and Probability

The next notions play a central role in probability theory.

Definitions 17.3. Let (X,A) be as in (17.1). If to each possible outcome x ∈ X weassign a number f (x) ∈ R such that the function f : X → R is (A,B(R))-measurable,then we say that f is a random variable on X . An R- valued random variable on X isan (A,B(R))-measurable function from X into R. We drop “on X” when confusion isimpossible.

Let (Y,B) be a measurable space, and let g : X → Y be an (A,B)-measurablefunction. In the context of probability theory g is called a random element. If (Y,B) =(∏

i∈I Yi ,⊗i∈I Bi ) for some nonempty family {(Yi ,Bi ) : i ∈ I } of measurable spaces,then g is called a random vector.

Definitions 17.4. Let (X,A, P) be a probability space, and let f be a random variableon X . If

∫X f d P exists, then the number E f = ∫

X f d P is called the expectation of f .(Sometimes we will write E[ f ] instead of E f .) If g : X → C is such that |g| ∈ L1,the number Eg = E[Re g] + i E[Im g] is called the expectation of g.

The following quantities are of special interest in probability theory.

Definitions 17.5. Let (X,A, P) and f be as in (17.4), and let k > 0. If f k isdefined and E f k exists, then E f k is called the k-th moment of f . The number E | f |k iscalled the k-th absolute moment of f . In accordance with (13.5), if E | f |k < ∞, thenE | f | j < ∞ whenever 0 < j < k. If E f ∈ R, then the number Var f = E( f −E f )2 =E f 2 − (E f )2 is called the variance of f , and the number σ [ f ] = (Var f )1/2 is calledthe standard deviation of f . (Sometimes we will write Var[ f ] instead of Var f .)

Definitions 17.6. Let (X,A, P) be a probability space, and let f and g be randomvariables on X such that E f ∈ R, Eg ∈ R and E[ f g] ∈ R. The number Cov[ f, g] =E[( f − E f )(g − Eg)] = E[ f g] − E[ f ]E[g] is called the covariance of f and g.If σ [ f ]2 = Var f ∈ ]0,∞[ and σ [g]2 = Var g ∈ ]0,∞[, the number ρ[ f, g] =Cov[ f, g]/σ [ f ]σ [g] is called the correlation coefficient between f and g. Accordingto Schwarz’s inequality, we have −1 � ρ[ f, g] � 1.

Exercise 17.7. Let f be a random variable such that E f ∈ R. Var f = 0 if and onlyif f = c a.s. for some c ∈ R.

Exercise 17.8. Let f be a random variable such that m = E f ∈ R and σ 2 = Var f ∈]0,∞[. Show that P(| f − m| � aσ) � a−2 whenever 0 < a < ∞.

Exercise 17.9. For i = 1, . . . , n, let fi be a random variable such that E f 2i < ∞, and

let ai ∈ R. Prove that Var[∑ni=1 ai fi ] = ∑n

i=1 a2i Var fi + 2

∑n1�i< j�n1 ai a j Cov[ fi ,

f j ] = aCaT , where a = (a1, . . . , an), aT denote its transpose, and C is a matrix withentries ci j = Cov[ fi , f j ], 1 � i, j � n.

Exercise 17.10. Let (X,A, P) be a probability space, let f be a random variables onX , and let ϕ : [0,∞[ → [0,∞[ be a differentiable function such that ϕ(0) = 0 and thederivative ϕ′ � 0. Assume that either ϕ′ is nonincreasing, or ϕ′ is nondecreasing andϕ′(x + 1) � Cϕ′(x), x � x0, for some x0 > 0 and C > 0. Show that E[ϕ ◦ | f |] < ∞if and only if

∑n�1 ϕ

′(n)P({| f | � n}) < ∞; in particular, E | f |p < ∞ for some

p > 0 if and only if∑

n�1 n p−1 P({| f | � n}) < ∞. [Use (15.23.a).]

Elementary Notions in Probability Theory 165

Exercise 17.11. Let f and g be random variables such that Var f = Var g ∈ ]0,∞[.Prove that Cov[ f + g, f − g] = 0.

Exercise 17.12 (Liapounov). Notation is as in (6.99). Set αk = E | f |k , k > 0. If

0 < p � q � r , then αr−pq � α

r−qp α

q−pr .

Exercise 17.13. Let f and g be random variables, and assume that P({ f > x}∩{g >y}) � P({ f > x})P({g > y}) for any x, y ∈ R. Show that P({− f > x} ∩ {−g >y}) � P({− f > x})P({−g > y}).Exercise 17.14. Let (X,A, P) be a probability space, let {An : n ∈ N } ⊂ A be suchthat P(An) → 1, and let f be a P-integrable random variable on X with E f = 0.Prove that E[ f 1An ] → 0.

Exercise 17.15. Let f be a nonnegative random variable such that Var f ∈ ]0,∞[.Show that E f � (Var f )/Var

√f . [Use Liapounov’s inequality (17.12).]

Exercise 17.16. Let f be a random variable, let ψ(x) = x log x, x � 1, and letχ(x), x � 0, denote the inverse function of ψ . Prove the following.

(a) χ(x) ∼ x/ log x as x → ∞. (For the meaning of ∼, see (7.2.16).)(b) logχ(x) ∼ log x as x → ∞.(c) For p � 0, E | f |p+1 < ∞ if and only if

∑n�1 n p(log n)p+1 P({| f | �

n log n}) < ∞; in particular, E f 2 < ∞ if and only if∑

n�1 log n P({| f | � √n log n}) <

∞. [Hint. Apply (17.10) with ϕ(x) = x p+1(log x)p+1, x � x0, to the random variableχ ◦ | f |, taking (a) and (b) into account.]

Exercise 17.17. Let (X,A, P) be a probability space, and let {An : n ∈ N } ⊂A be such that A1

a.s.= A2a.s.= · · · Show that A1

a.s.= ∪n∈N Ana.s.= ∩n∈N An

a.s.= lim infna.s.=

lim supn An .

Exercise 17.18. Notation is as in (6.99), assume that Yi is finite for any i ∈ I , and let(X,A, P) be a probability space. For i ∈ I , consider the measurable space (Yi ,P(Yi )),and let fi : X → Yi and gi : X → Yi be random elements.

(a) Prove that dα,H ◦ (f, g) is measurable, where f = ( fi )i∈I and g = (gi )i∈I .(b) Verify that E[dα,H ◦ (f, g)] = ∑

i∈I αi P( fi �= gi ).

18 Conditioning and Independence

The concept of conditioning and independence is peculiar to probability theory. In thissection we introduce the notion of conditional probability, and we study in detail theconcept of independence.

From now on, unless otherwise stated, (X,A, P) is a fixed probability space, anevent is an event in A, and a random variable is a random variable on X .

Definition 18.1. Let B ∈ A be such that P(B) > 0. For A ∈ A, define PB(A) =P(A ∩ B)/P(B). Then PB is a probability on A called the conditional probabilitygiven B. We shall also use the notation P(A|B) = PB(A).

166 Analysis and Probability

Theorem 18.2. The following assertions hold.(i) If B1, B2 ∈ A and P(B1 ∩ B2) > 0, then (PB1)B2 = PB1∩B2 .(ii) If A, B ∈ A and P(B) > 0, then P(A|B) = P(A)/P(B) whenever A ⊂ B,

and P(A|B) = 1 whenever A ⊃ B.(iii) If A, B ∈ A and P(A)P(B) > 0, then P(A)P(B|A) = P(B)P(A|B).(iv) If A1, . . . , An, An+1 ∈ A and P(A2 ∩ · · · ∩ An+1) > 0, then P(A1 ∩ · · · ∩

An|An+1) = ∏ni=1 P(Ai |Ai+1 ∩ · · · ∩ An+1) (product formula).

(v) If {Bi , : i ∈ I } ⊂ A is a countable partition of X such that P(Bi ) > 0, i ∈ I ,then P(A) = ∑

i∈I P(A|Bi )P(Bi ) for any A ∈ A (total probability formula).(vi) If A ∈ A and P(A) > 0, and {Bi , : i ∈ I } ⊂ A is a countable partition of X

such that P(Bi ) > 0, i ∈ I , then P(B j |A) = P(A|B j )P(B j )/∑

i∈I P(A|Bi )P(Bi ),

j ∈ I (Bayes formula).(vii) If B ∈ A and P(B) > 0, and f is a random variable on X, then∫

X

f d PB = 1

P(B)

∫B

f d P, (1)

in the sense that if one of the integral exists, so does the other, and the two integralsare equal.

Proof. The proof of (i)–(vi) is easy and is left to the reader. If f = 1A, where A ∈ A,then (1) becomes PB(A) = P(A ∩ B)/P(B), which is true. (vii) is further proved bymaking use of the indicator function method.

Now we introduce the concept of stochastic independence.

Definition 18.3. Let {Ai : i ∈ I } be a set of σ -algebras such that Ai ⊂ A foreach i ∈ I . We say that Ai , i ∈ I , are independent [P-independent] if, for every finiteset {i1, . . . , in} ⊂ I and A1 ∈ Ai1 , . . . , An ∈ Ain , we have P(A1 ∩ · · · ∩ An) =P(A1) · · · P(An).

Remark 18.4. Let {Ai : i ∈ I } be as in (18.3).(a) If Ai , i ∈ I , are independent, then Ai , i ∈ J , are independent whenever J ⊂ I .(b) If Ai , i ∈ J , are independent for any finite set J ⊂ I , then Ai , i ∈ I , are

independent.(c) Let {Bi : i ∈ I } be a set of σ -algebras such that Bi ⊂ Ai for any i ∈ I . If

Ai , i ∈ I , are independent, then Bi , i ∈ I , are independent.

Definition 18.5. Let { fi : i ∈ I } be a set of random variables on X . We say thatfi , i ∈ I , are independent [P-independent] if f −1

i (B(R)), i ∈ I , are independentσ -algebras.

Remark 18.6. Let { fi : i ∈ I } be as in (18.5). For i ∈ I , let gi : R → R be ameasurable function. Then, in accordance with (18.4.c), gi ◦ fi , i ∈ I , are independent.

Definition 18.7. Let {Ai : i ∈ I } ⊂ A. We say that Ai , i ∈ I , are independent[P-independent] if 1Ai , i ∈ I , are independent random variables.

Remarks 18.8. (a) Let {Ai : i ∈ I } be as in (18.7). Then Ai , i ∈ I , are independentif and only if σ({Ai }), i ∈ I , are independent σ -algebras.

Elementary Notions in Probability Theory 167

(b) Independence may be lost under conditioning. On the other hand, dependentobjects may gain independence under conditioning. To see this, let X = {1, 2, 3,4, 5},A = P(X), and set P({1}) = P({2}) = P({3}) = 1/4 and P({4}) = P({5}) =1/8. Put A = {1, 2}, B = {2, 3},C = {2, 4}, and D = {4, 5}. Then P(A ∩ B) =P(A)P(B) = 1/4, while 2/3 = P(A ∩ B|C) �= P(A|C)P(B|C) = 4/9. On theother hand, P(C ∩ D) = 1/8 �= 3/32 = P(C)P(D), while P(C ∩ D|C ∩ D) =P(C |C ∩ D)P(D|C ∩ D) = 1.

Lemma 18.9. For i = 1, . . . , n, let Mi ⊂ A be such that X ∈ Mi and A, B ∈ Mi

imply A, B ∈ Mi . If P(A1∩· · ·∩An) = P(A1) · · · P(An) for any A1 ∈ M1, . . . , An ∈Mn, then

(i) P(A1 ∩ · · · ∩ An) = P(A1) · · · P(An)

for any A1 ∈ M1, . . . , An−1 ∈ Mn−1, An ∈ σ(Mn).

Proof. Putting An = X we get P(A1 ∩ · · · ∩ An−1) = P(A1) · · · P(An−1) for anyA1 ∈ M1, . . . , An−1 ∈ Mn−1. If P(A1 ∩ · · · ∩ An−1) = P(A1) · · · P(An−1) = 0,then P(A1 ∩ · · · ∩ An−1 ∩ A) = P(A1) · · · P(An−1)P(A) for every A ∈ A, and so (i)holds in this case. Assume now that P(A1 ∩ · · · ∩ An−1) = P(A1) · · · P(An−1) �= 0,and consider the conditional probability PA1∩···∩An−1 . For each An ∈ Mn , we havePA1∩···∩An−1(A) = P(A1∩· · ·∩ An−1∩ A)/P(A1∩· · ·∩ An−1) = P(A1) · · · P(An−1)

P(A)/P(A1) · · · P(An−1) = P(A). Therefore, in view of (10.8), PA1∩···∩An−1 and Pcoincide on σ(Mn). Thus we have PA1∩···∩An−1(A) = P(A) for any A ∈ σ(Mn), andso (i) obtains.

Theorem 18.10. For i ∈ I , let Mi ⊂ A be such that A, B ∈ Mi implies A, B ∈ Mi .If for any finite set {i1, . . . , in} ⊂ I and A1 ∈ Mi1 , . . . , An ∈ Min , we have P(A1 ∩· · · ∩ An) = P(A1) · · · P(An), then σ(Mi ), i ∈ I , are independent.

Proof. We may and do assume that X ∈ Mi for each i ∈ I , because replacing Mi

by Mi ∪ {X} leaves the hypothesis of the theorem unchanged. In accordance with(18.4.b), it suffices to consider only the case that I is finite. Hence, without loss ofgenerality, we may take I = {1, . . . , n}. Then applying (18.9) n times, it follows thatσ(M1), . . . , σ (Mn) are independent.

Corollary 18.11. Let {Ai : i ∈ I } be as in (18.7). Then Ai , i ∈ I , are independentif and only if, for every finite set {i1, . . . , in} ⊂ I , we have P(Ai1 ∩ · · · ∩ Ain ) =P(Ai1) · · · P(Ain ).

Proof. The result is an immediate consequence of (18.8) and (18.10).

Notation 18.12. For i ∈ I , let Ai be a σ -algebra of subsets of X . We will write∨i∈I Ai instead of σ(∪i∈I Ai ) whenever convenient. If I = {n ∈ Z : n � m} for somem ∈ Z , then we will write ∨n�mAn in place of ∨i∈I Ai ; also, if I = {1, . . . , n}, thenwe will write ∨n

i=1Ai or A1 ∨ · · ·∨An instead of ∨i∈I Ai . If {Ia : a ∈ A} is a partitionof I such that Ia �= ∅, a ∈ A, then, according to (7.23), ∨a∈A(∨i∈Ia Ai ) = ∨i∈I Ai .

Theorem 18.13. Let (X,A) be a measurable space, let {Ai : i ∈ I } be a set ofσ -algebras such that Ai ⊂ A, i ∈ I , and let P and Q be probabilities on A such that

168 Analysis and Probability

PAi = QAi . If Ai , i ∈ I , are both P-independent and Q-independent, then P and Qcoincide on ∨i∈I Ai .

Proof. Set M = {∩i∈I Ai : Ai ∈ Ai , i ∈ I, and {i ∈ I : Ai �= Xi } is finite}.Evidently, A, B ∈ M implies A ∩ B ∈ M. As ∪i∈I Ai ⊂ M ⊂ ∨i∈I Ai , andP(A) = Q(A) for any A ∈ M, the result follows at once from (10.8).

The next theorem expresses the disassociativity property of the independence.

Theorem 18.14. Let {Ba : a ∈ A} be a set of σ -algebras such that Ba ⊂ A, a ∈ A,and Ba, a ∈ A, are independent. If for each a ∈ A there exists a set of σ -algebras{Ai : i ∈ Ia} such that Ai ⊂ Ba, i ∈ Ia, and Ai , i ∈ Ia, are independent, thenAi , i ∈ ∪a∈A Ia, are independent.

Proof. Exercise.

The following theorem expresses the associativity property of the independence.

Theorem 18.15. Let {Ai : i ∈ I } be a set of σ -algebras such that Ai ⊂ A, i ∈ I , andAi , i ∈ I , are independent. If {Ia : a ∈ A} is a partition of I such that Ia �= ∅, a ∈ A,then ∨i∈Ia Ai , a ∈ A, are independent.

Proof. For a ∈ A, put Ma = {∩i∈Ia Ai : Ai ∈ Ai , i ∈ Ia , and {i ∈ Ia : Ai �= Xi }is finite}. Obviously, A, B ∈ Ma implies A ∩ B ∈ Ma and σ(Ma) = ∨i∈Ia Ai .Since Ai , i ∈ I , are independent, for any finite set {a1, . . . , an} ⊂ A and B1 ∈Ma1 , . . . , Bn ∈ Man , we have P(B1∩· · ·∩Bn) = P(B1) · · · P(Bn). Hence, accordingto (18.10), the σ -algebras σ(Ma), a ∈ A, are independent.

Example 18.16. If f1, f2, f3, f4, f5 are independent random variables, then g1 =f 21 + f 2

2 + f 23 and g2 = f4 f5 are independent. Actually, by (18.15), σ( f1, f2, f3) and

σ( f4, f5) are independent. In view of (8.10), we have g−11 (B(R)) ⊂ σ( f1, f2, f3) and

g−12 (B(R)) ⊂ σ( f4, f5). Thus (18.4.c) shows that g1 and g2 are independent.

Definition 18.17. Let (X,A, P) be a probability space. Define OP = {A ∈ A :P(A) = 0 or 1}. Clearly, OP is a σ -algebra.

Theorem 18.18. The following assertions hold.(i) A and OP are independent.(ii) Let B ⊂ A be a σ -algebra. B ⊂ OP if and only if B is independent of B.(iii) Let {Ai : i ∈ I } be a set of σ -algebras such that Ai ⊂ A, i ∈ I . If Ai , i ∈ I ,

are independent, then Ai ∨ OP , i ∈ I , are independent.

Proof. For A ∈ A and B ∈ OP , we have P(A ∩ B) = P(A)P(B). This proves (i).Let B ⊂ A be a σ -algebra. If B ⊂ OP , then B is independent of B by (i) and (18.4.c).Conversely, if B is independent of B, then, for any B ∈ B, we have P(B) = P(B∩B) =P(B)2, and so P(B) = 0 or 1. This shows that B ⊂ OP . Thus (ii) is proved. To prove(iii), it suffices to consider only the case when I is finite. So, without loss of generality,we may take I = {1, . . . , n}. Since Ai ⊂ A, i ∈ I , and A and OP are independent,(18.14) shows that A1, . . . ,An,OP are independent σ -algebras. Therefore, by virtueof (18.15), it follows that A1, . . . ,An−1,An ∨ OP are independent. By repeating thisreasoning n − 1 times we see that A1 ∨ OP , . . . ,An ∨ OP are independent.

Elementary Notions in Probability Theory 169

Theorem 18.19. Let f be a random variable. Then f is OP -measurable if and onlyif f = c a.s. for some c ∈ R.

Proof. Assume first that f is OP -measurable, and consider the function F(x) =P({ f � x}), x ∈ R. Then F(x) = 0 or 1 for every x ∈ R. Let c = inf{x : F(x) = 1}.Since F(R) = {0, 1}, we have c ∈ R. As F is nondecreasing, we get F(x) = 0whenever x < c and F(x) = 1 whenever x > c. Since { f = c} = ∩n∈N {c − 1/n <f � c + 1/n} = ∩n∈N ({ f � c + 1/n} − { f � c − 1/n}), we have P({ f = c}) =limn(P({ f � c +1/n})− P({ f � c −1/n}) = limn(F(c +1/n)− F(c −1/n)) = 1,and so f = c a.s.

Assume now that there is c ∈ R such that f = c a.s., and let B ∈ B(R). If c ∈ B,we have { f = c} ⊂ f −1(B), and so P( f −1(B)) = 1, that is f −1(B) ∈ OP . Hence,if c ∈ Bc, then f −1(Bc) ∈ OP , and so f −1(B) = ( f −1(Bc))c ∈ OP . Thus f isOP -measurable.

The next result, due to A. N. Kolmogorov, has important applications.

Zero-one law 18.20. Let {An : n ∈ N } be a sequence of σ -algebras such thatAn ⊂ A, n ∈ N. If An, n ∈ N, are independent, then ∩n�1(∨k�nAk) ⊂ OP .

Proof. Set T = ∩n�1(∨k�nAk). For n ∈ N , (18.15) shows thatA1, . . . ,An,∨k>nAk are independent σ -algebras. Therefore, in view of (18.4.c), wesee that A1, . . . ,An, T are independent. Consequently, using (18.4.a) and (18.4.b), itfollows that A1, . . . ,An, . . . , T are independent σ -algebras. Applying again (18.15),we see that ∨k�nAk and T are independent. Thus, by virtue of (18.4.c), T is indepen-dent of T . Finally, (18.18.ii) shows that T ⊂ OP .

Example 18.21. Let fn, n ∈ N , be independent random variables, and let A ={x ∈ X : ∑k�1 fk(x) is convergent}. Then P(A) = 0 or 1. Actually, for each n ∈N , A = {x ∈ X : ∑k�n fk(x) is convergent} ∈ σ( fk, k � n) (8.30). Therefore,A ∈ ∩n�1σ( fk, k � n) ⊂ OP . This may be rephrased as A ∈ T ⊂ OP , where T isthe tail σ -algebra of { fn : n ∈ N } (7.72).

Lemma 18.22. Let f and g be nonnegative independent random variables. ThenE[ f g] = E[ f ]E[g].Proof. If f = ∑n

i=1 ai 1Ai and g = ∑mj=1 b j 1B j are simple functions, where

{ f = ai } = Ai , 1 � i � n, and {g = b j } = B j , 1 � j � m, then E[ f g] =E[∑n

i=1∑m

j=1 ai b j 1Ai ∩B j ] = ∑ni=1

∑mj=1 ai b j P(Ai )P(B j ) = (

∑ni=1 ai P(Ai ))∑m

j=1 b j P(B j ) = E[ f ]E[g]. If f and g are arbitrary, then, for every n ∈ N ,

define fn = ∑n2n

k=1(k − 1)2−n1{(k−1)2−n� f<k2−n} + n1{ f �n} and gn = ∑n2n

k=1(k −1)2−n1{(k−1)2−n�g<k2−n} + n1{g�n}. Since f and g are independent, it follows that fn

and gn are independent for any n ∈ N . Therefore,

E[ fngn] = E[ fn]E[gn], n ∈ N . (1)

Letting n → ∞ in (1), (8.15) and the monotone convergence theorem (11.22) showthat E[ f g] = E[ f ]E[g].

170 Analysis and Probability

Theorem 18.23. Let f and g be independent random variables. Then the followingstatements hold.

(i) If f and g are P-integrable, then f g is P-integrable, and E[ f g] = E[ f ]E[g].(ii) If f g is P-integrable, and P({ f = 0}) < 1, then g is P-integrable.

Proof. As f and g are independent, (18.6) shows that f ∗ and g∗ are independent,where f ∗ = f + or f − or | f |, and g∗ = g+ or g− or |g|. If f and g are P-integrable,then, by (18.22), we have E[| f g|] = E[| f |]E[|g|] < ∞, and so f g is P-integrable.Moreover, according to (18.22), we get E[ f g] = E[( f + − f −)(g+ − g−)] =E[ f +]E[g+] − E[ f +]E[g−] − E[ f −]E[g+] + E[ f −]E[g−] = E[ f ]E[g]. Thisproves (i). If f g is P-integrable, then E[| f |]E[|g|] = E[| f g|] < ∞. Hence, ifP({ f = 0}) < 1, then E[| f |] > 0, and so E[|g|] < ∞. This proves (ii).

The following result is extremely important.

Borel-Cantelli lemma 18.24. Let {An : n ∈ N } ⊂ A. Then the following statementshold.

(i) If∑

n∈N P(An) < ∞, then P(lim supn An) = 0.(ii) If An, n ∈ N, are independent, and

∑n∈N P(An) = ∞, then P(lim supn

An) = 1.

Proof. If∑

n∈N P(An) < ∞, then P(lim supn An) = P(∩n�1(∪k�n Ak)) =limn P(∪k�n Ak) � limn

∑k�n P(Ak) = 0. This proves (i). If An, n ∈ N , are

independent, and∑

n∈N P(An) = ∞, then, for each n ∈ N , we have P(∩k�n Ack) =

limm P(∩mk=n Ac

k) = limm∏m

k=n P(Ack) = limm

∏mk=n(1 − P(Ak)) � limm

∏mk=n

exp(−P(Ak)) = limm exp(−∑mk=n P(Ak)) = exp(−∑k�n P(Ak)) = 0. Therefore,

P(∪k�n Ak) = 1, n ∈ N , and so P(lim supn An) = 1. This proves (ii).

Remark 18.25. The condition that An, n ∈ N , are independent in (18.24.ii) isimportant. To see this, let X = [0, 1], let P be the Lebesgue measure on B([0, 1]), andlet An = [0, 1/n]. Then

∑n∈N P(An) = ∞, but P(lim supn An) = 0. However, the

independence condition can be relaxed to pairwise independence (see (18.46)).

The next definition generalizes (17.5).

Definition 18.26. Let (X,A, P) be a probability space, and let {(Yi ,Bi ) : i ∈ I }be a nonempty family of measurable spaces. For i ∈ I , let fi : X → Yi be a randomelement. We say that fi , i ∈ I , are independent [P-independent] if f −1

i (Bi ), i ∈ I ,are independent σ -algebras.

Example 18.27. Let {(Xi ,Ai , Pi ) : i ∈ I } be a nonempty set of probability spaces,and let P be the product of the probabilities Pi , i ∈ I . For i ∈ I , let πi stand forthe projection from

∏i∈I Xi onto Xi . Then πi , i ∈ I , are P-independent. Indeed,

for every finite set J = {i1, . . . , in} ⊂ I and A1 ∈ Ai1, . . . , An ∈ Ain , by virtueof (16.11), we have P(π−1

i1(A1) ∩ · · · ∩ π−1

in(An)) = P(π−1

J (A1 × · · · × An)) =Pi1(A1) · · · Pin (An) = P(π−1

i1(A1)) · · · P(π−1

in(An)).

Remark 18.28. Notation is as in (18.26). Let {Ia : a ∈ A} be a partition of I suchthat Ia �= ∅, a ∈ A.

Elementary Notions in Probability Theory 171

(a) If ( fi )i∈Ia , a ∈ A, are independent and, for each a ∈ A, fi , i ∈ Ia , are indepen-dent, then, in accordance with (7.19) and (18.14), fi , i ∈ I , are independent.

(b) If fi , i ∈ I , are independent, then ( fi )i∈Ia , a ∈ A, are independent by (7.19)and (18.15).

(c) For i ∈ I , let (Zi , Ci ) be a measurable space, and let gi : Yi → Zi be ameasurable function. If fi , i ∈ I , are independent, then gi ◦ fi , i ∈ I , are independentby (18.4.c).

Theorem 18.29. Let (X,A, P), {(Yi ,Bi ) : i ∈ I } and { fi : i ∈ I } be as in (18.26),and let f = ( fi )i∈I . Then fi , i ∈ I , are independent if and only if Pf−1 is the productof the probabilities P f −1

i , i ∈ I .

Proof. Let Q denote the product of the probabilities P f −1i , i ∈ I . Then Pf−1

and Q are probabilities on ⊗i∈I Bi . Assume that fi , i ∈ I , are independent, and putB = {∏i∈I Bi : Bi ∈ Bi , i ∈ I , and {i ∈ I : Bi �= Yi } is finite}. Obviously, A, B ∈ Bimplies A ∩ B ∈ B. Let

∏i∈I Bi ∈ B, and set J = {i ∈ I : Bi �= Yi }. Then we have

(Pf−1)(∏

i∈I Bi ) = P(∩i∈I f −1i (Bi )) = P(∩i∈J f −1

i (Bi )) = ∏i∈J P( f −1

i (Bi )) =(Qπ−1

J )(∏

i∈J Bi ) = Q(∏

i∈I Bi ). Therefore, using (7.17.a) and (10.8), it follows thatPf−1 = Q.

Assume now that Pf−1 = Q. Then, for any finite set J = {i1, . . . , in} ⊂ I andB1 ∈ Bi1 , . . . , Bn ∈ Bin , we have P( f −1

i1(B1)∩· · ·∩ f −1

in(Bn)) = P(f−1(

∏i∈I Bi )) =

Q(∏

i∈I Bi ), where Bi = Yi if i /∈ J . Since Q(∏

i∈I Bi ) = (Qπ−1J )(B1×· · ·×Bn) =

P( f −1i1(B1)) · · · P( f −1

in(Bn)), it follows that fi , i ∈ I , are independent.

Exercise 18.30. Let (X,A, P) be a complete probability space, and let B ∈ A besuch that P(B) > 0. Show that (B, B ∩ A, PB) is a complete probability space.

Exercise 18.31. For i = 1, . . . , n, let Ai ⊂ A be a countable partition of X . Prove thatσ(A1), . . . , σ (An) are independent if and only if P(A1∩· · ·∩ An) = P(A1) · · · P(An)

whenever A1 ∈ A1, . . . , An ∈ An .

Exercise 18.32. Let {Ai : i ∈ I } be as in (18.15). Let J ⊂ I , and let B ∈ ∨ j∈J A j

be such that P(B) > 0. Prove the following.(a) The restriction of P to Ai coincides with the restriction of PB to Ai for any

i ∈ I − J .(b) Ai , i ∈ I − J , are independent with respect to PB .

Exercise 18.33. Let f be an R-valued random variable on X . Show that f isOP -measurable if and only if f is constant a.s.

Exercise 18.34. Let fn, n ∈ N , be independent random variables. Prove thatlim infn fn and lim supn fn are constants a.s.

Exercise 18.35. Let An, n ∈ N , be independent events. Show that P(lim infn An) ∈{0, 1} and P(lim supn An) ∈ {0, 1}.Exercise 18.36. Let fn, n ∈ N , be as in (18.34), and let A = {x : { fn(x) : n ∈N } has a limit}. Prove the following.

172 Analysis and Probability

(a) P(A) = 0 or 1.

(b) If P(A) = 1, then there is c ∈ R such that fna.s.→ c.

Exercise 18.37. Let fn, n ∈ N , be as in (18.34), let {bn : n ∈ N } ⊂ ]0,∞[ besuch that bn → ∞, and put gn = ( f1 + · · · + fn)/bn . Let A = {x : {gn(x) : n ∈N } has a limit}. Prove the following.

(a) P(A) = 0 or 1.

(b) If P(A) = 1, then there is c ∈ R such that gna.s.→ c.

(c) lim infn gn and lim supn gn are constants a.s.

Exercise 18.38. Let {An : n ∈ N } be a sequence of σ -algebras such that An+1 ⊂An ⊂ A, n ∈ N . Show that ∩n∈N (An ∨ OP ) = (∩n∈N An) ∨ OP .

Exercise 18.39. Let Ai , 1 � i � n, be independent events, and let fi , 1 � i � n, beindependent nonnegative random variables. Prove the following.

(a) P(∪ni=1 Ai ) � (

∑ni=1 P(Ai ))/(1+∑n

i=1 P(Ai )). [Make use of the inequalitiesx/(1 + x) � 1 − e−x � x, x � 0.]

(b) P(sup1�i�n fi � x) � (∑n

i=1 P( fi � x))/(1 +∑ni=1 P( fi � x)), x � 0.

(c) If P(sup1�i�n fi � x) � 1/2, then∑n

i=1 P( fi � x) � 2P(sup1�i�n fi � x).

Exercise: Pairwise independence 18.40. For i ∈ I , let Ai ⊂ A be a σ -algebra, letfi be a random variable, and let Ai ∈ A. If, for i �= j in I,Ai is independent of A j , fi

is independent of f j , and Ai is independent of A j , then we say that Ai , i ∈ I, fi , i ∈ I ,and Ai , i ∈ I , are, respectively, pairwise independent.

(a) If f1, . . . , fn are pairwise independent and E f 2i < ∞, 1 � i � n, show that

Var[ f1 + · · · + fn] = Var f1 + · · · + Var fn .(b) Prove that Ai , i ∈ I , are pairwise independent if and only if, for any i �= j in

I , we have P(Ai ∩ A j ) = P(Ai )P(A j ).

Exercise 18.41 (Davis). Let An, n � 1, be pairwise independent events. Prove thefollowing.

(a) P(∪mn=1 An) �

∑mn=1 P(An)−∑m−1

n=1 P(An)∑m

k=n+1 P(Ak),m � 2.

(b) If∑∞

n=1 P(An) < ∞, then P(∪∞n=1 An) �

∑∞n=1 P(An) − ∑∞

n=1 P(An)∑∞k=n+1 P(Ak).

Exercise 18.42. Let f : X → C and g : X → C be measurable functionssuch that | f | , |g| ∈ L1(X,A, P). Assume that f and g are independent. Prove thefollowing.

(a) Any function in the set {Re f, Im f, | f | , f } is independent of any function inthe set {Re g, Im g, |g| , g}. [Use (8.2.c) and (18.28.c).]

(b) | f g| ∈ L1(X,A, P) and E[ f g] = E[ f ]E[g]. [Use (18.23) and (11.35).]

Exercise 18.43. Let an and Pn, n � 1, and λ be as in (16.24). Prove the following.(a) λa−1

n = Pn, n � 1.(b) an, n � 1, are independent with respect to λ.

Exercise 18.44. Let fn, n ∈ N , be random variables such that∑

n∈N P(| fn| � ε) <

∞ for every ε > 0. Use (18.24.i) to show that fna.s.→ 0.

Elementary Notions in Probability Theory 173

Exercise 18.45. Let fn, n ∈ N , be as in (18.34), and let f be a random variable such

that fnP→ f . Prove that f = c a.s. for some c ∈ R.

Exercise 18.46. Let An, n � 1, be pairwise independent events such that∑

n�1P(An) = ∞. Show that P(lim supn An) = 1. The following steps may be useful.

(a) For n � 1, set pn = P(An) and In = ∑ni=1 1Ai . Then σ [In]2 = ∑n

i=1 pi −∑ni=1 p2

i . [Use (18.40.a).](b) E[In]/σ [In] → ∞. [Use (a).](c) For every a > 0, P(In −E[In] > −aσ [In]) � 1−a−2, n � 1. [P(In −E[In] >

−aσ [In]) � P(|In − E[In]| < aσ [In]). Then use (17.8).](d) For a > 0, P(In > E[In]/2) � 1 − a−2 whatever n large enough (n � n(a)).

[Use (c) and (b).](e) P(lim supn An) = P(

∑i�1 1Ai = ∞) = P(limn In = ∞) � 1 − a−2, a > 0.

[Use (d).]

Exercise 18.47. Let A, B ∈ A be such that 0 < P(A), P(B) < 1. Prove that A, Band A�B are pairwise independent if and only if P(A) = P(B) = 2P(A ∩ B) = 1/2.

Exercise 18.48. Assume that P is nonatomic, and that P(A) > 0 for some A ∈ A.Show that there are events Ai ⊂ A, 1 � i � n, such that P(A1 ∩ · · · ∩ An) =P(A1) · · · P(An) �= 0. [Use (9.42.b).]

Exercise 18.49. Let P and Q be probabilities on A such that P is nonatomic, andassume that P(A∩B) = P(A)P(B)whenever Q(A∩B) = Q(A)Q(B) for A, B ∈ A.Prove that P � Q. [Hint. If Q(A) = 0, then P(A) = 0 or 1. To rule out the latter, use(9.42.b).]

Exercise 18.50. Let P and Q be as in (18.49), and assume that P(A ∩ B) =P(A)P(B) is equivalent to Q(A ∩ B) = Q(A)Q(B) for A, B ∈ A. Show thatP = Q. [Hints. Let f = d P/d Q (18.49). Use (18.48) and (18.49) to prove thatQ({1 + δ < f � 1 + 2δ}) = 0 for any δ > 0. Then apply (12.26).]

Exercise 18.51. Let f and g be independent random variables such that E | f + g|p <

∞ for some p ∈ ]0,∞[. Prove that E | f |p , E |g|p < ∞. [Hints. Choose x0 > 0 sothat P({|g| < x}) � 1/2 for x � x0. Then P({| f | � 2x}) � 2P({| f | � 2x} ∩ {|g| <x}) � 2P({| f + g| � x}), x � x0. Next use (15.23.a).]

Exercise 18.52 (Hewitt-Savage). Notation is as in (7.72). Let Q be a probability onB, and put P = ⊗m�1 Qm , where Qm = Q,m � 1. Show that E ⊂ OP . The followingsteps may be helpful.

(a) P p̃ = P for any p ∈ ∪n�1Pn . [Hint. Let p−1 : N → N be the inversefunction of p, and let Bm ∈ B,m � 1. Then (P p̃)(

∏m�1 Bm) = P(

∏m�1 Bp−1(m)) =∏

m�1 Q(Bp−1(m)) = ∏m�1 Q(Bm) = P(

∏m�1 Bm). Apply next (10.8), taking into

account that {∏m�1 Bm : Bm ∈ B for m � 1} is closed under intersection and generatesT1.]

(b) For n � 1, let Bn denote the σ -algebra of subsets of Y ∞ generated by πm, 1 �m � n. For ε > 0 and E ∈ E , apply (7.21) and (10.33) to choose nε � 1 and Eε ∈ Bnεso that P(E�Eε) < ε. For B1, . . . , Bn ∈ B, write B(n) = B1 ×· · ·× Bn ×Y ×Y ×· · ·.Then

174 Analysis and Probability

∣∣∣P(E ∩ B(n))− P( p̃(Eε) ∩ B(n))∣∣∣ < ε, p ∈ ∪n�1Pn . (1)

[Hint. P(E� p̃(Eε)) = P( p̃(E)� p̃(Eε)) = P(E�Eε) < ε by (a). Now remember(9.6.iii).]

(c) For n � 1, let k � (nε ∨ n), and define p∗ ∈ P2k+1 by

p∗(m) =⎧⎨⎩

m + k if 1 � m � km − k if k + 1 � m � 2km if m � 2k + 1

.

Then ∣∣∣P(E)P(B(n))− P( p̃∗(Eε) ∩ B(n))∣∣∣ < ε. (2)

[Hint. p̃∗(Eε) ∈ Tk+1, B(n) ∈ Bk , and Tk+1 and Bk are independent.](d) P(E ∩ B) = P(E)P(B) for all B ∈ T1. [Hint. By (1) and (2), P(E ∩ B(n)) =

P(E)P(B(n)). Then use (10.8), as the family of all B(n) with n � 1 is closed underintersection and generates T1.]

(e) P(E) = 0 or 1.

Exercise 18.53. Let (X,A, P) be a probability space, let (Y,B) be a measurablespace, let fn : X → Y, n � 1, be random elements, and set f = ( f1, f2, . . .). Supposethat fn, n � 1, are independent, and that P f −1

n = Q for some probability Q on B andany n � 1. Use (18.29) and (18.52) to prove that f−1(E) ⊂ OP .

Exercise 18.54 (Feller-Chung). For {An : n � 1} ⊂ A and {Bn : n � 1} ⊂ A,prove the following.

(a) If Bn is independent of An Acn−1 · · · Ac

0 for all n � 1, where A0 = ∅, then

P

(n⋃

i=1

Ai Bi

)�(

inf1�i�n

P(Bi )

)P

(n⋃

i=1

Ai

), n � 1, (1)

and

P

⎛⎝⋃i�1

Ai Bi

⎞⎠ �(

infi�1

P(Bi )

)P

⎛⎝⋃i�1

Ai

⎞⎠ . (2)

(b) If Bn is independent of each event An, An Acn+1, An Ac

n+1 Acn+2, . . . for any

n � 1, then (1) and (2) also hold.

Exercise 18.55. Let fn, n � 1, be pairwise independent random variables, andassume that P f −1

n = P f −11 , n � 1. If E | f1|p < ∞ for some p ∈ [1, 2[ and E f1 = 0,

then E |Sn| /n1/p → 0, where Sn = f1 +· · ·+ fn . The following steps may be helpful.(a) For n � 1, put fn,i = fi 1{| fi |<n1/p}, 1 � i � n, and Un = fn,1 + · · · + fn,n .

Then EUn/n1/p → 0 and E |Sn − Un| /n1/p → 0.(b) Var[Un/n1/p] → 0. [Use the dominated convergence theorem (11.25.ii).]

Elementary Notions in Probability Theory 175

(c) E |Un| /n1/p → 0. [Write E |Un| /n1/p �√

Var[Un/n1/p] + (EUn/n1/p).]

Exercise 18.56. For n � 1, let fn and Sn be as in (18.55). If E[ f 21 / log(2+| f1|)] < ∞

and E f1 = 0, then E |Sn| /√n log n → 0. [Mimic the solution to (18.55).]

Exercise 18.57. For n � 2, let f1, . . . , fn be random variables such that σ( f1, . . . ,

fn−1) and σ( fn) are independent, E | fn|p < ∞ for some p � 1 and E fn = 0,and let ϕ : Rn−1 → R be a measurable function. Show that E[|ϕ ◦ ( f1, . . . , fn−1)+fn|p1A] � E[|ϕ ◦ ( f1, . . . , fn−1)|p 1A] for A ∈ σ( f1, . . . , fn−1); in particular,E[| f1 + · · · + fn−1 + fn|p 1A] � E[| f1 + · · · + fn−1)|p 1A], A ∈ σ( f1, . . . , fn−1).[Hints. Let A = ( f1, . . . , fn−1)

−1(B) with B ∈ B(Rn−1). By (11.29), (18.29) and(15.8), write

E[|ϕ ◦ ( f1, . . . , fn−1)+ fn|p 1A]=

∫B×R

|ϕ(x1, . . . , xn−1)+ xn|p d(P( f1, . . . , fn)−1)(x1, . . . , xn)

=∫B

⎛⎝∫R

|ϕ(x1, . . . , xn−1)+ xn|p d(P f −1n )(xn))

× d(P( f1, . . . , fn−1)−1)(x1, . . . , xn−1)

=∫B

E |ϕ(x1, . . . , xn−1)+ fn|p d(P( f1, . . . , fn−1)−1)(x1, . . . , xn−1)

�∫B

|ϕ(x1, . . . , xn−1)|p d(P( f1, . . . , fn−1)−1)(x1, . . . , xn−1)

= E[|ϕ ◦ ( f1, . . . , fn−1)|p 1A],

where the last inequality comes from |x |p = |E[x + fn]|p � E |x + fn|p , x ∈ R.]

Exercise 18.58. Let {gn : n � 1} and {hn : n � 1} be sequences of random variablessuch that either σ(hn) is independent of σ(g1, . . . , gn) for all n � 1, or σ(hn) isindependent of σ(gk, k � n) for any n � 1. Prove the following.

(a) For εn, δn ∈ R, n � 1, P(∪k�n{gk + hk > εk}) � (infk�n P(hk � −δk)) ×P(∪k�n{gk > εk + δk}), n � 1. [Hint. For k � n � 1, write Ak = {gk > εk + δk} andBk = {hk � −δk}. Then use (2) of (18.54).]

(b) For ε, δ ∈ R, P(lim supn(gn + hn) � ε) � (lim infn P(hn � −δ))P(lim supngn > ε + δ). [Use (5.63.b) and (a).]

(c) For ε, δ ∈ R, P(lim infn(gn + hn) � −ε) � (lim infn P(hn � δ))P(lim infn

gn < −ε − δ). [Use (b) and (5.61.c).]

Exercise 18.59 (Katona). Let f and g be independent random variables such thatP f −1 = Pg−1. Prove that P(| f + g| � x) � 1

2 P(| f | � x)2, x � 0.

Exercise 18.60. For 1 � i � m, let fin, n � 1, be independent random variables.Assume that σ( fin, n � 1), 1 � i � m, are independent. For n � 1, let ϕn : Rm → R

176 Analysis and Probability

be a measurable function. Use (18.14), (18.28.b) and (18.28.c) to show that ϕn ◦( f1n, . . . , fmn), n � 1, are independent.

Exercise 18.61. Let (X,A, P) be a probability space, let f : X → X be a measurablefunction such that P f −1 = P , and let B ⊂ A be a σ -algebra that is independent off −1(A). Put f0 = iX and fn+1 = fn ◦ f, n � 0. Prove the following.

(a) f −1n (B) is independent of f −1

n+1(A) for any n � 0.

(b) For n � 0, f −10 (B), . . . , f −1

n (B), f −1n+1(A) are independent. [Hint. If f −1

0 (B),. . . , f −1

n (B), f −1n+1(A) are independent, then f −1

1 (B), . . . , f −1n+1(B), f −1

n+2(A) are inde-

pendent. Since f −11 (B) ∨ · · · ∨ f −1

n+1(B) ∨ f −1n+2(A) ⊂ f −1(A), use (18.14) to infer

that f −10 (B), f −1

1 (B), . . . , f −1n+1(B), f −1

n+2(A) are independent.](c) f −1

n (B), n � 0, are independent.