189
MATH2901 Revision Seminar Part I: Probability Theory Jeffrey Yang UNSW Society of Statistics/UNSW Mathematics Society August 2020 Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society) MATH2901 Revision Seminar August 2020 1 / 116

MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

MATH2901 Revision SeminarPart I: Probability Theory

Jeffrey Yang

UNSW Society of Statistics/UNSW Mathematics Society

August 2020

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 1 / 116

Page 2: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Note

The theoretical content was mostly sourced from Libo’s notes.

Most of the examples were taken from the slides Rui Tong (2018 StatSocTeam) made for that year’s revision seminar.

All examples are presented at the end so that it isn’t as obvious whichtechniques/methods should be used.

It is recommended that you refer to the official lecture notes when quotingdefinitions/results.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 2 / 116

Page 3: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 3 / 116

Page 4: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Basic Probability Theory

1 Axioms and Basic Results

2 Conditional Probability and Independence

3 Basic Laws

4 Bayes Formula

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 4 / 116

Page 5: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Introduction to Probability Theory

Probability theory is about modelling and analysing random experiments.We can do this mathematically by specifying an appropriate probabilityspace consisting of three components:

1 a sample space, Ω, which is the set of all possible outcomes.

2 an event space, F , which is the set of all events ”of interest”. Here,we can understand F as being a set of subsets of Ω which satisfiescertain conditions.

3 a probability function, P : F → [0, 1], which assigns each event inthe event space a probability.

Of course, for the model to be meaningful, P should satisfy certain axioms.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 5 / 116

Page 6: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Axioms

Definition (Probability Space)

A probability space is the triple (Ω,F ,P). Here, P satisfies the axioms

1 P(A) ≥ 0 for all A ∈ F2 P(Ω) = 1

3 P (⋃∞

i=1 Ai ) =∑∞

i=1 P(Ai ) for mutually exclusive events A1,A2, ... ∈ F

Exercise for the reader: Prove the axioms.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 6 / 116

Page 7: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Elementary Results

From the axioms, we are able to derive the following fundamental results:

1 If A1,A2, ...,Ak are mutually exclusive,

P

(k⋃

i=1

Ai

)=

k∑i=1

P(Ai ).

2 P(∅) = 0.

3 For any A ⊆ Ω, 0 ≤ P(A) ≤ 1 and P(A) = 1− P(A).

4 If B ⊂ A, then P(B) ≤ P(A). Hence, if B occurs → A occurs, thenP(B) ≤ P(A).

These results can be used without proof.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 7 / 116

Page 8: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Conditional Probability

Definition (Conditional Probability)

The conditional probability that an event A occurs, given that an event Bhas occurred is

P(A|B) =P(A ∩ B)

P(B).

Here, we require P(B) 6= 0.

Given that B has occurred, the total probability for the possible results ofan experiment equals P(B). Visually, we see that the only outcomes in Athat are now possible are those in A ∩ B.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 8 / 116

Page 9: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Independence

Definition (Independent Events)

Events A and B are independent if P(A ∩ B) = P(A)P(B).

Recall that for any two events A and B , we have P(A ∩ B) = P(A|B)P(B).Thus, A and B are independent if and and only P(A|B) = P(A) (orequivalently, P(B|A) = P(B)). Intuitively, this means that knowing event Ahas occurred does not give us any information on the probability of event Boccurring (and vice versa).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 9 / 116

Page 10: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Independence

Definition (Pairwise Independent Events)

For a countable sequence of events Ai, the events are pairwiseindependent if

P(Ai ∩ Aj) = P(Ai )P(Aj)

for all i 6= j

Definition ((Mutually) Independent Events)

For a countable sequence of events Ai, the events are (mutually)independent if for any collection Ai1 ,Ai2 , ...,Ain , we have

P(Ai1 ∩ ... ∩ Ain) = P(Ai1)...P(Ain).

Independence implies pairwise independence but not vice versa. Can youthink of an example of where pairwise independence does not implyindependence?

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 10 / 116

Page 11: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

The Multiplicative Law

Definition (The Multiplicative Law)

For events A1,A2, we have

P(A1 ∩ A2) = P(A2|A1)P(A1).

For events A1,A2,A3, we have

P(A1 ∩ A2 ∩ A3) = P(A3 ∩ A2 ∩ A1)

= P(A3|A2 ∩ A1)P(A2 ∩ A1)

= P(A3|A1 ∩ A2)P(A2|A1)P(A1).

We trust that the reader is able to generalise this to cases involving agreater number of events.

The Multiplicative Law is highly useful when dealing with asequence of dependent trials.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 11 / 116

Page 12: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

The Additive Law

Definition (The Additive Law)

For events A and B, we have

P(A ∪ B) = P(A) + P(B)− P(A ∩ B).

You might have noticed that the Additive Law resembles theInclusion-exclusion principle. The following quote from Wikipedia shedssome light on this:

”As finite probabilities are computed as counts relative to the cardinality ofthe probability space, the formulas for the principle of inclusion–exclusionremain valid when the cardinalities of the sets are replaced by finiteprobabilities.”

https://en.wikipedia.org/wiki/Inclusion-exclusion principle

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 12 / 116

Page 13: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

The Law of Total Probability

Definition (The Law of Total Probability)

Suppose A1,A2, ...,Ak are mutually exclusive (Ai ∩ Aj = ∅ for all i 6= j)

and exhaustive (⋃k

i=1 Ai = Ω = sample space); that is, A1, ...,Ak forms apartition of Ω. Then, for any event B, we have

P(B) =k∑

i=1

P(B|Ai )P(Ai ).

The Law of Total Probability relates marginal probabilities to conditionalprobabilities and is often used in calculations involving Bayes’ Formula.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 13 / 116

Page 14: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Bayes’ Formula/Bayes’ Theorem/Bayes’ Law

Definition (Bayes’ Theorem)

For a partition A1,A2, ...,Ak and an event B,

P(Aj |B) =P(B|Aj)P(Aj)∑ki=1 P(B|Ai )P(Ai )

=P(B|Aj)P(Aj)

P(B)

In essence, Bayes’ Theorem allows us to reverse the order of conditioningprovided that we know the marginal probabilities of event A and event B.When dealing with problems involving Bayes’ Theorem, it is recommendedthat one draws a tree diagram.

Here, we shall adopt the frequentist interpretation of probability whereprobability measures a proportion of outcomes – as opposed to the Bayesianinterpretation where probability measures a degree of belief.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 14 / 116

Page 15: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 15 / 116

Page 16: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Review of Random Variables

1 Discrete Random Variables

2 Continuous Random Variables

3 Cumulative Distribution Function

4 Expectation and Moments

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 16 / 116

Page 17: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Discrete Random Variables

Definition (Discrete Random Variable)

The random variable X is discrete if there are countably many values x forwhich P(X = x) > 0.

The probability structure of X is typically described by its probability(mass) function.

Definition (Probability (Mass) Function)

The probability function of the discrete random variables X is given by

fX (x) = P(X = x).

The following two properties are important and apply to all discrete randomvariables:

1 fX (x) ≥ 0 for all x ∈ R2∑

all x fX (x) = 1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 17 / 116

Page 18: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Continuous Random Variables

A continuous random variable has a continuum of possible values.Continuous random variables do not have a probability (mass) function, buthave the analogous (probability) density function.

Definition ((Probability) Density Function)

The density function of a continuous random variable is a real-valuedfunction fX on R with the property∫

AfX (x)dx = P(X ∈ A)

for any (measurable) set A ⊆ R.

The following two properties are important and apply to all continuousrandom variables:

1 fX (x) ≥ 0 for all x ∈ R.2∫∞−∞ fX (x)dx = 1.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 18 / 116

Page 19: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Cumulative Distribution Function (cdf)

Definition (Cumulative Distribution Function)

The cumulative distribution function (cdf) of a random variable X isdefined by

FX (x) = P(X ≤ x).

Here, X can be either continuous or discrete.

In the case where X is a continuous random variable, we have the followingimportant results:

1 FX =∫∞−∞ fX (t)dt.

2 fX (x) = F ′X (x).

3 P(a ≤ X ≤ b) =∫ ba fX (x)dx = area under fX between a and b.

Thus, if we know one of FX or fX , we are able to derive the other. Moreover,once we know these, we are able to derive any probability/property of X .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 19 / 116

Page 20: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Important Remarks on Continuous Random Variables

Suppose X is a continuous random variable. Then we have

P(X = a) = 0 for any a ∈ R.

Hence, it is only meaningful to talk about the probability of X lying in somesubinterval(s) of R. Consequently, we have

P(a < X < b) = P(a ≤ X < b) = P(a < X ≤ b) = P(a ≤ X ≤ b).

Thus, we typically do not have to worry about whether an interval containsits boundary points or not.

This is NOT the case for discrete random variables.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 20 / 116

Page 21: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Expectation

Definition (Expected Value of a Discrete Random Variable)

The expected value or mean of a discrete random variable X is

EX = E[X ]def=∑all x

x · P(X = x) =∑all x

x · fX (x),

where fX is the probability function of X .

Definition (Expected Value of a Continuous Random Variable)

The expected value of mean of a continuous random variable X is

E[X ] =

∫ ∞−∞

x · fX (x)dx ,

where fX is the density function of X .

In either case, E[X ] can be interpreted as the long run average of X .Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 21 / 116

Page 22: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Expectation of Transformed Random Variables

Often, we are interested in a transformation of a random variable. Inparticular, we often examine the rth moment of X about some constant a,E[(X − a)r ]. The following results provide a method of calculating theexpectation of a transformation of a random variable.

Result (Transformation of a Discrete Random Variable)

Let X be a discrete random variable and let g be a function of X . Then

Eg(X ) = E[g(X )] =∑all x

g(x) · fX (x).

Result (Transformation of a Continuous Random Variable)

Let X be a continuous random variable and let g be a function of X . Then

Eg(X ) = E[g(X )] =

∫ ∞−∞

g(X )fX (x)dx .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 22 / 116

Page 23: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Properties of Expectation

Result (Linearity of Expectation)

Let X ,Y be random variables and a, b be constants. Then

E[aX + bY ] = aE[X ] + bE[Y ].

Result (Expected Value of a Constant)

For a constant c ,E[c] = c .

In general, for dependent random variables X ,Y ,

E[XY ] 6= E[X ]E[Y ].

Also, if g is a transformation of X , then typically,

E[g(X )] 6= g(E[X ]).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 23 / 116

Page 24: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Variance and Standard Deviation

Definition (Variance)

Let µ = E[X ]. Then the variance of X , denoted Var[X ], is defined as

Var[X ] = E[(X − µ)2].

Observe that this is the second moment of X about µ.

Definition (Standard Deviation)

The standard deviation of X is the square root of its variance:

σ =√

Var[X ].

Both variance and standard deviation are measures of the spread of arandom variable. Standard deviations are in the same unit as X and thus,can be more readily interpreted. However, variances are easier to work withtheoretically.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 24 / 116

Page 25: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Properties of Variance

Result (Alternative Formula for Variance)

Let X be a random variable and let µ = E[X ]. Then

Var[X ] = E[X 2]− µ2.

The variance will often be calculated using this formula.

Result (Nonlinearity of Variance)

Let X be a random variable and let a be a constant. Then

Var[X + a] = Var[X ]

Var[aX ] = a2 Var[X ].

Pay attention to the nonlinearity of variance. A common mistake is treatingvariance as being linear.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 25 / 116

Page 26: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Moment Generating Functions

Definition (Moment Generating Function)

The moment generating function (mgf) of a random variable X is

µX (u) = E[euX ].

We say that the moment generating function of X exists if mX (u) is finitein some interval containing zero.

The following result regarding the r th moment of X , E[X r ], shows why themoment generating function is called as such.

Result (r th Moment of a random variable)

Let X be a random variable. Then in general, we have

E[X r ] = m(r)X (0)

for r = 0, 1, 2, ....

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 26 / 116

Page 27: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Properties of Moment Generating Functions

Result (Uniqueness)

Let X and Y be two random variables all of whose moments exist. If

mX (u) = mY (u)

for all u in a neighbourhood of 0 (i.e., for all |µ| < ε for some ε < 0), then

FX (x) = FY (y) for all x ∈ R.

That is, the moment generating function of a random variable is unique.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 27 / 116

Page 28: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Properties of Moment Generating Functions

Result (Convergence)

Let Xn : n = 1, 2, ... be a sequence of random variables, each withmoment generating function mXn(u). Furthermore, suppose that

limn→∞

mXn(u) for all u in a neighbourhood of 0

and mX (u) is a moment generating function of a random variable X . Then

limn→∞

FXn(x) = FX (x) for all x ∈ R.

That is, convergence of moment generating functions implies convergenceof cumulative distribution functions.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 28 / 116

Page 29: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Location and Scale Families of Densities

Result (Location Family of Densities)

A location family of densities based on the random variable U is thefamily of densities fX (x) where X = U + c for all possible c . Here, fX (x) isgiven by:

fX (x) = fU(x − c).

Result (Scale Family of Densities)

A scale family of densities based on the random variable U is the family ofdensities fX (x) where X = cU for all possible c . fX (x) is given by:

fX (x) = c−1fU

(xc

).

In essence, the density function of a continuous random variable may belongto a family of density functions that all have a similar form. This relates tothe concept of parameters in statistics.Proving the above results is not difficult and is a good exercise.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 29 / 116

Page 30: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 30 / 116

Page 31: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Computations for Common Distributions

1 Common distributions summary

2 Computing the MGF of random variables

3 Computing the moments directly

4 Computing the distribution of simple transformed random variables

5 Bivariate density computations

6 Computing the sum of random variables using the convolutionformula, bivariate transformation and MGFs

7 Identifying distributions from the MGF

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 31 / 116

Page 32: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Bernoulli Distribution

Definition (Bernoulli Distribution)

For a Bernoulli trial, define the random variable

X =

1 if the trial results in success

0 otherwise

Then X is said to have a Bernoulli distribution.

Result (Probability Function of X )

If X is a Bernoulli random variable defined according to a Bernoulli trialwith success probability 0 < p < 1 then the probability function of X is

fX (x) =

p if x = 1

1− p if x = 0.

An equivalent way of writing this is fX (x) = px(1− p)x , x = 0, 1.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 32 / 116

Page 33: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Binomial Distribution

Definition (Binomial Random Variable)

Consider a sequence of n independent Bernoulli trials, each with successprobability p. If

X = total number of successes

then X is a Binomial random variable with parameters n and p. Acommon shorthand is

X ∼ Bin(n, p).

Here, ”∼” has the meaning ”is distributed as” or ”has distribution”.

Results

If X ∼ Bin(n, p) then

1 fX (x) =(nx

)px(1− p)n−x , x = 0, ..., n,

2 E[X ] = np,

3 Var(X ) = np(1− p).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 33 / 116

Page 34: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Geometric Distribution

Definition (Geometric Distribution)

IfX = number of trials until first success,

then X is said to have a geometric distribution with parameter p (theprobably of success on each trial).

Results

If X ∼ Geom(p), then

1 fX (x ; p) = p(1− p)x−1, x = 1, 2, ...

2 E[X ] = 1p ,

3 Var(X ) = 1−pp2 .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 34 / 116

Page 35: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Poisson Distribution

Definition (Poisson Distribution)

The random variable X has a Poisson distribution with parameter λ > 0if its probability function is

fX (x ;λ) = P(X = x) =e−λλx

x!, x = 0, 1, 2, ...

A common abbreviation is X ∼ Poisson(λ).

The Poisson distribution is a model of the occurrence of point events in acontinuum. The number of points occurring in a time interval t is a randomvariable with a Poisson(λt) distribution

Results

If X ∼ Poisson(λ)

1 E(X ) = λ,

2 Var(X ) = λ.Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 35 / 116

Page 36: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Exponential Distribution

The exponential distribution is useful for describing the probability structureof positive random variables. The exponential distribution is closely relatedto the Poisson distribution; the time until the next event has an exponentialdistribution with parameter β = 1

λ .

Definition

A random variable X is said to have an exponential distribution withparameter β > 0 if X has density function:

fX (x ;β) =1

βe−x/β, x > 0.

Results1 E(X ) = β,

2 Var(X ) = β2,

3 Memoryless property: P(X > s + t|X > s) = P(X > t).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 36 / 116

Page 37: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Uniform Distribution

Definition (Uniform Distribution)

A continuous random variable X that can take values in the interval (a, b)with equal likelihood is said to have a uniform distribution on (a,b). Acommon shorthand is

X ∼ Unif(a, b).

Results

If X ∼ Unif(a, b), then

1 fX (x ; a, b) = 1b−a , a < x < b,

2 E(X ) = a+b2 ,

3 Var(X ) = (b−a)2

12 .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 37 / 116

Page 38: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Gamma Function

The Gamma Function extends the factorial function to the real numbers.

Definition (Gamma Function)

The Gamma function at x ∈ R is given by

Γ(x) =

∫ ∞0

tx−1e−tdt.

Results1 Γ(x) = (x − 1)Γ(x − 1),

2 Γ(n) = (n − 1)!,, n = 1, 2, 3...

3 Γ( 12 ) =

√π,

4∫∞

0 xme−xdx = m! for m = 0, 1, 2, ....

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 38 / 116

Page 39: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Beta Function

Definition (Beta Function)

The Beta function at x , y ∈ R is given by

B(x , y) =

∫ 1

0tx−1(1− t)y−1dt.

Result

For all x , y ∈ R,

B(x , y) =Γ(x)Γ(y)

Γ(x + y).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 39 / 116

Page 40: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Phi Function

Definition (Φ)

For all x ∈ R,

Φ(x) =1√2π

∫ x

−∞e−t

2/2dt.

We cannot simplify the above expression for Φ(x) as there is no closed-formanti-derivative.Φ gives the cumulative distribution function of the standard normaldistribution.

Results1 limx→−∞Φ(x) = 0,

2 limx→∞Φ(x) = 1,

3 Φ(0) = 12 ,

4 Φ is monotonically increasing over R.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 40 / 116

Page 41: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Normal Distribution

Definition (Normal Distribution)

The random variable X is said to have a normal distribution withparameters µ and σ2 (where −∞ < µ <∞ and σ2 > 0) if X has densityfunction

fX (x ;µ, σ) =1

σ√

2πe−(x−µ)2

2σ2 ,−∞ < x <∞.

A common shorthand isX ∼ N(µ, σ2).

Results

If X ∼ N(µ, σ2),

1 E(X ) = µ,

2 Var(X ) = σ2,

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 41 / 116

Page 42: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Computing Normal Distribution Probabilities

Result

If Z ∼ N(0, 1) then

P(Z ≤ x) = FZ (x) = Φ(x).

In other words, the Φ function is the cumulative distribution function of theN(0, 1) random variable.

Result

If X ∼ N(µ, σ2) then

Z =X − µσ

∼ N(0, 1).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 42 / 116

Page 43: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Gamma Distribution

Definition (Gamma Distribution)

A random variable X is said to have a Gamma distribution withparameters α and β (where α, β > 0) if X has density function:

fX (x ;α, β) =e−α/βxα−1

Γ(α)βα, x > 0

A common shorthand is:

X ∼ Gamma(α, β).

Results

If X ∼ Gamma(α, β) then

1 E(X ) = αβ,

2 Var(X ) = αβ2.

Moreover, Y has an Exponential distribution iff Y ∼ Gamma(1, β).Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 43 / 116

Page 44: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Beta Distribution

Definition (Beta Distribution)

A random variable X is said to have a Beta distribution with parametersaα, β > 0 if its density function is

fX (x ;α, β) =xα−1(1− x)β−1

B(α, β), 0 < x < 1.

The Beta distribution generalises the Unif(0, 1) distribution which can bethought of as a Beta distribution with a = b = 1.

Results

If X has a distribution with parameters α and β, then

1 E(X ) = αα+β ,

2 Var(X ) = αβ(α+β+1)(α+β)2 .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 44 / 116

Page 45: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Joint Probability Function and Density Function

Definition (Joint Probability Function)

If X and Y are discrete random variables, then the joint probabilityfunction of X and Y is

fX ,Y (x , y) = P(X = x ,Y = y),

the probability that X = x and Y = y .

Definition (Joint Density Function)

The joint density function of continuous random variables X and Y is abivariate function fX ,Y with the property∫ ∫

AfX ,Y (x , y)dxdy = P((X ,Y ) ∈ A)

for any (measurable) subset A of R2.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 45 / 116

Page 46: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Joint Cumulative Distribution Function

Definition (Joint Cumulative Distribution Function)

The joint cdf of X and Y is

FX ,Y (x , y) = P(X ≤ x ,Y ≤ y)

=

∑u≤x

∑v≤y P(X = u,Y = v) (X discrete)∫ y

−∞∫ x−∞ fX ,Y (u, v)dudv (X continuous).

Result1 If X and Y are discrete radom variables, then∑

all x

∑all y fX ,Y (x , y) = 1.

2 If X and Y are continuous random variables then∫∞−∞

∫∞−∞ fX ,Y (x , y)dxdy = 1.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 46 / 116

Page 47: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Expectation of Joint Functions

Result (Expectation of Joint Functions)

If g is any function of g(X ,Y ),

E[g(X ,Y )] =

=∑

all x

∑all y g(x , y)P(X = x ,Y = y) discrete

=∫∞−∞

∫∞−∞ g(x , y)fX ,Y (x , y)dxdy continuous

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 47 / 116

Page 48: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Marginal Probability Function

Result (Marginal Probability Function)

If X and Y are discrete, then fX (x) and fY (y) can be calculated fromfX ,Y (x , y) as follows:

fX (x) =∑all y

fX ,Y (x , y)

fY (y) =∑all x

fX ,Y (x , y).

fX (x) is sometimes referred to as the marginal probability function of X .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 48 / 116

Page 49: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Marginal Density Function

Result (Marginal Density Function)

If X and Y are continuous, then fX (x) and fY (y) can be calculated fromfX ,Y (x , y) as follows:

fX (x) =

∫ ∞−∞

fX ,Y (x , y)dy

fY (y) =

∫ ∞−∞

fX ,Y (x , y)dx .

fX (x) is sometimes referred to as the marginal density function of X .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 49 / 116

Page 50: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Conditional Probability Function

Definition (Conditional Probability Function)

If X and Y are discrete, the conditional probability function of X givenY = y is

fX |Y (x |y) = P(X = x |Y = y) =P(X = x ,Y = y)

P(Y = y)=

fX ,Y (x , y)

fY (y).

Similarly,

fY |X (y |x) = P(Y = y |X = x) =fX ,Y (x , y)

fX (x).

Result

P(Y ∈ A|X = x) =∑y∈A

fY |X (y |X = x).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 50 / 116

Page 51: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Conditional Density Function

Definition (Conditional Density Function)

If X and Y are continuous, the conditional density function of X givenY = y is

fX |Y (x |Y = y) =fX ,Y (x , y)

fY (y).

Similarly,

fY |X (y |X = x) =fX ,Y (x , y)

fX (x).

Shorthand: fY |X (y |x).

Result

P(a ≤ Y ≤ b|X = x) =

∫ b

afY |X (y |x)dy .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 51 / 116

Page 52: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Conditional Expected Value and Variance

Result (Conditional Expected Value)

The conditional expected value of X given Y = y is

E(X |Y = y) =

∑all x xP(X = x |Y = y) if X is discrete∫∞−∞ xfX |Y (x |y)dx if X is continuous

Result (Conditional Variance)

The conditional variance of X given Y = y is

Var(X |Y = y) = E(X 2|Y = y)− [E(X |Y = y)]2

where

E(X 2|Y = y) =

∑all x x

2P(X = x |Y = y)∫∞−∞ x2fX |Y (x |y)dx .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 52 / 116

Page 53: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Independent Random Variables

Definition (Independent)

Random variables X and Y are independent if and only if for all x , y

fX ,Y (x , y) = fX (x)fY (y).

Result

Random variables X and Y are independent if and only if for all x , y

fY |X (x , y) = fY (y)

.

Result

If X and Y are independent

FX ,Y (x , y) = FX (x) · FY (y).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 53 / 116

Page 54: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Covariance

Definition (Covariance

The covariance of X and Y is

Cov(X ,Y ) = E[(X − µX )(Y − µY )].

Cov(X ,Y ) measures how much X and Y vary about their means and alsohow much they vary together linearly.

Results1 Cov(X ,X ) = Var(X )

2 Cov(X ,Y ) = E(XY )− µXµY3 If X and Y are independent, then Cov(X ,Y ) = 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 54 / 116

Page 55: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Correlation

Definition (Correlation)

The correlation between X and Y is

Corr(X ,Y ) =Cov(X ,Y )√

Var(X ) · Var(Y ).

Corr(X ,Y ) measures the strength of the linear association between X andY . Independent random variables are uncorrelated, but uncorrelatedvariables are not necessarily independent (Consider the case whereE(X ) = 0 and Y = X 2).

Results1 |Corr(X ,Y )| ≤ 1

2 |Corr(X ,Y )| = 1 if and only if P(Y = a + bX ) = 1 for some constantsa, b.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 55 / 116

Page 56: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Transformations

Result

For discrete X ,

fY (y) = P(Y = y) = P[h(X ) = y ] =∑

x :h(x)=y

P(X = x).

Result

For continuous X , if h is monotonic over the set x : fX (x) > 0 then

fY (y) = fX (x)

∣∣∣∣dxdy∣∣∣∣

= fX (h−1(y))

∣∣∣∣dxdy∣∣∣∣

for y such that fX (h−1(y)) > 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 56 / 116

Page 57: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Linear Transformation

Result

For a continuous random variable X , if Y = aX + b is a lineartransformation of X with a 6= 0, then

fY (y) =1

|a|fX

(y − b

a

)for all y such that fX ( y−ba ) > 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 57 / 116

Page 58: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Leading into Bivariate Transformations...

If X and Y have joint density fX ,Y (x , y) and U is a function of X and Y ,we can find the density of U by calculating FU(u) = P(U ≤ u) anddifferentiating.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 58 / 116

Page 59: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Bivariate Transformations

Result

If U and V are functions of continuous random variables X and Y , then

fU,V (u, v) = fX ,Y (x , y) · |J|

where

J =

∣∣∣∣∣∣∂x∂u

∂x∂v

∂y∂u

∂y∂v

∣∣∣∣∣∣is a determinant called the Jacobian of the transformation.The full specification of fU,V (u, v) requires that the range of (u, v) valuescorresponding to those (x , y) for which fX ,Y (x , y) > 0 is determined.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 59 / 116

Page 60: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Bivariate Transformations

To find fU(u) by bivariate transformation:

1 Define some bivariation transformation (U,V ).

2 Find fU,V (u, v).

3 We want the marginal distribution of U. So now find∫∞−∞ fU,V (u, v).

Using a bivariate transformation to find the distribution of U is often moreconvenient that deriving it via the cumulative distribution function. Usingthe cdf requires double integration, which we can avoid when we use abivariate transformation.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 60 / 116

Page 61: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Sum of Independent Random Variables - ProbabilityFunction/Density Function Approach

Result (Discrete Convolution Formula)

Suppose that X and Y are independent random variables raking onlynon-negative integer values, and let Z = X + Y . Then

fX (z) =z∑

y=0

fX (z − y)fY (y), z = 0, 1, ...

Result (Continuous Convolution Formula)

Suppose X and Y are independent continuous random variables withX ∼ fX (x) and Y ∼ fY (y). Then Z = X + Y has density

fZ (z) =

∫all possible y

fX (z − y)fY (y)dy .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 61 / 116

Page 62: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Sum of Gammmas

Result

If X1,X2, ...,Xn are independent with Xi ∼ Gamma(αi , β), then

n∑i=1

Xi ∼ Gamma(n∑

i=1

αi , β).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 62 / 116

Page 63: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Sum of Independent Random Variables - MomentGenerating Function Approach

Result

Suppose that X and Y are independent random variables with momentgenerating functions mX and mY . Then

mX+Y (u) = mX (u)mY (u).

This generalises to n independent random variables.

Result

If X ∼ N(µX , σ2X ) and Y ∼ N(σY , σ

2Y ) are independent then

X + Y ∼ N(µX + µY , σ2X + σ2

Y ).

This also generalises.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 63 / 116

Page 64: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 64 / 116

Page 65: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Inequalities

1 Jensen’s Inequality

2 Markov’s inequality

3 Chebyshev’s inequality

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 65 / 116

Page 66: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Jensen’s Inequality

Result (Jensen’s Inequality)

If h is a convex function (i.e., concave up) and X is a random variable, then

E[h(X )] ≥ h(E[X ]).

There are several formulations of Jensen’s inequality and the aboveformulation is the one most relevant for probability theory. Studentsstudying MATH2701 next term will be delighted to re-encounter Jensen’sinequality albeit in a different formulation.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 66 / 116

Page 67: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Markov’s Inequality

Result (Markov’s Inequality)

Let X be a nonnegative random variable and a > 0. Then

P(X ≥ a) ≤ E[X ]

a.

That is, the probability that X is at least a is at most the expectation of Xdivided by a.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 67 / 116

Page 68: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Chebychev’s Inequality

Result (Chebychev’s Inequality)

If X is any random variable with E[X ] = µ, Var(X ) = σ2 then

P(|X − µ| > kσ) ≤ 1

k2.

That is, the probability that X is more than k standard deviations from itsmean is less than 1

k2 .

The significance of Chebychev’s Inequality is that we can make specificprobabilistic statements about a random variable given only its mean andstandard deviation – observe that we have not made any assumptions onthe distribution of X .Interestingly, Chebychev’s Inequality can be easily derived as a corollary ofMarkov’s Inequality (Hint: consider the random variable (X − E[X ])2)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 68 / 116

Page 69: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 69 / 116

Page 70: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Limit of Random Variables

1 Almost Sure Convergence

2 Convergence in Probability

3 Convergence in Distribution

4 Central Limit Theorem

5 Law of Large Numbers

6 The Statement and Application of the Delta Method

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 70 / 116

Page 71: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Almost Sure Convergence

Definition (Almost Sure Convergence)

The sequence of numerical random variables X1,X2, ... is said to convergealmost surely to a numerical random variable X , denoted Xn

a.s.→ X if

P(ω : lim

n→∞Xn(ω) = X (ω)

)= 1.

Within the context of probability theory, an event is said to happen almostsurely if it happens with probability 1. Note that it is possible for the set ofexceptions to be non-empty as long as it has probability 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 71 / 116

Page 72: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Almost Sure Convergence

The previous definition of almost sure convergence can be difficult to workwith so we will make use of an alternative definition.

Result (Alternative Definition of Almost Sure Convergence)

Xna.s.→ X

if and only if for every ε > 0,

limn→∞

P

(supk≥n|Xk − X | > ε

)= 0.

Almost Sure Convergence is the mode of convergence used in the StrongLaw of Large Numbers.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 72 / 116

Page 73: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Convergence in Probability

The main idea behind Convergence in Probability is that the probability ofan “unusual” event becomes smaller as the sequence progresses.

Definition (Convergence in Probability)

The sequence of random variables X1,X2, ... converges in probability to arandom variable X if, for all ε > 0,

limn→∞

P(|Xn − X | > ε) = 0.

This is usually written as

XnP→ X .

For context, an estimator is called consistent if it converges in probabilityto the quantity being estimated. Furthermore, convergence in probability isthe type of convergence established by the Weak Law of Large Numbers.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 73 / 116

Page 74: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Relationship between Almost Sure Convergence andConvergence in Probability

Result (Almost Sure Convergence implies Convergence in Probability)

Xna.s.→ X =⇒ Xn

P→ X

andXn

a.s.→ 0 ⇐⇒ supk≥n|Xk |

P→ 0.

To understand why almost sure convergence is stronger than convergence inprobability, almost sure convergence depends on a joint distribution whereasconvergence in probability depends only on a marginal distribution.Wise words of wisdom: Almost sure convergence means no noodle leavesthe strip (for large enough n), convergence in probability means theproportion of noodles leaving the strip goes to 0 (as n→∞).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 74 / 116

Page 75: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Convergence in Distribution

Convergence in Distribution is concerned with whether the distributions ofXi converges to the distribution of some random variable X . In other words,we increasingly expect to see the next outcome in a sequence of randomexperiments become better modelled by a given probability distribution.

Definition (Convergence in Distribution)

Let X1,X2, ... be a sequence of random variables. We say that Xn

converges in distribution to X if

limn→∞

FXn(x) = FX (x)

for all x where FX is continuous. A common shorthand is Xnd→ X .

We say that FX is the limiting distribution of Xn.

Convergence in distribution often arises in applications of the central limittheorem.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 75 / 116

Page 76: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

More on Convergence in Distribution

Convergence in distribution allows us to make approximate probabilitystatements about Xn, for large n, if we can derive the limiting distributionFX (n).

Result (Establishing Convergence in Distribution using Moments)

Let Xnn∈N be a sequence of random variables, each with mgf MXi(t).

Furthermore, suppose that

limn→∞

MXn(t) = MX (t).

If MX (t) is a moment generating function then there is a unique FX (whichgives a random variable X ) whose moments are determined by MX (t) andfor all points of continuity FX (x) we have

limn→∞

FXn(x) = FX (x).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 76 / 116

Page 77: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Relationship between Convergence in Probability andConvergence in Distribution

Result (Convergence in Probability implies Convergence in Distribution)

XnP→ X =⇒ Xn

d→ X .

Convergence in probability is concerned with the convergence of the actualvalues (the xi ’s) whereas convergence in distribution is concerned with theconvergence of the distributions (the FXi

(x)’s).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 77 / 116

Page 78: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Weak Law of Large Numbers

Result (Weak Law of Large Numbers)

Suppose X1,X2, ... are independent, each with mean µ and variance0 < σ2 <∞. If

Xn =1

n

n∑i=1

Xi , then XnP→ µ.

The Weak Law of Large Numbers describes how the sample averageconverges to the distributional average as the sample size increases.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 78 / 116

Page 79: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Slutsky’s Theorem

Result (Slutsky’s Theorem)

Let X1,X2, ... be a sequence of random variables such that Xnd→ X for

some distribution X and let Y1,Y2, ... be another sequence of random

variables such that YnP→ c for some constant c . Then

1 Xn + Ynd→ X + c

2 XnYnd→ cX .

Slutsky’s Theorem extends some properties of algebraic operations onconvergent sequences of real numbers to sequences of random variables andis useful for establishing convergence in distribution results.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 79 / 116

Page 80: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Strong Law of Large Numbers

The Weak Law corresponds to convergence in probability while the StrongLaw corresponds to almost sure convergence.

Result (Strong Law of Large Numbers)

Let X1,X2, ... be independent with common mean E[X ] = µ and varianceVar(X ) = σ2 <∞, then

Xna.s.→ µ.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 80 / 116

Page 81: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Central Limit Theorem

Result (Central Limit Theorem)

Suppose X1,X2, ... are independent and identically distributed randomvariables with common mean µ = E(Xi ) and common variance

σ2 = Var(Xi ) <∞. For each n ≥ 1 let Xn =∑n

i=1 Xi

n . Then

Xn − µσ/√n

d→ Z

where Z ∼ N(0, 1). It is common to write

Xn − µσ/√n

d→ N(0, 1).

Note that E(Xn = µ and Var(Xn) = σ2

n so that Central Limit Theoremstates that the limiting distribution of any standardised average ofindependent random variables is the standard Normal distribution.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 81 / 116

Page 82: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Alternative Forms of the Central Limit Theorem

Sometimes, probabilities involving related quantities such as the sum∑ni=1 Xi are required. Since

∑ni=1 Xi = nX , the Central Limit Theorem

also applies to the sum of a sequence of random variables.

Results

1√n(X − µ)

d→ N(0, σ2)

2

∑i Xi−nµσ√n

d→ N(0, 1)

3

∑i Xi−nµ√

n

d→ N(0, σ2)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 82 / 116

Page 83: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Applications of the Central Limit Theorem

Central Limit Theorem for Binomial Distribution

Suppose X ∼ Bin(n, p). Then

X − np√np(1− p)

d→ N(0, 1)

Normal Approximation to the Poisson Distribution

Suppose X ∼ Poisson(λ). Then

limλ→∞

P(X − λ√

λ≤ x

)= P(Z ≤ x)

where Z ∼ N(0, 1).

Continuity correction: add 12 to the numerator.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 83 / 116

Page 84: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

The Delta Method

The Delta Method

Let Y1,Y2, ... be a sequence of random variables such that

√n(Yn − θ)

σ

d→ N(0, 1).

Suppose the function g is differentiable in the neighbourhood of θ andg ′(θ) 6= 0. Then

√n(g(Yn)− g(θ))

d→ N(0, σ2[g ′(θ)]2).

Alternatively,g(Yn)− g(θ)

g ′(θ)/√n

d→ N(0, 1).

There are several ways of stating the Delta method.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 84 / 116

Page 85: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Table of Contents

1 Basic Probability Theory

2 Review of Random Variables

3 Computations for Common Distributions

4 Inequalities

5 Limit of Random Variables

6 Tips and Assorted Examples

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 85 / 116

Page 86: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

General Tips

1 When finding a pdf/cdf, always specify the domain over which it isdefined.

2 Make sure that the relevant conditions are satisfied before applyingsome result/theorem (e.g., are the random variables i.i.d? Does asequence of random variables have the right mode of convergence?)

3 Does the pdf in question belongs to a known family? If so, you cantypically make use of some known results.

4a.s.→ =⇒ P→ =⇒ d→

5 Know how to do calculations involving bivariate distributions

6 Know how to apply (bivariate) transformations

7 Know how to sum independent random variables using the convolutionformula approach and the moment generating function approach.

8 Know the Large Numbers Laws, the CLT and the Delta method.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 86 / 116

Page 87: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 1

Example (MATH1251) (contd.)

99% of the people with the disease receive a positive test. 98% of thosewithout receive a negative test. If 2% of the population have the disease,determine the probability of someone having the disease given they receiveda positive test.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 87 / 116

Page 88: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 1

We require P(D | T ) =P(T | D)P(D)

P(T ).

P(T ) = P(T | D)P(D) + P(T | Dc)P(Dc)

= P(T | D)P(D) +(1− P(T c | Dc)

)P(Dc)

= 0.99× 0.02 + (1− 0.98)× 0.98 = 0.0394

∴ P(D | T ) =0.99× 0.02

0.0394≈ 0.5025

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 87 / 116

Page 89: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 1

A lot of people get stuck with Bayes’ law, especially when used with otherresults. Use a tree diagram!

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 87 / 116

Page 90: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 2

Example

Given the distribution of X below, compute its expectation and standarddeviation.

x 0 3 9 27

P(X = x) 0.3 0.1 0.5 0.1

E[X ] = 7.5

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 88 / 116

Page 91: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 2

Example

Given the distribution of X below, compute its expectation and standarddeviation.

x 0 3 9 27

P(X = x) 0.3 0.1 0.5 0.1

E[X ] =∑all x

x P(X = x)

= 0× 0.3 + 3× 0.1 + 9× 0.5 + 27× 0.1

= 7.5

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 88 / 116

Page 92: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 2

Example

Given the distribution of X below, compute its expectation and standarddeviation.

x 0 3 9 27

P(X = x) 0.3 0.1 0.5 0.1

E[X ] = 7.5

E[X 2] = 02 × 0.3 + 32 × 0.1 + 92 × 0.5 + 272 × 0.1

= 114.3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 88 / 116

Page 93: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 2

Example

Given the distribution of X below, compute its expectation and standarddeviation.

x 0 3 9 27

P(X = x) 0.3 0.1 0.5 0.1

E[X ] = 7.5

E[X 2] = 114.3

σX =

√E[X 2]− (E[X ])2 =

√114.3− 7.52 =

√58.05 ≈ 7.619

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 88 / 116

Page 94: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 2

Example (2901 oriented)

Let X ∼ Geom(p). Prove that E[X ] = 1p .

E[X ] =∞∑x=1

xp(1− p)x−1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 88 / 116

Page 95: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 2

Example (2901 oriented)

Let X ∼ Geom(p). Prove that E[X ] = 1p .

Recall: P(X = x) = p(1− p)x−1 for x = 1, 2, . . .

E[X ] =∑all x

x P(X = x) =∞∑x=1

xp(1− p)x−1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 88 / 116

Page 96: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 2

Example (2901 oriented)

Let X ∼ Geom(p). Prove that E[X ] = 1p .

E[X ] =∞∑x=1

xp(1− p)x−1

=∞∑y=0

(y + 1)p(1− p)y (y = x − 1)

= (1− p)

∞∑y=0

(y + 1)p(1− p)y−1

= (1− p)∞∑y=0

yp(1− p)y−1 + (1− p)∞∑y=0

p(1− p)y−1

= (1− p)∞∑y=1

yp(1− p)y−1 + (1− p)∞∑y=1

p(1− p)y−1

+ p(1− p)−1 (evaluating at y = 0)

= (1− p)E[X ] + (1− p)

(1 + p(1− p)−1

)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 88 / 116

Page 97: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 2

E[X ] =∞∑x=1

xp(1− p)x−1

=∞∑y=0

(y + 1)p(1− p)y (y = x − 1)

= (1− p)

∞∑y=0

(y + 1)p(1− p)y−1

= (1− p)

∞∑y=0

yp(1− p)y−1 + (1− p)∞∑y=0

p(1− p)y−1

= (1− p)∞∑y=1

yp(1− p)y−1 + (1− p)∞∑y=1

p(1− p)y−1

+ p(1− p)−1 (evaluating at y = 0)

= (1− p)E[X ] + (1− p)

(1 + p(1− p)−1

)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 88 / 116

Page 98: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 2

E[X ] =∞∑x=1

xp(1− p)x−1

= (1− p)∞∑y=0

yp(1− p)y−1 + (1− p)∞∑y=0

p(1− p)y−1

= (1− p)∞∑y=1

yp(1− p)y−1 + (1− p)∞∑y=1

p(1− p)y−1

+ p(1− p)−1 (evaluating at y = 0)

= (1− p)E[X ] + (1− p)

(1 + p(1− p)−1

)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 88 / 116

Page 99: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 3

Example (2901 oriented)

Let X ∼ Geom(p). Prove that E[X ] = 1p .

∴ pE[X ] =

((1− p) + p

)E[X ] =

1

p

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 89 / 116

Page 100: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 3

In general, can be done with the aid of Taylor series or binomial theorem.But preferably just do this:

Method (Deriving Expected Value from definition) (2901)

Keep rearranging the expression until you make the entire density, or E[X ],appear again.

Discrete case - Use a change of summation index at some point

Continuous case - Use integration by parts (or occasionally integrationby substitution)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 89 / 116

Page 101: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 4

Example

Let fX (x) = 2θ2 x for 0 < x < θ. Compute the MGF and (2901) assert its

existence.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 90 / 116

Page 102: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 4

Example

Let fX (x) = 2θ2 x for 0 < x < θ. Compute the MGF and (2901) assert its

existence.

Integrate by parts

mX (u) = E[euX ] =2

θ2

∫ θ

0xeux dx

=2

θ2

(xeux

u

∣∣∣∣θ0

−∫ θ

0

eux

udx

)

=2θeuθ

uθ2− 2

θ2

(eux

u2

∣∣∣∣θ0

)

=2(uθeuθ − euθ + 1)

u2θ2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 90 / 116

Page 103: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 4

Example

Let fX (x) = 2θ2 x for 0 < x < θ. Compute the MGF and (2901) assert its

existence.

Slowly tidy everything up

mX (u) = E[euX ] =2

θ2

∫ θ

0xeux dx

=2

θ2

(xeux

u

∣∣∣∣θ0

−∫ θ

0

eux

udx

)

=2θeuθ

uθ2− 2

θ2

(eux

u2

∣∣∣∣θ0

)

=2(uθeuθ − euθ + 1)

u2θ2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 90 / 116

Page 104: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 4

Example

Let fX (x) = 2θ2 x for 0 < x < θ. Compute the MGF and (2901) assert its

existence.

Slowly tidy everything up

mX (u) = E[euX ] =2

θ2

∫ θ

0xeux dx

=2

θ2

(xeux

u

∣∣∣∣θ0

−∫ θ

0

eux

udx

)

=2θeuθ

uθ2− 2

θ2

(eux

u2

∣∣∣∣θ0

)

=2(uθeuθ − euθ + 1)

u2θ2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 90 / 116

Page 105: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 4

Example

Let fX (x) = 2θ2 x for 0 < x < θ. Compute the MGF and (2901) assert its

existence.

mX (u) =2(uθeuθ − euθ + 1)

u2θ2

GeoGebra simulation

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 90 / 116

Page 106: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 4

Example

Let fX (x) = 2θ2 x for 0 < x < θ. Compute the MGF and (2901) assert its

existence.

Idea: Can check that the limit as u → 0 is finite. The finiteness of the limitimplies the required result.

limu→0

2(uθeuθ − euθ + 1)

u2θ2

LH= lim

u→0

2(θeuθ + uθ2euθ − θeuθ

)2uθ2

= limu→0

euθ

= 1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 90 / 116

Page 107: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 5

Example

Use the MGF of X ∼ Bin(n, p) to prove that E[X ] = np.

E[X ] = limu→0

d

du(1− p + peu)n

= limu→0

n(1− p + peu)n−1 · peu

= n(1− p + p)n−1 · p= np

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 91 / 116

Page 108: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 5

Example

Use the MGF of X ∼ Bin(n, p) to prove that E[X ] = np.

E[X ] = limu→0

d

du(1− p + peu)n

= limu→0

n(1− p + peu)n−1 · peu

= n(1− p + p)n−1 · p= np

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 91 / 116

Page 109: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 6

Example

A busy switchboard receives 150 calls an hour on average. Assume thatcalls are independent from each other and can be modelled with a Poissondistribution. Find the probability of

1 Exactly 3 calls in a given minute

2 At least 10 calls in a given 5 minute period.

Naive:X ∼ Poisson(150).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 92 / 116

Page 110: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 6

Example

A busy switchboard receives 150 calls an hour on average. Assume thatcalls are independent from each other and can be modelled with a Poissondistribution. Find the probability of

1 Exactly 3 calls in a given minute

2 At least 10 calls in a given 5 minute period.

In Q1, take X ∼ Poisson(150/60) = Poisson(2.5). Then,

P(X = 3) = e−2.5 2.53

3!≈ 0.2138

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 92 / 116

Page 111: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 6

Example

A busy switchboard receives 150 calls an hour on average. Assume thatcalls are independent from each other and can be modelled with a Poissondistribution. Find the probability of

1 Exactly 3 calls in a given minute

2 At least 10 calls in a given 5 minute period.

In Q2, take Y ∼ Poisson(2.5× 5) = Poisson(12.5). Then,

P(Y ≥ 10) = 1− P(Y ≤ 9)

= 1− e−12.5

(12.50

0!+ · · ·+ 12.59

9!

)

= 1 - ppois(9,lambda=12.5,lower=TRUE)

≈ 0.7985689

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 92 / 116

Page 112: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 6

Example

A busy switchboard receives 150 calls an hour on average. Assume thatcalls are independent from each other and can be modelled with a Poissondistribution. Find the probability of

1 Exactly 3 calls in a given minute

2 At least 10 calls in a given 5 minute period.

In Q2, take Y ∼ Poisson(2.5× 5) = Poisson(12.5). Then,

P(Y ≥ 10) = 1− P(Y ≤ 9)

= 1 - ppois(9,lambda=12.5,lower=TRUE)

≈ 0.7985689

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 92 / 116

Page 113: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 7

Example (2901 course pack)

If, on average, 5 servers go offline during the day, what is the chance thatno servers will go offline in the next hour?

The number of servers going offline in a day is X ∼ Poisson(5).

So the time taken for the next server to go offline is T ∼ Exp(0.2),measured in days.

P(T >

1

24

)=

∫ ∞1/24

5e−5t dt

= e−5/24

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 93 / 116

Page 114: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 7

Example (2901 course pack)

If, on average, 5 servers go offline during the day, what is the chance thatno servers will go offline in the next hour?

The number of servers going offline in a day is X ∼ Poisson(5).

So the time taken for the next server to go offline is T ∼ Exp(0.2),measured in days.

P(T >

1

24

)=

∫ ∞1/24

5e−5t dt

= e−5/24

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 93 / 116

Page 115: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 7

Example (2901 course pack)

If, on average, 5 servers go offline during the day, what is the chance thatno servers will go offline in the next hour?

The number of servers going offline in a day is X ∼ Poisson(5).

So the time taken for the next server to go offline is T ∼ Exp(0.2),measured in days.

∴ We require P(T >

1

24

)

P(T >

1

24

)=

∫ ∞1/24

5e−5t dt

= e−5/24

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 93 / 116

Page 116: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 7

Example (2901 course pack)

If, on average, 5 servers go offline during the day, what is the chance thatno servers will go offline in the next hour?

The number of servers going offline in a day is X ∼ Poisson(5).

So the time taken for the next server to go offline is T ∼ Exp(0.2),measured in days.

P(T >

1

24

)=

∫ ∞1/24

5e−5t dt

= e−5/24

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 93 / 116

Page 117: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 8

Formula (Transforming a Discrete r.v.)

P(h(X ) = y

)=

∑x :h(x)=y

P(X = x)

Um, ye wat?

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 94 / 116

Page 118: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 8

Example

A random variable has the following distribution:

x -1 0 1 2

P(X = x) 0.38 0.21 0.14 0.27

Determine the distribution of Y = X 3 and Z = X 2.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 94 / 116

Page 119: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 8

Example

A random variable has the following distribution:

x -1 0 1 2

P(X = x) 0.38 0.21 0.14 0.27

Determine the distribution of Y = X 3 and Z = X 2.

If X can take the values −1, 0, 1, 2,then Y = X 3 takes the values −1, 0, 1, 8.

P(Y = −1) = P(X 3 = −1) = P(X = −1) = 0.38

Similarly, P(Y = 0) = 0.21, P(Y = 1) = 0.14, P(Y = 8) = 0.27.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 94 / 116

Page 120: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 8

Example

A random variable has the following distribution:

x -1 0 1 2

P(X = x) 0.38 0.21 0.14 0.27

Determine the distribution of Y = X 3 and Z = X 2.

On the other hand, X 2 can only take the values of 0, 1, 4.

P(Z = 0) = P(X 2 = 0) = P(X = 0) = 0.21

P(Z = 1) = P(X 2 = 1) = P(X = ±1) = 0.38 + 0.14 = 0.62

...and P(Z = 4) is still equal to 0.27.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 94 / 116

Page 121: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 8

Example

A random variable has the following distribution:

x -1 0 1 2

P(X = x) 0.38 0.21 0.14 0.27

Determine the distribution of Y = X 3 and Z = X 2.

On the other hand, X 2 can only take the values of 0, 1, 4.

P(Z = 0) = P(X 2 = 0) = P(X = 0) = 0.21

P(Z = 1) = P(X 2 = 1) = P(X = ±1) = 0.38 + 0.14 = 0.62

...and P(Z = 4) is still equal to 0.27.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 94 / 116

Page 122: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 8

Just to think about... (2901 oriented)

If X ∼ Poisson(λ), what must be the distribution of Y = X 2

P(Y = y) =

e−λ λ

√y

(√y)! if y = 0, 1, 4, 9, . . .

0 otherwise

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 94 / 116

Page 123: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 9

Method 1 (Continuous random variable transform theorem)

Consider the transform y = h(x). If h is monotonic wherever fX (x) isnon-zero, then the density of Y = h(X ) is

fY (y) = fX(h−1(y)

) ∣∣∣∣dxdy∣∣∣∣

Example

Let X ∼ Exp(λ). What is the density of Y = X 2?

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 95 / 116

Page 124: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 9

Example

Let X ∼ Exp(λ). What is the density of Y = X 2?

fX (x) = 1λe−x/λ for all x > 0.

h(x) = x2 is invertible for all x > 0, with h−1(y) =√y .

x =√y , so dx

dy = 12√y

∴ fY (y) = fX (√y)

∣∣∣∣ 1

2√y

∣∣∣∣

=1

λe−√y/λ

∣∣∣∣ 1

2√y

∣∣∣∣=

1

2λ√ye−√y/λ

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 95 / 116

Page 125: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 9

Example

Let X ∼ Exp(λ). What is the density of Y = X 2?

fX (x) = 1λe−x/λ for all x > 0.

h(x) = x2 is invertible for all x > 0, with h−1(y) =√y .

x =√y , so dx

dy = 12√y

∴ fY (y) = fX (√y)

∣∣∣∣ 1

2√y

∣∣∣∣=

1

λe−√y/λ

∣∣∣∣ 1

2√y

∣∣∣∣=

1

2λ√ye−√y/λ

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 95 / 116

Page 126: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 10

Example

Let X ∼ Exp(λ). What is the density of Y = X 2?

fY (y) =1

2λ√ye−√y/λ

Since x > 0 and y = x2, y > 0 as well.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 96 / 116

Page 127: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 10

Example

Let X ∼ Unif(−10, 10). What is the density of Y = X 2?

fY (y) =1

20√y

Since −10 < x < 10 and y = x2, we must have 0 < y < 100.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 96 / 116

Page 128: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 11

Example

The joint probability distribution of X and Y is

y

0 1 2

0 1/16 1/8 1/8x 1 1/8 1/16 0

2 3/16 1/4 1/16

Determine P(X = 0,Y = 1), P(X ≥ 1,Y < 1) and P(X − Y = 1)

P(X = 0,Y = 1) =1

8

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 97 / 116

Page 129: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 11

Example

The joint probability distribution of X and Y is

y

0 1 2

0 1/16 1/8 1/8x 1 1/8 1/16 0

2 3/16 1/4 1/16

Determine P(X = 0,Y = 1), P(X ≥ 1,Y < 1) and P(X − Y = 1)

P(X ≥ 1,Y < 1) = P(X = 1,Y = 0) + P(X = 2,Y = 0)

=1

8+

3

16=

5

16

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 97 / 116

Page 130: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 11

Example

The joint probability distribution of X and Y is

y

0 1 2

0 1/16 1/8 1/8x 1 1/8 1/16 0

2 3/16 1/4 1/16

Determine P(X = 0,Y = 1), P(X ≥ 1,Y < 1) and P(X − Y = 1)

P(X − Y = 1) = P(X = 2,Y = 1) + P(X = 1,Y = 0)

=1

4+

1

8=

3

8

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 97 / 116

Page 131: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 12

Joint continuous distributions

Unless you know how to use indicator functions really well (2901), sketchthe region!

Example

fX ,Y (x , y) =1

x2y2x ≥ 1, y ≥ 1

is the joint density of the continuous r.v.s X and Y . Find P(X < 2,Y ≥ 4)and P(X ≤ Y 2).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 98 / 116

Page 132: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 12

Example

fX ,Y (x , y) =1

x2y2x ≥ 1, y ≥ 1

is the joint density of the continuous r.v.s X and Y . Find P(X < 2,Y ≥ 4)and P(X ≤ Y 2).

P(X < 2,Y ≥ 4) =

∫ 2

1

∫ ∞4

1

x2y2dy dx

=

∫ 2

1

1

4x2dx

=1

8

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 98 / 116

Page 133: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 12

Example

fX ,Y (x , y) =1

x2y2x ≥ 1, y ≥ 1

is the joint density of the continuous r.v.s X and Y . Find P(X < 2,Y ≥ 4)and P(X ≤ Y 2).

P(X ≤ Y 2) =

∫ ∞1

∫ x2

1

1

x2y2dy dx

=

∫ ∞1

(1

x2− 1

x4

)dx

=2

3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 98 / 116

Page 134: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 13

Example

Find E[Y 2 lnX ] for the following distribution

y

1 2

x 1 1/10 1/52 3/10 2/5

E[Y 2 lnX ] = 12 ln 1P(X = 1,Y = 1) + 22 ln 1P(X = 1,Y = 2)

+ 12 ln 2P(X = 2,Y = 1) + 22 ln 2P(X = 2,Y = 2)

=

(3

10+ 2× 2

5

)ln 2 =

11 ln 2

10

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 99 / 116

Page 135: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 13

Example

Find E[Y 2 lnX ] for the following distribution

y

1 2

x 1 1/10 1/52 3/10 2/5

E[Y 2 lnX ] = 12 ln 1P(X = 1,Y = 1) + 22 ln 1P(X = 1,Y = 2)

+ 12 ln 2P(X = 2,Y = 1) + 22 ln 2P(X = 2,Y = 2)

=

(3

10+ 2× 2

5

)ln 2 =

11 ln 2

10

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 99 / 116

Page 136: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 13

Example

Find E[Y 2 lnX ] for the following distribution

y

1 2

x 1 1/10 1/52 3/10 2/5

E[Y 2 lnX ] = 12 ln 1P(X = 1,Y = 1) + 22 ln 1P(X = 1,Y = 2)

+ 12 ln 2P(X = 2,Y = 1) + 22 ln 2P(X = 2,Y = 2)

=

(3

10+ 2× 2

5

)ln 2 =

11 ln 2

10

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 99 / 116

Page 137: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 14

Problem

Examine the existence of E[XY ] for the earlier example:

fX ,Y (x , y) =1

x2y2for x , y ≥ 1.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 100 / 116

Page 138: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 15

Definition (Cumulative Distribution Function)

The CDF FX ,Y (x , y) is the function given by

FX ,Y (x , y) = P(X ≤ x ,Y ≤ y)

Finding a CDF (Continuous case)

FX ,Y (x , y) =

∫ x

−∞

∫ y

−∞fX ,Y (u, v) du dv

Example

For the earlier example, FX ,Y (x , y) = 0 if x < 1 or y < 1. Else:

FX ,Y (x , y) =

∫ x

1

∫ y

1

1

u2v2du dv =

(1− 1

x

)(1− 1

y

)Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 101 / 116

Page 139: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 16

Recall that P(A ∩ B) = P(A)P(B).

Definition (Independence of random variables)

Two random variables are independent when:

P(X = x ,Y = y) = P(X = x)P(Y = y) (discrete case)

fX ,Y (x , y) = fX (x)fY (y) (continuous case)

Example

Test if X and Y are independent, for

fX ,Y (x , y) =1

x2y2x , y ≥ 1.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 102 / 116

Page 140: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 16

Example

Test if X and Y are independent, for

fX ,Y (x , y) =1

x2y2x , y ≥ 1.

fX (x) =

∫ ∞1

1

x2y2dy

=1

x2x ≥ 1

Similarly fY (y) =1

y2y ≥ 1.

Therefore since fX ,Y (x , y) = fX (x)fY (y), X and Y are independent.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 102 / 116

Page 141: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 17

Example

Determine P(X = x | Y = 2), i.e. fX |Y (x | 2), for

y

1 2

x 1 1/10 1/52 3/10 2/5

P(Y = 2) = P(X = 1,Y = 2) + P(X = 2,Y = 2)

=1

5+

2

5

=3

5.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 103 / 116

Page 142: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 17

Example

Determine P(X = x | Y = 2), i.e. fX |Y (x | 2), for

y

1 2

x 1 1/10 1/52 3/10 2/5

P(Y = 2) =3

5

P(X = 1 | Y = 2) =P(X = 1,Y = 2)

P(Y = 2)=

1

3

P(X = 2 | Y = 2) =P(X = 2,Y = 2)

P(Y = 2)=

2

3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 103 / 116

Page 143: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 18

Lemma (Independence of random variables)

Two random variables are independent if and only if

fY |X (y | x) = fY (y)

orfX |Y (x | y) = fX (x)

Investigation

For the earlier example with fX ,Y (x , y) = x−2y−2 for x ≥ 1, y ≥ 1, provethe independence of X and Y using this lemma instead.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 104 / 116

Page 144: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 19

Definition (Conditional Expectation)

E[X | Y = y ] =

∑all x

xP(X = x | Y = y) discrete case∫ ∞−∞

x fX |Y (x | y) dx continuous case

Definition (Conditional Variance)

Var(X | Y = y) = E[X 2 | Y = y ]−(E[X | Y = y ]

)2

(And similarly for Y . Basically, just add the condition to the originalformula.)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 105 / 116

Page 145: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 19

Example

Find E[X | Y = 2] and Var(X | Y = 2) for

y

1 2

x 1 1/10 1/52 3/10 2/5

E[X | Y = 2] = 1 · P(X = 1 | Y = 2) + 2 · P(X = 2 | Y = 2)

= 1× 1

3+ 2× 2

3

=5

3.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 105 / 116

Page 146: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 19

Example

Find E[X | Y = 2] and Var(X | Y = 2) for

y

1 2

x 1 1/10 1/52 3/10 2/5

E[X 2 | Y = 2] = 12 · P(X = 1 | Y = 2) + 22 · P(X = 2 | Y = 2)

= 12 × 1

3+ 22 × 2

3= 3.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 105 / 116

Page 147: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 19

Example

Find E[X | Y = 2] and Var(X | Y = 2) for

y

1 2

x 1 1/10 1/52 3/10 2/5

Var(X 2 | Y = 2) = 3−(

5

3

)2

=2

9

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 105 / 116

Page 148: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 20

Example

Let fX ,Y (x , y) = xy for x ∈ [0, 1], y ∈ [0, 2]. Determine their covariance inthe old fashioned way.

Step 1: Determine the marginal densities

fX (x) =

∫ 2

0xy dy = 2x (0 ≤ x ≤ 1)

fY (y) =

∫ 1

0xy dx =

y

2(0 ≤ y ≤ 2)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 106 / 116

Page 149: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 20

Example

Let fX ,Y (x , y) = xy for x ∈ [0, 1], y ∈ [0, 2]. Determine their covariance inthe old fashioned way.

Step 2: Find the marginal expectations E[X ] and E[Y ]

E[X ] =

∫ 1

02x2 dx =

2

3

E[Y ] =

∫ 2

0

y2

2dy =

4

3

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 106 / 116

Page 150: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 20

Example

Let fX ,Y (x , y) = xy for x ∈ [0, 1], y ∈ [0, 2]. Determine their covariance inthe old fashioned way.

Step 3: Find E[XY ]

E[XY ] =

∫ 1

0

∫ 2

0xy dy dx = · · · =

8

9

Step 4: Plug in:

Cov(X ,Y ) = E[XY ]− E[X ]E[Y ] =8

9− 2

3× 4

3= 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 106 / 116

Page 151: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 20

Example

Let fX ,Y (x , y) = xy for x ∈ [0, 1], y ∈ [0, 2].Determine their covariance inthe old fashioned way.

That was a horrible idea.

Can prove that X and Y are independent

Can use the Fubini-Tonelli theorem to just check that E[XY ] equalsE[X ]E[Y ]

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 106 / 116

Page 152: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 21

Example (2901)

Let Z ∼ N (0, 1) and W satisfy P(X = 1) = P(X = −1) = 12 . Suppose

that W and Z are independent and define X := WZ .

Show that Cov(X ,Z ) = 0.

Noting that E[Z ] = 0,

Cov(X ,Z ) = E[XZ ]− E[X ]E[Z ] = E[XZ ]

Subbing in X = WZ and using independence gives

Cov(X ,Z ) = E[WZ 2] = E[W ]E[Z 2]

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 107 / 116

Page 153: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 21

Example (2901)

Let Z ∼ N (0, 1) and W satisfy P(X = 1) = P(X = −1) = 12 . Suppose

that W and Z are independent and define X := WZ .

Show that Cov(X ,Z ) = 0.

Noting that E[Z ] = 0,

Cov(X ,Z ) = E[XZ ]− E[X ]E[Z ] = E[XZ ]

Subbing in X = WZ and using independence gives

Cov(X ,Z ) = E[WZ 2] = E[W ]E[Z 2]

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 107 / 116

Page 154: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 21

Example (2901)

Let Z ∼ N (0, 1) and W satisfy P(X = 1) = P(X = −1) = 12 . Suppose

that W and Z are independent and define X := WZ .

Show that Cov(X ,Z ) = 0.

Observe that

E[W ] = 1P(X = 1)− 1P(X = −1) = 0.

Hence Cov(X ,Z ) = E[W ]E[Z 2] = 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 107 / 116

Page 155: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 22

Theorem (Bivariate Transform Formula)

Suppose X and Y have joint density function fX ,Y and let U and V betransforms on these random variables. Then the joint density of U,V is

fU,V (u, v) = fX ,Y (x , y) | det(J)|

where J is the Jacobian matrix

J =

(∂x∂u

∂x∂v

∂y∂u

∂y∂v

)Remember: x above y and u left of v

Example (Course pack)

Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

U = 12 (X − Y ) and V = Y .

We know that y > 0. Since v = y , it immediately follows that v > 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 108 / 116

Page 156: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 22

Example (Course pack)

Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

U =1

2(X − Y ) and V = Y .

We have y = v and

u =1

2(x − v) =⇒ x = 2u + v .

∴ J =

(2 10 1

)and det(J) = 2.

We know that y > 0. Since v = y , it immediately follows that v > 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 108 / 116

Page 157: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 22

Example (Course pack)

Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

U =1

2(X − Y ) and V = Y .

fX ,Y (x , y) =1

16e−(x+y)/4

Since y = v and x = 2u + v , we get x + y = 2u + 2v . Therefore

fU,V (u, v) =1

8e−(u+v)/2.

We know that y > 0. Since v = y , it immediately follows that v > 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 108 / 116

Page 158: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 22

Example (Course pack)

Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

U =1

2(X − Y ) and V = Y .

We know that y > 0. Since v = y , it immediately follows that v > 0.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 108 / 116

Page 159: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 22

Example (Course pack)

Let X and Y be i.i.d. Exp(4) r.v.s. Find the joint density of U and V if

U =1

2(X − Y ) and V = Y .

We know that y > 0. Since v = y , it immediately follows that v > 0.However, x > 0 and x = 2u + v . Therefore:

2u + v > 0

u > −v

2

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 108 / 116

Page 160: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 23

Example

Let X and Y be i.i.d. Geom(p). Use convolutions to find the probabilityfunction of Z := X + Y .

The probability functions are P(X = x) = p(1− p)x for x = 1, 2, 3, . . .,and P(Y = y) = p(1− p)y for y = 1, 2, 3, . . .. Therefore:

P(X = z − y) = p(1− p)z−y

for z − y = 1, 2, 3, . . .,

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 109 / 116

Page 161: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 23

Example

Let X and Y be i.i.d. Geom(p). Use convolutions to find the probabilityfunction of Z := X + Y .

The probability functions are P(X = x) = p(1− p)x for x = 1, 2, 3, . . .,and P(Y = y) = p(1− p)y for y = 1, 2, 3, . . .. Therefore:

P(X = z − y) = p(1− p)z−y

for z − y = 1, 2, 3, . . ., i.e.

y − z = . . . ,−3,−2,−1 ⇐⇒ y = . . . , z − 3, z − 2, z − 1

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 109 / 116

Page 162: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 23

Example

Let X and Y be i.i.d. Geom(p). Use convolutions to find the probabilityfunction of Z := X + Y .

Hence P(X = z − y)P(Y = y) = p(1− p)z−yp(1− p)y = p2(1− p)z , when

y = 0, 1, 2, . . .

and y = . . . , z − 3, z − 2, z − 1.

Therefore, y = 0, 1, 2, . . . , z − 3, z − 2, z − 1.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 109 / 116

Page 163: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 23

Example

Let X and Y be i.i.d. Geom(p). Use convolutions to find the probabilityfunction of Z := X + Y .

∴ P(Z = z) =z−1∑y=0

p2(1− p)z

= zp2(1− p)z (sum only depends on y !)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 109 / 116

Page 164: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 23

Example

Let X and Y be i.i.d. Geom(p). Use convolutions to find the probabilityfunction of Z := X + Y .

∴ P(Z = z) =z−1∑y=0

p2(1− p)z

= zp2(1− p)z (sum only depends on y !)

Since x = 1, 2, . . . and y = 1, 2, . . ., i.e. x and y are natural numbersgreater than or equal to 1, z = x + y = 2, 3, 4, . . .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 109 / 116

Page 165: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 24

Example

Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows aGamma(2, 1) distribution using a convolution.

The densities are fX (x) = e−x for x > 0, and fY (y) = e−y for y > 0.Therefore:

fX (z − y) = e−z+y , for z − y > 0 , i.e. y < z

Hence fX (z − y)fY (y) = e−z when y < z and y > 0. i.e.

fX (z − y)fY (y) = e−z for 0 < y < z

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 110 / 116

Page 166: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 24

Example

Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows aGamma(2, 1) distribution using a convolution.

The densities are fX (x) = e−x for x > 0, and fY (y) = e−y for y > 0.Therefore:

fX (z − y) = e−z+y , for z − y > 0 , i.e. y < z

Hence fX (z − y)fY (y) = e−z when y < z and y > 0. i.e.

fX (z − y)fY (y) = e−z for 0 < y < z

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 110 / 116

Page 167: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 24

Example

Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows aGamma(2, 1) distribution using a convolution.

∴ fZ (z) =

∫ z

0e−z dy

= e−zz

=e−z/1z2−1

Γ(2)12

Since x > 0 and y > 0, z = x + y > 0. Thus Z has the density of aGamma(2,1) random variable.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 110 / 116

Page 168: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 24

Example

Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows aGamma(2, 1) distribution using a convolution.

∴ fZ (z) =

∫ z

0e−z dy

= e−zz

=e−z/1z2−1

Γ(2)12

Since x > 0 and y > 0, z = x + y > 0. Thus Z has the density of aGamma(2,1) random variable.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 110 / 116

Page 169: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 25

Theorem (MGF of a sum)

If X and Y are independent random variables, then

mX+Y (u) = mX (u)mY (u)

Example

Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows aGamma(2, 1) distribution from quoting MGFs.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 111 / 116

Page 170: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 25

Example

Let X and Y be i.i.d. Exp(1). Prove that Z := X + Y follows aGamma(2, 1) distribution from quoting MGFs.

mX (u) = 11−u and mY (u) = 1

1−u . So clearly

mZ (u) = mX (u)mY (u) =

(1

1− u

)2

,

which is the MGF of a Gamma(2,1) distribution. Hence Z follows thisdistribution as well.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 111 / 116

Page 171: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 26

Example

Let X1, . . . ,Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define

Yn = nminU1, . . . ,Un. Prove that Ynd→ Y , where Y ∼ Exp(1).

FYn(y) = P(Yn ≤ y) = P(nminU1, . . . ,Un ≤ y)

= P(

minU1, . . . ,Un ≤y

n

)

= 1− P(

minU1, . . . ,Un ≥y

n

)

= 1− P(U1 >

y

n, . . . ,Un >

y

n

)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 112 / 116

Page 172: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 26

Example

Let X1, . . . ,Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define

Yn = nminU1, . . . ,Un. Prove that Ynd→ Y , where Y ∼ Exp(1).

FYn(y) = P(Yn ≤ y) = P(nminU1, . . . ,Un ≤ y)

= P(

minU1, . . . ,Un ≤y

n

)= 1− P

(minU1, . . . ,Un ≥

y

n

)

= 1− P(U1 >

y

n, . . . ,Un >

y

n

)

In general, if minx1, . . . , xn ≤ x , then not every xi ≤ x .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 112 / 116

Page 173: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 26

Example

Let X1, . . . ,Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define

Yn = nminU1, . . . ,Un. Prove that Ynd→ Y , where Y ∼ Exp(1).

FYn(y) = P(Yn ≤ y) = P(nminU1, . . . ,Un ≤ y)

= P(

minU1, . . . ,Un ≤y

n

)= 1− P

(minU1, . . . ,Un ≥

y

n

)= 1− P

(U1 >

y

n, . . . ,Un >

y

n

)But it is true that if minU1, . . . ,Un ≥ x , then every xi ≥ x .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 112 / 116

Page 174: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 26

Example

Let X1, . . . ,Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define

Yn = nminU1, . . . ,Un. Prove that Ynd→ Y , where Y ∼ Exp(1).

FYn(y) = 1− P(U1 >

y

n, . . . ,Un >

y

n

)= 1− P

(U1 >

y

n

). . .P

(Un >

y

n

)(independence)

= 1−[P(U1 >

y

n

)]n(id. distributed)

= 1−

[∫ 1

y/n1 dt

]n= 1−

(1− y

n

)n

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 112 / 116

Page 175: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 26

Example

Let X1, . . . ,Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define

Yn = nminU1, . . . ,Un. Prove that Ynd→ Y , where Y ∼ Exp(1).

FYn(y) = 1− P(U1 >

y

n, . . . ,Un >

y

n

)= 1− P

(U1 >

y

n

). . .P

(Un >

y

n

)(independence)

= 1−[P(U1 >

y

n

)]n(id. distributed)

= 1−

[∫ 1

y/n1 dt

]n= 1−

(1− y

n

)n

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 112 / 116

Page 176: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 26

Example

Let X1, . . . ,Xn be a sequence of i.i.d. Unif(0, 1) random variables. Define

Yn = nminU1, . . . ,Un. Prove that Ynd→ Y , where Y ∼ Exp(1).

∴ limn→∞

FYn(y) = 1− e−y = FY (y)

Hence Ynd→ Y .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 112 / 116

Page 177: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 27

Example (Libo’s notes)

Australians have average weight about 68 kg and variance about 16 kg2.Suppose 40 random Australians are chosen. What is the (approximate)probability that the average weight of these Australians is over 80?

Let X1, . . . ,X40 be the weights of the Australians.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 113 / 116

Page 178: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 27

Example (Libo’s notes)

Australians have average weight about 68 kg and variance about 16 kg2.Suppose 40 random Australians are chosen. What is the (approximate)probability that the average weight of these Australians is over 80?

Let X1, . . . ,X40 be the weights of the Australians. Then n = 40, µ = 68and σ = 4, so by the CLT:

X − 68

4/√

40

d→ Z

where Z ∼ N (0, 1).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 113 / 116

Page 179: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 27

Example (Libo’s notes)

Australians have average weight about 68 kg and variance about 16 kg2.Suppose 40 random Australians are chosen. What is the (approximate)probability that the average weight of these Australians is over 80?

∴ P(X40 > 80) = P(X40 − 68

4/√

40>

80− 68

4/√

40

)≈ P

(Z >

80− 68

4/√

40

)

= P(Z > 3√

40)

= 1-pnorm(3*sqrt(40))

or pnorm(3*sqrt(40), lower.tail=FALSE)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 113 / 116

Page 180: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 27

Example (Libo’s notes)

Australians have average weight about 68 kg and variance about 16 kg2.Suppose 40 random Australians are chosen. What is the (approximate)probability that the average weight of these Australians is over 80?

∴ P(X40 > 80) = P(X40 − 68

4/√

40>

80− 68

4/√

40

)≈ P

(Z >

80− 68

4/√

40

)= P(Z > 3

√40)

= 1-pnorm(3*sqrt(40))

or pnorm(3*sqrt(40), lower.tail=FALSE)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 113 / 116

Page 181: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 28

Lemma (Normal Approximation to Binomial)

Let X ∼ Bin(n, p), which is a sum of n independent Ber(p) r.v.s. Then

X − np√np(1− p)

d→ N (0, 1)

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 114 / 116

Page 182: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 28

Example

An unfortunate soul decided to sit his exam despite having a migraine andthe flu. Fortunately, it was not a university exam, and the paper involvedonly 200 multiple choice questions with 5 options. Therefore, he randomlyguesses every answer. What is the (approximate) probability he fails?

Let X be how many he gets correct. Then X ∼ Bin(200, 1

5

).

We may approximate X with Y ∼ N (40, 32). Then,

P(X < 100) ≈ P(Y < 100)

= P(Y − 40√

32<

100− 40√32

)= P

(Z <

60√32

)= P(Z < 10.6066)

Oh my...

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 114 / 116

Page 183: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 28

Example

An unfortunate soul decided to sit his exam despite having a migraine andthe flu. Fortunately, it was not a university exam, and the paper involvedonly 200 multiple choice questions with 5 options. Therefore, he randomlyguesses every answer. What is the (approximate) probability he fails?

Let X be how many he gets correct. Then X ∼ Bin(200, 1

5

).

We may approximate X with Y ∼ N (40, 32). Then,

P(X < 100) ≈ P(Y < 100)

= P(Y − 40√

32<

100− 40√32

)= P

(Z <

60√32

)= P(Z < 10.6066)

Oh my...

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 114 / 116

Page 184: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 28

Example

An unfortunate soul decided to sit his exam despite having a migraine andthe flu. Fortunately, it was not a university exam, and the paper involvedonly 200 multiple choice questions with 5 options. Therefore, he randomlyguesses every answer. What is the (approximate) probability he fails?

Let X be how many he gets correct. Then X ∼ Bin(200, 1

5

).

We may approximate X with Y ∼ N (40, 32). Then,

P(X < 100) ≈ P(Y < 100)

= P(Y − 40√

32<

100− 40√32

)= P

(Z <

60√32

)= P(Z < 10.6066)

Oh my...

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 114 / 116

Page 185: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 28

Example

An unfortunate soul decided to sit his exam despite having a migraine andthe flu. Fortunately, it was not a university exam, and the paper involvedonly 200 multiple choice questions with 5 options. Therefore, he randomlyguesses every answer. What is the (approximate) probability he fails?

Let X be how many he gets correct. Then X ∼ Bin(200, 1

5

).

We may approximate X with Y ∼ N (40, 32). Then,

P(X < 100) ≈ P(Y < 100)

= P(Y − 40√

32<

100− 40√32

)= P

(Z <

60√32

)= P(Z < 10.6066) Oh my...

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 114 / 116

Page 186: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 29

Example (Libo’s notes)

Let X1,X2, ... be a sequence of i.i.d random variables with mean 2 andvariance 7. Obtain a large sample approximation for the distribution of(Xn)3.

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 115 / 116

Page 187: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 29

The CLT gives√n(Xn − 2)

d→ N(0, 7)

Applying the Delta Method with g(x) = x3 leads to g ′(x) = 3x2 and then

√n[(X 3

n )− 23]d→ N(0, 7 · (3 · 22)2).

Simplifying, we have

√n[((X )3

n − 8]d→ N(0, 1008).

Thus, for large n, the approximate distribution of (Xn)3 is N(8, 1008n ).

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 115 / 116

Page 188: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 30

Example (More 2801 focused)

The Riemann zeta function is defined for complex s with real part greaterthan 1 by the absolutely convergent infinite series

ζ(s) =∞∑n=1

1

ns.

Prove that the real part of every non-trivial zero of the Riemann zetafunction is 1

2 .

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 116 / 116

Page 189: MATH2901 Revision Seminar - mathsoc.unsw.edu.au€¦ · Table of Contents 1 Basic Probability Theory 2 Review of Random Variables 3 Computations for Common Distributions 4 Inequalities

Example 30

Jeffrey Yang (UNSW Society of Statistics/UNSW Mathematics Society)MATH2901 Revision Seminar August 2020 116 / 116