Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Collections of Random Variables
Brian Vegetabile, Dustin Pluta
2018 Statistics BootcampDepartment of Statistics
University of California, Irvine
September 14th, 2018
1 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Considering Multiple Random Variables I
• While random variables are interesting in themselves,most of statistics revolves around collections of randomvariables.
• Need the tools to discuss the relationships between therandom variables in a sample
2 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Considering Multiple Random Variables II
• Ultimately we will want to make use of theorems andproperties of random samples to perform inference onparameters of the distributions underlying a sample.
• For example, a statistic such as the sample mean will beused to make inference about the population mean
• We often rely on approximations that are due to “large”samples
3 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Defining a Random Sample I
• We begin by defining a random sample
Definition (Random Sample)
Consider n random variables X1, X2, . . . , Xn, these randomvariables form a random sample if each Xi is independent ofall others and the marginal pmf or pdf of each RV is the samefunction f . Such random variables are said to be independentand identically distributed.
Definition (Sample Size)
The number of random variables, n, in a random sample isreferred to as the sample size.
4 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Defining a Random Sample II
• One way to think about the idea of the random sample isthat there is some large or infinite ‘population’ where eachrandom variable is selected from
• In this population, each random variable is generatedusing the same density or mass function f
• The sample size n is the number of random variablesselected from that population.
5 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Examples of Random Samples
• Consider 100 coin flips from a fair coin. Each coin flip canbe considered as a random variable from a Bernoullidistribution with success probability p = 0.5.
• Consider the height of students in high schools around thecountry. It may be reasonable to assume that the heightsof these students come from a population where height isrepresented as a normal distribution centered at someaverage µ with variance σ2.
• Consider measuring 250 failure times for light bulbs froma single production line. The time to failure may beassumed to come from a population of light bulbs wherefailure time can be modeled as an exponential distributionwith rate parameter λ.
The main idea is ‘repeated observations’ of the samephenomenon.
6 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Joint Distribution of a Sample
• Based upon the definition of a random sample, we canconstruct the joint distribution which represents theprobability distribution of the sample.
• Assuming a parameterized probability function f(xi|θ)and we let x = (x1, . . . , xn), then
f(x|θ) = f(x1|θ)f(x2|θ) . . . f(xn|θ) =
n∏i=1
f(xi|θ)
• Unfortunately this joint distribution will not always benice to work with
• Additionally we may actually be interested in thedistribution of a function of the random variables.
7 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Defining Statistics & Sampling Distributions I
• Similar to the definition to random variable, we provide afew definitions
Definition (Statistic (DeGroot))
Suppose that the observable random variables of interest areX1, . . . , Xn. Let r be an arbitrary real-valued function of nreal variables. Then the random variable T = r(X1, . . . , Xn) iscalled a statistic.
Definition (Statistic (Dudewicz))
Any function of the random variables that are being observedsay tn(X1, X2, . . . , Xn), is called a statistic. Further sinceX1, X2, . . . , Xn are random variables, it is a random variable.
8 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Defining Statistics & Sampling Distributions II
Definition (Statistic (Casella))
Let X1, X2, . . . , Xn be a random sample of size n from apopulation and let T (x1, . . . , xn) be a real-valued orvector-valued function whose domain includes the samplespace of (X1, X2, . . . , Xn). Then the random variable orrandom vector Y = T (X1, . . . , Xn) is called a statistic. Theprobability distribution of a statistic Y is called the samplingdistribution of Y .
9 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Defining Statistics & Sampling Distributions III
• The major take away from all of these definitions is that astatistic is a function of random variables
• Since it is a function of random variables, it is also arandom variable and therefore has a distribution!
• Often we will specifically be interested in the distributionof the statistic
• Since these distributions will often be unknown, we willlook to finding approximations for them
10 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Sums of Random Variables & the Sample Mean
• A primary statistic involves the sum of the randomvariables and functions of these sums
• That is
T (X1, X2, . . . , Xn) = X1 +X2 + · · ·+Xn =
n∑i=1
Xi
• The sample mean mean is a function of a sum of randomvariables,
X̄ =1
n
n∑i=1
Xi
11 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Sample Variance
• Another statistic is the sample variance defined as
S2 =1
n− 1
n∑i=1
(Xi − X̄)2
• The sample standard deviation is defined as S =√S2.
12 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Expectations of Sums of Random Variables I
• Claim: Let X1, X2, . . . , Xn be a random sample and letg(x) be a function such that E(g(X1)) and V ar(g(X1))exist, then
E
(n∑i=1
g(Xi)
)= nE(g(X1))
and
V ar
(n∑i=1
g(Xi)
)= n(V ar(g(X1))
13 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Expectations of Sums of Random Variables II
• Claim: Let X1, X2, . . . , Xn be a random sample and letg(x) be a function such that E(g(X1)) exists,thenE(
∑ni=1 g(Xi)) = nE(g(X1))
• Demonstrate full proof
• Short “Proof”:
E
(n∑i=1
g(Xi)
)=
n∑i=1
E (g(Xi)) = nE(g(X1))
14 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Expectations of Sums of Random Variables III
• Now consider the sample mean, assume that theexpectation of the population is µ and the variance in thepopulation is σ2
• This implies that
E(X̄) = E
(1
n
n∑i=1
Xi
)
=1
nE
(n∑i=1
Xi
)
=1
nnE(X1)
= E(X1) = µ
15 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Expectations of Sums of Random Variables IV
• Additionally we can find the variance of the sample mean
V ar(X̄) = V ar
(1
n
n∑i=1
Xi
)
=1
n2V ar
(n∑i=1
Xi
)
=1
n2nV ar(X1)
=1
nV ar(X1) =
σ2
n
• These two results imply that regardless of thedistribution of the statistic itself, we know specificproperties of the distribution!
16 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
MGF’s for Sample Means I
• We also can calculate the MGF for the sample mean,specifically, define Y = 1
n
∑∞i=1Xi, where Xi are from a
random sample (iid)...
MY (t) = E(etY )
= E(etn
∑ni=1Xi)
17 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
MGF’s for Sample Means II
MY (t) = E(etn
∑ni=1Xi)
=
∫· · ·∫∫
etn
∑ni=1 xif(x1, . . . , xn)dx1 . . . dxn
=
∫· · ·∫∫ n∏
i=1
etnxi
n∏i=1
f(xi)dx1 . . . dxn
= [MX(t/n)]n
18 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Example MGFs of the Sample Mean ofExponential Random Variables I
• The last result is mainly useful if we already know thedistribution of the underlying observations of the sample.
• Consider a random sample of size n, X1, X2, . . . from aexponential distribution with MGF
MX(t) =1
1− λt
19 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Example MGFs of the Sample Mean ofExponential Random Variables II
therefore
MX(t/n) =
(1
1− λn t
)and
MX̄(t) =
(1
1− λn t
)n• Looking this up we see this is the MGF of a Gamma(n, λn)
• We’ll compare this with some approximations later
20 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Bounding Probabilities on Statistics I
• The true distribution of a statistic is often difficult toobtain. Often, there are simple methods to get estimatesfor probability statements regarding the statistic
• Consider Markov’s Inequality
Theorem (Markov’s Inequality)
Suppose that X is a random variable such that P (X ≥ 0) = 1,then for every real number t > 0
Pr(X ≥ t) ≤ E(X)
t
• DeGroot and Schervish state that the Markov inequality isprimarily of interest for large values of t
21 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Bounding Probabilities on Statistics II
• Related to Markov’s Inequality and often more useful isChebyshev’s Inequality.
Theorem (Chebyshev’s Inequality)
Let X be a random variable for which V ar(X) exists. Thenfor every real number t > 0,
Pr(|X − E(X)| ≥ t) ≤ V ar(X)
t2
• The proof follows from Markov’s Inequality consideringthe random variable Y = (X − E(X))2.
22 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Bounding Probabilities on Statistics III
• If we consider the sample mean and apply Chebyshev’sInequality, it follows that
Pr(|X̄ − µ| ≥ t) ≤ σ2
nt2
• This can be a very useful inequality for boundingprobabilities, or helping to choose sample sizes
23 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Application: Utilizing Chebyshev’s Inequality
• From DeGroot and Schervish, Consider a random variableX with V ar(σ2) and consider t = 3σ, then byChebyshev’s
Pr(|X − E(X)| ≥ 3σ) ≤ σ2
(3σ)2=
1
9≈ 0.11
• This implies that the probability that the a randomvariable will differ from its mean by more than 3 standarddeviations is less than 0.11
24 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Application: Chebyshev’s and the Sample Mean I
DeGroot and Schervish Example 6.2.1 - Determining the Re-quired Number of Observations
• Suppose that a random sample is to be taken from adistribution for which the value of the mean µ is notknown, but for which it is known that the standarddeviation σ is 2 units or less. We shall determine howlarge the sample size must be in order to make theprobability at least 0.99 that |X̄ − µ| will be less than 1unit.
25 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Application: Chebyshev’s and the Sample Mean II
• Since σ2 ≤ 22 = 4, it follows that for every sample of sizen, that
Pr(|X̄ − µ| ≥ 1) ≤ 4
n
Further by our problem statement we would likePr(|X̄ − µ| < 1) = 0.99, thus 0.01 ≤ 4
n which implies thatwe need 400 observations.
26 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
General Idea of Asymptotics
• While the true distribution may be unknown for any givenstatistic, there may be assumptions for approximating thedistribution ‘as n grows large’
• That is we may be able to make some statements aboutthe distribution of the statistic in the limit.
• This is where the Law of Large Numbers, Central LimitTheorem, and the Delta Method come into play
• We first review a few types of convergence.
27 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Types of Convergence
• We’ll be interested in the idea of what happens to thedistribution of the statistic as the sample size grows toinfinity.
• There are three main types of convergence we’ll outlinetoday
• Convergence in Law (or Distribution)• Convergence in Probability• Almost-Sure Convergence
28 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Convergence in Distribution
• We define our first mode of convergence: convergence indistribution or convergence in law
Definition (Convergence in Distribution)
A sequence of random variables, X1, X2, . . . , converges indistribution to a random variable X if
limn→∞
FXn(x) = FX(x)
for all points x where FX(x) is continuous. Denoted XnL−→ X
or XnD−→ X
• We see here that we are first talking about distributionfunctions converging to another distribution
• This is fundamentally different than the next few types ofconvergence.
29 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Convergence in Probability and Almost Surely I
• A somewhat ‘weak’ convergence is outlined below
Definition (Convergence in Probability)
A sequence of random variables X1, X2, . . . , converges inprobability to a random variable X if, for every ε > 0,
limn→∞
P (|Xn −X| ≥ ε) = 0
denoted XnP−→ X.
• Notice that the form of this looks very similar to some ofthe inequalities that we have demonstrated...
30 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Convergence in Probability and Almost Surely II
• A stronger version of convergence is almost-sureconvergence
Definition (Almost-Sure Convergence)
A sequence of random variables X1, X2, . . . ,, converges almostsurely to a random variable X if for every ε > 0,
P ( limn→∞
|Xn −X| < ε) = 1
denoted Xna.s.−−→ X.
• This is sometimes referred to as convergence withprobability 1.
31 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Example - Converges Almost Surely
Dudewicz and Mishra Example 6.2.6
• Let Xn be a sequence of random variables defined by
Xn =
{0 with probability 1− (1
2)n
1 with probability (12)n
for n = 1, 2, 3, . . . . Then it can be shown thatP (limn→∞Xn = 0) = 1, hence Xn
a.s.−−→ 0.
32 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Example - Converges in Probability I
Dudewicz and Mishra Example 6.2.6
• Let Xn be a sequence of random variables defined by
Xn =
{0 with probability 1− (1
2)n
1 with probability (12)n
for n = 1, 2, 3, . . . . To show convergence in probability,we can appeal to Markov’s Inequality...
33 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Example - Converges in Probability II
• We see that E(Xn) = (12)n and E(X2) = (1
2)n.
• Therefore V ar(Xn) = 2n−122n
• Applying Markov’s Inequality we have for every ε > 0
Pr(|Xn| > ε) ≤ 1
2nε2
and therefore
limn→∞
P (|Xn| ≥ ε) = 0
• Therefore the sequence Xn converges in probability to arandom variable X that is “degenerate at zero” (takes onvalue 0 with probability 1)
34 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Weak Law of Large Numbers I
• Understanding convergence principles will allow us tounderstand properties of the sampling distribution as thesample size grows.
• The first result of major importance is the Weak Law ofLarge Numbers
Theorem (Weak Law of Large Numbers)
Suppose that X1, . . . , Xn form a random sample from adistribution for which the mean is µ and the variance exists.Let X̄n denote the sample mean. Then
X̄nP−→ µ
35 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Weak Law of Large Numbers II
• Proof:
Pr(|X̄n − µ| ≤ ε) ≥ 1− σ2
nε2
Hence
limn→∞
Pr(|X̄n − µ| < ε) = 1
showing the result.
• This result says that with high probability X̄n tends to avalue µ if the sample size is large
• This also suggests that if a large sample is taken of anunknown distribution, the sample mean will be a goodapproximation of the population mean with highprobability.
36 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Weak Law of Large Numbers R Demo
37 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Central Limit Theorem I
• The WLLN is great start for understanding thedistribution of the sample mean but fortunately, we canactually do better!
Theorem (Central Limit Theorem)
Let X1, X2, . . . be a sequence of independent and identicallydistributed random variables with mean µ and finite varianceσ2. Then √
n(X̄n − µ)L−→ N(0, σ2)
38 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Central Limit Theorem II
• Sketch of proof: Use moment generating functions ofcharacteristic functions to find the MGF ofY =
√n(X̄n − µ). Expand this characteristic function
using a Taylor series expansion and show that it convergesto the MGF of a normal distribution with variance σ2.
• There are additional versions of the Central LimitTheorem which reduce some of the assumptions, they willbe introduced throughout the year.
39 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Example - Distribution of the Sample Mean ofExponential Random Variables I
• The Central Limit Theorem may be one of the mostimportant results in all of statistics.
• Consider the random sample X1, X2, . . . Xn whereXi ∼ Exponential(λ). That is
f(x|λ) =1
λexp
{−xλ
}Find the distribution of the sample mean from such asample.
40 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Example - Distribution of the Sample Mean ofExponential Random Variables II
• Appealing to the Central Limit Theorem, we knowE(Xi) = λ exists and further that the varianceV ar(X) = λ2 is finite.
• This implies that
√n(X̄n − λ)
L−→ N(0, λ2)
• Thus we can say that
X̄n∼̇N(λ,λ2
n
)
41 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Central Limit Theorem Example
42 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Normal Approximation to the BinomialDistribution I
• In defining the binomial distribution, we stated that itcould be thought of as the sum of independent Bernoullitrials with success probability p.
• We can attempt to approximate the Binomial distributionthen using the Central Limit Theorem...
• First, E(X) = p and V ar(X) = p(1− p), therefore
√n(X̄n − p)
L−→ N(0, p(1− p))
based on the CLT
43 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Normal Approximation to the BinomialDistribution II
• This implies that
√n
(1
n
n∑i=1
Xi − p
)L−→ N(0, p(1− p))
factoring out a 1/n, this becomes
√n
n
(n∑i=1
Xi − np
)L−→ N(0, p(1− p))
44 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Normal Approximation to the BinomialDistribution III
• Now defining Y =∑n
i=1Xi (A binomial random variable),we have
1√n
(Y − np) L−→ N(0, p(1− p))
• Rearranging, this implies that
Y ∼̇N(np, np(1− p))
45 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Normal Approximation to the BinomialDistribution IV
• Why is such an approximation useful?
• Recall the mass function for the binomial distribution
f(x|p) =
(n
x
)px(1− p)n−x
•(nx
)= n!
x!(n−x)! which can become computationallychallenging for large n, where as the density of the normaldistribution is relatively easy to calculate computationally.
• Therefore these approximations can become very usefulthroughout statistics
46 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Normal Approximation to Binomial R Example
47 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
The Delta Method I
• The Central Limit Theorem is powerful in that it allows usto talk about the distribution of the sample mean.
• What if we’re interested in more complicated functions ofthe sample mean?
• This is where the Delta Method comes into play
• We’ll provide an informal derivation of the delta method
48 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
The Delta Method II
• Consider X1, X2, . . . , Xn which forms a random samplefrom a distribution that has a finite mean µ and finitevariance σ2.
• By CLT,√n(X̄n − µ)
L−→ N(0, σ2).
• Now suppose there exists a function g(X̄n) and we wouldlike to approximate its distribution.
49 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
The Delta Method III
• The delta method works by taking a Taylor seriesexpansion of g(X̄n) at the mean of the distribution, thatis
g(X̄n) ≈ g(µ) + g′(µ)(X̄n − µ) + . . .
and ignoring the higher order terms
50 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
The Delta Method IV
• Therefore
g(X̄n)− g(µ) = g′(µ)(X̄n − µ)√n(g(X̄n)− g(µ)) = g′(µ)
√n(X̄n − µ)
• We know√n(X̄n − µ)
L−→ N(0, σ2), which implies that
√n(g(X̄n)− g(µ))
L−→ N(0, (g′(µ))2σ2)
51 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Example - Delta Method I
Ferguson - Chapter 7 Example 1
• Consider a random sample with mean µ and variance σ2,
by the√n(X̄n − µ)
L−→ N(0, σ2). What is the distributionof X̄2
n?
52 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
Example - Delta Method II
• Here g(X̄n) = X̄2n, thus g′(X̄n) = 2X̄n, thus g′(µ) = 2µ.
Utilizing the delta method formula we have that
√n(X̄2
n − µ2)L−→ N(0, 4µ2σ2)
• Notice that if µ = 0 this becomes a degenerate randomvariable and thus this approximation may not be useful...
53 / 54
06 -Collections of
RandomVariables
RandomSamples
What is aStatistic
Sample Mean
SampleVariance
Expectationsin Samples
ImportantInequalities
General IdeaofAsymptotics
Types ofConvergence
LLN
CLT
Delta Method
References
References
Stephen Abbott. Understanding analysis. Springer, 2001.
George Casella and Roger L Berger. Statistical inference.Second edition, 2002.
EJ Dudewicz and SN Mishra. Modern mathematical statistics.john wilsey & sons. Inc., West Sussex, 1988.
Sheldon Ross. A first course in probability. Pearson, ninthedition, 2014.
Mark J Schervish. Probability and Statistics. 2014.
54 / 54