Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
RS – 4 - Jointly distributed RV (b)
1
Chapter 4Jointly distributed Random variables
Multivariate distributions
• Conditional distributions
,Xy
p x p x y ,Yx
p y p x y• Marginal distributions
,
Y XX
p x yp y x
p x
,
X YY
p x yp x y
p y
• For a Discrete RV, the joint probability function:p(x,y) = P[X = x, Y = y]
Joint probability function
RS – 4 - Jointly distributed RV (b)
2
• Conditional distributions
,Xf x f x y dy
• Marginal distributions
,
Y XX
f x yf y x
f x
,
X YY
f x yf x y
f y
,Yf y f x y dx
• For a Continuous RV, the joint probability function:f(x,y) = Pf[X = x, Y = y]
Joint probability function
Definition: Independence
Two random variables X and Y are defined to be independent if
,
: X YYY X
X X
p x y p x p yp y x p y
p x p x N ote
, X Y
XX YY Y
p x y p x p yp x y p x
p y p y
, X Yp x y p x p y if X and Y are discrete
, X Yf x y f x f y if X and Y are continuous
Thus, in the case of independence
marginal distributions ≡ conditional distributions
Independence
RS – 4 - Jointly distributed RV (b)
3
The Multiplicative Rule for densities
,X Y X
Y X Y
p x p y xp x y
p y p x y
if X and Y are discrete
if X and Y are continuous
if and are independentX Yp x p y X Y
,X Y X
Y X Y
f x f y xf x y
f y f x y
if and are independentX Yf x f y X Y
Proof:
This follows from the definition for conditional densities:
,
Y XX
p x yp y x
p x
The same is true for continuous random variables.
Hence
, X Y Xp x y p x p y x
,
X YY
p x yp x y
p y
and
, Y X Yp x y p y p x y
The Multiplicative Rule for densities
RS – 4 - Jointly distributed RV (b)
4
Suppose that a rectangle is constructed by first choosing its length, X and then choosing its width Y.
Its length X is selected from an exponential distribution with mean = 1/ = 5. Once the length has been chosen its width, Y, is selected from a uniform distribution from 0 to half its length.
Find the probability that its area A = XY is less than 4.
Solution:
Finding CDFs (or pdfs) is difficult - Example
151
5 for 0xXf x e x
1 if 0 2
2Y Xf y x y xx
, X Y Xf x y f x f y x
1 15 51 2
5 5
1= if 0 2 , 0
2x x
xe e y x xx
xy = 4 y = x/2
2 4y x x 2 8 or 8 2 2x x
2 2 2 2 2y x
2 2, 2
Finding CDFs (or pdfs) is difficult - Example
RS – 4 - Jointly distributed RV (b)
5
2 2, 2
4
2 2 2
0 0 02 2
4 , ,
xx
P XY f x y dydx f x y dydx
Finding CDFs (or pdfs) is difficult - Example
4
2 2 2
0 0 02 2
4 , ,
xx
P X Y f x y dydx f x y dydx
1 15 5
42 2 2
2 25 5
0 0 02 2
xx
x x
x xe d yd x e dyd x
1 15 5
2 22 2 4
5 2 50 2 2
x xxx x xe d x e d x
1 15 5
2 2281
5 50 2 2
x xe d x x e dx
This part can be
evaluatedThis part may require Numerical evaluation
Finding CDFs (or pdfs) is difficult - Example
RS – 4 - Jointly distributed RV (b)
6
Functions of Random Variables
Methods for determining the distribution of functions of Random Variables
Given some random variable X, we want to study some function h(X).
X is a transformation from (Ω, Σ ,P) into the probability space (χ, AX, PX). Now, Y is a transformation from (χ, AX, PX) into the probability space (Υ, AY, PY),
That is, we need to discover how the probability measure PY relates to the measure PX.
For some F є F , we have that
PY[F] = PXx є χ : Y(x) є F = Pω є Ω: g(X(ω)) є F=
= Pω є Ω: X(ω) є h-1(F)= PX[h-1(F)]
RS – 4 - Jointly distributed RV (b)
7
Methods for determining the distribution of functions of Random Variables
With non-transformed variables, we step "backwards" from the values of X to the set of events in Ω. In the transformed case, we take two steps backwards: 1) once from the range of the transformation back to the values of the X, and 2) again back to the set of events in Ω.
Potential problem: The transformation h(x) --we need to work with
h-1(x)-- may not yield unique results if h(X) is not monotonic.
Methods to get PY:
1. Distribution function method
2. Transformation method
3. Moment generating function method
Method 1: Distribution function method
Let X, Y, Z …. have joint density f(x,y,z, …)Let W = h( X, Y, Z, …)
First stepFind the distribution function of WG(w) = P[W ≤ w] = P[h( X, Y, Z, …) ≤ w]
Second stepFind the density function of Wg(w) = G'(w).
RS – 4 - Jointly distributed RV (b)
8
Distribution function method: Example 1 - 2
Let X have a normal distribution with mean 0, and variance 1 -i.e., a standard normal distribution. That is:
Let W = X2.. Find the distribution of W.
2
21
2
x
f x e
First stepFind the distribution function of W
G(w) = P[W ≤ w] = P[ X2 ≤ w]
i f 0P w X w w
2
21
2
w x
w
e d x
F w F w
where 2
21
2
x
F x f x e
Example 1: Chi-square - 2
RS – 4 - Jointly distributed RV (b)
9
d wd wF w F w
dw dw
Second stepFind the density function of W
g(w) = G'(w).
1 1
2 2 2 21 1 1 1
2 22 2
w w
e w e w
1 1
2 21 1
2 2f w w f w w
1
2 21
if 0 .2
w
w e w
Example 1: Chi-square - 2
Thus, if X ~ N(0,1) W = X2 follows
1
2 21
i f 0 .2
w
g w w e w
This distribution is the Gamma distribution with = ½ and = ½. This distribution is the
2 distribution with 1 degree of freedom.
• Using the additive properties of a gamma distribution, the sum of T independent
2 RVs produces a 2 distributed RV. Alternatively, the
sum of T independent N(0,1)2 RVs produces a 2 distributed RV.
• Note: If we add T independent N(i,σi2)2 RVs, the Σi (Xi/σi)2 follows
a non-central 2 distribution, with non-centrality parameter Σi (i/σi)2.
This distribution is common in the power analysis of statistical tests in which the null distribution is (asymptotically) a
2.
Example 1: Chi-square - 2
RS – 4 - Jointly distributed RV (b)
10
Distribution function method: Example 2
Suppose that X and Y are independent random variables each having an exponential distribution with parameter ( E(X) = 1/).
We say X & Y are i.i.d. exponential RVs.
Find the distribution of W = X + Y.
1 fo r 0xf x e x
2 fo r 0yf y e y
1 2,f x y f x f y
2 fo r 0 , 0x ye x y
First stepFind the distribution function of W = X + Y
G(w) = P[W ≤ w] = P[ X + Y ≤ w]
Example 2: Sum of Exponentials
RS – 4 - Jointly distributed RV (b)
11
Example 2: Sum of Exponentials
1 2
0 0
w w x
P X Y w f x f y dydx
2
0 0
w w xx ye dydx
Example 2: Sum of Exponentials
RS – 4 - Jointly distributed RV (b)
12
1 2
0 0
w w x
P X Y w f x f y dydx
2
0 0
w w xx ye dydx
2
0 0
w w xx ye e dy dx
2
0 0
w xw yx e
e dx
02
0
w w xx e e
e dx
Example 2: Sum of Exponentials
P X Y w 0
2
0
w w xx e e
e dx
0
wx we e dx
0
wxwe
xe
0w
we ewe
1 w we we
Example 2: Sum of Exponentials
RS – 4 - Jointly distributed RV (b)
13
Second stepFind the density function of W
g(w) = G'(w).
1 w wde we
dw
ww wdw de
e e wdw dw
2w w we e we
2 for 0wwe w
Example 2: Sum of Exponentials
Then, if X & Y ~ i.i.d. Г(1, W = X+Y has density
2 for 0wg w we w
This distribution can be recognized to be the Gamma distribution with parameters = 2 and or Г(2,
Example 2: Sum of Exponentials
RS – 4 - Jointly distributed RV (b)
14
Distribution function method: Example 3 Student’s t distribution
Let Z and U be two independent random variables with:
1. Z having a Standard Normal distribution -i.e., Z~ N(0,1)
2. U having a 2 distribution with degrees of freedom
Find the distribution ofZ
tU
2
21
2
z
f z e
2
12 2
12
2
u
h u u e
Therefore, the joint density of Z and U is:
The distribution function of T is:
Z tG t P T t P t P Z U
U
2
2
12 2
1
2,
22
z u
f z u f z h u u e
2
2
12 2
0
12
22
tU
z u
u e d zd u
Example 3: Student-t Distribution
RS – 4 - Jointly distributed RV (b)
15
G(t)
2
2
12 2
0
12
22
tU
z u
u e dzdu
2
2
12 2
0
12
( )2
2
tu
z udg t G t u e d z d u
d t
Then:
Example 3: Student-t Distribution
Using the FTC: ( )x
a
F x f t dt
Then 2
2
12 2
0
12
22
tu
z udg t u e d z d u
d t
( )F x f x
2
2
122 2
0
12
22
t uu uu e e du
b
a
b
a
dxtxFdt
ddxtxF
dt
d),(),(Using:
Example 3: Student-t Distribution
RS – 4 - Jointly distributed RV (b)
16
Hence 221
1
2 2
0
12
( )2
2
tu
g t u e du
Using 1
0
xx e dx
1
0
1 xx e dx
2 11 2
1
2 21
20 2
12
2
1
tu
u e du
t
Thus,
Example 3: Student-t Distribution
or
1 12 22 2
12
( ) 1 1
2
t tg t K
1
2
2
K
where
This is the Student’s t distribution. Its denoted as t ~ tv, where ν is referred as degrees of freedom (df).
William Gosset (1876 - 1937)
Example 3: Student-t Distribution
RS – 4 - Jointly distributed RV (b)
17
t distribution
standard normal distribution
Example 3: Student-t Distribution
Example of a t-distributed variable
Let
)1,0(~)(
/
)(__
Nx
nn
xz
1
__
2
2
_
~/
)()(
)1/()1(
)(
ntns
x
s
xn
nsn
xn
t
212
2
~)1(
n
snU
Let
Assume that Z and U are independent (later, we will learn how to check this assumption). Then,
RS – 4 - Jointly distributed RV (b)
18
Let x1, x2, … , xn denote a sample of size n from the density f(x).
Find the distribution of M = max(xi).
Repeat this computation for m = min(xi)
Assume that the density is the uniform density from 0 to . Thus,
10
( )e l s e w h e r e
xf x
0 0
( ) 0
1
x
xF x P X x x
x
Distribution function method: Example 4Distribution of the Max and Min Statistics
Finding the distribution function of M.
( ) m a x iG t P M t P x t
0 0
0
1
n
t
tt
t
Example 4: Distribution of Max & Min
RS – 4 - Jointly distributed RV (b)
19
Differentiating we find the density function of M.
1
0
0 o therw ise
n
n
n tt
g t G t
0
0.1
0.2
0.3
0.4
0.5
0.6
0 2 4 6 8 10
0
0.02
0.04
0.06
0.08
0.1
0.12
0 2 4 6 8 10
f(x) g(t)
Example 4: Distribution of Max & Min
Finding the distribution function of m.
( ) m in iG t P m t P x t
0 0
1 1 0
1
n
t
tt
t
Example 4: Distribution of Max & Min
RS – 4 - Jointly distributed RV (b)
20
0
0.1
0.2
0.3
0.4
0.5
0.6
0 2 4 6 8 10
Differentiating we find the density function of m.
1
1 0
0 o t h e r w i s e
nn t
tg t G t
0
0.02
0.04
0.06
0.08
0.1
0.12
0 2 4 6 8 10
f(x) g(t)
Example 4: Distribution of Max & Min
The Probability Integral Transformation
This transformation allows one to convert observations that come from a uniform distribution from 0 to 1 to observations that come from an arbitrary distribution.
Let U denote an observation having a uniform distribution [0, 1].
Let f(x) denote an arbitrary pdf and F(x) its corresponding CDF. Let
X = F-1(U)
We want to find the distribution of X.
1 0 1( )
e ls e w h e re
ug u
RS – 4 - Jointly distributed RV (b)
21
Find the distribution of X.
1 ( )G x P X x P F U x
P U F x F x
Hence:
Thus if U ~ Uniform(0,1), then,
X = F-1(U) has density f(x).
g x G x F x f x
U
1( )X F U
The Probability Integral Transformation
• The goal of some estimation methods is to simulate an expectation, say E[h(Z)]. To do this, we need to simulate Z from its distribution. The probability integral transformation is very handy for this task.
Example: Exponential distribution
Let U ~ Uniform(0,1).
Let F(x) = 1 – exp(- λx) –i.e., the exponential distribution.
Then,
– log(1 – U )/ λ ~ F (exponential distribution)
Example: If F is the standard normal, F-1 has no closed form solution. Most computers programs have a routine to approximate F-1 for the standard normal distribution. We can use this to simulate other distributions –for example, truncated normals.
The Probability Integral Transformation
RS – 4 - Jointly distributed RV (b)
22
Method 2: The Transformation Method
Theorem
Let X denote a RV with pdf f(x) and U = h(X).
Assume that h(x) is either strictly increasing (or decreasing) then the probability density of U is:
1
1 ( )( )
d h u d xg u f h u f x
d u d u
Proof
Use the distribution function method.
Step 1 Find the distribution function, G(u)
Step 2 Differentiate G (u ) to find the probability density function g(u)
G u P U u P h X u
1
1
( ) s t r ic t ly in c re a s in g
( ) s t r ic t ly d e c re a s in g
P X h u h
P X h u h
1
1
( ) s t r i c t ly in c r e a s in g
1 ( ) s t r i c t ly d e c r e a s in g
F h u h
F h u h
Thus, g u G u
11
11
strictly increasing
strictly decreasing
dh uF h u h
du
dh uF h u h
du
Method 2: The Transformation Method
RS – 4 - Jointly distributed RV (b)
23
g u G u
11
11
s t r ic t ly in c re a s in g
s t r ic t ly d e c re a s in g
d h uF h u h
d u
d h uF h u h
d u
1
1 ( )( )
d h u d xg u f h u f x
d u d u
Or,
Method 2: The Transformation Method
Example 1: Log Normal Distribution
Suppose that X ~ N( 2). Find the distribution of U = h(x) = eX.
Solution: Recall transformation formula:
2
221
2
x
f x e
11 ln 1
ln and dh u d u
h u udu du u
1
1 ( )( )
d h u d xg u f h u f x
d u d u
RS – 4 - Jointly distributed RV (b)
24
Replacing in transformation formula:
1
1 ( )( )
d h u d xg u f h u f x
d u d u
2
2
ln
21 1 fo r 0
2
u
e uu
Since the distribution of log(U) is normal, the distribution of U is called the log-normal distribution.
Note: It is easy to obtain the mean of a log-normal variable.
0
2
))(ln(
0
2
2
2
1)(][ dueduuuguE
u
Example 1: Log Normal Distribution
Substitution: y = (ln(u) - )/σ u = eyσ+
dy σ =(1/u)du du= σ eyσ+ dy.
Then,
0
2
))(ln(
0
2
2
2
1)(][ dueduuuguE
u
2222
2
2
0
2
))(ln(
2222
2
2
2
2
2
1
2
1
2
1
2
1][
edyeee
dyee
dyeedueuE
yy
yy
yyu
Example 1: Log Normal Distribution
RS – 4 - Jointly distributed RV (b)
25
Example: Under the Black-Scholes model, in a short period of time of length t, the return on the stock is normally distributed.
Consider a stock whose price is S
where μ is expected return and σ is volatility. It follows from this assumption that (we derived this result in Chapter 14)
Since the logarithm of ST is normal, ST is log-normally distributed.
tσt,μN~S
S
t-Tσ t],-[T2
σμlnSN~lnS
or , t-Tσ t],-[T2
σμN~lnSlnS
2
tT
2
tT
Example 1: Log Normal Distribution
Example 1: Log Normal Distribution - Graph
RS – 4 - Jointly distributed RV (b)
26
Suppose that X has a Gamma distribution with α and λparameters. Find the distribution of U = h(x) = 1/X.
Solution: Recall transformation formula:
1
1 ( )( )
d h u d xg u f h u f x
d u d u
.01)(
and/1)(2
11
x
udu
udhuuh
.0)(
),;( 1
xexxf x
Example 2: Inverse Gamma Distribution
Replacing in transformation formula:
1
1 ( )( )
d h u d xg u f h u f x
d u d u
Since the distribution of X is gamma, the distribution of U is called the inverse-gamma distribution.
It’s straightforward to calculate its mean:
).1()1()1()1(
)()(][
0
1
00
dueu
dueuduuuguE
u
u
.0)(
11
)(1
2
11
ueuu
eu
uu
Example 2: Inverse Gamma Distribution
RS – 4 - Jointly distributed RV (b)
27
• In Bayesian econometrics, it is common to assume that σ2 follows an inverse gamma distribution (& σ-2 follows a gamma!). Usual values for (α,λ): α=T/2 and λ=1/(2η2)=Φ/2, where η2 is related to the variance of the T N(0, η2) variables we are implicitly adding.
Example 2: Inverse Gamma Distribution - Graph
Method 3: Use of moment generating functions
RS – 4 - Jointly distributed RV (b)
28
Review: MGF - Definition
Let X denote a random variable with probability density function f(x) if continuous (probability mass function p(x) if discrete). Then
mX(t) = the moment generating function of X
tXE e
if is co n tin u o u s
if is d isc re te
tx
tx
x
e f x d x X
e p x X
The distribution of a random variable X is described by either
1. The density function f(x) if X continuous (probability mass function p(x) if X discrete), or
2. The cumulative distribution function F(x), or
3. The moment generating function mX(t)
Review: MGF - Properties
1. mX(0) = 1
0 d e riva tive o f a t 0 .k thX Xm k m t t 2.
kk E X
2 33211 .
2 ! 3 ! !kk
Xm t t t t tk
L L3.
4. Let X be a RV with MGF mX(t). Let Y = bX + a
Then, mY(t) = mbX + a(t) = E(e [bX + a]t) = eatmX (bt)
co n tin u o u s
d isc re te
k
kk k
x f x d x XE X
x p x X
RS – 4 - Jointly distributed RV (b)
29
5. Let X and Y be two independent random variables with moment generating function mX(t) and mY(t) .
Then mX+Y(t) = mX (t) mY (t)
6. Let X and Y be two random variables with moment generating function mX(t) and mY(t) and two distribution functions FX(x) and FY(y) respectively.
If mX (t) = mY (t), then FX(x) = FY(x).
This ensures that the distribution of a random variable can be identified by its moment generating function.
7. mX+b(t) = ebt mX (t)
Review: MGF - Properties
Using of moment generating functions to find the distribution of
functions of Random Variables
RS – 4 - Jointly distributed RV (b)
30
Example: Sum of 2 independent exponentials
= MGF of the gamma distribution with α=2.
Suppose that X and Y are two independent distributed exponential random variables with pdf ’s given by
f(x) = λ e-λx
f(y) = λ e-λy
Solution:
1
1
)1()(
)1()()(
ttm
t
ttm
Y
X
211 )1()1()1()()()(
ttt
tmtmtm XXYX
Example: Affine Transformation of a Normal
2 2
2
tt
Xm t e
22
2
atatbt bt
aX b Xm t e m at e e
2 2 2
2
a ta b t
e
= MGF of the normal distribution with mean a + b and variance a22.
Thus, Y = aX + b ~ N(a + b, a22).
• Suppose that X ~ N() . What is the distribution of Y = aX + b?
• Solution:
RS – 4 - Jointly distributed RV (b)
31
Thus Z has a standard normal distribution.
1XZ X a X b
10Z a b
22 2 2 21
1Z a
Special Case: the z transformation
Example: Affine Transformation of a Normal
Example 1: Distribution of X+YSuppose that X and Y are independent. Let X ~ N(X , X
2) and Y ~ N(Y , Y
2). Find the distribution of S = X + Y.
Solution:
2 2
2X
Xt
t
Xm t e
2 2
2Y
Yt
t
Ym t e
2 2 2 2
2 2X Y
X Yt t
t t
X Y X Ym t m t m t e e
Now,
2 2 2
2
X YX Y
tt
X Ym t e
Thus, Y = X + Y ~ N(X + Y , X2+Y
2).
RS – 4 - Jointly distributed RV (b)
32
Example 2: Distribution of aX+bYSuppose that X and Y are independent. Let X ~ N(X , X
2) and Y ~ N(Y , Y
2). Find the distribution of L = aX + bY.
Solution:
2 2
2X
Xt
t
Xm t e
2 2
2Y
Yt
t
Ym t e
aX bY aX bY X Ym t m t m t m at m bt Now,
2 22 2
2 2X Y
X Yat bt
at bte e
2 2 2 2 2
2
X YX Y
a b ta b t
aX bYm t e
Thus, Y = aX + bY ~ N(aX + bY , a2X2+b2Y
2).
2 22 2 2 21 1
X Y X Y
Special Case:
a = +1 & b = -1.
Thus Y = X - Y has a normal distribution with mean X - Y and variance:
Example 2: Distribution of aX+bY
RS – 4 - Jointly distributed RV (b)
33
Example 3: (Extension to n independent RV’s)
Suppose that X1, X2, …, Xn are independent each having a normal distribution with means i, standard deviations i (for i = 1, 2, … , n)Find the distribution of L = a1X1 + a1X2 + …+ anXn
Solution:
2 2
2i
i
i
tt
Xm t e
1 1 1 1n n n na X a X a X a Xm t m t m t L LNow
22 221 1
1 1 2 2n n
n n
a ta ta t a t
e e
L
(for i = 1, 2, … , n)
1 1 nX X nm a t m a t L
...
...
...
...
or
2 2 2 2 2
1 11 1
1 1
......
2
n nn n
n n
a a ta a t
a X a Xm t e
L
= the MGF of the normal distribution
with mean:
and variance:
Thus,
Y = a1X1 + … + anXn ~N(a11 + …+ ann, a121
2+ … + an2n
2)
1 1 . . . n na a 2 2 2 2
1 1 . . . n na a
Example 3: (Extension to n independent RV’s)
RS – 4 - Jointly distributed RV (b)
34
In this case X1, X2, …, Xn is a sample from a normal distribution with mean , and standard deviations and
1 2
1nL X X X
n L th e sam p le m eanX
Special case:a1 = a2 = … = an = 1/n1 = 2 = … = n = 1 = 2 = … = n =
Example 3: (Extension to n independent RV’s)
...
Thus
2 2 2 2 21 1 ...x n na a
and variance
1 1 ...x n na a
has a normal distribution with mean
1 1 . . . n nY x a x a x 11 1. . . nx xn n
1 1...n n
2 2 2 22 2 21 1 1
. . . nn n n n
SummaryLet x1, x2, …, xn be a sample from a normal distribution with mean , and standard deviations . Then,
~ N(X, σ2/n)X
Example 3: (Extension to n independent RV’s)
RS – 4 - Jointly distributed RV (b)
35
0
0.1
0.2
0.3
0.4
20 30 40 50 60
Population
xDistribution of
Example 3: (Extension to n independent RV’s)
Let the random variable Yi be determined by:Yi = α + βXi + εi, i=1, 2, ..., n.
where the Xi’s are (exogenous or pre-determined) numbers, α and β
constants. Let the random variable εi follow a normal distribution with zero mean and constant variance. That is,
εi ~ N(0, σ2), i = 1, 2, ..., n.Then,
E(Yi)= E(α + βXi + εi) = α + βXi
Var(Yi)= Var(α + βXi + εi) = Var(εi) = σ2.
Since Y is a linear function of a normal RV Yi ~ N(α + βXi, σ2).
Application: Simple Regression Framework
RS – 4 - Jointly distributed RV (b)
36
Central and Non-central Distributions
• Noncentrality parameters are parameters of families of probability distributions which are related to other central families of distributions. If the noncentrality parameter of a distribution is zero, the distribution is identical to a distribution in the central family.
For example, the standard Student's t-distribution is the central family of distributions for the Noncentral t-distribution family.
• Noncentrality parameters are used in the following distributions:- t-distribution- F-distribution- χ2 distribution - χ distribution - Beta distribution
• In general, noncentrality parameters occur in distributions that are transformations of a normal distribution. The central versions are derived from normal distributions; the noncentral versions generalize to arbitrary means, 0.
Example: The standard (=central) χ2 distribution is the distribution of a sum of squared independent N(0, 1). The noncentral χ2 distribution generalizes this to N(, 2).
• There are extended versions of these distributions with two noncentrality parameters: the doubly noncentral beta distribution, the doubly noncentral F-distribution and the doubly noncentral t-distribution.
Central and Non-central Distributions
RS – 4 - Jointly distributed RV (b)
37
• There are distributions with two noncentrality parameters. This happens for distributions that are defined as the quotient of two independent distributions.
• When both source distributions are central, the result is a central distribution. When one is noncentral, a (singly) noncentral distribution results, while if both are noncentral, the result is a doubly noncentral distribution.
Example: A t-distribution is defined (ignoring constants) as the ratio of a N(0,1) RV and the square root of an independent χ2 RV. Both RV can have noncentral parameters. This produces a doubly noncentral t-distribution.
Central and Non-central Distributions
The proof of the CLT is very simple using moment generating functions. We will rely on the following result:
Let m1(t), m2(t), … , be a sequence of moment generating functions corresponding to the sequence of distribution functions:F1(x) , F2(x), … Let m(t) be a moment generating function corresponding to the distribution function F(x).
The Central Limit Theorem (CLT): Preliminaries
lim for all in an interval about 0.ii
m t m t t
lim for all .ii
F x F x x
then
Then, if
RS – 4 - Jointly distributed RV (b)
38
Let x1, x2, …, xn be a sequence of independent and identically distributed RVs with finite mean , and finite variance Then as nincreases, , the sample mean, approaches the normal distribution with mean μ and variance n.
This theorem is sometimes stated as
where means “the limiting distribution (asymptotic distribution) is” (or convergence in distribution).
Note: This CLT is sometimes referred as the Lindeberg-Lévy CLT.
The Central Limit Theorem (CLT)
_
x
)1,0()(
_
Nxn d
d
Proof: (use moment generating functions)Let x1, x2, … be a sequence of independent random variables from a distribution with moment generating function m(t) and CDF F(x).
Let Sn = x1 + x2 + … + xn then
1 2 1 2
=n n nS x x x x x xm t m t m t m t m t L L
=n
m t
1 2now n nx x x Sx
n n
L
1or n
n
n
x SS
n
t tm t m t m m
n n
The Central Limit Theorem (CLT)
...
......
RS – 4 - Jointly distributed RV (b)
39
Let x n n
z x
n
then
nn n
t t
z x
nt ntm t e m e m
n
and ln lnz
n tm t t n m
n
2
2 2Let or and
t t tu n n
u un
2 2
2 2 2ln
t tm u
u u
ln [mz(t)] 2
2 2
ln m u ut
u
The Central Limit Theorem (CLT)
0
Now lim ln lim lnz zn u
m t m t
2
2 20
lnlimu
m u ut
u
2
2 0lim using L 'H opital's rule
2u
m u
m ut
u
2
22
2 0lim using L'Hopital's rule again
2u
m u m u m u
m ut
2
2 2
ln m u ut
u
ln [mz(t)]
The Central Limit Theorem (CLT)
RS – 4 - Jointly distributed RV (b)
40
22
2
0 0
2
m mt
222 2
2 2 2i iE x E xt t
222thus lim ln and lim
2
t
z zn n
tm t m t e
2
2Now t
m t e
• This is the moment generating function of the standard normal distribution.
• Thus, the limiting distribution of z is the standard normal distribution:
2
21
i.e. lim2
x u
zn
F x e du
Q.E.D.
The Central Limit Theorem (CLT)
The CLT states that the asymptotic distribution of the sample mean is a normal distribution.
Q: What is meant by “asymptotic distribution”?In general, the “asymptotic distribution of Xn” means a non-constant RV X, along with real-number sequences an and bn, such that an(Xn − bn) has as limiting distribution the distribution of X.
In this case, the distribution of X might be referred to as the asymptotic or limiting distribution of either Xn or of an(Xn − bn), depending on the context.
The real number sequence an plays the role of a stabilizing transformation to make sure the transformed RV –i.e., an(Xn − bn)– does not have zero variance as n →∞.
Example: has zero variance as n →∞. But, n1/2 has a finite variance, σ2._
x_
x
CLT: Asymptotic Distribution
RS – 4 - Jointly distributed RV (b)
41
• The CLT gives only an asymptotic distribution. As an approximation for a finite number of observations, it provides a reasonable approximation only when the observations are close to the mean; it requires a very large number of observations to stretch it into the tails.
• The CLT also applies in the case of sequences that are not identically distributed. Extra conditions need to be imposed on the RVs.
• Lindeberg found a condition on the sequence Xn, which guarantees that the distribution of Sn is asymptotically normally distributed. W. Feller showed that Lindeberg's condition is necessary as well (if the condition does not hold, then the sum Sn is not asymptotically normally distributed).
• A sufficient condition that is stronger (but easier to state) than Lindeberg's condition is that there exists a constant A, such that |Xn|< A for all n.
CLT: Remarks
Jarl W. Lindeberg (1876 – 1932) Vilibald S. (Willy) Feller (1906-1970)
Paul Pierre Lévy (1886–1971)