Random Samples
X1, . . . ,Xn - i.i.d variables (independent, identically distributed)or a random sample observations independently selectedfrom the same population, or resulting from analogousstatistical experiments.Definition A statistic is any function of observations from arandom sample that does not involve population parameters.Distributions of statistics are called sampling distributions.Examples:X = 1n (X1 + + Xn),S2 = 1n1
ni=1(Xi X )2,min(X1, . . . ,Xn)
Statistics are random variables
Theorem Let X1, . . . ,Xn be a random sample from adistribution with E(Xi) = and Var(Xi) = 2. ThenE(X ) = ,Var(X ) =
2
n and E(X1 + + Xn) = n,Var(X1 + + Xn) = n2.
1 / 1
Distributions of selected statistics in random samplesfrom normal populations
Theorem Let X1, . . . ,Xn be a random sample from a N(, 2)distribution. Statistics X and U = n1
2S2 = 1
2
i(Xi X )2 have
distributions N(, 2
n ) and 2n1, respectively.
Theorem If X1, . . . ,Xn is a random sample from N(, 2)distribution, then variables X and U are independent. Also, ifvariables X and U are independent then sample X1, . . . ,Xn isselected from a normally distributed population.
2 / 1
Definition If independent variables Z and U have distributionsN(0, 1) and 2 , respectively, then variable X =
ZU/
has
Students t distribution with degrees of freedom. The density
of variable X is f (x ; ) = ((+1)/2)(/2)1pi
(1 + x2
)(+1)/2.
Theorem Students tn distribution approaches N(0, 1) when thenumber n of degrees of freedom increases.
Definition If independent variables U and V havedistributions 21 and
22
, respectively, then X = U/1V/2 has Fdistribution with 1 and 2 degrees of freedom, F (1, 2).
3 / 1
Problem 10.2.2 p. 318 X F(1, 2). SinceP(X < x) = = P( 1X >
1x ), then P(
1X
F (2,1)
< 1x ) = 1 .
Problem 10.2.3 p. 318 X N(0, 1), Y N(1, 1), W N(2, 4).P( X
2+(Y1)2X2+(Y1)2+(W2)2/4 > k) = P(
11+ (W2)
2/4X2+(Y1)2
> k) =
Since 11+a > k is equivalent to 1 > k + ak and to a t)= 1P(X1 > t , . . . ,Xn > t) = 1[P(Xi > t)]n = 1 [1 F (t)]n.
g1(t) = ddtG1(t) = n[1 F (t)]n1(1) ddtF (t)= nf (t)[1 F (t)]n1k = n Gn(t) =
nr=n(nr
)[F (t)]r [1 F (t)]nr
=(nn
)[F (t)]n[1 F (t)]0 = [F (t)]n.
Also, Gn(t) = P(Xn:n t) = P(X1 t , . . . ,Xn t)= [P(Xi t)]n = [F (t)]n. Finally, gn(t) = nf (t)[F (t)]n1
gk (t) = k(nk
)f (t)[F (t)]k1[1 F (t)]nk
7 / 1
Example Xi EXP(1), f (x) = ex for x > 0 and 0 otherwise,F (t) = 1 ex for x > 0 and 0 otherwise.G1(t) = 1 [1 (1 et)]n = 1 ent , g1(t) = nentGn(t) = (1 et)n, and gn(t) = net(1 et)n1.
Example Let X1, . . . ,Xn be a random sample selected fromU[0, ] distribution.Find:(i) Densities g1(t) and gn(t) of X1:n and Xn:n, respectively.(ii) cdfs G1(t) and Gn(t).(iii) Means and variances.(iv) Covariance of X1:n and Xn:n.
8 / 1
(i) Xi U[0, ], f (x) = 1 for 0 < x < and 0 otherwise,
F (x) =
0 for x < 0x for 0 < x < 1 for x >
g1(t) = n (1 t )n1, gn(t) = n ( t )n1(ii) G1(t) = 1 [1 t ]n, Gn(t) = ( t )n.
9 / 1
(iii) E(X1:n) =
0 t n (1 t )n1dt = t = w then dt = dw , and =
10 w
n (1 w)n1dw
= n 1
0 w1(1 w)n1 BETA(2,n)
dw
= (n1)!(n+1)!n 1
0
(n + 2)(2)(n)
w1(1 w)n1dw =1
= n!(n+1)! =
n+1 .
E(Xn:n) =
0 t n ( t )n1dt = 1nn
0 tndt = nn
tn+1n+1 |0 = n
n+1
(n+1)n
= nn+1.
10 / 1
E(X 21:n) =
0 t2g1(t) =
0 t
2 n (1 t )n1dt = 1 t = u, t = (1 u), dt = duand = 01 2(1 u)2 nun1du = n2 10 un1(1 u)2
BETA(n,3)
du
= n2 (n)(3)(n+3)
10
(n + 3)(n)(3)
un1(1 u)2du =1
= 2 n(n1)!2(n+2)!
= 2 2n!(n+2)! =22
(n+1)(n+2)
11 / 1
Finally, Var(X1:n) = E(X 21:n) (E(X1:n))2 = 22
(n+1)(n+2) ( n+1)2
= 2( 2(n+1)(n+2) 1(n+1)2 ) = 2
(n+1)2(n+2) [2(n + 1) (n + 2)]
= n2
(n+1)2(n+2)
Var(Xn:n) :
E(X 2n:n) =
0 t2gn(t)dt =
0 t
2 n ( t )n1dt = nn
0 tn+1dt
= nntn+2n+2 |0 = n
2
n+2 .
Now, Var(Xn:n) = E(X 2n:n) (E(Xn:n))2 = n2
n+2 ( nn+1)2= n2( 1n+2 n(n+1)2 ) = n
2
(n+2)(n+1)2 [(n + 1)2 n(n + 2)]
=1
= n2
(n+2)(n+1)2 . Notice that Var(X1:n) = Var(Xn:n).
12 / 1
(iv) The joint density of k th and l th order statistics (l > k) isgiven by the formula gk ,l(s, t) =
n!(k1)!(lk1)!(nl)! [F (s)]
k1[F (t)F (s)]lk1[1F (t)]nl f (s)f (t)for s t , and 0 otherwise.For k = 1 and l = n :g1,n(s, t) = n!0!(n2)!0! [F (s)]
0[F (t) F (s)]n2[1 F (t)]0f (s)f (t)= n(n 1)[F (t) F (s)]n2f (s)f (t).
For X1, . . . ,Xn selected from U[0, ] distribution
g1,n(s, t) = n(n 1)[ t s ]n2 1 1 = n(n1)n (t s)n2,0 < s < t < .
13 / 1
E(X1:nXn:n) =
0
t0 st g1,n(s, t)ds dt
=
0
t0 st
n(n1)n (t s)n2ds dt
=
0 t n(n1)n t
0[t (t s)](t s)n2ds
=
dt =
= t
0 [t(t s)n2 (t s)n1]ds = ( t(ts)n1
n1 +(ts)n
n )|t0= t
n
n1 tn
n = tn( 1n1 1n ) = tn 1n(n1) .
Now = 0 t n(n1)n tnn(n1) dt = 1n 0 tn+1dt = 1n tn+2n+2 = 2n+2 ,and finally, Cov(X1:n,Xn:n) =
2
n+2 n+1 nn+1=
2
(n+1)2(n+2) [(n + 1)2 n(n + 2)
=1
] = 2
(n+1)2(n+2)
14 / 1
Distribution of the sample range R = Xn:n X1:nR = Xn:nX1:n = T S, and let companion variable be W = T .The solutions are t = w , and s = t r = w r . SinceJ = |11 01| = 1, |J| = 1. Now g(w , r) = f1,n(s(w , r), t(w , r))= n(n 1)[F (w) F (w r)]n2f (w r)f (w) and the density ofR can be obtained as hR(r) =
g(w , r)dw
Example X1 . . . ,Xn EXP(1), f (x) = ex for x > 0 and 0otherwise, F (x) = 1 ex for x > 0 and 0 otherwise. ThenhR(r) =
r n(n 1)[ew+r ew ]n2e(wr)ewdw
= er (er 1)n2n(n 1) r
ew(n2)2wdw =
15 / 1
=r e
nwdw = 1nenw |r = 1nenr , and finallyhR(t) = (n 1)er (er 1)n2enr = (n 1)er(n1)(er 1)n2= (n 1)er (er (er 1))n2 = (n 1)er (1 er )n2, r > 0.
Example Let X1, . . . ,Xn be a random sample selected from aU[0, ] distribution. Determine the sample size n needed for theexpected sample range E(Xn:n X1:n) to be at least 0.75.E(R) = E(Xn:n X1:n) = nn+1 n+1 = n1n+1. If nown1n+1 0.75 then 4(n 1) 3(n + 1) and n 7.
Var(R) = Var(Xn:n X1:n) = Var(Xn:n) + Var(X1:n)2Cov(X1:n,Xn:n) = 2n2(n+1)2(n+2) 2
2
(n+1)2(n+2) =22(n1)
(n+1)2(n+2)
16 / 1
Determine Var(Xl:n Xk :n), l > k , in a sample from U[0, ].Var(Xl:n Xk :n) = Var(Xk :n) + Var(Xl:n) 2Cov(Xk :n,Xl:n).Needed: (i) E(Xk :n), (ii) E(X 2k :n), (iii) Var(Xk :n),and (iv) Cov(Xk :n,Xl:n)
gk (t) = n!(k1)!(nk)! [F (t)]k1(1 F (t)]nk f (t)
= n!(k1)!(nk)! (t )
k1[1 t ]nk 1
17 / 1
(i) E(Xk :n) =
0 tgk (t)dt =
0 tn!
(k1)!(nk)! (t )
k1[1 t ]nk 1dt= n!(k1)!(nk)!
10 ww
k1(1 w)nk 1 dw
= n!(k1)!(nk)!
10wk (1 w)nkdw BETA(k+1,nk+1)
= (k+1)(nk+1)(n+2)n!
(k1)!(nk)!
= k!(nk)!(n+1)! n!(k1)!(nk)! = kn+1 , and E(Xl:n) = ln+1 .(ii) Similarly,E(X 2k :n) =
0 t
2gk (t)dt =
0 t2 n!
(k1)!(nk)! (t )
k1[1 t ]nk 1dt
= 2n!
(k1)!(nk)!
10wk+1(1 w)nkdw BETA(k+2,nk+1)
= (k+2)(nk+1)(n+3) 2n!
(k1)!(nk)! = 2 k(k+1)
(n+1)(n+2) .
18 / 1
(iii) Var(Xk :n) = E(X 2k :n) [E(Xk :n)]2 = 2 k(k+1)(n+1)(n+2) ( kn+1)2
= 2 k(k+1)(n+1)k2(n+2)
(n+1)2(n+2) = 2 k(n+1k)
(n+1)2(n+2) .
(iv) gk ,l(s, t) = n!(k1)!(lk1)!(nl)![F (s)]k1[F (t) F (s)]lk1[1 F (t)]nl f (s)f (t)ds dt .Now E(Xk :nXl:n) =
0
t0 st gk ,l(s, t)
=
0
t0 st(
s )
k1[ t s ]lk1[1 t ]nl 12ds dt =???
Easier way: Let Y1, . . . ,Yn be a random sample from U[0, 1].Then Xi = Yi ,Xi:n = Yi:n,E(Xi:n) = E(Yi:n) and so on. AlsoE(Xk :nXl:n) = 2E(Yk :nYl:n).
19 / 1
For = n!(k1)!(lk1)!(nl)! ,E(Yk :nYl:n) =
0
t0 st gk ,l(s, t)ds dt
=
0
t0 st sk1[t s]lk1[1 t ]nl ds dt
=
0
t0 [1 (1 t)] sk [t s]lk1[1 t ]nl ds dt
=
0
t0 s
k [t s]lk1[1 t ]nl ds dt 0 t0 sk [t s]lk1[1 t ]nl+1 ds dt = AB,
and since 1 = (n+1)!k!(lk1)!(nl)! ,
A = 1
0
t0 1s
k [t s]lk1[1 t ]nl gk+1,l+1(s,t)
ds dt
= k!(lk1)!(nl)!(n+1)! .
20 / 1
Similarly B = 2
0
t0 2s
k [t s]lk1[1 t ]nl+1 gk+1,l+1(s,t)
ds dt
= k!(lk1)!(nl+1)!(n+2)! , since 2 =(n+2)!
k!(lk1)!(nl+1)! .
Next,E(Yk :nYl:n) = A B = (k!(lk1)!(nl)!(n+1)! k!(lk1)!(nl+1)!(n+2)! )= n!(k1)!(lk1)!(nl)! k!(lk1)!(nl)!(n+2)! [(n + 2) (n l + 1)]= k(l+1)(n+1)(n+2) .
Consequently, E(Xk :nXl:n) = 2k(l+1)
(n+1)(n+2)
and Cov(Xk :n,Xl:n) = E(Xk :nXl:n) E(Xk :n)E(Xl:n)= 2 k(l+1)(n+1)(n+2) kn+1 ln+1 = 2 k(l+1)(n+1)kl(n+2)(n+1)2(n+2)= 2 k(n+1l)
(n+1)2(n+2) .
21 / 1
Finally, Var(Xl:n Xk :n) = Var(Xk :n) +Var(Xl:n) 2Cov(Xk :n,Xl:n)= 2 k(n+1k)
(n+1)2(n+2) + 2 l(n+1l)
(n+1)2(n+2) 22k(n+1l)
(n+1)2(n+2)
= 2 (lk)(n+1l+k)(n+1)2(n+2)
Joint distribution of X1:n, . . . ,Xn:n: The density of the jointdistribution is g(y1, . . . , yn) = n!f (y1) f (yn) for y1 < < ynand 0 otherwise.
HW 10.3.2 p. 325
22 / 1
Problem 10.3.6 p. 325 The density of Xi BETA(2, 1) isf (x) = (3)(2)(1)x
21(1 x)11 = 2x for 0 < x < 1 and 0 otherw.The joint density (for 0 < y1 < y2 < y3 < y4 < y5 < 1) isg(y1, y2, y3, y4, y5) = 5!2y12y22y32y42y5 = 5!25y1y2y3y4y5.
(i) g1,2,4(y1, y2, y4) = 5!25 1y4
y4y2
y1y2y3y4y5dy3 =
dy5 =
= y4y2
y1y2y3y4y5dy3 = y1y2y4y5 y4y2
y3dy3 =0.5(y24y22 )
= 0.5y1y2y4y5(y24 y22 )
23 / 1
Now = 5!24 1y4 y1y2y4y5(y24 y22 )dy5= 5!24y1y2y4(y24 y22 )
1y4y5dy5
0.5(1y24 )
= 5!23y1y2y4(y24 y22 )(1 y24 ).
24 / 1
(ii) E(X2:5|X4:5) = E(S|T ) = t
0 sf (s|t)ds = t
0 sf (s,t)fT (t)
ds. Since
f (x) = 2x ,F (X ) = x2, the joint density is (n = 5, k = 2, l = 4)
f2,4(s, t) = 5!(21)!(421)!(54)! (s2)1(t2 s2)421(1 t2)12s2t
= 4 5!s3t(t2 s2)(1 t2) for 0 < s < t < 1,and f4(t) = 5!(41)!(54)! (t
2)41(1 t2)542t = 40t7(1 t2).
Now, f (s|t) = 45!s3t(t2s2)(1t2)40t7(1t2) = 12s3t6(t2 s2), andE(X2:5|X4:5) =
t0 s 12s3t6(t2 s2)ds =
t0 12s
4t4ds
t0 12s6t6ds = 125 s5t4|t0 127 s7t6|t0 = 12(15 17)t = 2435 tHW: 10.3.6 p. 325 - find E(X3:5|X4:5)
25 / 1
(iii) Y = X2:5X1:5 =TS . Let W = S be a companion variable, so that
t = yw , and s = w . Since 0 < s < t < 1, we have0 < w < yw < 1, and that means that w > 0, y > 1, and y < 1w .J = |y1w0 | = w , |J| = w .f1,2(s, t) = 5!(11)!(211)!(52)! [F (s)]
11[F (t) F (s)]211[1 F (t)]53f (s)f (t) = 20(1 t2)32s2t = 80st(1 t2)3Now g(w , y) = 80w(wy)(1 y2w2)3w = 80w3y(1 y2w2)3,and gY (y) =
1/y0 g(w , y)dw =
1/y0 80w
3y(1 y2w2)3 dw
= 80y 1/y
0w3(1 y2w2)3 dw
=
=
Let 1 y2w2 = z. Then w2 = 1zy2 ,2wy2dw = dz, and = 01 w
3z3 12wy2dz =12y4 1
0 (1 z)z3dz = 12y4 1
0 (z3 z4)dz
= 12y4(z
4
4 z5
5 )|10dz = 140y4, and gY (y) = = 2y3 for y > 126 / 1
Generating Random SamplesWhen quantitative problems are too complex to be studiedtheoretically one can try to use simulations to obtainapproximate solutions.
Generating U[0, 1] distribution to obtain other discretedistributions such as e.g., Bernoulli, binomial, geometric,negative binomial, and Poisson.Example1: P(X = 1) = p = 1 P(X = 0). Let p = 0.3. Selectany subset of [0, 1] of length 0.3 (p). For example: [0.2, 0.5], or[0.7, 1], or [0,0.1] [0.8,1].Let [0, 0.3] and [0.3, 1] represent a success (S) and a failure(F ), respectively. Five values are generated from a U[0, 1]distribution:0.2117, 0.1385, 0.7009, 0.6990, 0.6903
S S F F F a random sample fromBIN(1, 0.3) distribution
27 / 1
Example 2 BIN(n,p) BIN(6, 0.4).Let [0, 0.6] F and (0.6, 1] S (one of possible choices).1 observation requires n generations from U[0, 1]k observations require n k generations from U[0, 1].For two observations from BIN(6, 0.4) distributions one needs12 generations from U[0, 1]
0.4972 F 0.5957 F0.8125 S 0.4801 F0.3133 F 0.2223 F0.2025 F 0.1718 F0.9335 S 0.2292 F0.0114 F 0.9815 S
X1 = 2 X2 = 1
Random sample of size 2 generated from BIN(6, 0.4)distribution: x1 = 2, x2 = 1.
28 / 1
Example 3 Generate a random sample of size 6 from POI(2)distribution.
X P(X = x) FX (x) = P(X x)0 0.1353 0.13531 0.2707 0.40602 0.2707 0.67673 0.1804 0.85714 0.0902 0.94735 0.0361 0.9834
Xk = i if for k th observation Uk Fx(i 1) Uk < FX (i).Ui 0.0909 0.1850 0.1243 0.2991 0.4290 0.9272Xi 0 1 0 1 2 4
Random sample of size 6 selected from POI(2) distribution is:0, 1, 0, 1, 2, 4.
29 / 1
Theorem If a random variable X has continuous and strictlyincreasing cdf FX then F (X ) has U[0, 1] distribution.
Therefore if Y U[0, 1] then F1X (Y ) has the same distributionas a random variable X . Therefore to generate the distributionof Y one generates the distribution U[0, 1] first, and thentransforms obtained observations by F1X .
Problem 10.4.3 p. 329 The density and the cdf are
f (x) ={
e2x for x < 0e2x for x > 0 and F (x) =
{ 12e
2x
1 12e2x .
Since F (0) = 12 , F1(y) =
{ 12 log 2y for y 1212 log 2(1 y) for y > 12 .
y1 = 0.74492 x1 = 12 log 2(1 0.744921) = 0.336517y2 = 0.464001 x2 = 12 log 2 0.464001 = 0.03736.
HW 10.4.2 p.329
30 / 1
Accept/Reject Algorithm
When the distribution of variable X is such that cdf F and/orF1 do not have closed form, one of the possible methods ofgenerating a random sample from the distribution of X is theso-called accept/reject algorithm:
Let U U[0, 1], and let variable Y with density g be somedistribution that is easy to generate. Variables U and Y areindependent.
Additionally, let c be a constant such that f (y) cg(y) for anyvalue y of Y , so in other words c = supy
f (y)g(y) .
Finally, X = Y if U < f (Y )cg(Y ) .
31 / 1
Justification: It will be shown that FX (y) = FY (y |U f (Y )cg(Y ) ).
FX (y) = FY (y |U f (Y )cg(Y ) ) = P(Yy ,Uf (y)/[cg(y)]P(Uf (y)/[cg(y)] =
P(U f (Y )/[cg(y)]) = f (y)cg(y) g(y)dy = 1c f (y)dy = 1c = c y f (t)/[cg(t)]0 g(t)du dt = c y g(t)( f (t)/[cg(t)]0 du)dt= c
y g(t)
f (t)cg(t)dt =
cc
y f (t)dt = FX (y)
Problem 10.4.6 p.329 Use accept/reject algorithm togenerate a sample from N(0, 1) distribution. fX - density ofN(0, 1), FX does not have a closed form. Y has doubleexponential (Laplace) distribution with density g(y) = 1.5e3|y |.
32 / 1
c = supyf (y)g(y) ,
f (y)g(y) =
(2pi)1/2ey2/2
1.5e|y| =1
1.5
2piey2/2+3|y |
the function is even so it is enough to consider y > 0.
y > 0: maxy ey2/2+3y ddy e
y2/2+3y = ddy ey2/2+3y (y + 3)
is equal to 0 if y = 3.
supyf (y)g(y) =
f (3)g(3) =
11.5
2pie4.5+9 = e
4.5
1.5
2pi= 23.941 = c.
X = Y if U < f (Y )cg(Y ) .
U1 U2 Y f (y)/[cg(y)] X0.22295 0.516174 0.01096 0.1148 none0.847152 0.466449 -0.02315 0.0119 none0.614370 0.001058 -2.05270 0.6385 -2.0527
x1 = 2.0527.
33 / 1
Example Another accept/reject algorithm will be used togenerate BETA(, + 1) distribution.
Let U1,U2 U[0, 1], be independent and let > 0, > 0V1 = U
1/1 ,V2 = U
1/2 . X = V1 if V1 + V2 1
Determine the distribution of X :
FX (a) = P(V1 a|V1 + V2 1) = P(V1a,V1+V21)P(V1+V21) =ND =
FV1(v) = P(V1 v) = P(U1/1 v) = P(U1 v) = v,fV1(v) = v
1 and f (v1, v2) = v11 v12 (variables V1 and
V2 are indpendent since U1 and U2 are assumed independent)
34 / 1
D = P(V1 + V2 1) = 1
0
1v10 v
11 v
12 dv2dv1
= 1
0 v11
1v10
v12 dv2 = 1
(1v1)
dv1 = 1
0 v11 (1 v1) dv1
= ()(+1)(++1) 1
0(++1)
()(+1)v11 (1 v1) dv1 = (+1)(+1)(++1)
and
N = a
0
1v10 v
11 v
12 dv2dv1 =
a0 (1 v1)v11 dv1
= ()(+1)(++1) a
0(++1)
()(+1) (1 v1)v11 dv1
= (+1)(+1)(++1) FBETA(,+1)(a).
35 / 1
Now = ND = FBETA(,+1)(a) X BETA(, + 1).
Generate 1 observation from BETA(0.738, 1.449) distribution.
X BETA(0.738, 1.449), = 0.738, = 0.449.
Generate u1,u2: 0.996484, 0.066042
v1 = 0.9964841/0.738 = 0.99523,
v2 = 0.0660421/0.449 = 0.002352.
v1 + v2 = 0.99758 1 and therefore x = v1 = 0.99523
36 / 1
Recommended