5620HHa14

Embed Size (px)

Citation preview

  • Random Samples

    X1, . . . ,Xn - i.i.d variables (independent, identically distributed)or a random sample observations independently selectedfrom the same population, or resulting from analogousstatistical experiments.Definition A statistic is any function of observations from arandom sample that does not involve population parameters.Distributions of statistics are called sampling distributions.Examples:X = 1n (X1 + + Xn),S2 = 1n1

    ni=1(Xi X )2,min(X1, . . . ,Xn)

    Statistics are random variables

    Theorem Let X1, . . . ,Xn be a random sample from adistribution with E(Xi) = and Var(Xi) = 2. ThenE(X ) = ,Var(X ) =

    2

    n and E(X1 + + Xn) = n,Var(X1 + + Xn) = n2.

    1 / 1

    Distributions of selected statistics in random samplesfrom normal populations

    Theorem Let X1, . . . ,Xn be a random sample from a N(, 2)distribution. Statistics X and U = n1

    2S2 = 1

    2

    i(Xi X )2 have

    distributions N(, 2

    n ) and 2n1, respectively.

    Theorem If X1, . . . ,Xn is a random sample from N(, 2)distribution, then variables X and U are independent. Also, ifvariables X and U are independent then sample X1, . . . ,Xn isselected from a normally distributed population.

    2 / 1

  • Definition If independent variables Z and U have distributionsN(0, 1) and 2 , respectively, then variable X =

    ZU/

    has

    Students t distribution with degrees of freedom. The density

    of variable X is f (x ; ) = ((+1)/2)(/2)1pi

    (1 + x2

    )(+1)/2.

    Theorem Students tn distribution approaches N(0, 1) when thenumber n of degrees of freedom increases.

    Definition If independent variables U and V havedistributions 21 and

    22

    , respectively, then X = U/1V/2 has Fdistribution with 1 and 2 degrees of freedom, F (1, 2).

    3 / 1

    Problem 10.2.2 p. 318 X F(1, 2). SinceP(X < x) = = P( 1X >

    1x ), then P(

    1X

    F (2,1)

    < 1x ) = 1 .

    Problem 10.2.3 p. 318 X N(0, 1), Y N(1, 1), W N(2, 4).P( X

    2+(Y1)2X2+(Y1)2+(W2)2/4 > k) = P(

    11+ (W2)

    2/4X2+(Y1)2

    > k) =

    Since 11+a > k is equivalent to 1 > k + ak and to a t)= 1P(X1 > t , . . . ,Xn > t) = 1[P(Xi > t)]n = 1 [1 F (t)]n.

    g1(t) = ddtG1(t) = n[1 F (t)]n1(1) ddtF (t)= nf (t)[1 F (t)]n1k = n Gn(t) =

    nr=n(nr

    )[F (t)]r [1 F (t)]nr

    =(nn

    )[F (t)]n[1 F (t)]0 = [F (t)]n.

    Also, Gn(t) = P(Xn:n t) = P(X1 t , . . . ,Xn t)= [P(Xi t)]n = [F (t)]n. Finally, gn(t) = nf (t)[F (t)]n1

    gk (t) = k(nk

    )f (t)[F (t)]k1[1 F (t)]nk

    7 / 1

    Example Xi EXP(1), f (x) = ex for x > 0 and 0 otherwise,F (t) = 1 ex for x > 0 and 0 otherwise.G1(t) = 1 [1 (1 et)]n = 1 ent , g1(t) = nentGn(t) = (1 et)n, and gn(t) = net(1 et)n1.

    Example Let X1, . . . ,Xn be a random sample selected fromU[0, ] distribution.Find:(i) Densities g1(t) and gn(t) of X1:n and Xn:n, respectively.(ii) cdfs G1(t) and Gn(t).(iii) Means and variances.(iv) Covariance of X1:n and Xn:n.

    8 / 1

  • (i) Xi U[0, ], f (x) = 1 for 0 < x < and 0 otherwise,

    F (x) =

    0 for x < 0x for 0 < x < 1 for x >

    g1(t) = n (1 t )n1, gn(t) = n ( t )n1(ii) G1(t) = 1 [1 t ]n, Gn(t) = ( t )n.

    9 / 1

    (iii) E(X1:n) =

    0 t n (1 t )n1dt = t = w then dt = dw , and =

    10 w

    n (1 w)n1dw

    = n 1

    0 w1(1 w)n1 BETA(2,n)

    dw

    = (n1)!(n+1)!n 1

    0

    (n + 2)(2)(n)

    w1(1 w)n1dw =1

    = n!(n+1)! =

    n+1 .

    E(Xn:n) =

    0 t n ( t )n1dt = 1nn

    0 tndt = nn

    tn+1n+1 |0 = n

    n+1

    (n+1)n

    = nn+1.

    10 / 1

  • E(X 21:n) =

    0 t2g1(t) =

    0 t

    2 n (1 t )n1dt = 1 t = u, t = (1 u), dt = duand = 01 2(1 u)2 nun1du = n2 10 un1(1 u)2

    BETA(n,3)

    du

    = n2 (n)(3)(n+3)

    10

    (n + 3)(n)(3)

    un1(1 u)2du =1

    = 2 n(n1)!2(n+2)!

    = 2 2n!(n+2)! =22

    (n+1)(n+2)

    11 / 1

    Finally, Var(X1:n) = E(X 21:n) (E(X1:n))2 = 22

    (n+1)(n+2) ( n+1)2

    = 2( 2(n+1)(n+2) 1(n+1)2 ) = 2

    (n+1)2(n+2) [2(n + 1) (n + 2)]

    = n2

    (n+1)2(n+2)

    Var(Xn:n) :

    E(X 2n:n) =

    0 t2gn(t)dt =

    0 t

    2 n ( t )n1dt = nn

    0 tn+1dt

    = nntn+2n+2 |0 = n

    2

    n+2 .

    Now, Var(Xn:n) = E(X 2n:n) (E(Xn:n))2 = n2

    n+2 ( nn+1)2= n2( 1n+2 n(n+1)2 ) = n

    2

    (n+2)(n+1)2 [(n + 1)2 n(n + 2)]

    =1

    = n2

    (n+2)(n+1)2 . Notice that Var(X1:n) = Var(Xn:n).

    12 / 1

  • (iv) The joint density of k th and l th order statistics (l > k) isgiven by the formula gk ,l(s, t) =

    n!(k1)!(lk1)!(nl)! [F (s)]

    k1[F (t)F (s)]lk1[1F (t)]nl f (s)f (t)for s t , and 0 otherwise.For k = 1 and l = n :g1,n(s, t) = n!0!(n2)!0! [F (s)]

    0[F (t) F (s)]n2[1 F (t)]0f (s)f (t)= n(n 1)[F (t) F (s)]n2f (s)f (t).

    For X1, . . . ,Xn selected from U[0, ] distribution

    g1,n(s, t) = n(n 1)[ t s ]n2 1 1 = n(n1)n (t s)n2,0 < s < t < .

    13 / 1

    E(X1:nXn:n) =

    0

    t0 st g1,n(s, t)ds dt

    =

    0

    t0 st

    n(n1)n (t s)n2ds dt

    =

    0 t n(n1)n t

    0[t (t s)](t s)n2ds

    =

    dt =

    = t

    0 [t(t s)n2 (t s)n1]ds = ( t(ts)n1

    n1 +(ts)n

    n )|t0= t

    n

    n1 tn

    n = tn( 1n1 1n ) = tn 1n(n1) .

    Now = 0 t n(n1)n tnn(n1) dt = 1n 0 tn+1dt = 1n tn+2n+2 = 2n+2 ,and finally, Cov(X1:n,Xn:n) =

    2

    n+2 n+1 nn+1=

    2

    (n+1)2(n+2) [(n + 1)2 n(n + 2)

    =1

    ] = 2

    (n+1)2(n+2)

    14 / 1

  • Distribution of the sample range R = Xn:n X1:nR = Xn:nX1:n = T S, and let companion variable be W = T .The solutions are t = w , and s = t r = w r . SinceJ = |11 01| = 1, |J| = 1. Now g(w , r) = f1,n(s(w , r), t(w , r))= n(n 1)[F (w) F (w r)]n2f (w r)f (w) and the density ofR can be obtained as hR(r) =

    g(w , r)dw

    Example X1 . . . ,Xn EXP(1), f (x) = ex for x > 0 and 0otherwise, F (x) = 1 ex for x > 0 and 0 otherwise. ThenhR(r) =

    r n(n 1)[ew+r ew ]n2e(wr)ewdw

    = er (er 1)n2n(n 1) r

    ew(n2)2wdw =

    15 / 1

    =r e

    nwdw = 1nenw |r = 1nenr , and finallyhR(t) = (n 1)er (er 1)n2enr = (n 1)er(n1)(er 1)n2= (n 1)er (er (er 1))n2 = (n 1)er (1 er )n2, r > 0.

    Example Let X1, . . . ,Xn be a random sample selected from aU[0, ] distribution. Determine the sample size n needed for theexpected sample range E(Xn:n X1:n) to be at least 0.75.E(R) = E(Xn:n X1:n) = nn+1 n+1 = n1n+1. If nown1n+1 0.75 then 4(n 1) 3(n + 1) and n 7.

    Var(R) = Var(Xn:n X1:n) = Var(Xn:n) + Var(X1:n)2Cov(X1:n,Xn:n) = 2n2(n+1)2(n+2) 2

    2

    (n+1)2(n+2) =22(n1)

    (n+1)2(n+2)

    16 / 1

  • Determine Var(Xl:n Xk :n), l > k , in a sample from U[0, ].Var(Xl:n Xk :n) = Var(Xk :n) + Var(Xl:n) 2Cov(Xk :n,Xl:n).Needed: (i) E(Xk :n), (ii) E(X 2k :n), (iii) Var(Xk :n),and (iv) Cov(Xk :n,Xl:n)

    gk (t) = n!(k1)!(nk)! [F (t)]k1(1 F (t)]nk f (t)

    = n!(k1)!(nk)! (t )

    k1[1 t ]nk 1

    17 / 1

    (i) E(Xk :n) =

    0 tgk (t)dt =

    0 tn!

    (k1)!(nk)! (t )

    k1[1 t ]nk 1dt= n!(k1)!(nk)!

    10 ww

    k1(1 w)nk 1 dw

    = n!(k1)!(nk)!

    10wk (1 w)nkdw BETA(k+1,nk+1)

    = (k+1)(nk+1)(n+2)n!

    (k1)!(nk)!

    = k!(nk)!(n+1)! n!(k1)!(nk)! = kn+1 , and E(Xl:n) = ln+1 .(ii) Similarly,E(X 2k :n) =

    0 t

    2gk (t)dt =

    0 t2 n!

    (k1)!(nk)! (t )

    k1[1 t ]nk 1dt

    = 2n!

    (k1)!(nk)!

    10wk+1(1 w)nkdw BETA(k+2,nk+1)

    = (k+2)(nk+1)(n+3) 2n!

    (k1)!(nk)! = 2 k(k+1)

    (n+1)(n+2) .

    18 / 1

  • (iii) Var(Xk :n) = E(X 2k :n) [E(Xk :n)]2 = 2 k(k+1)(n+1)(n+2) ( kn+1)2

    = 2 k(k+1)(n+1)k2(n+2)

    (n+1)2(n+2) = 2 k(n+1k)

    (n+1)2(n+2) .

    (iv) gk ,l(s, t) = n!(k1)!(lk1)!(nl)![F (s)]k1[F (t) F (s)]lk1[1 F (t)]nl f (s)f (t)ds dt .Now E(Xk :nXl:n) =

    0

    t0 st gk ,l(s, t)

    =

    0

    t0 st(

    s )

    k1[ t s ]lk1[1 t ]nl 12ds dt =???

    Easier way: Let Y1, . . . ,Yn be a random sample from U[0, 1].Then Xi = Yi ,Xi:n = Yi:n,E(Xi:n) = E(Yi:n) and so on. AlsoE(Xk :nXl:n) = 2E(Yk :nYl:n).

    19 / 1

    For = n!(k1)!(lk1)!(nl)! ,E(Yk :nYl:n) =

    0

    t0 st gk ,l(s, t)ds dt

    =

    0

    t0 st sk1[t s]lk1[1 t ]nl ds dt

    =

    0

    t0 [1 (1 t)] sk [t s]lk1[1 t ]nl ds dt

    =

    0

    t0 s

    k [t s]lk1[1 t ]nl ds dt 0 t0 sk [t s]lk1[1 t ]nl+1 ds dt = AB,

    and since 1 = (n+1)!k!(lk1)!(nl)! ,

    A = 1

    0

    t0 1s

    k [t s]lk1[1 t ]nl gk+1,l+1(s,t)

    ds dt

    = k!(lk1)!(nl)!(n+1)! .

    20 / 1

  • Similarly B = 2

    0

    t0 2s

    k [t s]lk1[1 t ]nl+1 gk+1,l+1(s,t)

    ds dt

    = k!(lk1)!(nl+1)!(n+2)! , since 2 =(n+2)!

    k!(lk1)!(nl+1)! .

    Next,E(Yk :nYl:n) = A B = (k!(lk1)!(nl)!(n+1)! k!(lk1)!(nl+1)!(n+2)! )= n!(k1)!(lk1)!(nl)! k!(lk1)!(nl)!(n+2)! [(n + 2) (n l + 1)]= k(l+1)(n+1)(n+2) .

    Consequently, E(Xk :nXl:n) = 2k(l+1)

    (n+1)(n+2)

    and Cov(Xk :n,Xl:n) = E(Xk :nXl:n) E(Xk :n)E(Xl:n)= 2 k(l+1)(n+1)(n+2) kn+1 ln+1 = 2 k(l+1)(n+1)kl(n+2)(n+1)2(n+2)= 2 k(n+1l)

    (n+1)2(n+2) .

    21 / 1

    Finally, Var(Xl:n Xk :n) = Var(Xk :n) +Var(Xl:n) 2Cov(Xk :n,Xl:n)= 2 k(n+1k)

    (n+1)2(n+2) + 2 l(n+1l)

    (n+1)2(n+2) 22k(n+1l)

    (n+1)2(n+2)

    = 2 (lk)(n+1l+k)(n+1)2(n+2)

    Joint distribution of X1:n, . . . ,Xn:n: The density of the jointdistribution is g(y1, . . . , yn) = n!f (y1) f (yn) for y1 < < ynand 0 otherwise.

    HW 10.3.2 p. 325

    22 / 1

  • Problem 10.3.6 p. 325 The density of Xi BETA(2, 1) isf (x) = (3)(2)(1)x

    21(1 x)11 = 2x for 0 < x < 1 and 0 otherw.The joint density (for 0 < y1 < y2 < y3 < y4 < y5 < 1) isg(y1, y2, y3, y4, y5) = 5!2y12y22y32y42y5 = 5!25y1y2y3y4y5.

    (i) g1,2,4(y1, y2, y4) = 5!25 1y4

    y4y2

    y1y2y3y4y5dy3 =

    dy5 =

    = y4y2

    y1y2y3y4y5dy3 = y1y2y4y5 y4y2

    y3dy3 =0.5(y24y22 )

    = 0.5y1y2y4y5(y24 y22 )

    23 / 1

    Now = 5!24 1y4 y1y2y4y5(y24 y22 )dy5= 5!24y1y2y4(y24 y22 )

    1y4y5dy5

    0.5(1y24 )

    = 5!23y1y2y4(y24 y22 )(1 y24 ).

    24 / 1

  • (ii) E(X2:5|X4:5) = E(S|T ) = t

    0 sf (s|t)ds = t

    0 sf (s,t)fT (t)

    ds. Since

    f (x) = 2x ,F (X ) = x2, the joint density is (n = 5, k = 2, l = 4)

    f2,4(s, t) = 5!(21)!(421)!(54)! (s2)1(t2 s2)421(1 t2)12s2t

    = 4 5!s3t(t2 s2)(1 t2) for 0 < s < t < 1,and f4(t) = 5!(41)!(54)! (t

    2)41(1 t2)542t = 40t7(1 t2).

    Now, f (s|t) = 45!s3t(t2s2)(1t2)40t7(1t2) = 12s3t6(t2 s2), andE(X2:5|X4:5) =

    t0 s 12s3t6(t2 s2)ds =

    t0 12s

    4t4ds

    t0 12s6t6ds = 125 s5t4|t0 127 s7t6|t0 = 12(15 17)t = 2435 tHW: 10.3.6 p. 325 - find E(X3:5|X4:5)

    25 / 1

    (iii) Y = X2:5X1:5 =TS . Let W = S be a companion variable, so that

    t = yw , and s = w . Since 0 < s < t < 1, we have0 < w < yw < 1, and that means that w > 0, y > 1, and y < 1w .J = |y1w0 | = w , |J| = w .f1,2(s, t) = 5!(11)!(211)!(52)! [F (s)]

    11[F (t) F (s)]211[1 F (t)]53f (s)f (t) = 20(1 t2)32s2t = 80st(1 t2)3Now g(w , y) = 80w(wy)(1 y2w2)3w = 80w3y(1 y2w2)3,and gY (y) =

    1/y0 g(w , y)dw =

    1/y0 80w

    3y(1 y2w2)3 dw

    = 80y 1/y

    0w3(1 y2w2)3 dw

    =

    =

    Let 1 y2w2 = z. Then w2 = 1zy2 ,2wy2dw = dz, and = 01 w

    3z3 12wy2dz =12y4 1

    0 (1 z)z3dz = 12y4 1

    0 (z3 z4)dz

    = 12y4(z

    4

    4 z5

    5 )|10dz = 140y4, and gY (y) = = 2y3 for y > 126 / 1

  • Generating Random SamplesWhen quantitative problems are too complex to be studiedtheoretically one can try to use simulations to obtainapproximate solutions.

    Generating U[0, 1] distribution to obtain other discretedistributions such as e.g., Bernoulli, binomial, geometric,negative binomial, and Poisson.Example1: P(X = 1) = p = 1 P(X = 0). Let p = 0.3. Selectany subset of [0, 1] of length 0.3 (p). For example: [0.2, 0.5], or[0.7, 1], or [0,0.1] [0.8,1].Let [0, 0.3] and [0.3, 1] represent a success (S) and a failure(F ), respectively. Five values are generated from a U[0, 1]distribution:0.2117, 0.1385, 0.7009, 0.6990, 0.6903

    S S F F F a random sample fromBIN(1, 0.3) distribution

    27 / 1

    Example 2 BIN(n,p) BIN(6, 0.4).Let [0, 0.6] F and (0.6, 1] S (one of possible choices).1 observation requires n generations from U[0, 1]k observations require n k generations from U[0, 1].For two observations from BIN(6, 0.4) distributions one needs12 generations from U[0, 1]

    0.4972 F 0.5957 F0.8125 S 0.4801 F0.3133 F 0.2223 F0.2025 F 0.1718 F0.9335 S 0.2292 F0.0114 F 0.9815 S

    X1 = 2 X2 = 1

    Random sample of size 2 generated from BIN(6, 0.4)distribution: x1 = 2, x2 = 1.

    28 / 1

  • Example 3 Generate a random sample of size 6 from POI(2)distribution.

    X P(X = x) FX (x) = P(X x)0 0.1353 0.13531 0.2707 0.40602 0.2707 0.67673 0.1804 0.85714 0.0902 0.94735 0.0361 0.9834

    Xk = i if for k th observation Uk Fx(i 1) Uk < FX (i).Ui 0.0909 0.1850 0.1243 0.2991 0.4290 0.9272Xi 0 1 0 1 2 4

    Random sample of size 6 selected from POI(2) distribution is:0, 1, 0, 1, 2, 4.

    29 / 1

    Theorem If a random variable X has continuous and strictlyincreasing cdf FX then F (X ) has U[0, 1] distribution.

    Therefore if Y U[0, 1] then F1X (Y ) has the same distributionas a random variable X . Therefore to generate the distributionof Y one generates the distribution U[0, 1] first, and thentransforms obtained observations by F1X .

    Problem 10.4.3 p. 329 The density and the cdf are

    f (x) ={

    e2x for x < 0e2x for x > 0 and F (x) =

    { 12e

    2x

    1 12e2x .

    Since F (0) = 12 , F1(y) =

    { 12 log 2y for y 1212 log 2(1 y) for y > 12 .

    y1 = 0.74492 x1 = 12 log 2(1 0.744921) = 0.336517y2 = 0.464001 x2 = 12 log 2 0.464001 = 0.03736.

    HW 10.4.2 p.329

    30 / 1

  • Accept/Reject Algorithm

    When the distribution of variable X is such that cdf F and/orF1 do not have closed form, one of the possible methods ofgenerating a random sample from the distribution of X is theso-called accept/reject algorithm:

    Let U U[0, 1], and let variable Y with density g be somedistribution that is easy to generate. Variables U and Y areindependent.

    Additionally, let c be a constant such that f (y) cg(y) for anyvalue y of Y , so in other words c = supy

    f (y)g(y) .

    Finally, X = Y if U < f (Y )cg(Y ) .

    31 / 1

    Justification: It will be shown that FX (y) = FY (y |U f (Y )cg(Y ) ).

    FX (y) = FY (y |U f (Y )cg(Y ) ) = P(Yy ,Uf (y)/[cg(y)]P(Uf (y)/[cg(y)] =

    P(U f (Y )/[cg(y)]) = f (y)cg(y) g(y)dy = 1c f (y)dy = 1c = c y f (t)/[cg(t)]0 g(t)du dt = c y g(t)( f (t)/[cg(t)]0 du)dt= c

    y g(t)

    f (t)cg(t)dt =

    cc

    y f (t)dt = FX (y)

    Problem 10.4.6 p.329 Use accept/reject algorithm togenerate a sample from N(0, 1) distribution. fX - density ofN(0, 1), FX does not have a closed form. Y has doubleexponential (Laplace) distribution with density g(y) = 1.5e3|y |.

    32 / 1

  • c = supyf (y)g(y) ,

    f (y)g(y) =

    (2pi)1/2ey2/2

    1.5e|y| =1

    1.5

    2piey2/2+3|y |

    the function is even so it is enough to consider y > 0.

    y > 0: maxy ey2/2+3y ddy e

    y2/2+3y = ddy ey2/2+3y (y + 3)

    is equal to 0 if y = 3.

    supyf (y)g(y) =

    f (3)g(3) =

    11.5

    2pie4.5+9 = e

    4.5

    1.5

    2pi= 23.941 = c.

    X = Y if U < f (Y )cg(Y ) .

    U1 U2 Y f (y)/[cg(y)] X0.22295 0.516174 0.01096 0.1148 none0.847152 0.466449 -0.02315 0.0119 none0.614370 0.001058 -2.05270 0.6385 -2.0527

    x1 = 2.0527.

    33 / 1

    Example Another accept/reject algorithm will be used togenerate BETA(, + 1) distribution.

    Let U1,U2 U[0, 1], be independent and let > 0, > 0V1 = U

    1/1 ,V2 = U

    1/2 . X = V1 if V1 + V2 1

    Determine the distribution of X :

    FX (a) = P(V1 a|V1 + V2 1) = P(V1a,V1+V21)P(V1+V21) =ND =

    FV1(v) = P(V1 v) = P(U1/1 v) = P(U1 v) = v,fV1(v) = v

    1 and f (v1, v2) = v11 v12 (variables V1 and

    V2 are indpendent since U1 and U2 are assumed independent)

    34 / 1

  • D = P(V1 + V2 1) = 1

    0

    1v10 v

    11 v

    12 dv2dv1

    = 1

    0 v11

    1v10

    v12 dv2 = 1

    (1v1)

    dv1 = 1

    0 v11 (1 v1) dv1

    = ()(+1)(++1) 1

    0(++1)

    ()(+1)v11 (1 v1) dv1 = (+1)(+1)(++1)

    and

    N = a

    0

    1v10 v

    11 v

    12 dv2dv1 =

    a0 (1 v1)v11 dv1

    = ()(+1)(++1) a

    0(++1)

    ()(+1) (1 v1)v11 dv1

    = (+1)(+1)(++1) FBETA(,+1)(a).

    35 / 1

    Now = ND = FBETA(,+1)(a) X BETA(, + 1).

    Generate 1 observation from BETA(0.738, 1.449) distribution.

    X BETA(0.738, 1.449), = 0.738, = 0.449.

    Generate u1,u2: 0.996484, 0.066042

    v1 = 0.9964841/0.738 = 0.99523,

    v2 = 0.0660421/0.449 = 0.002352.

    v1 + v2 = 0.99758 1 and therefore x = v1 = 0.99523

    36 / 1