InferenceForMeans Confidence

Embed Size (px)

Citation preview

  • 7/29/2019 InferenceForMeans Confidence

    1/25

    Confidence Regions Confidence regions are multivariate extensions of univariate

    confidence intervals.

    Recall the definition of a 100(1 )% CI for a parameter: for X f(x|), , the interval (t1(X), t2(X)) i s a

    100(1 )% CI for ifPr[t1(X) t2(X)] = 1 .

    If represents a univariate mean , a 100(1 )% CI for is given by [X tn1,2s

    2/n, X + tn1,2s2/n].

    Similarly, for p1, the region R(X) is a 100(1 )%confidence region for if

    Pr[R(X) will cover the true ] = 1 .

    279

  • 7/29/2019 InferenceForMeans Confidence

    2/25

    Confidence Regions (contd) For the mean vector p1, we know that before the sample

    is selected,

    Pr

    n(X )S1(X ) (n 1)p

    (n p) Fp,np()

    = 1 ,

    meaning that

    X is within statistical distance[(n 1)pFp,np()/(n p)]1/2

    from with probability 1 . Once a sample is obtained and x, S are computed, the set of

    values

    n(x )S1(x ) (n 1)p(n p) Fp,np()

    defines an ellipsoidal region R(X) that is likely to cover .

    280

  • 7/29/2019 InferenceForMeans Confidence

    3/25

    Confidence Regions (contd)

    To decide whether a hypothesized value 0 is contained inthe confidence region, we evaluate

    n(x 0)S1(x 0)and compare it to the scaled F value above. If the squared

    distance from x to 0 is larger than [(n1)pFp,np()/(np)],0 is not in the confidence region.

    This is exactly equivalent to testing Ho : = 0 versus H1 : = 0 using Hotellings T2 statistic.

    Thus, the 100(1 )% confidence region is composed ofall values 0 for which the T

    2 test would NOT rejectHo : = 0 versus H1 : = 0 at level .

    281

  • 7/29/2019 InferenceForMeans Confidence

    4/25

    Confidence Regions (contd)

    What can we say about the shape of the confidence region?It is a p-dimensional ellipsoid centered at the sample mean

    vector X.

    Recall that if (i, ei) are an eigenvalue-eigenvector pair of S,then letting

    (n 1)pFp,np()/(n p) = c2,the ith axis of the confidence ellipse has half length

    c

    in

    =

    (n 1)pFp,np()/(n p)

    in

    along the ei direction.

    282

  • 7/29/2019 InferenceForMeans Confidence

    5/25

    Confidence Regions (contd)

    Thus, beginning from the center of the ellipse at x, theaxes of the confidence ellipse are

    i

    (n 1)pn(n

    p)

    Fp,np().

    Since the second term is constant for all axes, the ratiosof the i will reflect the relative elongations.

    Larger differences in the sample variances across the pmeasurements (due to real causes or to differences in

    the scale of the measurements), will create larger ratios

    of eigenvalues (correlations are also involved)

    283

  • 7/29/2019 InferenceForMeans Confidence

    6/25

    Example: Microwave Ovens

    Recall the microwave oven radiation data in Tables 4.1 and4.5, where two radiation measurements, x1 and x2, wereobtained from n = 42 ovens. Here, the xj denotes the trans-formed (by Box-Cox) radiation measurements, using a power = 0.25.

    Sample statistics for those data are:

    x =

    0.5640.603

    , S =

    0.014 0.0120.012 0.015

    , S1 =

    203.02 163.39

    163.39 200.23

    .

    Eigenvalue and eigenvector pairs for S are1 = 0.026 e

    1 = [0.704, 0.710]

    2 = 0.002 e2 = [0.71, 0.704]

    284

  • 7/29/2019 InferenceForMeans Confidence

    7/25

    Example: Microwave Ovens (contd) The 95% CR for is given by all values 1, 2 that satisfy:

    42[0.5641 0.6032]

    203.02 163.39163.39 200.23

    0.564 10.603 2

    6.62,

    where2(41)

    40F2,40(0.05) =

    2(41)

    403.23 = 6.62.

    Is 0 = [0.562 0.589]

    a plausible value for ? To check,plug 0 into the expression above and see if it satisfies theinequality. In this case, we get 1.30 which is less than 6.62,

    and conclude that 0 is plausible at the 95% level.

    285

  • 7/29/2019 InferenceForMeans Confidence

    8/25

    Example: Microwave Ovens (contd)

    286

  • 7/29/2019 InferenceForMeans Confidence

    9/25

    Example: Microwave Ovens (contd)

    The joint confidence ellipsoid is centered at x = [0.564 0.603]and the half lengths of the two axes are

    0.026

    2(41)

    42(40)3.23 = 0.064,

    0.002

    2(41)

    42(40)3.23 = 0.018.

    The axis are in the direction of the two eigenvectors when xis taken as the origin.

    The ratio0.026

    0.002= 3.6

    indicates that the major axis is 3.6 times longer than theminor axis.

    287

  • 7/29/2019 InferenceForMeans Confidence

    10/25

    Simultaneous Confidence Statements

    Often we are interested in drawing inference about each j. One possibility is to construct ordinary confidence intervals

    xj tn1(

    2)

    sjj

    n,

    for each j. One problem is that the combined set ofindividual intervals result in a simultaneous confidence levelthat is less than the nominal 1 .

    There are various ways of constructing a collection ofindividual confidence intervals so that the joint confidence

    level for the family of parameters remains at 1 Intuitively, CIs that protect against erosion of the confidence

    level will be wider than the individual (1 ) 100% CIs.

    288

  • 7/29/2019 InferenceForMeans Confidence

    11/25

    Simultaneous Confidence Statements (contd)

    Suppose that we have p variables. The population mean ofthe first variable 1 can be written as

    a1 = [1 0 ... 0],and in general, j = a

    j where a

    j is the p 1 row vector with

    a one in the jth position and zeros in all other positions.

    Given a sample x1, x2,...,xn of p-dimensional vectors, anestimator of j is a

    jx, with an estimated variance of a

    jSaj/n.

    Then, an ordinary (1 ) 100% CI for j can be written as

    ajx tn1(/2)ajSaj

    n.

    289

  • 7/29/2019 InferenceForMeans Confidence

    12/25

    Simultaneous Confidence Statements (contd)

    An alternative way to interpret the ordinary (1 ) 100%confidence interval is as follows: the CI is the set of values

    of a for which

    |t| =

    n(ajx aj)ajSaj

    tn1(/2),or, equivalently

    t2 = n(ajx aj)2ajSaj

    = n(aj(x ))2ajSaj

    t2n1(/2).

    290

  • 7/29/2019 InferenceForMeans Confidence

    13/25

    Simultaneous Confidence Statements (contd)

    Intuitively, if we wish to construct a set of tests for manydifferent vectors a and have confidence level 1 that allintervals will cover the true a, we will need a larger criticalvalue on the right-hand side of the inequality.

    What is the maximum value that the statistic t2 can reachfor some vector a?

    maxa

    t2 = maxa

    n(a(x ))2aSa

    = n(x )S1(x ) = T2,using the maximization lemma (2.50) on page 80 of your

    textbook (you checked this on an assignment).

    The maximum T2 is achieved when a is proportional toS1(x ).

    291

  • 7/29/2019 InferenceForMeans Confidence

    14/25

    Simultaneous Confidence Statements (contd)

    Let X1,...,Xn be a sample from Np(, ). Then simultane-ously for all a, the intervals given by

    aXp(n

    1)

    (n p) Fp,np()aSa

    n

    will cover a with probability of at least 1 .

    Proof: recall that

    T2 = n(x )S1(x ) c2 = n(a(x ))2

    aSa c2

    for every a.

    292

  • 7/29/2019 InferenceForMeans Confidence

    15/25

    Simultaneous Confidence Statements (contd)

    Equivalently:

    ax c

    aSa/n a ax + c

    aSa/n for all a.

    Choosingc2 = p(n 1)Fp,np()/(n p)

    results in intervals that contain a with probability no smaller

    than

    1 = Pr(T2 c2).

    293

  • 7/29/2019 InferenceForMeans Confidence

    16/25

    Simultaneous Confidence Statements (contd)

    The intervals we just defined are called T2 because theirlength is determined by the sampling distribution of T2.

    For a the vector with zeros everywhere and 1 in the jth

    position, the T

    2

    interval is

    xj

    p(n 1)(n p) Fp,np()

    sjj

    n j xj+

    p(n 1)

    (n p) Fp,np()

    sjj

    n.

    Note that for a the vector with zeros everywhere except 1 in

    the jth position and -1 in the kth position, the interval wouldcorrespond to j k. In this case,

    ax = xj xk, and aSa = sjj 2sjk + skk.

    294

  • 7/29/2019 InferenceForMeans Confidence

    17/25

    Example: Microwave Ovens (contd)

    Before we had obtained a simultaneous 95% confidenceellipsoid for 1 and 2, the means of the fourth root of

    radiation with door closed and door open.

    We now compute 95% T2 intervals for the two means.First note that

    p(n 1)

    n(n p)Fp,np(0.05) = 2(41)

    42(40)3.23 = 0.397.

    is common to both intervals.

    295

  • 7/29/2019 InferenceForMeans Confidence

    18/25

    Example: Microwave Ovens (contd)

    For 1, 2:x1 0.397

    s11 0.564 (0.397 0.12) 0.564 0.0476

    x2 0.397

    s22 0.603 (0.397 0.121) 0.603 0.048.

    For the difference between doors closed and open:x1 x2 0.397

    s11 2s12 + s22 0.039 (0.397 0.0748)

    [0.069, 0.009],suggesting that closing the door significantly reduces the

    (fourth root) radiation emitted by the ovens.

    The T2 intervals are shadows or projections of the confidenceellipse onto the component axes.

    296

  • 7/29/2019 InferenceForMeans Confidence

    19/25

    Example: Microwave Ovens (contd)

    The T2 intervals are shadows or projections of the confidenceellipse onto the component axes.

    297

  • 7/29/2019 InferenceForMeans Confidence

    20/25

    Comparison of simultaneous and ordinary t

    intervals The ordinary one-at-a-time t intervals each have coverage

    probability 1 , but the joint coverage probability of p in-tervals is not known.

    In the special case where the covariance matrix is diagonal,the joint coverage probability of p ordinary t intervals is

    (1 )p.

    Clearly, to guarantee 1

    joint coverage probability, the t

    intervals need to be made wider.

    How much wider depends on p, n and .

    298

  • 7/29/2019 InferenceForMeans Confidence

    21/25

    Comparison of confidence intervals (contd)

    The multipliers of (sjj /n)1/2 in the simultaneous intervalsand in the t intervals are, respectively

    p(n 1)

    (n p) Fp,np(), and tn1(/2).

    For example, for = 0.05, n = 15 and p = 4, thesimultaneous intervals are

    (4.14 2.145)2.145

    100% = 93%wider.

    299

  • 7/29/2019 InferenceForMeans Confidence

    22/25

    Comparison of confidence intervals (contd)

    An one-at-a-time t interval is the correct choice if we areinterested in only one of the components of .

    While simultaneous T2 intervals have the correct jointcoverage probability, they tend to be too conservative if

    we are only interested in the p components of (as opposedto all possible linear combinations of the components).

    Note that for p = 2, the two T2 intervals define a rectanglethat contains the ellipse (with 95% coverage probability) and

    more.

    Thus, the rectangle formed by the two T2 intervals has morethan 1 coverage probability.

    300

  • 7/29/2019 InferenceForMeans Confidence

    23/25

    The Bonferroni method for multiple

    comparisons

    The Bonferroni method is useful when we wish to makea small number m of comparisons for linear combinations

    a1,...,am.

    Let Ci denote a confidence statement about ai such thatPr(Ci true) = 1 i. Then

    Pr(all Ci true) = 1 Pr( at least one Ci false)

    1

    iPr(Ci false)

    = 1

    i

    (1 Pr(Ci true)

    = 1 (1 + 2 + ... + m).

    301

  • 7/29/2019 InferenceForMeans Confidence

    24/25

    The Bonferroni method for multiple

    comparisons Consider, for example, m individual t intervals for 1,...,m,

    with i = /m. From the Bonferroni inequality, we havethat:

    Pr

    Xi tn1(

    2m)sii

    n contains i, for all i 1

    m

    i=1

    m

    = 1 .

    In general, to make confidence statements about p means,we divide the significance level by the number of intervals

    we want to construct p.

    Microwave ovens: see T2 and Bonferroni intervals in nextfigure.

    302

  • 7/29/2019 InferenceForMeans Confidence

    25/25

    T2

    and Bonferroni confidence intervals

    303