Confidenc Interval(16 3) Biostatistics

Embed Size (px)

Citation preview

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    1/26

    1

    Estimation of Population Means:Point Estimation and Confidence

    Interval

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    2/26

    2

    Statistics

    Descriptive Inferential

    Estimation Hypothesis testing

    Point estimateInterval estimates

    (ConfidenceInterval)

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    3/26

    3

    Types of Estimators

    Point Estimator

    - It gives a single value as an estimate of the

    parameter of interest

    Interval Estimator

    - It specifies a range of values of the parameter and our

    confidence that the parameter value is in that range

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    4/26

    4

    Point Estimation

    A point estimate of the population parameter is the samplestatistic computed from a random sample drawn from the

    population under study.

    Certain sample statistic are good point estimators for certain

    parameters-G ----- Estimates ----- ----- Estimates ----- W

    Sample mean is a statistic that varies from sample to sample If the investigator had repeated the experiment, he would have

    found a range of sample means, any one of which would be a

    point estimate of the population mean.

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    5/26

    5

    APoint Estimate is a single number, How much uncertainty is associated with a point estimate of a population

    parameter?

    The point estimate method fails to indicate how close the estimate isto population parameter. This flaw can be remedied by use of aconfidence interval estimate (CI).

    An interval estimate provides more information about apopulation characteristic than does a point estimate. It provides

    a confidence level for the estimate. Such interval estimates arecalled Confidence Intervals

    Point Estimate

    Lower

    Confidence

    Limit

    Width ofconfidence interval

    Upper

    Confidence

    Limit

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    6/26

    6

    Interval Estimation

    It is the interval of numbers in which we have a specified degree of

    assurance that the value of the parameter can be found.

    The level of confidence tells the probability the method produced aninterval that includes the unknown parameter

    Gives information about closeness to unknown population

    parameters

    Stated in terms of level of confidence. (Can never be 100%

    confident)

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    7/26

    7

    Confidence interval for population

    parameter A confidence interval is a formula that tell us

    how to use sample data to calculate an interval

    that estimate a population parameter e.g.

    population mean ().

    The confidence level is the confidence

    coefficient expressed as a percentage i.e.

    (1- )%

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    8/26

    8

    Empirical Rule Definition

    For data sets having a normal bell-shapeddistribution, the following properties apply:

    About 68% of all values fall within 1 standard

    deviation of the mean

    About 95% of all values fall within 2 standard

    deviation of the mean

    About 99.7% of all values fall within 3 standarddeviation of the mean.

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    9/26

    9

    Confidence Interval

    The general formula for all confidence intervals is equal to:

    Point Estimate (Critical Value) * (Standard Error)

    Now using the Empirical Rule for the normal

    distribution we know that the interval X + 2 /n , or

    more precisely, the interval X + 1.96 /n includes 95%of Xs in the repeated sampling.

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    10/26

    10

    Consider a 95% confidence interval:

    Z= -1.96 Z= 1.96

    .05!E

    .0252

    ! .025

    2

    !

    Point EstimateLowerConfidenceLimit

    UpperConfidenceLimit

    Point Estimate

    0

    .951 !E .0252/ !E

    .475.475

    Z

    l u

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    11/26

    11

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    12/26

    12

    Confidence Intervals

    Formula:

    Steps:

    1.Calculate the sample statistic to use as an estimate of

    the population parameter

    2.Calculate the lower (LL) and the upper limits (UL) of

    the confidence interval

    n

    ZXW

    Ey

    2/

    eQe

    nZX

    /

    Wy

    E 2

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    13/26

    13

    Determination of In order to construct an interval estimate, it is necessary

    to obtain some estimate of , the variability of thepopulation from which the sample is drawn.

    This is required to obtain an estimate of the standard

    error of the sample mean

    Generally, the sample standard deviation s is used as anestimate of .

    For a small sample, where n < 30, the t-distributionshould be used, again using s as an estimate of .

    n

    x

    WW !

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    14/26

    14

    Level of Confidence

    Probability that the unknown population parameter is in the

    confidence interval in 100 trials. Denoted (1 - ) % = level

    of confidence e.g. 90%, 95%, 99%

    Is Probability that the parameter is not within the interval

    in 100 trials

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    15/26

    15

    Selecting a confidence level

    There is no one confidence level that isappropriate for all circumstances.

    Greater confidence level means greater certaintythat the interval estimate of actually contains

    . But for 99% or 99.9% confidence level, theinterval may be very wide.

    Smaller confidence levels (eg. 80% or 90%)

    produce smaller margins of error and seeminglymore precise interval estimates, but they are lesslikely to contain .

    By tradition, the default level is 95%.

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    16/26

    16

    Interpretation

    The interpretation of the confidence interval is very

    important. Basically it means that upon taking a sample of

    size n repeatedly and constructing the interval

    X + 1.96 /n each time, we would expect the populationmean Q to fall within the interval 95% of the time .

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    17/26

    17

    Interpretation of a Confidence

    Interval for Population Mean (

    ) We can be 100(1-)confident that lies between the lower andupper bounds of the confidence interval.

    In other way, it means that upon taking a sample of size

    n repeatedly and constructing the interval X + 1.96/n each time, we would expect the population meanQ to fall within the interval 95% of the time

    The values are called lower and upper 100(1-)%

    confidence limits.

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    18/26

    18

    Commonly used values of Z/2

    Confidence level

    100 (1-) 2

    Z2

    90% 0.10 0.05 1.645

    95% 0.05 0.025 1.96

    99% 0.01 0.005 2.575

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    19/26

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    20/26

    20

    Example 2

    If we wish to estimate the mean VO2 uptake for a

    population of joggers based on a random sample of 100

    joggers, we could use the 95% confidence interval for Q.From our random sample of 100 joggers we know that X =

    47.5 ml/kg and S = 4.8 ml/kg. A 95% Confidence Interval

    (C.I.) of Q isX + 1.96 S /n or 47.5 + 1.96 ( 4.8)/10

    47.5 + 0.94 or ( 46.56, 48.44)

    The values 46.56 and 48.44 are the lower and upper 95%

    confidence limits. Interpretation: We are 95% confident that in the long run

    the intervals constructed in such a way will contain thepopulation mean Q.

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    21/26

    21

    Example 3

    If we wish to estimate the mean VO2 uptake for a

    population of joggers based on a random sample of 100

    joggers, we could use the 99% confidence interval for Q.From our random sample of 100 joggers we know that X =

    47.5 ml/kg and S = 4.8 ml/kg. A 99% Confidence Interval

    (C.I.) of Q isX + 2.575 S /n or 47.5 + 2.575 ( 4.8)/10

    47.5 + 1.24 or ( 46.26, 48.74)

    The values 46.26 and 48.74 are the lower and upper 99%

    confidence limits. Interpretation: We are 99% confident that in the long run

    the intervals constructed in such a way will contain thepopulation mean Q.

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    22/26

    22

    Width of a Confidence Interval

    The width of any confidence interval is the difference

    between the upper confidence limit and the lower

    confidence limit .

    The width of a confidence interval represent theaccuracy of estimation .

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    23/26

    23

    Factors Affecting Interval Width

    Data Variation

    measured by

    Sample Size n

    Level of Confidence

    (1 - )

    Confidence Interval Estimate

    nZX

    /

    Wy E 2 eQe

    nZX

    /

    Wy E 2

    Narrow widths and high confidence levels are desirable, but

    Narrow widths and high confidence levels are desirable, butthese two things affect each otherthese two things affect each other

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    24/26

    24

    Why Narrow Confidence Interval areImportant ?

    Narrow confidence intervals are of the greatest value inmaking estimates ,because they allow us to estimate anunknown parameter with little room for error .

    Aconfidence interval can be narrowed by:

    Increasing the sample size .

    Reducing the confidence level (1-)100%

    Reducing the source of variability in the observations ,thus

    producing less variance .

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    25/26

    25

    Cautions about interval

    estimates There are many assumptions involved in intervalestimation: The sample is randomly selected from a population.

    The sample size is sufficiently large

    The population standard deviation is known or s is a goodestimate of .

    The selection of a confidence level is an arbitrary process.

    The population is not too skewed

    As a result, interval estimates are not precise, but are estimates orapproximations.

    Larger n, repeated sampling, comparisons with otherstudies, and careful sampling and survey design andpractice can improve the quality of the estimates.

  • 8/8/2019 Confidenc Interval(16 3) Biostatistics

    26/26

    26

    95%Confidence Interval

    A 95% is the mostfrequent reportedconfidence intervalreported. Not that when

    you see certain intervalestimates reported on TV(for example somebusiness or medicalstatistics), the confidence

    level is not mentioned butit is under stood that it isbased on a 95%confidence level.

    68% CI More Error

    Narrow CI

    95% CI Medium Error

    Narrow CI

    99% CI Less Error

    Wider CI