23
PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF INFERENCE: 1. Estimation 2. Hypothesis testing Methods based on statistical relationships between samples and populations POINT ESTIMATION: estimation of parameter from a sample statistic For the mean, standard deviation, etc.. INTERVAL ESTIMATION: using a sample to identify an interval within which the population parameter is thought to lie, with a certain probability

PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

Embed Size (px)

Citation preview

Page 1: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE

INFERENCE:• Methodologies that allow us to draw

conclusions about population parameters from sample statistics

TYPES OF INFERENCE: 1. Estimation2. Hypothesis testing

• Methods based on statistical relationships between samples and populations

• POINT ESTIMATION: estimation of parameter from a sample statistic

– For the mean, standard deviation, etc..

• INTERVAL ESTIMATION: using a sample to identify an interval within which the population parameter is thought to lie, with a certain probability

Page 2: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

ESTIMATION OF POPULATION MEAN

• Sample mean value is only an estimate of the parameter mean value

– Parameter value is not known

• Due to sampling variability, no two samples will produce exactly the same outcome, or sample mean

        Can we estimate how this sample mean

value would vary if you take many large samples from the same population?

 Remember:       sample mean values from large samples

have a normal distribution       the mean of the sampling distribution is

the same as the unknown parameter  • standard deviation of for a SRS of size n

is ? x

Page 3: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

• Example: A random sample of 350 male college students were asked for the number of units they were taking. The mean was 12.3 units, with a standard deviation of 2.50 units.

 • What can we say about the mean

number of units of all student males at the university? How will the estimate value of the parameter vary from one sample to another with a certain confidence, like 95%?

 Assume that = ?. s = ?

  

Page 4: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

Statistical confidence Remember: The 68-95-99.7 rule     In  95% of all samples, the mean score of

x will lie within 2 standard deviations of the population mean score .

 Since s = 2.50, we can say that In 95% of samples, will lie within 5.0

points of the observed sample mean In 95% of all samples,   

• Thus, the parameter will lie between 7.3 and 17.3, in 95% of samples

0.50.5 xx

Page 5: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

Rephrasing: 1.  We are 95% confident that the interval

7.3-17.3 contains  • We have just assigned statistical

confidence to our estimation of the parameter

• We call this estimated interval a CONFIDENCE INTERVAL for the mean value

Page 6: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

   But, there is still some chance that the true parameter value will not lie in the identified interval

•  e.g. The SRS chosen was one of few samples for which is not within 5.0 points of true mean. 5% of samples will give these incorrect results

x

Page 7: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

    CONFIDENCE INTERVAL – formal definition

  A level C confidence interval for a parameter

is defined as 

estimate margin of error  and gives the interval that will capture the

true parameter value in repeated samples with a certain probability

      Confidence intervals usually vary between 90% and 99.9%

Page 8: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

BUILDING CONFIDENCE INTERVALS

If we know the parameter and , we can standardize the sample mean. The result is the ONE-SAMPLE Z STATISTIC

The z statistic tells us how far the observed is from , in units of standard deviations of . Because has a normal distribution, z has the standard normal distribution N(0,1).

n

xz

xxx

Page 9: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

    Constructing confidence intervals

When we construct a 95% confidence interval, we are looking for two values for which there is a 95% chance that the population mean is between them. So,

P(Low < < High) = 0.95

Thus, 0.95 = P(-1.96 < z < 1.96)

=

=

=

0.95 =

)96.196.1(

n

xP

)96.196.1(n

xn

P

)96.196.1(n

xn

xP

)96.196.1(n

xn

xP

Page 10: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

    Draw a SRS of size n from a population having unknown mean , and known standard deviation . A level C confidence interval for

This interval is exact when the population distribution is normal and is approximately correct for large n in other cases

where represents the probability that the interval will not capture the true parameter value in repeated sample or confidence level, and C is the confidence level.

nzx

2/

C1

nzx

nzx

2/2/

Page 11: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

Figure 6.5 and figure 6.6

z* = z/2

C = chosen confidence level – probability that a parameter will lie within a given interval with a desiredconfidence

(1-C)/2 = probability that a parameter will be situatedeither above or below the the lower confidence limit = /2

Confidence intervals and confidence levels of Standardized normal curve N(0,1)

Page 12: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

  Example:

• A manufacturer of pharmaceutical products analyzes a specimen from each batch of a product to verify the concentration of the active ingredient. The chemical analysis is not perfectly precise. Repeated measurements on the same specimen give slightly different results. The results of repeated measurements follow a normal distribution. The analysis procedure has no bias, so the mean of the population of all measurements is the true concentration in the specimen. The standard deviation of this distribution is known to be 0.0068 g/l. Three analyses of one specimen give the following concentrations 

0.8403 0.8363 0.8447

• Calculate the 99% confidence interval for the true concentration.

nzx

2/

Page 13: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

     INTERVAL ESTIMATION OF WITH UNKNOWN

replaced with estimate s – introduces more uncertainty

STUDENT’S T-DISTRIBUTION• not standard normal curve

n

stx n 1,2/

n

stx

n

stx 2/2/

ns

xt

nzx

nzx

2/2/

Page 14: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

INTERVAL ESTIMATION OF WITH UNKNOWN

Intervals derived from t-distribution are wider than those found with z-distribution

For large samples (n=>30), it makes no difference which distribution we use to estimate confidence interval

Page 15: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

HOW CONFIDENCE INTERVALS BEHAVE 

    Ideal situation – high confidence and small margin of error

 Margin of error (E) =

     The smaller the margin of error, the

more precise our estimation of

nz 2/

Page 16: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

   Properties of error

 1.  Error increases with smaller sample size

For any confidence level, large samples reduce the margin of error

 2.  Error increases with larger standard

Deviation     As variation among the individuals in the

population increases, so does the error of our estimate

 3.   Error increases with larger z values Tradeoff between confidence level and margin

of error  

Page 17: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

Figure 8-10 and 8-11

Interval width (error) increases withIncreased confidence level

Higher confidence levels haveHigher z values

Error is high in small samples

Page 18: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

PARAMETRIC STATISTICAL INFERENCE: ESTIMATION

Example: Calculate the 99% confidence interval for

sample size of 1. = 0.8404, = 0.0068  99% confidence interval for n=3 was 0.8303

to 0.8505 g/l How do these compare in relation to the

mean? Which one has the larger margin of error?

Page 19: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

CHOOSING SAMPLE SIZE

     Sometimes we wish to estimate our mean within a certain margin of error.

• Sometimes we wish to determine a certain sample size in order to achieve a given margin of error

• Here is how… Remember: Margin of error (E) =  To obtain a desired value of E, for a givenconfidence level, you need to figure out n. From the above, 

     It is the sample size that determines the

margin of error• Required sample size depends on the

desired level of confidence

nz 2/

2

2/

E

zn

Page 20: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF
Page 21: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

CHOOSING SAMPLE SIZE

Example: Management asks the pharmaceutical laboratory to produce results accurate to within 0.005 with 95% confidence. How many measurements must be averaged to comply with this request?  m = 0.005 g/lFor 95% confidence level, z = ? = 0.0068 g/l 

 

Page 22: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

CHOOSING SAMPLE SIZE

Example: Management asks the pharmaceutical laboratory to produce results accurate to within 0.005 with 95% confidence. How many measurements must be averaged to comply with this request?  m = 0.005 g/lFor 95% confidence level, z = 1.960. = 0.0068 g/l 

 is n = 7 or n = 8? Choose one that will give a smaller margin of error. How should we always round to meet the requirements necessary?

1.7005.0

0068.096.12

m

zn

Page 23: PARAMETRIC STATISTICAL INFERENCE INFERENCE: Methodologies that allow us to draw conclusions about population parameters from sample statistics TYPES OF

SUMMARY

All formulas for inference are only correct under certain conditions

o    Most inference methods have several assumptions attached to them that must be met if the outcomes produced by them are to be reliable.

  Confidence interval formula has the following

assumptions: 1.   The data must come from a simple random

sample. different methods exist for stratified and

multistage samples undercoverage and non-response can add error2. X bar must be a random normal variable3.   There must be no outliers. Is the formula

sensitive to outliers?4.   If sample size is small (<15) and/or is not

known but distribution of x still normal, t-distribution must be used to compute interval

5.  When sigma is known use z-distribution. For large sample sizes we can assume that = s

and use either z or t distributions