Lecture 14 Dustin Lueker. 2 Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample

STA 291Fall 2009

Lecture 14Dustin Lueker

2

Statistical Inference: Estimation Inferential statistical methods provide

predictions about characteristics of a population, based on information in a sample from that population◦ Quantitative variables

Usually estimate the population mean Mean household income

◦ Qualitative variables Usually estimate population proportions

Proportion of people voting for candidate A

STA 291 Fall 2009 Lecture 14

3

Two Types of Estimators Point Estimate

◦ A single number that is the best guess for the parameter Sample mean is usually at good guess for the

population mean Interval Estimate

◦ Point estimator with error bound A range of numbers around the point estimate Gives an idea about the precision of the estimator

The proportion of people voting for A is between 67% and 73%


4

Point Estimator A point estimator of a parameter is a sample

statistic that predicts the value of that parameter

A good estimator is ◦ Unbiased

Centered around the true parameter ◦ Consistent

Gets closer to the true parameter as the sample size gets larger

◦ Efficient Has a standard error that is as small as possible (made

use of all available information)


5

Unbiased

An estimator is unbiased if its sampling distribution is centered around the true parameter◦ For example, we know that the mean of the

sampling distribution of equals μ, which is the true population mean So, is an unbiased estimator of μ

Note: For any particular sample, the sample mean may be smaller or greater than the population mean Unbiased means that there is no systematic

underestimation or overestimation

x

x

x


6

Biased A biased estimator systematically

underestimates or overestimates the population parameter◦ In the definition of sample variance and sample

standard deviation uses n-1 instead of n, because this makes the estimator unbiased

◦ With n in the denominator, it would systematically underestimate the variance


7

Efficient An estimator is efficient if its standard error

is small compared to other estimators◦ Such an estimator has high precision

A good estimator has small standard error and small bias (or no bias at all)

◦ The following pictures represent different estimators with different bias and efficiency

◦ Assume that the true population parameter is the point (0,0) in the middle of the picture


8

Bias and Efficient

Note that even an unbiased and efficient estimator does not always hit exactly the population parameter.

But in the long run, it is the best estimator.


9

Confidence Interval Inferential statement about a parameter

should always provide the accuracy of the estimate◦ How close is the estimate likely to fall to the true

parameter value? Within 1 unit? 2 units? 10 units?

◦ This can be determined using the sampling distribution of the estimator/sample statistic

◦ In particular, we need the standard error to make a statement about accuracy of the estimator


10

Confidence Interval Range of numbers that is likely to cover (or

capture) the true parameter Probability that the confidence interval

captures the true parameter is called the confidence coefficient or more commonly the confidence level◦ Confidence level is a chosen number close to 1,

usually 0.90, 0.95 or 0.99◦ Level of significance = α = 1 – confidence level


11

Confidence Interval To calculate the confidence interval, we

use the Central Limit Theorem (np and nq ≥ 5)

Also, we need a that is determined by the confidence level

Formula for 100(1-α)% confidence interval for μ

/ 2z


n

ppZp

)ˆ1(ˆˆ 2/

90% confidence interval◦ Confidence level of 0.90

α=.10 Zα/2=1.645


α=.05 Zα/2=1.96


α=.01 Zα/2=2.576

Common Confidence Intervals

12STA 291 Fall 2009 Lecture 14

Compute at 95% confidence interval for p if a sample of 50 people yielded a sample proportion of .41

Example

STA 291 Fall 2009 Lecture 14 13

“Probability” means that in the long run 100(1-α)% of the intervals will contain the parameter◦ If repeated samples were taken and confidence

intervals calculated then 100(1-α)% of the intervals will contain the parameter

For one sample, we do not know whether the confidence interval contains the parameter

The 100(1-α)% probability only refers to the method that is being used

Interpreting Confidence Intervals


Incorrect statement◦ With 95% probability, the population mean will fall

in the interval from 3.5 to 5.2

To avoid the misleading word “probability” we say that we are “confident”◦ We are 95% confident that the true population

mean will fall between 3.5 and 5.2




STA 291 Fall 2009 Lecture 14 16

Confidence Intervals Changing our confidence level will change

our confidence interval◦ Increasing our confidence level will increase the

length of the confidence interval A confidence level of 100% would require a

confidence interval of infinite length Not informative

There is a tradeoff between length and accuracy◦ Ideally we would like a short interval with high

accuracy (high confidence level)


Documents

Lecture 14 Dustin Lueker. 2 Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample