40
Math 3680 Lecture #15 Confidence Intervals

Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) = and SD(X) = . Recall the following two facts about the average of n observations

Embed Size (px)

DESCRIPTION

Estimation

Citation preview

Page 1: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Math 3680

Lecture #15

Confidence Intervals

Page 2: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Review: Suppose that E(X) = and SD(X) = . Recall the following two facts about the average of n observations drawn with replacement:

 

XnXSD

XE

)(

)(

Page 3: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Estimation

Page 4: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Example: A university has 25,000 registered students. In a survey of 318 students, the average age of the sample is found to be 22.4, with a sample SD of 4.5 years. Estimate the average age of all 25,000 students, and attach a standard error to this estimate.

Page 5: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Wrong Answer: The average age of thestudent body is exactly 22.4 years. What is wrong with this simplistic analysis?

Page 6: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Answer: Of course, we estimate the average of the population to be 22.4 years – but this estimate will not be exact. To determine the magnitude of the error, we need to find the SE, and that means a box model.

25,000 tickets Average = ??

SD = ?? 318 draws

Page 7: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Bootstrap Estimation: Although the SD of the box is unknown, we estimate the SD of the box from the fractions in the sample:

SD of box 4.5

SE of the sample average

(Why?)  

.251.0318

5.4125000

31825000

Page 8: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Conclusion: The average age is about 22.4 years, give or take 0.251 years or so.

Page 9: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Confidence Intervals:Large samples or known

Page 10: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

-0.994458 0.994458

0.1

0.2

0.3

0.4

-0.994458 0.994458

0.1

0.2

0.3

0.4

68%

We say that the range

22.40.251 years = 22.149-22.651 years

is a 68% confidence interval for the average age of the population.

Page 11: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

-1.95996 1.95996

0.1

0.2

0.3

0.4

-1.95996 1.95996

0.1

0.2

0.3

0.4

95%

We say that the range

22.4(1.96)(0.251) years = 21.909-22.891 years

is a 95% confidence interval for the average age of the population.

Page 12: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

-2.96774 2.96774

0.1

0.2

0.3

0.4

-2.96774 2.96774

0.1

0.2

0.3

0.4

99.7%

We say that the range

22.4(2.968)(0.251) years = 21.656-23.144 years

is a 99.7% confidence interval for the average age of the population.

Page 13: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

0.1

0.2

0.3

0.4

0.1

0.2

0.3

0.4

1 - 2

In general, we say that the range

is a 1 - 2 confidence interval for the population average .

z z

nzX

nzX

1

Page 14: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Logic:

21

21

21

21/

21

1

1

1

1

1

nzX

nzXP

nzX

nzP

nzX

nzP

zn

XzP

zZzP

Page 15: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Observations:

1) We are NOT saying that 95% of the students are between 21.9 and 22.9 years old – this is patently ridiculous, of course.

2) We are NOT saying that there is a 95% chance that the average age is between 21.9 and 22.9 years. The population average is constant – it is either in this range or it is not.

Page 16: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Observations:

3) The true interpretation is as follows: If several people run this experiment and they all find a 95%-confidence interval, then the true population parameter will lie in about 95% of these intervals.

Page 17: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

21.5

22

22.5

23

23.5

100 different 95% confidence intervals

Page 18: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

21.75

22

22.25

22.5

22.75

23

23.25

100 different 68% confidence intervals

Page 19: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

21.5

22

22.5

23

23.5

100 different 95% confidence intervals, n = 4 x 318 =1272

Page 20: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Observations:

4) In the previous problem, we replaced the population with the sample s. (When did we do this?) As it turns out, this makes little practical difference for large samples.

More on this later when we consider small samples.

Page 21: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Observations:

5) The normal approximation has been used. As discussed earlier, a large number of draws is required for this assumption to hold.

6) Remember: There is no such thing as a 100% confidence interval. In practice, scientists often use 95% as a balance between a high confidence level and a narrow confidence interval.

Page 22: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Example: In a simple random sample of 680 households (in a city of millions), the average number of TV sets is 1.86, with an SD of 0.80. Find a 95% confidence interval for the average number of TV sets per household in the city.

Page 23: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

True or false: (i) 1.860.06 is a 95%-confidence interval for this population average. (ii) 1.860.06 is a 95%-confidence interval for this sample average.

(iii) There is a 95% chance for the population average to be in the range 1.860.06.

Page 24: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

132 117 176 126 142 120127 125 198 208 105 146214 194 131 208 101 139184 163 129 138 110 247181 181 125 123 117 176211 108 254 244 139 179190 212 228 139 147 170139 129 174 108 106 141112 126 125 142 115 147105 256 142 175 131 119174 106 194 181 196 232143 142 104 184 112 141135 110 107 137 111 112185 114 188 106 102 104120 143 179 178 124 242235 129 198 150 180 187142 125 184 238 111 129129 203 115 101 178 133134 168 229 169 148 185154 162 103 105 125 151

Example: The chart to the right shows platelet counts among 120 geriatric patients. Find a 95% confidence interval for the average platelet count among geriatric patients.

Page 25: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Fill in the blanks with either box or draws.

Probabilities are used when reasoning from the __________ to the _____________.

Confidence levels are used when reasoning from the ____________ to the ______________.

Page 26: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Fill in the blank with either observed or expected.

The chance error is in the _______________ value.

Page 27: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Fill in the blank with either sample or population.

 The confidence level is for the ______________ average.

Page 28: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Confidence Intervals:Projecting Sample Size

Page 29: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Example: In a preliminary simple random sample of 680 households (in a city of millions), the average number of TV sets in the sample households is 1.86, with an SD of 0.80.

Suppose that it’s desired to construct a 90% confidence interval which has a margin of error of 0.03. How large a sample would be necessary?

Page 30: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Solution:

-1.64485 1.64485

0.1

0.2

0.3

0.4

-1.64485 1.64485

0.1

0.2

0.3

0.4

nn

n

nz

3.1924867.43

03.08.0645.1

03.08.095.0

So, the sample size should be at least 1925

Page 31: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Confidence Intervals:Small samples

Page 32: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Example: A biological research team measures the weights of 14 chipmunks, randomly chosen. Find a 90% confidence interval for the average weight of chipmunks.

7.6 8.66 9.41 8.45 8.08 8.86 7.488.2 9.24 9.34 9.58 10.1 8.55 9.15

Page 33: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Note: The previous calculations used the fact that

approximately follows the normal curve for large values of n. In this problem, we cannot use this approximation.

nX

/

Page 34: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

However, for both small and large samples, we can use the fact that

approximately follows the Student’s t-distribution with n - 1 degrees of freedom.

nSX

/

Page 35: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

0.1

0.2

0.3

0.4

0.1

0.2

0.3

0.4

1 - 2

In general, we say that the range

is a 1 - 2 confidence interval for the population average .

tn-1, tn-1, 1-

nstX

nstX nn 1,1,1

Page 36: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

-1.77093 1.77093

0.1

0.2

0.3

0.4

-1.77093 1.77093

0.1

0.2

0.3

0.4

90%

Therefore, the 90% confidence interval is

or 8.40 – 9.12 ounces.

Excel:TINV(0.1, 13)

,1476.077093.176.8

Page 37: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Note: Be sure you look up the correct number on the table in the back of the book. The numbers at the bottom of Table 4 specify the two-sided confidence levels.

Page 38: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Example: Duracell tests 12 batteries in flashlights. They determine that the average life of the batteries in this sample is 3.58 hours, with a sample SD of 1.58 hours. Find a 95% confidence interval for the average life of a Duracell battery in a flashlight.

Repeat if 100 batteries were tested (with the same sample mean and SD as above)

Page 39: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations
Page 40: Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations

Note: In previous lectures, we considered another technique of inferring information about the box from the draws – namely, hypothesis testing.

Confidence intervals provide a method of estimating the average of the box. Hypothesis testing checks if the difference between the supposed box average and the sample average is either real or due to chance.