Upload
randall-richardson
View
219
Download
0
Tags:
Embed Size (px)
Citation preview
Section 8.1 - Estimating a Proportion with Confidence
Objectives:
1. To find a confidence interval graphically
2. Understand a confidence interval as consisting of those population proportions for which the result from the sample is reasonably likely
3. To always check the three conditions before constructing a confidence interval
• To construct a confidence interval using the formula
1. To interpret a confidence interval and the meaning of “confidence”
2. To compute the required sample size for a given margin of error
Section 8.1 - Estimating a Proportion with Confidence
General idea:
Consider the population of the U.S. Suppose you are interested in the proportion of redheads in the population.
Since the proportion of redheads is probably unknown, you will have to estimate it. What should you do?
– Take a sample. (The size will depend on how much time and money you have.)
– Compute the sample proportion. (The Central Limit Theorem tells you that this estimator is unbiased, and has other “desirable” properties.) This is your best guess.
– Are you “sure”? What do you mean by sure? How “sure” do you need to be?
Introduction
A Pew Research Center survey found that 55% of singles ages 18-29 say they aren’t in a committed relationship and are not actively looking. This percentage is based on interviews with 1068 singles. The survey reported a margin of error of 3%.
The researchers also say that they are 95% confident that the error in the percentage (55%) is less than 3% either way. That is, they are 95% confident that if they were to ask all young singles in the U.S., between 52% and 58% would report that they aren’t in a committed relationship and are not actively looking.
What do they mean by this?
Section 8.1 - Estimating a Proportion with Confidence
Reasonably Likely Events
Section 8.1 - Estimating a Proportion with Confidence
About 95% of all sample proportions p will fall
within about two standard errors of the population
proportion p, that is, within the interval
p ±1.96p(1- p)
nThe sample proportions in this interval are calledreasonably likely.This rule works well only under the condition that bothnp≥10 and n(1−p)≥10.
Reasonably Likely Events and Rare Events
Reasonably likely events are those in the middle 95% of the distribution of all possible outcomes. The outcomes in the upper 2.5% and lower 2.5% of the distribution are rare events - they happen, but rarely.
Section 8.1 - Estimating a Proportion with Confidence
y = x 0 1, ,( )normalDensity
1.96 = 1.961.96− = -1.96
0.050.100.150.200.250.300.350.400.45
-3 -2 -1 0 1 2 3x
no data Function Plot
Rare
Upper 2.5%
Rare
Lower 2.5%
Reasonably Likely
Middle 95%
Example: Reasonably Likely Results from Coin Flips
Section 8.1 - Estimating a Proportion with Confidence
Suppose you flip a fair coin 100 times.
What are the reasonably likely values of the
sample proportion p ?
What numbers of heads are reasonably likely?
Example: Reasonably Likely Results from Coin Flips
Section 8.1 - Estimating a Proportion with Confidence
Suppose you flip a fair coin 100 times. What are the reasonably
likely values of the sample proportion p ?
Check conditions :
np =(100)(0.50) =50 ≥10; n(1−p) =(100)(0.50) =50 ≥10
95% of all sample proportions p should fall in the interval
p±1.96p(1−p)
n=0.50 ±1.96
(0.5)(0.5)100
=0.50 ±1.96(0.05) =0.50 ±0.10
= 0.4, 0.6[ ]
Example: Reasonably Likely Results from Coin Flips
Section 8.1 - Estimating a Proportion with Confidence
Suppose you flip a fair coin 100 times. What numbers of
heads are reasonably likely?
In about 95% of the samples, the number of successes xwill be in the interval
np ±1.96 np(1−p) =50 ±1.96 (100)(0.5)(0.5) =50 ±1.96(5) =50 ±10
= 40, 60[ ]
Introduction, continued.
Section 8.1 - Estimating a Proportion with Confidence
The Pew Research Center doesn’t know the value of p
(the percentage of young singles not in a relationship).
For each possible value of p, Pew can compute how close
to p most sample proportions will be.
By knowing the variability expected in random samples, Pew
can estimate how close p should be to p.
The Meaning of a Confidence Interval
Suppose you take repeated random samples of size 40 from a population with 60% successes. What proportion of successes would be reasonably likely in your sample?
Section 8.1 - Estimating a Proportion with Confidence
np =(40)(0.60) =24 ≥10; n(1−p) =(40)(0.40) =16 ≥10μ p =p=0.60
σ p =p(1−p)
n=
(0.60)(0.40)40
=0.077
Reasonably likely = Middle 95% = μ p ±1.96 ⋅σ p
=0.60 ±1.96(0.077) =0.60 ±0.151
= 0.449, 0.751[ ]
The Meaning of a Confidence Interval
Section 8.1 - Estimating a Proportion with Confidence
Reasonably likely sample proportions for n = 40
p (1 - p) σ ME = 1.96 σ CI = p ± ME
0.9 0.1 0.047 0.092 [0.808, 0.992]
0.8 0.2 0.063 0.123 [0.677, 0.923]
0.7 0.3 0.072 0.141 [0.559, 0.841]
0.6 0.4 0.077 0.151 [0.449, 0.751]
0.5 0.5 0.079 0.155 [0.345, 0.655]
0.4 0.6 0.077 0.151 [0.249, 0.551]
0.3 0.7 0.072 0.141 [0.159, 0.441]
0.2 0.8 0.063 0.123 [0.077, 0.323]
0.1 0.9 0.047 0.092 [0.008, 0.192]
The Meaning of a Confidence Interval
Reasonably likely sample proportions for samples of size n = 40
Section 8.1 - Estimating a Proportion with Confidence
1
0.8
0.6
0.4
0.2
-0.2
y
0.2 0.4 0.6 0.8 1
ProportionofSuccessesin thePopulation
Proportion of Successes in the Sample
The Meaning of a Confidence Interval
Suppose that in an experiment, 75%, or 30 out of the 40 trials, resulted in success.
Is it plausible that the true proportion is 50%?
Is it plausible that the true proportion is 80%
What values are plausible for the population proportion?
Section 8.1 - Estimating a Proportion with Confidence
The Meaning of a Confidence Interval
Plausible population percentages are p = 0.6, p = 0.7, p = 0.8
Section 8.1 - Estimating a Proportion with Confidence
1
0.8
0.6
0.4
0.2
-0.2
y
0.2 0.4 0.6 0.8 1
ProportionofSuccessesin thePopulation
Proportion of Successes in the Sample
The Meaning of a Confidence Interval
Plausible population percentages are p = 0.6, p = 0.7, p = 0.8.
The sample proportion 0.75 (represented by the red vertical line) intersects the reasonably likely range of values for p = 0.80 (from 0.677 to 0.923, represented by the orange line segment).
If the population proportion is 0.80, you are reasonably likely to get 30 successes in 40 trials, or 75%.
The sample proportion 0.75 (represented by the red vertical line) does not intersect the reasonably likely range of values for p = 0.50 (from 0.345 to 0.655, represented by the orange line segment).
If the population proportion is 0.50, you are not likely to get 30 successes in 40 trials, or 75%.
Section 8.1 - Estimating a Proportion with Confidence
The Meaning of a Confidence Interval
Plausible population percentages are p = 0.6, p = 0.7, p = 0.8
Section 8.1 - Estimating a Proportion with Confidence
1
0.8
0.6
0.4
0.2
-0.2
y
0.2 0.4 0.6 0.8 1
ProportionofSuccessesin thePopulation
Proportion of Successes in the Sample
The Meaning of a Confidence Interval
Plausible population percentages are from about p = 0.6 to about p = 0.85.
These plausible percentages for the population proportion are called the 95% confidence interval for p.
Section 8.1 - Estimating a Proportion with Confidence
1
0.8
0.6
0.4
0.2
-0.2
y
0.2 0.4 0.6 0.8 1
ProportionofSuccessesin thePopulation
Proportion of Successes in the Sample
Section 8.1 - Estimating a Proportion with Confidence
A 95% confidence interval consists of those population
proportions p for which the sample proportion p is
reasonably likely.
Note that the population proportion p is the unknown
parameter.
A Confidence Interval for a Population Proportion
Section 8.1 - Estimating a Proportion with Confidence
Recall our formula for the "reasonably likely" interval, which
represents the middle 95% of the sampling distribution:
μp ± 1.96 ⋅σ p = p ± 1.96 ⋅p(1 − p)
n
Where did the "1 .96" come from?
It is the z - score corresponding to a probability of 0 .9750 .
A Confidence Interval for a Population Proportion
Section 8.1 - Estimating a Proportion with Confidence
The population proportion p is an unknown parameter.
In fact, estimating p is the whole point of what we are doing.
The idea is to estimate p by a range (interval) of values instead
of by a single value (point) p.
Let's see if we can modify the formula p ±1.96 ⋅p(1−p)
n
A Confidence Interval for a Population Proportion
Section 8.1 - Estimating a Proportion with Confidence
Since we don't know p, let's use the next best thing, p :
p ±1.96 ⋅p(1−p)
n
Instead of using 1.96, which is the z - score that corresponds
to the middle 95%, let's just put in a variable, z* , which will depend on how confident we want to be.
p±z* ⋅p(1−p)
n
Section 8.1 - Estimating a Proportion with Confidence
The formula for a confidence interval for the proportion of successes
p in the population is based on three components : the sample
proportion p, the standard error σ p , and the confidence level z*
p±z* ⋅p(1−p)
nHere n is the sample size and p is the proportion of successes in the sample.
The value of z* depends on how confident you want to be that the confidenceinterval will contain p. 90% CI ⇒ z* =1.645
95% CI ⇒ z* =1.96
99% CI ⇒ z* =2.576
Where do these values come from?
A Confidence Interval for a Population Proportion
Once again, what is it that we are trying to do?
We wish to find out the value of an unknown population parameter - the proportion of successes.
The best estimate of the value of the population proportion, based on the Central Limit Theorem, is to take a random sample and compute the sample proportion. (Bigger samples are better, etc.)
In some applications, it is useful to consider a range or interval of values, instead of just one. Depending on how “confident” we want or need to be, we can construct a confidence interval - a range of likely values for the population proportion.
Section 8.1 - Estimating a Proportion with Confidence
A Confidence Interval for a Population Proportion
Section 8.1 - Estimating a Proportion with Confidence
A confidence interval for the proportion of successes pin the population is given by the formula
p ±z* ⋅p(1−p)
n
This confidence interval is reasonably accurate for(1) Simple random samples from binomial populations.(2) np≥10 and n(1- p)≥10(3) Populations that are at least 10 times the size of the sample. N ≥10⋅n
A Confidence Interval for a Population Proportion
Section 8.1 - Estimating a Proportion with Confidence
For the confidence interval p ±z* ⋅p(1−p)
n,
the expression ME =z* ⋅p(1−p)
n is called the Margin of Error
The margin of error is one- half the width of the confidence interval.
The point estimate p is located in the center of the confidence interval.
Example: Safety Violations
Suppose you have a random sample of 40 buses from a large city and find that 24 buses have a safety violation. Find the 90% confidence interval for the proportion of all buses that have a safety violation.
Section 8.1 - Estimating a Proportion with Confidence
90% confidence interval :
p =2440
=0.60
z90%* =1.645
p±z* ⋅p(1−p)
n=0.60 ±1.645
(0.60)(0.40)40
=0.60 ±0.127
= 0.473, 0.727[ ]
Example: Safety Violations
Suppose you have a random sample of 40 buses from a large city and find that 24 buses have a safety violation. Find the 90% confidence interval for the proportion of all buses that have a safety violation.
Using the TI-83/84:
STAT TESTS 1-PropZInt ENTER
1-PropZInt
x: 24
n: 40
C-Level: .90
Calculate [ENTER]
Section 8.1 - Estimating a Proportion with Confidence
1-PropZInt
(.47259, .72741)
p = .6
n = 40
The Capture Rate
Sometimes a confidence interval “captures” the true population proportion and sometimes it doesn’t.
The capture rate of a method of constructing confidence intervals is the proportion of confidence intervals that contain the population parameter (proportion) in repeated usage of the method.
If a polling company uses 95% confidence intervals in a large number of different surveys, the population proportion p should be in 95% of them.
Section 8.1 - Estimating a Proportion with Confidence
Section 8.1 - Estimating a Proportion with Confidence
Correct statement :
"I am 95% confident that the interval p ±z* ⋅p(1−p)
ncontains the true value of the population proportion p."
Incorrect statement :"I am 95% confident that the true value of the population
proportion p will fall in the interval p±z* ⋅p(1−p)
n"
Margin of Error and Sample Size
Section 8.1 - Estimating a Proportion with Confidence
95% confidence intervals for large sample sizes are
narrower than those for small sample sizes :
p ±z* ⋅p(1−p)
nn increases ⇒ SE decreases ⇒ ME decreases ⇒ CI decreases
Margin of Error and Sample Size
Example: The Effect of Sample Size on the Margin of Error
Section 8.1 - Estimating a Proportion with Confidence
Suppose you take a random sample and get p = 0.7
(a) If n = 100, find the 95% confidence interval for p
and state the margin of error
p ±z* ⋅p(1−p)
n=0.70 ±1.96
(0.7)(0.3)100
=0.70 ±0.0898
= 0.6102, 0.7898[ ]The margin of error is 0.0898
Margin of Error and Sample Size
Example: The Effect of Sample Size on the Margin of Error
Section 8.1 - Estimating a Proportion with Confidence
Suppose you take a random sample and get p = 0.7
(b) What happens to the confidence interval and margin of
error if you quadruple the sample size, to n = 400?
p ±z* ⋅p(1−p)
n=0.70 ±1.96
(0.7)(0.3)400
=0.70 ±0.0449
= 0.6551, 0.7449[ ]The margin of error is 0.0449
What Sample Size Should You Use?
Section 8.1 - Estimating a Proportion with Confidence
To find a formula for the sample size, take the formula
for the margin of error and solve for the sample size n:
ME =z* ⋅p(1−p)
n
ME2 = z*( )2 p(1−p)
n⎛⎝⎜
⎞⎠⎟
n= z*( )2 p(1−p)
ME2⎛⎝⎜
⎞⎠⎟
What Sample Size Should You Use?
Section 8.1 - Estimating a Proportion with Confidence
To use the formula for the sample size, you need to know
(1) what margin of error is acceptable
(2) the confidence level (use 95% unless otherwise specified)
(3) the value of p (use 0.5 if no other information is available)
n = z*( )2 p(1−p)
ME2⎛⎝⎜
⎞⎠⎟
What Sample Size Should You Use?
Example: What sample size should you use for a survey if you want the margin of error to be at most 3% with 95% confidence but you have no estimate of p?
Section 8.1 - Estimating a Proportion with Confidence
n = z*( )2 p(1−p)
ME2⎛⎝⎜
⎞⎠⎟
=1.962 0.5 ⋅0.50.032
⎛⎝⎜
⎞⎠⎟
=1067.111≈1068