Mar. 22 Statistic for the day:Percent of Americans 18 or older who believe Martha Stewart’s sentence should include jail time: 53%
Assignment: Read Chapter 18 Exercises p329: 1, 2, 5, 6, 8, 10
Source: gallup.com
These slides were created by Tom Hettmansperger and in some cases modified by David Hunter
Do you believe Martha Stewart got a fair trial?Do you believe Martha Stewart’s sentence should include jail time?
Gallup Poll
Sample percentages: categorical variable
YesYes NoNo No opinionNo opinion
Fair trialFair trial 66%66% 27%27% 7%7%
Jail timeJail time 53%53% 40%40% 7%7%
The Gallup Poll was based on 1005 telephone interviews.
Based on the sample of 1005 we estimate that 53% of the population of millions believes that Martha Stewart’s sentence should include jail time.
If we take a new sample of 1005 we will get a newsample percentage. It will generally not be exactly 53%.
If we take lots of samples of 1005 we will get lots of sample percentages.
Next we look at the histogram for the percentages.
200 percentages based on 200 samples of 1005 each.
Mean = 53% (or .53)Standard deviation = (57% − 50%) / 4 = 1.75% (or .0175)
5958575655545352515049
20
10
0
PERCENT
Fre
quen
cy
Histogram of PERCENT, with Normal Curve
1 marginof errorsamplesize
How do we measure and assess the uncertainty in the sample percentage?
So in our example, if the sample size is 1600,then the MARGIN OF ERROR is:
1 1 1 .03231.71005samplesize
Or 3.2 %And we report 53% + 3.2%
We defined the margin of error to be 2 standard deviations.We estimated the standard deviation from the histogram to be .0175. This nearly agrees since 2x.0175 = .035. Pretty close!
Summary: Gallup Poll
We have a simple random sample from We have a simple random sample from the population of telephone owners.the population of telephone owners. The sample size used was 1005.The sample size used was 1005. We find the percentage from our sample.We find the percentage from our sample. The MARGIN OF ERROR is 1 divided byThe MARGIN OF ERROR is 1 divided bythe square root of the sample size.the square root of the sample size. For 1005 the MARGIN OF ERROR is .032.For 1005 the MARGIN OF ERROR is .032.
Hence we report: PERCENTAGE Hence we report: PERCENTAGE ++ .032 .032 The margin of error does not depend on the The margin of error does not depend on the population size, only on the sample size!population size, only on the sample size!
• To refine the idea of standard deviation (for later use in a refined margin of error).
• We also want to relate this to the normal curve.
In the past we:1. used a sample to get a sample proportion2. used a formula to get the margin of error3. reported the sample proportion + the margin of error
Now we want a formula for the standard deviation. Then we will use the new standard deviation formulato calculate a new margin of error.
Goals
Formula for estimating the standard deviation of a sample proportion (don’t need histogram):
sample proportion (1 sample proportion )
sample size
.53 (1 .53).016
1005
If we happen to know the true population proportion we use itinstead of the sample proportion.
0.466 0.482 0.498 0.514 0.530 0.546 0.562 0.578 0.594
0
10
20
30
40
50
60
70
80
90
Fre
qu
en
cy
a sample of 10051000 percentages each based onHistogram with Normal Curve
4 standard dev iations.016std dev =
Summary:1. We take a sample of 1005 phone interviews
2. We estimate the percent of the American publicthat thinks that Martha Stewart should go to jail:
53%
3. To assess the uncertainty in the 53% samplefigure, we think of a normal curve of percentagescentered at .530 with standard deviation of .016.
4. So the normal curve has 95% of its distribution between .530 – 2x.016 and .530 + 2x.016 or
Estimate 53% (.53) with 50% to 56% (.50 to .56) the reasonable interval of values.
What to expect from sample proportions
Facts: fingerprints may be influenced by prenatal hormones.
Most people have more ridges on right hand than left.
People who have more on the left hand are said to have leftward asymmetry.
Women are more likely to have this trait than men.
The proportion of all men who have this trait is about 15%
In a study of 186 heterosexual and 66 homosexual men 26 (14%) heterosexual men showed the trait and 20 (30%) homosexual men showed the trait
(Reference: Hall, J. A. Y. and Kimura, D. "Dermatoglyphic Asymmetry and Sexual Orientation in Men", Behavioral Neuroscience, Vol. 108, No. 6, 1203-1206, Dec 94. )
Is it unusual to observe a sample of 66 men and observea sample proportion of 30%?
We now know what the distribution of sample proportionsbased on a sample of 66 should look like. We will supposethat the true proportion in the population of men is 15%.
04466
15115.
).(. Standarddeviation
0.0 0.1 0.2 0.3
0
5
10
15F
requ
ency
Histogram of proportions, with Normal Curven = 66, true proportion = .15, standard deviation= .044
homosexual men0.150.062 0.238
4 standard deviations
2 std devs
The sample proportion for homosexual men (30%) is toolarge to come from the expected distribution of sample proportions.
Sample means: measurement variables
Data from stat 100 survey. Sample size 237.Mean value is 152.5 pounds.Standard deviation is about (240 – 100)/4 = 35
Suppose we want to estimate the mean weight at PSU
300200100
40
30
20
10
0
Weight
Fre
quen
cy
Histogram of Weight, with Normal Curve
What is the uncertainty in the mean?
We need a margin of error for the mean.
Suppose we take another sample of 237.
What will the mean be?
Will it be 152.5 again?
Probably not.
Consider what happens if we take 1000 sampleseach of size 237 and compute 1000 means.
Standard deviation is about (157 – 148)/4 = 9/4 = 2.25
160155150145
100
50
0
Weight
Fre
quen
cy
curve, based on samples of size 237Histogram of 1000 means with normal
Formula for estimating the standard deviation of the sample mean (don’t need histogram)
Just like in the case of proportions, we would like to have a simple formula to find the standard deviation of the mean without havingto resample a lot of times.
Suppose we have the standard deviation of the original sample. Then the standard deviationof the sample mean is:
standard deviation of the data
sample size
So in our example of weights:
The standard deviation of the sample is about 35.
Hence by our formula:
Standard deviation of the mean is 35 divided bythe square root of 237: 35/15.4 = 2.3(Recall we estimated it to be 2.25)
So the margin of error of the sample mean is 2x2.3 = 4.6
Report 152.5 + 4.6 or 147.9 to 157.1
Example: SAT scores
Suppose nationally we know that the SAT has amean of 425 points and a standard deviation of 120 points.
Draw by hand a picture of what you expect the distributionof sample means based on samples of size 100 to look like.
Sample means have a normal distribution mean 425standard deviation 120/10 = 12
So draw a bell shaped curve, centered at 425, with 95%of the bell between 425 – 24 = 401 and 425 + 24 = 449
390 400 410 420 430 440 450 460
0
5
10
15F
requ
ency
Normal Curve of SAT meansbased on samples of 100
mean = 425 std dev = 12
425
4 std devs
A sample of 100 SATs with a mean of 460 would be very unusual. A sample of 100 with a mean of 440 would not be unusual.