Upload
griffin-ellis
View
218
Download
4
Embed Size (px)
Citation preview
SP 225Lecture 10
The Central Limit Theorem
Differences Between Statistics and Parameters
Population: All People
Parameter: 5 of 15 or 33% wear glasses
Sample: 3 Randomly Selected People
Statistic: 0 of 3 or 0% wear glasses
Methods to for Better Samples
Random sampling makes samples representative
Book term: probability sample EPSEM (Equal probability of selectino
method)
EPSEM Technique
Begin with a list of all population members
Generate random numbers to identify members of the list to be selected in the sample
Do everything possible to get selected members to participate in the survey
Sampling Distributions
The shape, measure of center and measure of variation associated with many sample statistics
Unique and different from the population distribution
Three Separate Distributions
Sample distribution Empirical and known Intended to help learn about the population
Population distribution Empirical and unknown Properties estimated using statistics
Sampling distribution Theoretical or non-empirical Properties well-known and based on probabilities
Population DistributionNumber of Siblings
0
2
4
6
8
10
12
14
16
0 1 2 3 4 5 6
Population Distribution
Population Distribution
Roll 1 Roll 2 Roll 3 Roll 4 Roll 5 Roll 6
Dice Face
Fre
qu
ency
Sample Distribution
600 Rolls of the Die
0
20
40
60
80
100
120
Roll 1 Roll 2 Roll 3 Roll 4 Roll 5 Roll 6
Fre
qu
ency
Sampling Distribution of Mean
3.653.603.553.503.453.403.35
9
8
7
6
5
4
3
2
1
0
C1
Frequency
Histogram of C1
Central Limit Theorem
For normally distributed populations If repeated samples of size N are drawn
from a normal population with mean µ and standard deviation σ, then the sampling distribution σ of sample means will be normal with a mean of µ and a standard deviation of σ/√N
Central Limit Theorem
For any population If repeated samples of size N are drawn
from a normal population with mean µ and standard deviation , then, as N becomes large, the sampling distribution σ of sample means will be normal with a mean of µ and a standard deviation of σ/√N
The Probability Distribution
We can calculate probabilities for sample means using the sampling distribution for sample means
Similar to calculating probabilities for an individual using a population distribution
Use standard deviation of the sampling distribution instead of sampling distribution of the population
Empirical Rule for Data with a Bell-Shaped Distribution (3)
Example Problem (1)
The average GPA at a particular school is m=2.89 with a standard deviation s=0.63. A random sample of 25 students is collected. Find the probability that the average GPA for this sample is greater than 3.0.
Example Problem (2)
The time it takes students in a cooking school to learn to prepare seafood gumbo is a random variable with a normal distribution where the average is 3.2 hours with a standard deviation of 1.8 hours. Find the probability that the average time it will
take a class of 36 students to learn to prepare seafood gumbo is less than 3.4 hours.
Find the probability that it takes one student between 3 and 4 hours to learn to prepare seafood gumbo.
Confidence Interval Mathematical statement that says that the
parameter lies within a certain range of values
The average employee of XYZ Automotive has been employed between 8 and 12 years
95% confident that the mean length of employment at XYZ automotive is between 8 and 12 years
Probability Distribution for Population Mean
95% confident that the mean length of employment at XYZ automotive is between 8 and 12 years
(sample)
Confidence Level
Percent of confidence intervals that contain the population mean over the long run
Probability this confidence interval contains the population mean
95% confident that the mean length of employment at XYZ automotive is between 8 and 12 years
99% confident that the mean length of employment at XYZ automotive is between 8 and 12 years
Confidence Interval Formula
)(..N
ZXic
Confidence Interval Estimate
)1
(..
N
sZXic
Confidence Interval Example
A random sample of 100 television programs contained an average of 2.37 acts of physical violence per program. The standard deviation of the number of acts of violence on television is 3. At the 99% level, what is your estimate of the mean number of acts of violence for all television programs?
Alpha
Percent of confidence intervals that DO NOT contain the population mean over the long run
Probability this confidence interval DOES NOT contain the population mean
Complement of confidence level
Efficiency
Extent to which the confidence interval clusters around the mean
Width of the confidence interval Determined by population standard
deviation and sample size