View
229
Download
0
Embed Size (px)
Citation preview
7/29/2019 Sampling Distribution[1]
1/35
1
Sampling Methods and
Sampling Distr ibutionsChapter
7/29/2019 Sampling Distribution[1]
2/35
2EXPLAIN WHY SAMPLES ARE USED.
DEFINE AND CONSTRUCT A SAMPLINGDISTRIBUTION OF SAMPLE MEANS.
EXPLAIN THE CENTRAL LIMIT THEOREM
CALCULATE CONFIDENCE INTERVALSFOR MEANS AND PROPORTIONS.
DETERMINE HOW LARGE A SAMPLESHOULD BE FOR BOTH MEANS ANDPROPORTIONS.
GOALS
7/29/2019 Sampling Distribution[1]
3/35
3 The destructive nature of certain tests.
The physical impossibility of checking allitems in the population.
The cost of studying all the items in apopulation is often prohibitive.
The adequacy of sample results.
To contact the whole population wouldoften be time-consuming.
WHY SAMPLE THE POPULATION?
7/29/2019 Sampling Distribution[1]
4/35
4What is a Probabi l i ty Sample?
A sample selected in such a way that eachitem or person in the population beingstudied has a known (nonzero) likelihood of
being included in the sample. Simple Random Sample:A sample
formulated so that each item or person in
the population has the same chance ofbeing included.
PROBABILITY SAMPLING
7/29/2019 Sampling Distribution[1]
5/35
5
Given a list of elements, select a randomsubset.
Tools
Uniform Random Number Generator Sort Function
How?
SIMPLE RANDOM SAMPLING VIA EXCEL
7/29/2019 Sampling Distribution[1]
6/35
6
Systemat ic Random Sampl ing:The items orindividuals of the population are arrangedin some way-alphabetically or by someother method. A random starting point is
selected, and then every kth member of thepopulation is selected for the sample.
Strat i f ied Random Sampl ing :A population
is first divided into subgroups, calledstrata, and a sample is selected from eachstratum.
PROBABILITY SAMPLING (continued)
7/29/2019 Sampling Distribution[1]
7/35
7
Sampl ing Error :The difference between asample statistic and its correspondingparameter.
For example
PROBABILITY SAMPLING (continued)
7/29/2019 Sampling Distribution[1]
8/35
8
A probability distribution consisting of a listof all possible sample means of a givensample size selected from a population, andthe probability of occurrence associated
with each sample mean.
EXAMPLE :A law firm has five partners. Attheir weekly partners meeting each
reported the number of hours they chargedclients for their professional services lastweek. The results are given on the next
slide.
SAMPLING DISTRIBUTION OF
THE SAMPLE MEANS
7/29/2019 Sampling Distribution[1]
9/35
9EXAMPLE (continued)
Two partners are randomly selected. Howmany different samples are possible?
7/29/2019 Sampling Distribution[1]
10/35
10 This is the combination of 5 objects taken 2
at a time. That is, 5C2 = (5!)/[(2!)(3!)] = 10. List the possible samples of size 2 and
compute the mean.
EXAMPLE (continued)
7/29/2019 Sampling Distribution[1]
11/35
11Organize the sample means into a sampling
distribution. The sampling distribution isshown below.
EXAMPLE (continued)
7/29/2019 Sampling Distribution[1]
12/35
12Compute the mean of the sample means
and compare it with the population mean. The population mean,
m = (22 + 26 + 30 + 26 + 22)/5 = 25.2. The mean of the sample means = [(22)(1) +
(24)(4) + (26)(3) + (28)(2)]/10 = 25.2.
Observe that the mean o f the sample means
is equal to the popu lat ion mean.
EXAMPLE (cont inued)
7/29/2019 Sampling Distribution[1]
13/35
13 For a population with a mean m and a
variance s2, the sampling distribution of themeans of all possible samples of size ngenerated from the population will be
approximately normally distributed - withthe mean of the sampling distribution equalto m and the variance equal to s2/n-assuming that the sample size issufficiently large.
CENTRAL LIMIT THEOREM
7/29/2019 Sampling Distribution[1]
14/35
14POINT ESTIMATESOne value (called a po in t) that is used to
estimate a population parameter. Examples of point estimates are the samplemean, the sample standard deviat ion, the
sample var iance, the sample propo rt ion etc. EXAMPLE: The number of defective items
produced by a machine was recorded for fiverandomly selected hours during a 40-hourwork week. The observed number ofdefectives were 12, 4, 7, 14, and 10. So thesample mean is 9.4. Thus a point estimate for
the weekly mean number of defectives is 9.4
7/29/2019 Sampling Distribution[1]
15/35
15INTERVAL ESTIMATESAn In terval Estim atestates the range within
which a population parameter probably lies. The interval within which a population
parameter is expected to occur is called a
conf idence interval. The two confidence intervals that are used
extensively are the 95% and the 99%.
A 95%confidence interval means that about95% of the similarly constructed intervalswill contain the parameterbeing estimated.
7/29/2019 Sampling Distribution[1]
16/35
16INTERVAL ESTIMATES (con tinued)Another interpretation of the 95%
confidence interval is that 95% of thesample means for a specified sample sizewill lie within 1.96 standard deviations of
the hypothesized population mean. For the 99% confidence interval, 99% of the
sample means for a specified sample size
will lie within 2.58 standard deviations ofthe hypothesized population mean.
7/29/2019 Sampling Distribution[1]
17/35
17
-2.58 -1.96 1.96 2.58
95%
99%
0
STANDARD ERROR OF THE SAMPLE
7/29/2019 Sampling Distribution[1]
18/35
18STANDARD ERROR OF THE SAMPLE
MEANS
This is the standard deviation of thesampling distribution of the sample means.
The standard errorof the sample means iscomputed by:
is the symbol for the standard error ofthe sample means.
sis the standard deviation of the population.n is the size of the sample.
ss
xn
sx
STANDARD ERROR OF THE SAMPLE
7/29/2019 Sampling Distribution[1]
19/35
19
Ifsis not knownand n= 30 ormore(considered a large sample), the standarddeviation of the sample, designated by s, isused to approximate the population
standard deviation, s. The formula for thestandard error then becomes:
What happens as n gets larger?
s
s
n
x
STANDARD ERROR OF THE SAMPLE
MEANS (continued)
95% AND THE 99% CONFIDENCE
7/29/2019 Sampling Distribution[1]
20/35
2095% AND THE 99% CONFIDENCE
INTERVALS (CI) FOR m
The 95% and the 99% confidence intervalsform are constructed as follows when n30.
95% CI for the population meanmis givenby
99% CI formis given by
Xs
n196.
Xs
n
258.
CONSTRUCTING A GENERAL
7/29/2019 Sampling Distribution[1]
21/35
21
In general, a confidence interval for themean is computed by:
The Z value is obtained from the standard
no rmal table in Append ix D (look-upconfidence/2).
CONSTRUCTING A GENERAL
CONFIDENCE INTERVALS (CI) FORm
X Zs
n
7/29/2019 Sampling Distribution[1]
22/35
22 The Dean of Students at Penta Tech wants to
estimate the mean number of hours workedper week by students. A sample of 49students showed a mean of 24 hours with astandard deviation of 4 hours.
What is the point estimate of the mean numberof hours worked per week by students?
The point estimate is 24 hours (sample mean).
What is the 95% confidence interval for theaverage number of hours worked per week bythe students?
EXAMPLE
7/29/2019 Sampling Distribution[1]
23/35
23Using formula, we have 24 1.96(4/7) or we
have 22.88 to 25.12.What are the 95% conf idence l imi ts?
The endpoints of the confidence interval
are the confidence limits. The lowerconf idence l im i tis 22.88 and the upperconf idence l im i tis 25.12.
What degree of conf idenceis being used? The degree of confidence (level of
confidence) is 0.95
EXAMPLE (continued)
7/29/2019 Sampling Distribution[1]
24/35
24 Interpret the findings.
If we had time to select 100 samples of size 49from the population of the number of hours worked
per week by students at Penta Tech and compute
the sample means and 95% confidence intervals,
the population mean of the number of hours
worked by the students per week would be found in
about 95 out of the 100 confidence intervals.
Either a confidence interval contains the populationmean or it does not. About 5 out of the 100
confidence intervals would not contain the
population mean.
EXAMPLE(cont inued)
CONFIDENCE INTERVAL FOR A
7/29/2019 Sampling Distribution[1]
25/35
25CONFIDENCE INTERVAL FOR APOPULATION PROPORTION
The confidence interval for a populationproportion:
where is the standard error of the
proportion:
p z p s
sp
p p
n
( )1
sp
Th fid i t l i t t d b
7/29/2019 Sampling Distribution[1]
26/35
26 The confidence interval is constructed by:
where:is the sample proportion.
z is the zvalue for the degree of
confidence selected.n is the sample size.
p z p pn
( )1
p
7/29/2019 Sampling Distribution[1]
27/35
27EXAMPLE Chris Cooper, a financial planner, is studying
the retirement plans of young executives. Asample of 500 young executives who ownedtheir own home revealed that 175 planned tosell their homes and retire to Arizona. Develop
a 98% confidence interval for the proportion ofexecutives that plan to sell and move toArizona.
Here n= 500, = 175/500 = 0.35, and z=2.33 the 98% CI is 0.35 2.33 or
0.35 0.0497. Interpret?
p( . )( . )035 065
500
FINITE POPULATION
7/29/2019 Sampling Distribution[1]
28/35
28FINITE-POPULATIONCORRECTION FACTOR
A population that has a fixed upper boundis said to be f in i te.
For a f in i te popu lat ion, where the totalnumber of objects is Nand the size of thesample is n, the following adjustment ismade to the standard errors of the samplemeans and the proportion.
Standard error o f the sample means :
s
s
x
n
N n
n
1
FINITE POPULATION
7/29/2019 Sampling Distribution[1]
29/35
29
Standard error of the sample proport ions :
Note: If n /N < 0.05, the fini te-popu lation
co rrect ion factor can be igno red.
sp
p p
n
N n
N
( )1
1
FINITE-POPULATION
CORRECTION FACTOR (con tinued)
EXAMPLE
7/29/2019 Sampling Distribution[1]
30/35
30 The Dean of Students at Penta Tech wants to
estimate the mean number of hours workedper week by students. A sample of 49students showed a mean of 24 hours with astandard deviation of 4 hours. Construct a
95% confidence interval for the mean numberof hours worked per week by the students ifthere are only 500 students on campus.
Now n /N= 49/500 = 0.098 > 0.05, so we have touse the finite population correction factor.
= [22.9352,25.1065]
EXAMPLE
24 1964
49
500 49
500 1
.
7/29/2019 Sampling Distribution[1]
31/35
31SELECTING A SAMPLE SIZE There are 3 factors that determine the size
of a sample, none of which has any d irectrelat ionship to the size of the popu lat ion .
They are:
1. The degree of confidence selected.2. The maximum allowable error.
3. The variation of the population.
7/29/2019 Sampling Distribution[1]
32/35
32SAMPLE SIZE FOR THE MEANA convenient computational formula for
determining n is:
where:
E is the allowable error.
z is the z score associated with the degreeof confidence selected.
s is the sample deviation of the pilotsurvey.
n Z S
E
2
7/29/2019 Sampling Distribution[1]
33/35
33A consumer group would like to estimate
the meanmonthly electric bill for a singlefamily house in July. Based on similarstudies the standard deviation is estimated
to be $20.00. A 99% level of confidence isdesired, with an accuracy of $5.00. Howlarge a sample is required?
n= [(2.58)(20)/5]
2
= 106.5024 107.
EXAMPLE
34
7/29/2019 Sampling Distribution[1]
34/35
34SAMPLE SIZE FOR PROPORTIONS The formula for determining the sample
size in the case of a proportion is:
is the estimated proportion, based on pastexperience or a pilot survey.
z is the zvalue associated with the degree ofconfidence selected.
E is the maximum allowable error the researcher will
tolerate.
p
n p p Z
E
( )12
35EXAMPLE
7/29/2019 Sampling Distribution[1]
35/35
35 The American Kennel Club wanted to
estimate the propor t ionof children thathave a dog as a pet. If the club wanted theestimate to be within 3% of the population
proportion, how many children would theyneed to contact? Assume a 95% level ofconfidence and that the Club estimated that30% of the children have a dog as a pet.
n= (0.30)(0.70)(1.96/0.03)2 = 896.3733 897.
EXAMPLE