Upload
holleb
View
28
Download
0
Embed Size (px)
DESCRIPTION
EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics. Estimation: Additional Topics. Chapter Topics. Population Means, Dependent Samples. Population Means, Independent Samples. Population Proportions. Population Variance. - PowerPoint PPT Presentation
Citation preview
1/49
EF 507
QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE
FALL 2008
Chapter 9
Estimation: Additional Topics
2/49
Estimation: Additional Topics
Chapter Topics
Population Means,
Independent Samples
Population Means,
Dependent Samples
Population Variance
Group 1 vs. independent
Group 2
Same group before vs. after
treatment
Variance of a normal distribution
Examples:
Population Proportions
Proportion 1 vs. Proportion 2
3/49
1. Dependent Samples
Tests Means of 2 Related Populations Paired or matched samples Repeated measures (before/after) Use difference between paired values:
Eliminates Variation Among Subjects Assumptions:
Both Populations Are Normally Distributed
Dependent samples
di = xi - yi
4/49
Mean Difference
The ith paired difference is di , where
di = xi - yi
The point estimate for the population mean paired difference is d : n
dd
n
1ii
n is the number of matched pairs in the sample
1n
)d(dS
n
1i
2i
d
The sample standard deviation is:
Dependent samples
5/49
Confidence Interval forMean Difference
The confidence interval for difference between population means, μd , is
Where
n = the sample size (number of matched pairs in the paired sample)
n
Stdμ
n
Std d
α/21,ndd
α/21,n
Dependent samples
6/49
The margin of error is
tn-1,/2 is the value from the Student’s t distribution with (n – 1) degrees of freedom for which
Confidence Interval forMean Difference
(continued)
2
α)tP(t α/21,n1n
n
stME d
α/21,n
Dependent samples
7/49
Six people sign up for a weight loss program. You collect the following data:
Paired Samples Example
Weight: Person Before (x) After (y) Difference, di
1 136 125 11 2 205 195 10 3 157 150 7 4 138 140 - 2 5 175 165 10 6 166 160 6 42
d = di
n
4.82
1n
)d(dS
2i
d
= 7.0
8/49
For a 95% confidence level, the appropriate t value is tn-1,/2 = t5,0.025 = 2.571
The 95% confidence interval for the difference between means, μd , is
12.06μ1.94
6
4.82(2.571)7μ
6
4.82(2.571)7
n
Stdμ
n
Std
d
d
dα/21,nd
dα/21,n
Paired Samples Example (continued)
Since this interval contains zero, we can not be 95% confident, given this limited data, that the weight loss program helps people lose weight
9/49
2. Difference Between Two Means
Population means, independent
samples
Goal: Form a confidence interval for the difference between two population means, μx – μy
x – y
Different data sources Unrelated Independent
Sample selected from one population has no effect on the sample selected from the other population
The point estimate is the difference between the two sample means:
10/49
Difference Between Two Means
Population means, independent
samples
Confidence interval uses z/2
Confidence interval uses a value from the Student’s t distribution
σx2 and σy
2 assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
(continued)
11/49
Population means, independent
samples
2. σx2 and σy
2 Known
Assumptions:
Samples are randomly and independently drawn
both population distributions are normal
Population variances are known
*σx2 and σy
2 known
σx2 and σy
2 unknown
12/49
Population means, independent
samples
…and the random variable
has a standard normal distribution
When σx and σy are known and
both populations are normal, the variance of X – Y is
y
2y
x
2x2
YX n
σ
n
σσ
(continued)
*
Y
2y
X
2x
YX
n
σ
nσ
)μ(μ)yx(Z
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 Known
13/49
Population means, independent
samples
The confidence interval for
μx – μy is:
Confidence Interval, σx
2 and σy2 Known
*
y
2Y
x
2X
α/2YXy
2Y
x
2X
α/2 n
σ
n
σz)yx(μμ
n
σ
n
σz)yx(
σx2 and σy
2 known
σx2 and σy
2 unknown
14/49
Population means, independent
samples
3.a) σx2 and σy
2 Unknown, Assumed Equal
Assumptions:
Samples are randomly and independently drawn
Populations are normally distributed
Population variances are unknown but assumed equal*σx
2 and σy2
assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
15/49
Population means, independent
samples
(continued)
Forming interval estimates:
The population variances are assumed equal, so use the two sample standard deviations and pool them to estimate σ
use a t value with (nx + ny – 2) degrees of freedom
*σx2 and σy
2 assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
σx2 and σy2 Unknown,
Assumed Equal
16/49
Population means, independent
samples
The pooled variance is
(continued)
* 2nn
1)s(n1)s(ns
yx
2yy
2xx2
p
σx
2 and σy2
assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
σx2 and σy
2 Unknown,Assumed Equal
17/49
The confidence interval for
μ1 – μ2 is:
Where
*
Confidence Interval, σx
2 and σy2 Unknown, Equal
σx2 and σy
2 assumed equal
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
y
2p
x
2p
α/22,nnYXy
2p
x
2p
α/22,nn n
s
n
st)yx(μμ
n
s
n
st)yx(
yxyx
2nn
1)s(n1)s(ns
yx
2yy
2xx2
p
18/49
Pooled Variance Example
You are testing two computer processors for speed. Form a confidence interval for the difference in CPU speed. You collect the following speed data (in Mhz):
CPUx CPUy
Number Tested 17 14Sample mean 3,004 2,538Sample std dev 74 56
Assume both populations are normal with equal variances, and use 95% confidence
19/49
Calculating the Pooled Variance
4427.03
1)141)-(17
5611474117
1)n(n
S1nS1nS
22
y
2yy
2xx2
p
(()1x
The pooled variance is:
The t value for a 95% confidence interval is:
2.045tt 0.025 , 29α/2 , 2nn yx
20/49
Calculating the Confidence Limits
The 95% confidence interval is
y
2p
x
2p
α/22,nnYXy
2p
x
2p
α/22,nn n
s
n
st)yx(μμ
n
s
n
st)yx(
yxyx
14
4427.03
17
4427.03(2.054)2538)(3004μμ
14
4427.03
17
4427.03(2.054)2538)(3004 YX
515.31μμ416.69 YX
We are 95% confident that the mean difference in CPU speed is between 416.69 and 515.31 Mhz.
21/49
Population means, independent
samples
3.b) σx2 and σy
2 Unknown, Assumed Unequal
Assumptions:
Samples are randomly and independently drawn
Populations are normally distributed
Population variances are unknown and assumed unequal
*
σx2 and σy
2 assumed equal
σx2 and σy
2 known
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
22/49
Population means, independent
samples
σx2 and σy
2 Unknown,Assumed Unequal
(continued)
Forming interval estimates:
The population variances are assumed unequal, so a pooled variance is not appropriate
use a t value with degrees of freedom, where
σx2 and σy
2 known
σx2 and σy
2 unknown
*
σx2 and σy
2 assumed equal
σx2 and σy
2 assumed unequal
1)/(nn
s1)/(n
ns
)n
s()
ns
(
y
2
y
2y
x
2
x
2x
2
y
2y
x
2x
v
23/49
The confidence interval for
μ1 – μ2 is:
*
Confidence Interval, σx
2 and σy2 Unknown, Unequal
σx2 and σy
2 assumed equal
σx2 and σy
2 unknown
σx2 and σy
2 assumed unequal
y
2y
x
2x
α/2,YXy
2y
x
2x
α/2, n
s
n
st)yx(μμ
n
s
n
st)yx(
1)/(nn
s1)/(n
ns
)n
s()
ns
(
y
2
y
2y
x
2
x
2x
2
y
2y
x
2x
vWhere
24/49
Two Population Proportions
Goal: Form a confidence interval for the difference between two population proportions, Px – Py
The point estimate for the difference is
Population proportions
Assumptions:
Both sample sizes are large (generally at least 40 observations in each sample)
yx pp ˆˆ
25/49
Two Population Proportions
Population proportions
(continued)
The random variable
is approximately normally distributed
y
yy
x
xx
yxyx
n
)p(1p
n)p(1p
)p(p)pp(Z
ˆˆˆˆ
ˆˆ
26/49
Confidence Interval forTwo Population Proportions
Population proportions
The confidence limits for
Px – Py are:
y
yy
x
xxyx n
)p(1p
n
)p(1pZ )pp(
ˆˆˆˆˆˆ
2/
27/49
Example: Two Population Proportions
Form a 90% confidence interval for the difference between the proportion of men and the proportion of women who have college degrees.
In a random sample, 26 of 50 men and 28 of 40 women had an earned college degree
28/49
Example: Two Population Proportions
Men:
Women:
0.101240
0.70(0.30)
50
0.52(0.48)
n
)p(1p
n
)p(1p
y
yy
x
xx
ˆˆˆˆ
0.5250
26px ˆ
0.7040
28py ˆ
(continued)
For 90% confidence, Z/2 = 1.645
29/49
Example: Two Population Proportions
The confidence limits are:
so the confidence interval is
-0.3465 < Px – Py < -0.0135
Since this interval does not contain zero we are 90% confident that the two proportions are not equal
(continued)
y yx xx y α/2
x y
ˆ ˆp (1 p )ˆ ˆp (1 p )ˆ ˆ(p p ) Z
n n
(0.52 0.70) 1.645(0.1012)
30/49
Confidence Intervals for the Population Variance
Population Variance
Goal: Form a confidence interval for the population variance, σ2
The confidence interval is based on the sample variance, s2
Assumed: the population is normally distributed
31/49
Confidence Intervals for the Population Variance
Population Variance
The random variable
2
22
1n σ
1)s(n
follows a chi-square distribution with (n – 1) degrees of freedom
(continued)
The chi-square value denotes the number for which2
, 1n
α)P( 2α , 1n
21n χχ
32/49
Confidence Intervals for the Population Variance
Population Variance
The (1 - )% confidence interval for the population variance is
2/2 - 1 , 1n
22
2/2 , 1n
2 1)s(nσ
1)s(n
αα χχ
(continued)
33/49
Example
You are testing the speed of a computer processor. You collect the following data (in Mhz):
CPUx
Sample size 17Sample mean 3,004Sample std dev 74
Assume the population is normal. Determine the 95% confidence interval for σx
2
34/49
Finding the Chi-square Values
n = 17 so the chi-square distribution has (n – 1) = 16 degrees of freedom
= 0.05, so use the the chi-square values with area 0.025 in each tail:
probability α/2 = 0.025
216
216
= 28.85
6.91
28.85
20.975 , 16
2/2 - 1 , 1n
20.025 , 16
2/2 , 1n
χχ
χχ
α
α
216 = 6.91
probability α/2 =0 .025
35/49
Calculating the Confidence Limits
The 95% confidence interval is
Converting to standard deviation, we are 95% confident that the population standard deviation of
CPU speed is between 55.1 and 112.6 Mhz
2/2 - 1 , 1n
22
2/2 , 1n
2 1)s(nσ
1)s(n
αα χχ
6.91
1)(74)(17σ
28.85
1)(74)(17 22
2
23,037 σ 12,683
36/49
Sample PHStat Output
37/49
Sample PHStat Output
Input
Output
(continued)
38/49
Sample Size Determination
For the Mean
DeterminingSample Size
For theProportion
39/49
Margin of Error
The required sample size can be found to reach a desired margin of error (ME) with a specified level of confidence (1 - )
The margin of error is also called sampling error the amount of imprecision in the estimate of the
population parameter the amount added and subtracted to the point estimate
to form the confidence interval
40/49
For the Mean
DeterminingSample Size
n
σzx α/2
n
σzME α/2
Margin of Error (sampling error)
Sample Size Determination
41/49
For the Mean
DeterminingSample Size
n
σzME α/2
(continued)
2
22α/2
ME
σzn Now solve
for n to get
Sample Size Determination
42/49
Sample Size Determination
To determine the required sample size for the mean, you must know:
The desired level of confidence (1 - ), which determines the z/2 value
The acceptable margin of error (sampling error), ME
The standard deviation, σ
(continued)
2
22α/2
ME
σzn
43/49
Required Sample Size Example
If = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence?
(Always round up)
219.195
(45)(1.645)
ME
σzn
2
22
2
22α/2
So the required sample size is n = 220
44/49
n
)p(1pzp α/2
ˆˆˆ
n
)p(1pzME α/2
ˆˆ
DeterminingSample Size
For theProportion
Margin of Error (sampling error)
Sample Size Determination
45/49
DeterminingSample Size
For theProportion
2
2α/2
ME
z 0.25n
Substitute 0.25 for and solve for n to get
(continued)
Sample Size Determination
n
)p(1pzME α/2
ˆˆ
cannot be larger than 0.25, when = 0.5
)p(1p ˆˆ
p̂)p(1p ˆˆ
46/49
The sample and population proportions, and P, are generally not known (since no sample has been taken yet)
P(1 – P) = 0.25 generates the largest possible margin of error (so guarantees that the resulting sample size will meet the desired level of confidence)
To determine the required sample size for the proportion, you must know:
The desired level of confidence (1 - ), which determines the critical z/2 value
The acceptable sampling error (margin of error), ME Estimate P(1 – P) = 0.25
(continued)
Sample Size Determination
p̂
47/49
Required Sample Size Example
How large a sample would be necessary to estimate the true proportion defective in a large population within ±3%, with 95% confidence?
48/49
Required Sample Size Example
Solution:
For 95% confidence, use z0.025 = 1.96
ME = 0.03
Estimate P(1 – P) = 0.25
So use n = 1,068
(continued)
2 2α/2
2 2
0.25 z (0.25)(1.96)n 1,067.11
ME (0.03)
49/49
PHStat Sample Size Options