Upload
alexandra-blair
View
221
Download
0
Embed Size (px)
DESCRIPTION
3 Distribution of sample proportion X/n: For n ≥ 30 Confidence Interval for p (Sec. 10.1): Maximum error of estimate for p
Citation preview
1
Math 4030 – 10b
Inferences ConcerningProportions
2
Population proportion p is:
• p(100)% of the subjects in the population has the property of our interest;
• if randomly select one subject from the population, the probability is p that the subject has the property of our interest;
• if we take a sample of size n, of which X subjects have the property of our interest, then the sample proportion is
Sample Proportion:
nX
3
Distribution of sample proportion X/n:
),(~ pnBinomialX))1(,( pnpnpN
npppN
nX )1(,
For n ≥ 30
Confidence Interval for p (Sec. 10.1):
nnx
nx
znxp
nnx
nx
znx
11
2/2/
Maximum error of estimate for p
4
Sample size calculation:
/2
1p pE z
n
2/2(1 ) zn p pE
p??
• Use p from similar population;• Use ¼ as maximum of p(1-p);• If = 0.05, we may use n = 1/E2
5
For Hypothesis Testing (Sec. 10.2)
npppN
nX )1(,
)1,0(~
11 00
0
00
0N
pnpnpX
npp
pnX
Z
00 : ppH
6
A new method is under development for making disks of a superconducting material. 50 disks are made by each method (new and old) and they are checked for superconductivity when cooled with liquid nitrogen.
Compare 2 proportions:
Old Method 1 New Method 2 TotalSuperconductor
s 31 42 73
Failures 19 8 27
Total 50 50 100
Need to claim that the new method makes improvement.
7
2112
111,0~ˆˆnn
ppNpp
use unknown, is If p
)1,0(~
111
ˆˆ
21
12 N
nnpp
pp
or
Sample proportions: .ˆ ,ˆ2
22
1
11 n
XpnXp
Distribution under the assumption pppH 210 :
.ˆ21
21
nnxxp
Distribution of Sample Proportion Difference:
8
478.2
501
50173.0173.0
62.084.0
11ˆ1ˆ
ˆˆ
OLDNEW
OLDNEW
nnpp
ppZ
0:0:
1
0
OLDNEW
OLDNEW
ppHppH
Hypothesis Testing:Null hypothesis
Alternative hypothesis
Level of significance: Test) tailed-(Right 05.0
Critical value and Critical region: for large sample, we use the z-test 645.105.0 z
Sample statistic calculation:
Conclusion: Reject the null hypothesis, …
73.010073ˆ p
9
17.050
62.0162.050
84.0184.096.1
ˆ1ˆˆ1ˆ2/
OLD
OLDOLD
NEW
NEWNEW
npp
nppzE
Confidence interval for the difference:
22.062.084.05031
5042ˆˆ OLDNEW pp
39.0ˆˆ05.0 OLDNEW pp
More than Up to
10
Compare Several Proportions (Sec. 10.3):
Sample 1
Sample 2 … Sampl
e k TotalSucces
ses x1 x2 … xk xFailure
s n1-x1 n2-x2 … nk-xk n - x
Total n1 n2 … nk n
From k independent samples from k populations, we have
11
,)1( jjj
jjjj ppn
pnXZ
for each j, and large sample.
Sampling distribution if are k population proportions:
kppp ,...,, 21
Combined
k
j jjj
jjj
ppnpnX
1
22
)1(
has chi-square distribution with df = k – 1.
Normal approximate binomial.
12
)1()1()1(
222
jjj
jjjj
jjj
jjjj
jjj
jjj
ppnpnXp
ppnpnXp
ppnpnX
jj
jjj
pnpnX 2
jj
jjjj
jj
jjjjj
pnpnXn
pnpnnnX
11
12
2
k
j i ij
ijijk
j jjj
jjj
eeo
ppnpnX
1
2
1
2
1
22
)1(
Observed frequency
Expected frequency
13
,
2
1 1
22
i
k
j ij
ijij
eeo
same. theallnot are ,...,,:...:
4211
210
pppHpppH k
Hypothesis Testing:
Null hypothesis
Alternative hypothesis
Sample statistic:
wherek
k
nnnxxx
nxp
......ˆ
21
21 (Pooled proportion)
pne
pne
jj
jj
ˆ1
ˆ
2
1
(Expected Cell Frequency)
jjj
jj
xno
xo
2
1
(Observed Cell Frequency)
with df = k – 1,
14
Example. Four methods are under development for making disks of a superconducting material. 40, 50, 60, 70 disks are made by each of 4 methods, respectively, and they are checked for superconductivity when cooled with liquid nitrogen.
Method 1
Method 2
Method 3
Method 4 Total
Superconduct
ors21 32 32 45 130
Failures 9 8 28 25 70
Total 30 40 60 70 200
15
First we need to know whether 4 methods have any difference.
Null hypothesis: Alternative hypothesis: are not all equal.Level of significance: = 0.05Critical region: With df = 4 – 1 = 3, we have
Critical region is: (7.815, ).Statistic from sample: We need to calculate the expected frequencies.
4321 pppp
4321 ,,, pppp
815.7)3(205.0
16
Method 1
Method 2
Method 3
Method 4 Total
Superconduct
ors21
(19.5)32
(26)32
(39)45
(45.5) 130
Failures
9(10.5)
8(14)
28(21)
25(24.5) 70
Total 30 40 60 70 2002 = 7.891
Expected frequencies:
Conclusion: Since the sample statistic falls in the critical region, we reject the null hypothesis. Four methods are not all the same.
How do these methods differ?
17
j
j
j
j
j
j
jj
j
j
j
j
j
j
j
nnx
nx
znx
pnnx
nx
znx
11
2/2/
Gives confidence interval for each of the 4 population (method) proportion.Use Excel, we find
M1 M2 M3 M4Sample Size 30 40 60 70
Sample Proportion 0.70 0.80 0.53 0.64E 0.16 0.12 0.13 0.11
95% CI - L 0.54 0.68 0.41 0.53
95% CI - R 0.86 0.92 0.66 0.76
18
Method 1
Method 2
Method 3
Method 4
0.4 0.90.80.70.60.5p
M1 M2 M3 M4Sample Size 30 40 60 70
Sample Proportion 0.70 0.80 0.53 0.64E 0.16 0.12 0.13 0.11
95% CI - L 0.54 0.68 0.41 0.53
95% CI - R 0.86 0.92 0.66 0.76