Upload
hathien
View
219
Download
0
Embed Size (px)
Citation preview
Comparing Two Proportions
Example: caries incidence
Clinical trial with caries intervention on infants
N
developed caries
by age two
controls 36 27.8%
intervention 68 8.8%
Is this strong evidence of effectiveness of
experimental intervention?
Comparison of two proportions
- two independent samples
These are called “two-sample” tests.
Our goal is usually to estimate p1 – p2, the
corresponding confidence intervals, and to perform
hypothesis tests on:
H0: p1 – p2 = 0.
The obvious statistic to compare the two population
proportions is 1p̂ - 2p̂ . Where ip̂ = number of
successes in group i divided by sample size in group
i.
Probability theory tells us that:
1. 1p̂ - 2p̂ is the best estimate of p1 – p2
2. the standard error is 222111 )1()1( nppnpp
3. If n1p1(1-p1) > 5 and n2p2(1-p2) > 5
2221112121 )1()1(,~ˆˆ nppnppppNpp
Large-sample confidence interval for p1 – p2
2221112/121 )ˆ1(ˆ)ˆ1(ˆˆˆ nppnppZpp
Large-sample Z-test of
H0: p1 – p2 = 0 vs. H1: p1 – p2 ≠ 0
Test statistic: )ˆˆ(
ˆˆ
21
21
0ppSE
ppZ
H
Where )ˆˆ( 210ppSEH denotes the standard error
estimates using H0: p1 - p2 = 0 (p1 = p2)
Estimate the common p using
21
21ˆnn
xxp
,
where x1 and x2 are the number of successes in
groups 1 and 2, respectively.
Then
2121 11)ˆ1(ˆ)ˆˆ(0
nnppppSEH
Compare Z to a standard Normal distribution.
Example: Caries incidence
N
caries by age two
Number percent
controls 36 10 27.8%
intervention 68 6 8.8%
95% confidence interval:
1p̂ - 2p̂ = 0.278 - 0.088 = 0.19
082.068)088.01(088.036)278.01(278.0 SE
95% confidence interval for p1 - p2:
351.0,029.0082.096.119.0
Test: H0: p1 – p2 = 0 vs. H1: p1 – p2 ≠ 0
154.06836610ˆ p
074.0681361)154.01(154.0)ˆˆ( 210 ppSEH
57.2074.0
19.0Z
P-value = 2×P(Z > 2.57) = 0.010. Reject at α=.05 level.
Chi-squared Test ( χ2 test)
Chi-square test generalizes two-sample Z-test to
situation with more than two proportions.
Example: perio by gender (NHANES I data):
Evaluate whether periodontitis is independent of
gender by seeing if the proportion of males in each
group defined by periodontal status is the same.
χ2 test utilizes “contingency” tables
The null hypothesis is that all proportions are equal
H0: p1 = p2 = p3.
Observed Data
Co unt
11 43 92 9 93 7 30 09
26 07 14 90 92 1 50 18
37 50 24 19 18 58 80 27
male
fem ale
GENDER
To tal
healthy gin givitis per io
per iodon tal status
To tal
Expected frequencies (under assumption of equal proportions)
periodontal status
Total healthy gingivitis perio
male 3750 ×
(3009/8027)
= 1406
2419 ×
(3009/8027)
= 907
1858 ×
(3009/8027)
= 697
3009
female
3750 ×
(5018/8027)
= 2344
2419 ×
(5018/8027)
= 1512
1858 ×
(5018/8027)
= 1161
5018
Total 3750 2419 1858 8027
Chi-squared statistic:
X2 = Σ(observed - expected)2
expected
697
)697937(
907
)907929(
1406
)14061143( 222
1161
)1161921(
1512
)15121490(
2344
)23442607( 222
= 212
Large (positive) values of X2 indicate evidence
against the null hypothesis.
If H0 is true, then a χ2 statistic from a contingency
table with R rows and C columns should have a
Chi-square distribution with (R-1) × (C-1)
degrees of freedom.
The P-value is the probability that a χ2
(R-1) × (C-1)
distribution is greater than the observed statistic.
Note that all the probability in the p-value (and
rejection region) is on one side, since only large
values of X2 would contradict H0.
Our statistic, 212, was larger than 15.20, the
99.95th percentile of a χ22 dist’n, so p < 0.0005.
Table 6 in the coursepack has χ2 percentiles.
SPSS output for Chi-square test
GENDER * periodontal status Crosstabulation
11 43 92 9 93 7 30 09
14 05.7 90 6.8 69 6.5 30 09.0
26 07 14 90 92 1 50 18
23 44.3 15 12.2 11 61.5 50 18.0
37 50 24 19 18 58 80 27
37 50.0 24 19.0 18 58.0 80 27.0
Co unt
Ex pected Count
Co unt
Ex pected Count
Co unt
Ex pected Count
male
fem ale
GENDER
To tal
healthy gin givitis per io
per iodon tal status
To tal
Chi-Square Tests
21 2.271 a 2 .00 0
21 0.264 2 .00 0
20 9.324 1 .00 0
80 27
Pearson Chi-Square
Lik eliho od Ratio
Lin ear-by-Linear Associatio n
N o f Valid Cases
Value df
Asymp. Sig.
(2- sided)
0 cells (. 0%) have expected count less than 5. The minimum
expected count is 696.49.
a.
Notes on Chi-squared test:
1. Chi-square test p-values rely on Normal
approximations, so they not valid for small
samples (any expected frequencies < 5).
2. Reject H0 at significance level α if the Chi-
square statistic is greater than the 100(1- α)th
percentile of the Chi-square distribution (i.e. not
α/2).
3. The null hypothesis for the Chi-square test can
be equivalently formulated as “X1 is
independent of X2”, where X1 and X2 are the
two categorical variables being compared
(gender and perio status in our example).
4. When comparing two proportions the Chi-
square test is equivalent to Z-test for two
proportions.
5. The Z-test for two proportions can be
formulated as a one-sided test, but the Chi-
square test cannot.
Does Normality assumption hold?
.
Fisher’s Exact Test
Does not rely on Normality assumption.
Uses “exact” distribution instead of a Normal
approximation.
Use in place of χ2 test when any expected cell
frequency is less than 5.
Example: caries incidence
Caries
yes no total
control 10 26 36
intervention 6 62 68
total 16 88 104
The null-hypothesis of Fisher’s Exact test is that
there is no relationship between the two
characteristics. Thus, every possible arrangement of
observations in the respective cells is equally likely
(but assuming row and column totals don’t change).
The p-value is computed by calculating the number
of possible arrangements of observations that
produce tables that are more extreme than the
observed and then dividing this by the total number
of possible arrangements of the observations.
Example: Caries incidence
observed table Caries yes no total �̂�𝑐 − �̂�𝑖 = 0.19 control 10 26 36
intervention 6 62 68
probability of table
under H0
total 16 88 104 𝑃𝐻0= 0.01048
Tables more extreme (result in greater difference in proportions)
11 25 36 �̂�𝑐 − �̂�𝑖 = 0.23 15 21 36 �̂�𝑐 − �̂�𝑖 = 0.40
5 63 68 1 67 68
16 88 104 𝑃𝐻0= 0.00236 16 88 104 𝑃𝐻0= 0.00000
12 24 36 �̂�𝑐 − �̂�𝑖 = 0.27 16 20 36 �̂�𝑐 − �̂�𝑖 = 0.44
4 64 68 0 68 68
16 88 104 𝑃𝐻0= 0.00038 16 88 104 𝑃𝐻0= 0.00000
13 23 36 �̂�𝑐 − �̂�𝑖 = 0.32 0 36 36 �̂�𝑐 − �̂�𝑖 = -0.24
3 65 68 16 52 68
16 88 104 𝑃𝐻0= 0.00004 16 88 104 𝑃𝐻0= 0.00055
14 22 36 �̂�𝑐 − �̂�𝑖 = 0.36 1 35 36 �̂�𝑐 − �̂�𝑖 = -0.19
2 66 68 15 53 68
16 88 104 𝑃𝐻0= 0.00000 16 88 104 𝑃𝐻0= 0.00602
Total probability of all as or more extreme tables = 0.01985
Formula for Probability of Table in Fisher’s Exact Test
Table
a b
c d
Probability
d!c!b!a!n!
d)!(bc)!(ad)!(cb)!(a
SPSS output
treatment group * caries a t age two Crosstabulation
Count
26 10 36
62 6 68
88 16 104
controls
intervention
treatment
group
Total
no yes
caries at age two
Total
Chi-Square Tests
6.496b 1 .011
5.122 1 .024
6.171 1 .013
.020 .013
6.434 1 .011
.000c
104
Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
Linear-by-Linear
Association
McNemar Test
N of Valid Cases
Value df
Asymp. Sig.
(2-sided)
Exact Sig.
(2-sided)
Exact Sig.
(1-sided)
Computed only for a 2x2 tablea.
0 cells (.0%) have expected count less than 5. The minimum expected count is
5.54.
b.
Binomial distribution used.c.
McNemar’s Test for Proportions (Paired Data)
Use for comparing proportions from paired data
Example: Change in plaque index
Fifty-three study participants assessed twice for plaque
index (PI), at baseline and 4 weeks later. We wish to assess
whether the proportion of patients with high PI changes.
PI at 4 weeks
PI at baseline low high
low 29 1
high 13 10
Incorrect methods:
1. Comparing 53
23ˆ
1 p with 53
11ˆ
2 p using the Z-test:
This will not give a valid p-value because it does not
compare independent samples. The same 53 people
are used in each proportion.
2. Performing a Chi-square or Fisher’s Exact test on the
above 2×2 table: These would test whether the
proportion of high’s at baseline is related to the
proportion of high’s at 4 weeks. They would not test
whether or not the proportions are different.
PI at 4 weeks
PI at baseline low high
low 29 1
high 13 10
McNemar’s Test assesses the null hypothesis
H0: P(PI high at baseline) = P(PI high at 4 week),
by noting that it is equivalent to:
H0: P(PI changes high to low) = P(PI changes low to high),
for all discordant pairs.
The discordant pairs are those that have different
values for the two observations. Note that each entry
in the table is the number of pairs.
The latter H0 can be evaluated using a one-sample
test for proportions with,
H0: p = 0.50, vs. H1: p ≠ 0.50,
where p = proportion of discordant pairs that increase.
PI at 4 weeks
PI at baseline low high
low 29 1
high 13 10
If n > 20 (where n is # of discordant pairs) can use
Z-test for proportions (chapter 9.3).
If n < 20 (as in the current example, n = 14) use the
binomial distribution to compute the exact p-value.
Let X = number of discordant pairs that increase,
which, under H0, is binomial(n = 14, p = 0.5).
The two-sided p-value is the probability that we
would see a more unbalanced sample of the
discordant pairs than 13 vs 1, which is
P(X < 1) + P(X > 13)
= P(X=0) + P(X=1) + P(X=13) + P(X=14)
= 0.0001 + 0.0009 + 0.0009 + 0.0001
= 0.0020
SPSS output
baseline PI * four week PI Crosstabulation
Count
29 1 30
13 10 23
42 11 53
low
high
baseline
PI
Total
low high
four week PI
Total
Chi-Square Tests
12.757b 1 .000
10.433 1 .001
13.872 1 .000
.000 .000
12.516 1 .000
.002c
53
Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
Linear-by-Linear
Association
McNemar Test
N of Valid Cases
Value df
Asymp. Sig.
(2-sided)
Exact Sig.
(2-sided)
Exact Sig.
(1-sided)
Computed only for a 2x2 tablea.
1 cells (25.0%) have expected count less than 5. The minimum expected count is
4.77.
b.
Binomial distribution used.c.
Analysis of Categorical Data Summary
Proportions from two independent samples
Large samples – Z-test for proportions
Small samples – Fisher’s Exact Test
Proportions from > 2 independent samples
Chi-square test
Proportions from paired data
McNemar’s Test