Upload
lenhan
View
222
Download
0
Embed Size (px)
Citation preview
The Chi-Square Statistic
Nonparametric Tests for Goodness of Fit and
Independence
01:830:200:10-13 Spring 2013
Chi-Square Tests
Nominal Data and the Chi-Square Test
• Early in the semester, we made a distinction between two
types of data:
– Measurement data, which represents individual items or events in terms
of measured scores
– Categorical data, which represents frequencies or counts of
observations in different categories
• Thus far, our focus has been almost exclusively on testing
hypotheses about measurement data. The chi-square tests
introduced in this lecture are used to test hypotheses about
categorical data
01:830:200:10-13 Spring 2013
Chi-Square Tests
The Chi-Square Test for Goodness-of-Fit
• The chi-square test for goodness-of-fit uses frequency
data from a sample to test hypotheses about the shape or
proportions of a population.
• Each individual in the sample is classified into one category
on the scale of measurement.
• The data, called observed frequencies, simply count how
many individuals from the sample are in each category.
01:830:200:10-13 Spring 2013
Chi-Square Tests
The Chi-Square Test for Goodness-of-Fit
• The null hypothesis specifies the proportion of the population
that should be in each category.
• The proportions from the null hypothesis are used to compute
expected frequencies that describe how the sample would
appear if it were in perfect agreement with the null hypothesis.
01:830:200:10-13 Spring 2013
Chi-Square Tests
• For the goodness of fit test, the expected frequency for
each category is obtained by
(p is the proportion from the null hypothesis and N is the size of the
sample)
Ef Np
The Chi-Square Test for Goodness-of-Fit
01:830:200:10-13 Spring 2013
Chi-Square Tests
Example 1: Uniform Expected Frequencies
For 3 possible categories, if the player is throwing symbols randomly, then
we should expect N/3 occurrences per category.
For N = 75, we would expect 25 occurrences per category.
Number of symbols thrown in a game of
rock/paper/scissors
Symbol Rock Paper Scissors Total
Observed 30 21 24 75
Expected 25 25 25
01:830:200:10-13 Spring 2013
Chi-Square Tests
Example 2: Non-uniform Expected Frequencies
Official M&M's Color
Distribution
Color p
brown 0.3
red 0.2
blue 0.1
orange 0.1
green 0.1
yellow 0.2
Color Brown Red Blue Orange Green Yellow Total
Observed 14 6 7 4 9 13 53
Expected 15.9 10.6 5.3 5.3 5.3 10.6
Number of M&Ms of each color in a sample bag
01:830:200:10-13 Spring 2013
Chi-Square Tests
The Chi-Square Test for Independence
• The second chi-square test, the chi-square test for
independence, can be used and interpreted in two different
ways:
1. Testing hypotheses about the relationship between two variables in a
population
H0 : There is no relationship between factor A and factor B
(This interpretation is analogous to correlation)
2. Testing hypotheses about differences between proportions for two or
more populations.
H0 : There is no difference between the distribution of factor A under different
levels of factor B
(This interpretation is analogous to interaction in factorial ANOVAs)
01:830:200:10-13 Spring 2013
Chi-Square Tests
The Chi-Square Test for Independence
• The data for a chi-square test for independence are usually
organized in a matrix with the categories for one variable
defining the rows and the categories of the second variable
defining the columns.
• These matrices are called contingency tables
– They were introduced earlier in the semester, during the Basic
Probability lecture
01:830:200:10-13 Spring 2013
Chi-Square Tests
The Chi-Square Test for Independence
Outcome
Treatment Success Relapse Total
Drug 13 36 49
Placebo 14 30 44
Total 27 66 93
Frequency of successes and relapses for anorexic
patients treated with Prozac or a placebo
01:830:200:10-13 Spring 2013
Chi-Square Tests
The Chi-Square Test for Independence
• The data, called observed frequencies, simply show how
many individuals from the sample fall into each cell (i.e.,
combination of factor levels) of the matrix.
• The null hypothesis for this test states that there is no
relationship between the two variables
– In other words, the two variables are independent.
01:830:200:10-13 Spring 2013
Chi-Square Tests
The Chi-Square Statistic
• Both chi-square tests use the same statistic. The calculation of the chi-square statistic requires two steps:
1. The null hypothesis is used to construct an idealized sample
distribution of expected frequencies that describes how the sample would look if the data were in perfect agreement with the null hypothesis.
2. A chi-square statistic is computed to measure the amount of discrepancy between the ideal sample (expected frequencies from H0) and the actual sample data (the observed frequencies = fo).
• A large discrepancy results in a large value for chi-square and indicates that the data do not fit the null hypothesis and the hypothesis should be rejected.
01:830:200:10-13 Spring 2013
Chi-Square Tests
Computing Expected Frequencies
For the goodness of fit test, the expected frequency is computed as
For the test for independence, the expected frequency for each cell in the matrix is computed as
C R
E C R
ff N p
fp
N
Ef Np
01:830:200:10-13 Spring 2013
Chi-Square Tests
Outcome
Treatment Success Relapse Total
Drug P(success,drug) P(relapse,drug) P(drug)
Placebo P(success,placebo) P(relapse,placebo) P(placebo)
Total P(success) P(relapse) 1
To understand the intuition behind this, consider the joint and marginal
probabilities underlying the frequency distribution:
Remember from earlier in the semester that if two factors (treatment &
outcome) are independent, then:
outcome, treatment outcome treatmentP P P
01:830:200:10-13 Spring 2013
Chi-Square Tests
Computing the Chi-Square Statistic
The calculation of chi-square is the same for all chi-square tests:
However, computation of the degrees of freedom differs
For the goodness-of-fit test:
For the test of independence:
2
2 O E
E
f f
f
# 1df cells
# 1 # 1df rows cols
01:830:200:10-13 Spring 2013
Chi-Square Tests
Example
2
2
2 2 2 2 2 2
5
14 15.9 6 10.6 7 5.3 4 5.3 9 5.3 13 10.6
15.9 10.6 5.3 5.3 5.3 10.6
0.23 2.00 0.55 0.32 2 6.21.58 0.54
O E
E
f f
f
Official M&M's Color
Distribution
Color p
brown 0.3
red 0.2
blue 0.1
orange 0.1
green 0.1
yellow 0.2
Color Brown Red Blue Orange Green Yellow Total
Observed 14 6 7 4 9 13 53
Expected 15.9 10.6 5.3 5.3 5.3 10.6
Number of M&Ms of each color in a sample bag
01:830:200:10-13 Spring 2013
Chi-Square Tests
χ2 Distribution Upper Tail Probability
df 0.1 0.05 0.025 0.01
1 2.71 3.84 5.02 6.63
2 4.61 5.99 7.38 9.21
3 6.25 7.81 9.35 11.34
4 7.78 9.49 11.14 13.28
5 9.24 11.07 12.83 15.09
6 10.64 12.59 14.45 16.81
7 12.02 14.07 16.01 18.48
8 13.36 15.51 17.53 20.09
9 14.68 16.92 19.02 21.67
10 15.99 18.31 20.48 23.21
11 17.28 19.68 21.92 24.72
12 18.55 21.03 23.34 26.22
13 19.81 22.36 24.74 27.69
14 21.06 23.68 26.12 29.14
15 22.31 25.00 27.49 30.58
16 23.54 26.30 28.85 32.00
17 24.77 27.59 30.19 33.41
18 25.99 28.87 31.53 34.81
19 27.20 30.14 32.85 36.19
20 28.41 31.41 34.17 37.57
30 40.26 43.77 46.98 50.89
40 51.81 55.76 59.34 63.69
50 63.17 67.50 71.42 76.15
60 74.40 79.08 83.30 88.38
70 85.53 90.53 95.02 100.43
80 96.58 101.88 106.63 112.33
90 107.57 113.15 118.14 124.12
100 118.50 124.34 129.56 135.81
01:830:200:10-13 Spring 2013
Chi-Square Tests
Example
2
2
2 2 2 2 2 2
5
14 15.9 6 10.6 7 5.3 4 5.3 9 5.3 13 10.6
15.9 10.6 5.3 5.3 5.3 10.6
0.23 2.00 0.55 0.32 2 6.21.58 0.54
O E
E
f f
f
Official M&M's Color
Distribution
Color p
brown 0.3
red 0.2
blue 0.1
orange 0.1
green 0.1
yellow 0.2
Color Brown Red Blue Orange Green Yellow Total
Observed 14 6 7 4 9 13 53
Expected 15.9 10.6 5.3 5.3 5.3 10.6
Number of M&Ms of each color in a sample bag
06.21 11.07; accept H
The frequency distribution of colors in our bag is not significantly different from the
distribution expected based on M&M’s published proportions
01:830:200:10-13 Spring 2013
Chi-Square Tests
Full Example (taken from earlier lecture)
Race Yes No Total
Black 95 425 520
Nonblack 19 128 147
Total 114 553 667
Death Sentence
Event Frequencies
Event Probabilities (f/N)
• Are race of the defendant and
application of the death sentence
independent?
0.142
0.78
black,death
black death 0.171 0.133
black,death black deat
0
h
P
P P
P P P
Race Yes No Total
Black 0.142 0.637 0.780
Nonblack 0.028 0.192 0.220
Total 0.171 0.829 1.000
Death Sentence
01:830:200:10-13 Spring 2013
Chi-Square Tests
Full Example:
Race Yes No Total
Black 95 425 520
Nonblack 19 128 147
Total 114 553 667
Death Sentence
Observed Event Frequencies
114black,yes
553black,no
147 114nonblack,yes
52088.88
667
520431.12
667
25.12667
12114
.88667
7 553nonblack,no
E
E
E
E
f
f
f
f
01:830:200:10-13 Spring 2013
Chi-Square Tests
Race Yes No Total
Black 95 425 520
Nonblack 19 128 147
Total 114 553 667
Death Sentence
Observed Event Frequencies
Race Yes No Total
Black 88.88 431.12 520
Nonblack 25.12 121.88 147
Total 114 553 667
Death Sentence
Expected Event Frequencies
2
2
2 2 2 2
1
95 88.88 425 431.12 19 25.12 128 121.88
88.88 431.12 25.12 121.88
0.42 0.09 1.49 0 231 1. .3
O E
E
f f
f
01:830:200:10-13 Spring 2013
Chi-Square Tests
χ2 Distribution Upper Tail Probability
df 0.1 0.05 0.025 0.01
1 2.71 3.84 5.02 6.63
2 4.61 5.99 7.38 9.21
3 6.25 7.81 9.35 11.34
4 7.78 9.49 11.14 13.28
5 9.24 11.07 12.83 15.09
6 10.64 12.59 14.45 16.81
7 12.02 14.07 16.01 18.48
8 13.36 15.51 17.53 20.09
9 14.68 16.92 19.02 21.67
10 15.99 18.31 20.48 23.21
11 17.28 19.68 21.92 24.72
12 18.55 21.03 23.34 26.22
13 19.81 22.36 24.74 27.69
14 21.06 23.68 26.12 29.14
15 22.31 25.00 27.49 30.58
16 23.54 26.30 28.85 32.00
17 24.77 27.59 30.19 33.41
18 25.99 28.87 31.53 34.81
19 27.20 30.14 32.85 36.19
20 28.41 31.41 34.17 37.57
30 40.26 43.77 46.98 50.89
40 51.81 55.76 59.34 63.69
50 63.17 67.50 71.42 76.15
60 74.40 79.08 83.30 88.38
70 85.53 90.53 95.02 100.43
80 96.58 101.88 106.63 112.33
90 107.57 113.15 118.14 124.12
100 118.50 124.34 129.56 135.81
01:830:200:10-13 Spring 2013
Chi-Square Tests
Race Yes No Total
Black 95 425 520
Nonblack 19 128 147
Total 114 553 667
Death Sentence
Observed Event Frequencies
Race Yes No Total
Black 88.88 431.12 520
Nonblack 25.12 121.88 147
Total 114 553 667
Death Sentence
Expected Event Frequencies
2
2
2 2 2 2
1
95 88.88 425 431.12 19 25.12 128 121.88
88.88 431.12 25.12 121.88
0.42 0.09 1.49 0 231 1. .3
O E
E
f f
f
02.31 3.84; accept H
These data do not indicate a significant relationship between race and sentencing
in death penalty trials. Or,
These data are not sufficient to conclude that black defendents are sentenced at a
different rate than nonblack defendants in death penalty trials.
01:830:200:10-13 Spring 2013
Chi-Square Tests
The Chi-Square Test for Independence: Effect Size
• When both variables in the chi-square test for independence
consist of exactly two categories (the data form a 2 x 2
matrix), it is possible to re-code the categories as 0 and 1 for
each variable and then compute a correlation known as a phi-
coefficient that measures the strength of the relationship.
• We discussed the phi-coefficient earlier in the lecture on
correlation. In the 2x2 case, the phi-coefficient is computed in
exactly the same manner as Pearson’s r.
• The value of the phi-coefficient, or the squared value which is
equivalent to an r2, is used to measure the effect size.
01:830:200:10-13 Spring 2013
Chi-Square Tests
The Chi-Square Test for Independence: Effect Size
• In the more general case (i.e., for tables larger than 2x2), the
square of the phi-coefficient can be computed as:
• The value of the phi-coefficient, or the squared value, which is
analogous to an r2, is used to measure the effect size.
22
N
01:830:200:10-13 Spring 2013
Chi-Square Tests
Example: Effect Size
22
2.310.0034
667
0.06
N
01:830:200:10-13 Spring 2013
Chi-Square Tests
χ2 Distribution Upper Tail Probability
df 0.1 0.05 0.025 0.01
1 2.71 3.84 5.02 6.63
2 4.61 5.99 7.38 9.21
3 6.25 7.81 9.35 11.34
4 7.78 9.49 11.14 13.28
5 9.24 11.07 12.83 15.09
6 10.64 12.59 14.45 16.81
7 12.02 14.07 16.01 18.48
8 13.36 15.51 17.53 20.09
9 14.68 16.92 19.02 21.67
10 15.99 18.31 20.48 23.21
11 17.28 19.68 21.92 24.72
12 18.55 21.03 23.34 26.22
13 19.81 22.36 24.74 27.69
14 21.06 23.68 26.12 29.14
15 22.31 25.00 27.49 30.58
16 23.54 26.30 28.85 32.00
17 24.77 27.59 30.19 33.41
18 25.99 28.87 31.53 34.81
19 27.20 30.14 32.85 36.19
20 28.41 31.41 34.17 37.57
30 40.26 43.77 46.98 50.89
40 51.81 55.76 59.34 63.69
50 63.17 67.50 71.42 76.15
60 74.40 79.08 83.30 88.38
70 85.53 90.53 95.02 100.43
80 96.58 101.88 106.63 112.33
90 107.57 113.15 118.14 124.12
100 118.50 124.34 129.56 135.81