Upload
norman-marsh
View
217
Download
0
Embed Size (px)
Citation preview
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1
Chapter 11
Goodness of Fit Test(section 11.2)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 2.
Preview We focus on analysis of categorical (qualitative
or attribute) data that can be separated into different categories (often called cells).
Hypothesis test: Observed counts agree with some claimed distribution.
Use the x2 (chi-square) distribution.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 3..
Example
There are four blood types: A, B, AB, O
A sample of 100 patients had their blood type identified. Determine what percentage of people have each blood type.
Blood type
Observed Frequency
A 42
B 9
AB 6
O 43
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 4
DefinitionA goodness-of-fit test is used to test the hypothesis that an observed frequency distribution fits (or conforms to) some claimed distribution.
ExampleThe claimed frequency distribution of blood types is as follows:
40% type A 4% type AB
11% type B 45% type O
Source: http://www.redcrossblood.org/learn-about-blood/blood-types
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 5..
Goodness-of-Fit Test
We will compare each categories frequency with what we would expect (with our claimed distribution).
To determine if it is a good fit, we look at how much they differ (i.e. how much they vary).
Basically, our test uses the relative variance between our sample and claimed distribution.
If the claimed distribution works, the variance should be small.
How it works:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 6..
Goodness-of-Fit Test
1. The data is randomly selected.
2. The sample data consist of frequency counts for each of the different categories.
3. For each category, the expected frequency is at least 5.
Requirements
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 7..
O Observed frequency of an outcome
E Expected frequency of an outcome
k Number of different categories (or outcomes)
n Total number of trials
p1 … pk Relative frequencies of each category
Goodness-of-Fit TestNotation
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 8..
Goodness-of-Fit
Test Statistic2
2 ( )O Ex
E
This is the relative variance between the expected frequencies and the observed frequencies.
Follows a χ2-distribution (k-1 degrees of freedom)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 9..
Expected Frequencies
nE
k
If expected frequencies
are not all equal:
E np
If expected frequencies are all equal:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 10..
1. Write what you know
2. State the claim
3. State the null and alternative hypothesis (H0 and H1)
4. Draw a diagram(Note: only use a right-tailed test for Goodness-of-Fit)
5. Calculate the test statistic and critical value
6. Make an initial conclusion (reject or fail to reject H0)
7. Make a final conclusion
Goodness-of-Fit TestStep-by-step guide
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 11.. Figure 11-2
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 12..
Example 1
There are four blood types: A, B, AB, O
A sample of 100 patients had their blood type identified. Determine if the claimed frequency distribution fits this data.
Blood type
Observed Frequency
Claimed Frequency
A 42 40%
B 9 11%
AB 6 4%
O 43 45%
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 13..
Example 1Bloodtype
ObservedFrequency
ClaimedFrequency
A 42 40%
B 9 11%
AB 6 4%
O 43 45%
Using StatCrunch(See video tutorial for details)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 14
Example 1Bloodtype
ObservedFrequency
ClaimedFrequency
A 42 40%
B 9 11%
AB 6 4%
O 43 45%
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 15..
Example 2
To check if collected has been accurately recorded and not made up, one can check if the value of the last decimal (between 0 and 9) is uniformly distributed. A set of weights were recorded and calculated the frequency of the last decimal. Test if this data is uniformly distributed.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 16.
Example 2Last Digit Frequency
0 7
1 14
2 6
3 10
4 8
5 4
6 5
7 6
8 12
9 8
Using StatCrunch
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 17
Example 2Last Digit Frequency
0 7
1 14
2 6
3 10
4 8
5 4
6 5
7 6
8 12
9 8
+80 11.25