Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Goodness of Fit Test
Lecture 23
March 19, 2018
Four Stages of Statistics
• Data Collection �
• Displaying and Summarizing Data �
• Probability �
• Inference• One Quantitative �
• One Categorical• One-Sample Proportion Test �
• Goodness of Fit Test
• One Categorical and One Quantitative
• Two Categorical
• Two Quantitative
Chi-Squared Distribution
• Chi-Squared Distribution: continuous probability distribution with the following properties:
• Unimodal and right-skewed
• Always non-negative
• Mean equal to number of degrees of freedom
• Changes shape depending on degrees of freedom• Becomes less right-skewed as df increase
Examples of Chi-Squared Distribution
Example of Chi-Squared Table Example #1: Using Chi-Squared Table
• Question: What is the chi-squared statistic with 4 degrees of freedom corresponding to 10% of the area in the upper tail?
• Answer: ___________________
______
Motivation: Goodness of Fit Test
• Scenario: Birthday effect of 400 HS football players across four seasons
• Question: Are the birthdays evenly distributed across all seasons using � = .05?
• Problem:• Variable has _________________________________
• One-sample proportion test can only __________________ ____________________________
Spring Summer Fall Winter
Sample Prop.��� = .23 �
� = .20 �� = .30 �
� = .27Hypothesized Prop. .25 .25 .25 .25
Goodness of Fit Test: Hypotheses
• Used For: • Determining if collected data is consistent with a
specified probability distribution
• Hypotheses: • �: The data is consistent with the specified
distribution.
• � : At least one �� differs from its hypothesized value
Example #2: Doing Hypothesis Test
• Question: Are the birthdays evenly distributed across all seasons using � = .05?
• Hypothesis Test:1. Hypotheses:
• �: ______________________________________________________________• � : ______________________________________________________________
_____________________________
Spring Summer Fall Winter
Observed Counts 92 80 120 108
Hypothesized Prop. .25 .25 .25 .25
Goodness of Fit Test: Conditions
• Conditions: • Expected counts for each category at least 5
• Expected Counts: number of observations we would expect in each category if � is true
• Hypothesized proportion for each category: ��• Total sample size: �• Expected count for category �: ���
Example #2: Doing Hypothesis Test
• Question: Are the birthdays evenly distributed across all seasons using � = .05?
• Hypothesis Test:2. Conditions:
• All expected counts ______________________
• Goodness of fit test ______________________
Spring Summer Fall Winter
Observed Counts 92 80 120 108
Hypothesized Prop. .25 .25 .25 .25
Expected Counts
Goodness of Fit Test: Test Statistic
• Test Statistic:
�� = � Actual − Expected �
Expected$
�% • Follows chi-squared distribution with & − 1 df
• Idea: Test statistic compared observed and expected counts relative to sample size
• Each group made a contribution to the test statistic
• If expected and actual are close, contribution is small
• If expected and actual are far apart, contribution is large
Number of categories
Example #2: Doing Hypothesis Test
• Hypothesis Test:3. Test Statistic:
�� = ___________________________________________________________= ___________________________
= __________
Degrees of Freedom: () = __________________
Spring Summer Fall Winter
Observed Counts 92 80 120 108
Expected Counts 100 100 100 100
Goodness of Fit Test: Conclusion
• Decision: Reject � for large values of the test statistic
• Imply that actual and expected counts are far away
• Always an upper one-sided test
• Conclusion: At least one category differs from its hypothesized proportion
• Do not know which categories or how many
Example #2: Doing Hypothesis Test
• Hypothesis Test:4. Critical Value: _______________
5. Decision: ____________________ (________________________)
6. Conclusion: ____________________________________________ __________________________________________________________
Test Statistic:
_______________
Rejection
Region
Spring Summer Fall Winter
Observed Counts 92 80 120 108
Expected Counts 100 100 100 100
Goodness of Fit vs. Proportion Test
• In one-sample proportion test…• Categorical variable had 2 categories
• If you hypothesize � = .60, hypothesized proportion for the other category is .40
• In goodness of fit test…• Categorical variable has 3 or more categories
• If you hypothesize � = .60 for one category, all other categories add to .40, but we don’t know how it is allocated to other categories
• Need to check accuracy of all categories simultaneously
Determining Which Categories Differ
• To determine which categories differ from their hypothesized proportions:
• Calculate a confidence interval for each category
• Determine if the interval contains the hypothesized proportion for each
Note: If � is rejected, at least one interval will not contain
its hypothesized proportion and will be the reason the null
hypothesis was rejected.
Example #3: Which Categories Differ
• Scenario: Birthday effect of 400 HS football players across four seasons
• Question: Which categories’ proportions differed significantly from .25?
• Strategy:• Calculate _________________________________________________
for each category
• Look for __________________________________ in the interval
Spring Summer Fall Winter
Sample Prop. .23 .20 .30 .27
Hypothesized Prop. .25 .25 .25 .25
Example #3: Which Categories Differ
• Scenario: Birthday effect of 400 HS football players across four seasons
Season Sample Prop. Confidence Interval
Spring .23 .23 ± 1.96 .�-(.//)� = ______________
Summer .20 .20 ± 1.96 .�(.�)� = ______________
Fall .30 .30 ± 1.96 .-(./)� = ______________
Winter .27 .27 ± 1.96 .�/(./-)� = ______________
Example #3: Which Categories Differ
• Scenario: Birthday effect of 400 HS football players across four seasons
• Question: Which categories’ proportions differed significantly from .25?
• Answer: _____________________________________________
Season Sample Prop. Interval Sig. Difference?
Spring .23 (.189, .271)Summer .20 (.161, .239)Fall .30 (.255, .345)Winter .27 (.226, .314)
Example #4: Type I Error
• Results: We found a test statistic of �� = 9.28which corresponds to a p-value of .0258. This led us to reject � at the 5% level of significance.
• Question: What is the probability of making a Type I Error?
• Answer: ___________________________
• Question: What would have happened if we had made a Type I error?
• Answer: Conclude that _____________________________ when in reality ______________________________________ ________________________________________________________
Example #5: Interpreting Output
• Scenario: Mars Company claims that plain M&M’s have the following distribution of colors:
You open a bag and count each color.
• Question: Is Mars’ claim accurate at � = .05?
• Hypothesis Test:1. Hypotheses:
• �: __________________________________________________________________
_________________________________________________________________
• � : _______________________________________________________________
Color Blue Orange Green Yellow Brown Red
Prop. .24 .20 .16 .14 .13 .13
Example #5: Interpreting Output
If the bags are actually being filled according to ______________
_______________, then the probability of getting __________________
______________________________ than what was observed is _______.
Example #5: Interpreting Output
• Hypothesis Test:2. Conditions: _____________________________________________
3. Test Statistic: ______________________
4. P-Value: _____________
5. Decision: _________________________ (___________________)
6. Conclusion: ____________________________________________ __________________________________________________________
_____
_______
Example #6: Confidence Intervals
• Question: What do you notice about all six confidence intervals?
• Answer: All of them ________________________________ ________________________
• Reason: ___________________________________________________
Summary
• Goodness of Fit Test: used to test proportions of a categorical variable with at least three categories
• Expected Counts: how many observations we would expect in each category if the null hypothesis is true
• Test Statistic:
�� = � Actual − Expected �
Expected$
�% • Large values are evidence against �