AP Statistics Chapter 26: Comparing Counts. Test for Goodness of Fit: To analyze categorical data,...

AP Statistics

Chapter 26: Comparing Counts

Test for Goodness of Fit:

To analyze categorical data, we construct two-way tables and examine the counts or percents of the explanatory and response variables.

Hershey Activity:Kind Milk

ChocolateMr.

Goodbar Krackel Special Dark

Claimed % 40% 21% 21% 18%

Proportion of Hershey Miniatures

KIND ObservedCount, O

ExpectedCount, E

Milk Chocolate

Mr. Goodbar

Krackel

Special Dark

We want to compare the observed counts to the expected counts.

The null hypothesis is that there is no difference between the observed and expected counts.

The alternative hypothesis is that there is a difference between the observed and expected counts.

is called the

chi-square test statistic . It measures how well the observed counts fit the expected counts, assuming that the null hypothesis is true.

22 O E

The distribution of the chi-square statistic is called the chi-square distribution, 2. This distribution is a density curve.

The total area under the curve is 1 .

The curve begins at zero on the horizontal axis and is skewed right .

As the degrees of freedom increase, the shape of the curve becomes more symmetric .

Using the Hershey’s Miniatures chi-square statistic, find the probability of obtaining a 2 value at least this extreme assuming the null hypothesis is true.

This is known as the “Goodness of Fit

Test.” Graph the chi-square distribution with (k

– 1) = 3 degrees of freedom, where k = # of categories:

TI-83+: 2pdf (X, 5)

We would expect to obtain a 2 value at least this extreme in about _____ out of every _____ samples, assuming the null hypothesis is true.

CONDITIONS: The Goodness of Fit Test may be used when:SRSwe are working with counts .all counts are at least 1 and most (at least 80%) of the counts are at least 5 .

*Following the Goodness of Fit Test,

check to see which component made the greatest contribution to the chi-square statistic to see where the biggest changes occurred.

Inference for Two Way Tables:

To compare two proportions, we use a 2-Proportion Z Test. If we want to compare three or more proportions, we need a new procedure.

The first step in the overall test for comparing

several proportions is to arrange the data in a 2-way table.

Think of the counts as elements of a matrix with r

rows and c columns. This is called an r x c table with (r)(c) cells.

Our null hypothesis is that there is no difference

among the proportions. The alternative hypothesis is that there are some difference among the proportions.

We will use the chi-square test to measure how far the observed values are from the expected values.

To calculate the expected counts, multiply the row total by the column total, and divide by the table total : row total column total

expected counttable total

The chi-square statistic is the sum over all r x c cells in the table:

The degrees of freedom is (r – 1)

(c – 1) . The P-value is the area to the

right of the statistic under the chi-square density curve.

In Summary:Use a chi-square goodness of fit

test when you want to show that the distribution of counts in one categorical variable matches the distribution predicted by a model.

( one categorical variable, one -way table)H0: The distribution of Hershey Miniatures is the same as the advertised distribution.

Use a chi-square test for homogeneity when you want to show that the distribution of counts for two or more groups is the same.

“Are the groups homogeneous?”; more than one population; split, then sample

( one categorical variable, two -way table)

H0: The distribution of X is the same as the distribution of Y.

Use a chi-square test for independence when you want to show that two categorical variables are independent for one group of individuals.

“Are the variables independent?”; one population; sample, then split

( two categorical variables, two -way table)

H0: X and Y are independent.

What Can Go Wrong?

Don’t use chi-square methods unless you have counts.

Just because numbers are in a two-way table doesn’t make them suitable for chi-square analysis.

Beware large samples.With a sufficiently large sample size, a chi-square test can always reject the null hypothesis.

Don’t say that one variable “depends” on the other just because they’re not independent.

Association is not causation.

What have we learned?We’ve learned how to test hypotheses about categorical variables.All three methods we examined look at counts of data in categories and rely on chi-square models.

Goodness-of-fit tests compare the observed distribution of a single categorical variable to an expected distribution based on theory or model.Tests of homogeneity compare the distribution of several groups for the same categorical variable.Tests of independence examine counts from a single group for evidence of an association between two categorical variables.

What have we learned? (cont.)

Mechanically, these tests are almost identical.While the tests appear to be one-sided, conceptually they are many-sided, because there are many ways that the data can deviate significantly from what we hypothesize.When we reject the null hypothesis, we know to examine standardized residuals to better understand the patterns in the data.

AP Statistics Chapter 26: Comparing Counts. Test for Goodness of Fit: To analyze categorical data,...

Documents

+ Chapter 11 Inference for Distributions of Categorical Data 11.1Chi-Square Goodness-of-Fit Tests 11.2Inference for Relationships

Random Forest Ensemble Visualizationkenlau177.github.io/Indented-Agg-Tree/cpsc547-writeup-final2.pdf · adding up counts from two categorical attribute values. This idea is incorporated

PROC FREQ 1SHRUG November 28, 2014. What good is Proc FREQ It Counts! Answers question how many Display data (error checks), descriptive Analyze categorical

Displaying and Describing Categorical Datamrcamposmathpage.yolasite.com/resources/Stats Chapter 3.pdf · 2015. 9. 7. · 22 CHAPTER 3 Displaying and Describing Categorical Data Counts

Copyright © Cengage Learning. All rights reserved. 14 Goodness-of-Fit Tests and Categorical Data Analysis

Goodness of Fit Test - storage.googleapis.com · Test Statistic: _____ Rejection Region Spring Summer Fall Winter Observed Counts 92 80 120 108 Expected Counts 100 100 100 100 Goodness

Categorical Logic - Mesa Community Collegebarsp59601/text/103/notes/syllogisms.pdf · Categorical Logic Categorical Syllogisms. I. ... Categorical Logic Syllogisms Syllogism - a deductive

Mapping politics: how context counts in electoral geography · 2019-01-15 · conventional social science. The categorical approach suffers from a sort of ‘agnosia‘ or disorder

CATEGORICAL REPRESENTATIONS OF CATEGORICAL GROUPS · CATEGORICAL REPRESENTATIONS OF CATEGORICAL GROUPS 533 4. A categorical group is discrete if there is at most one morphism between

Categorical Data Analysis - download.e-bookshelf.de · 4.1 Generalized Linear Model, 116 4.2 Generalized Linear Models for Binary Data, 120 4.3 Generalized Linear Models for Counts,

Goodness of Fit Tests Categorical Data and The Analysis ofbfhskasten2.weebly.com/.../copy_of_chapter_12.ppt.pdf · Chapter 12 The Analysis of Categorical Data and Goodness of Fit

How Should We Assess the Fit of Rasch-Type Models? Approximating the Power of Goodness-of-fit Statistics in Categorical Data Analysis Alberto Maydeu-Olivares

Basics of analytical graph theory - University of California ... charts displays the proportion of counts or measures falling within the category (categorical x data) (Again these

Chapter 16: Analysis of Categorical Data. LO1Use the chi-square goodness-of-fit test to analyze probabilities of multinomial distribution trials along

Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted

categorical variables, time and regular expressions ...cjd11/charles_dimaggio/DIRE/... · categorical variables Outline 1 categorical variables coding categorical variables 2 dates

THE ANALYSIS OF CATEGORICAL DATA AND GOODNESS-OF-FlT … · 2018. 1. 28. · 3 Carry out a chi-square goodness-of-fit test. » Carry out chi-square tests for homogeneity of proportions

Chi-square test Chi-square test or 2 test. Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate)

BOSTON UNIVERSITY SURGICAL RESIDENCY PROGRAM 2020 …...Aldana GS Categorical Andrea GS Categorical Benjamin GS Categorical Praveen GS Categorical Stephanie GS Categorical Geary, M.D

Chapter 26: Comparing counts of categorical data