17
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Embed Size (px)

Citation preview

Page 1: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1

Chapter 11

Goodness of Fit Test(section 11.2)

Page 2: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 2.

Preview We focus on analysis of categorical (qualitative

or attribute) data that can be separated into different categories (often called cells).

Hypothesis test: Observed counts agree with some claimed distribution.

Use the x2 (chi-square) distribution.

Page 3: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 3..

Example

There are four blood types: A, B, AB, O

A sample of 100 patients had their blood type identified. Determine what percentage of people have each blood type.

Blood type

Observed Frequency

A 42

B 9

AB 6

O 43

Page 4: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 4

DefinitionA goodness-of-fit test is used to test the hypothesis that an observed frequency distribution fits (or conforms to) some claimed distribution.

ExampleThe claimed frequency distribution of blood types is as follows:

40% type A 4% type AB

11% type B 45% type O

Source: http://www.redcrossblood.org/learn-about-blood/blood-types

Page 5: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 5..

Goodness-of-Fit Test

We will compare each categories frequency with what we would expect (with our claimed distribution).

To determine if it is a good fit, we look at how much they differ (i.e. how much they vary).

Basically, our test uses the relative variance between our sample and claimed distribution.

If the claimed distribution works, the variance should be small.

How it works:

Page 6: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 6..

Goodness-of-Fit Test

1. The data is randomly selected.

2. The sample data consist of frequency counts for each of the different categories.

3. For each category, the expected frequency is at least 5.

Requirements

Page 7: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 7..

O Observed frequency of an outcome

E Expected frequency of an outcome

k Number of different categories (or outcomes)

n Total number of trials

p1 … pk Relative frequencies of each category

Goodness-of-Fit TestNotation

Page 8: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 8..

Goodness-of-Fit

Test Statistic2

2 ( )O Ex

E

This is the relative variance between the expected frequencies and the observed frequencies.

Follows a χ2-distribution (k-1 degrees of freedom)

Page 9: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 9..

Expected Frequencies

nE

k

If expected frequencies

are not all equal:

E np

If expected frequencies are all equal:

Page 10: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 10..

1. Write what you know

2. State the claim

3. State the null and alternative hypothesis (H0 and H1)

4. Draw a diagram(Note: only use a right-tailed test for Goodness-of-Fit)

5. Calculate the test statistic and critical value

6. Make an initial conclusion (reject or fail to reject H0)

7. Make a final conclusion

Goodness-of-Fit TestStep-by-step guide

Page 11: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 11.. Figure 11-2

Page 12: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 12..

Example 1

There are four blood types: A, B, AB, O

A sample of 100 patients had their blood type identified. Determine if the claimed frequency distribution fits this data.

Blood type

Observed Frequency

Claimed Frequency

A 42 40%

B 9 11%

AB 6 4%

O 43 45%

Page 13: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 13..

Example 1Bloodtype

ObservedFrequency

ClaimedFrequency

A 42 40%

B 9 11%

AB 6 4%

O 43 45%

Using StatCrunch(See video tutorial for details)

Page 14: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 14

Example 1Bloodtype

ObservedFrequency

ClaimedFrequency

A 42 40%

B 9 11%

AB 6 4%

O 43 45%

Page 15: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 15..

Example 2

To check if collected has been accurately recorded and not made up, one can check if the value of the last decimal (between 0 and 9) is uniformly distributed. A set of weights were recorded and calculated the frequency of the last decimal. Test if this data is uniformly distributed.

Page 16: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 16.

Example 2Last Digit Frequency

0 7

1 14

2 6

3 10

4 8

5 4

6 5

7 6

8 12

9 8

Using StatCrunch

Page 17: Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 1 Chapter 11 Goodness of Fit Test (section 11.2)

Copyright © 2010, 2007, 2004 Pearson Education, Inc. 11.1 - 17

Example 2Last Digit Frequency

0 7

1 14

2 6

3 10

4 8

5 4

6 5

7 6

8 12

9 8

+80 11.25