55
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Embed Size (px)

Citation preview

Page 1: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Copyright © 2013 Pearson Education, Inc. All rights reserved

Chapter 10

Inferring Population

Means

Page 2: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 2 Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Learning Objectives

Understand when a goodness-of-fit test is needed and appropriate, and know how to perform the test and interpret results.

Distinguish between tests of homogeneity and tests of independence.

Understand when it is appropriate to use a chi-square statistic to test whether two categorical variables are associated; know how to perform this test and interpret the results.

Page 3: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Copyright © 2013 Pearson Education, Inc. All rights reserved

10.1

The Basic Ingredients for Testing with

Categorical Variables

Page 4: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 4

Volunteers Representative of Student Body?

White Asian Hispanic Other

34% 32% 13% 21%

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Ethnicities for UCLA Student Body

White Asian Hispanic Other

150 120 45 85

Random Sample of 400 UCLA Volunteers

Is the ethnic distribution of volunteers the same as the ethnic distribution for the student body?

Page 5: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 5

What Would We Expect?

White Asian Hispanic Other

34% 32% 13% 21%

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Ethnicities for UCLA Student Body

White Asian Hispanic Other

Observed 150 120 45 85

Random Sample of 400 UCLA Volunteers

0.34 x 400 = 136 0.13 x 400 = 52

0.32 x 400 = 128 0.21 x 400 = 84

Expected 136 128 52 84

Page 6: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 6

Questions on Goodness of Fit

The observed counts are not the same as the expected counts.

Are they far enough from expected to conclude that the distribution of all UCLA volunteers differs from the student body distribution?Copyright © 2013 Pearson Education, Inc.. All rights reserved.

White Asian Hispanic Other

Observed 150 120 45 85

Expected 136 128 52 84

Random Sample of 400 UCLA Volunteers

Page 7: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 7

c2 Test Statistic

c2 measures how far the observed is from the expected.

c2 = 0.12 + 0.47 + 0.94 + 0.01 = 1.54

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

White Asian Hispanic Other

Observed 150 120 45 85

Expected 136 128 52 84

Random Sample of 400 UCLA Volunteers

2

2 Observed Expected

Expected

2 2 2 2150 136 120 128 45 52 85 84

0.12, 0.47, 0.94, 0.01136 136 52 84

Page 8: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 8

Political Affiliation and Music Preference

Is Political Affiliation associated with Music Preference?

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Democrat Republican

Pop 70 52

Classic Rock 34 57

Other 21 16

Survey of 250 People

Page 9: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 9

Finding Expected Counts

If they are independent, then the number of Republicans who listen to Pop would be

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Democrat Republican

Pop 85 52

Classic Rock 34 57

Other 21 16

Survey of 250 People

137100% 54.8%

250Pop:

125100% 50%

25Rep:

0

Expected 0.548 0.5 250 68.5

Page 10: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 10

Finding Expected Counts

Same test statistic: Computer is easier than by hand c2 ≈ 14.2 DF = (Rows – 1)(Columns – 1) = (3-1)(2-1) = 2 p-value = 0.0008

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Democrat Republican

Pop 85 (69.75) 52 (68.50)

Classic Rock 34 (46.33) 57 (44.67)

Other 21 (23.93) 16 (23.07)

Survey of 250 People

2

2 Observed Expected

Expected

Page 11: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 11

Using the c2

All expected counts must be 5 or higher. Data is qualitative. Can be used to test if an unknown

distribution is the same as a known distribution.

Can be used to test if two variables are independent or associated.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 12: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Copyright © 2013 Pearson Education, Inc. All rights reserved

10.2

The Chi-Square Test for Goodness of Fit

Page 13: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 13

Chi-Square Test for Goodness of Fit

Used to see if an unknown distribution is different from a given distribution.

Always the same null and alternative hypotheses: H0: The population distribution of the variable is

the same as the proposed distribution. Ha: The population distributions are different.

Uses a c2 test statistic. The rest follows the standard procedure.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 14: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 14

Chi-Square Test for Goodness of Fit

To find the expected count: Percent of the population times the sample size

for uniform distribution

Uses a c2 test statistic. The degrees of freedom (DF):

numbers of categories – 1 All expected counts must be greater than 5. The rest follows the standard procedure.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Sample Size

Number of Possibilities

Page 15: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 15

Goodness of Fit: Rolling a Die

You are playing a game that involves rolling a die and suspect that the die is not fair.

1. Hypothesize H0: The die is fair.

(1,2,3,4,5, and 6 are equally likely to occur) Ha: The die is not fair.

You roll it 300 times and get:

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 16: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 16

Goodness of Fit: Rolling a Die

2. Prepare Use a = 0.01, c2 Statistic, all expected counts

are greater than 5. If all numbers our equally likely to occur, then

we would expect to get 50 of each value, 300/6 = 50.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Outcome 1 2 3 4 5 6

Observed 35 45 69 52 43 56

Expected 50 50 50 50 50 50

Page 17: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 17

3. Compute to Compare

Stat → Goodness-of-fit→ Chi-Square test

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 18: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 18

4. Interpret

P-Value = 0.0156 > 0.01 = a Fail to reject H0

There is insufficient evidence to support the claim that the die is not fair.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 19: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 19

Facts about Goodness of Fit

The test statistic c2 will always be non-negative. If c2 is close to 0, then we will fail to reject H0.

If c2 is large, then we will reject H0.

Can conclude that the unknown distribution differs from the known.

Cannot conclude that the unknown distribution is the same as the known.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 20: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 20

Counts Must Be Used

If proportions are given instead of counts Multiply each proportion by the sample size to

obtain the count. If percents are given instead of counts

Convert the percents to decimals by dividing by 100. Then multiply each percent by the sample size to obtain the count.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 21: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Copyright © 2013 Pearson Education, Inc. All rights reserved

10.3

Chi-Square Tests for Associations between Categorical Variables

Page 22: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 22

Test for Independence

One sample two categorical variables. Answers whether there is an association

between two categorical variables. Random, independent collection. All expected counts greater than 5. H0: The two variables are independent

Ha: There is an association between the two variablesCopyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 23: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 23

Is type of business associated with US region? A random sample of 558 businesses was studied

Manufacturing Retail Financial

East 47 92 67

Central 23 40 18

North 19 28 14

South 39 40 47

West 25 43 16

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

1. Hypothesize H0: business type and region are independent

Ha: Business type and region are associated

Page 24: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 24

2. Prepare

a = 0.05, c2 test for independence, all expected counts greater then 5.

Stat →Tables→Contingency→with summary

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 25: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 25

3. Compute to Compare

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 26: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 26

3. Compute to Compare

c2 ≈ 17.38 P-value = 0.0263

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 27: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 27

4. Interpret

P-value = 0.0263 < 0.05 = a Reject H0

Accept Ha

There is statistically significant evidence to support the claim that business type and region are associated.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 28: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 28

Test for Homogeneity

Two samples, one categorical question. Tests if the two populations are associated.

Is the distribution for the first population the same as for the second population?

Differs from Goodness of Fit in that there are two samples instead of one sample and one known population.

Differs from Test for Independence in that there are two samples and one variable instead of one sample and two variables. Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 29: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 29

Do freshmen and sophomores have different opinions about spending a year abroad? Is spending a year abroad a good idea?

Strongly Agree

Agree Disagree Strongly Disagree

Freshmen 45 33 18 7

Sophomores 32 28 20 6

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

1. Hypothesize H0: The distributions of opinions for freshmen

and sophomores are the same. Ha: The distributions of opinions for freshmen

and sophomores are not the same.

Page 30: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 30

2. Prepare

a = 0.05 c2 test for homogeneity All expected counts are greater than 5.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 31: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 31

3. Compute to Compare

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Stat →Tables→Contingency→with summary

Page 32: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 32

4. Interpret

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

P-value = 0.7367 > 0.05 = a There is statistically insignificant evidence to

conclude that the distributions of opinions for freshmen and sophomores are not the same.

Page 33: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 33

Comparing Test for Independence and Difference Between Proportions

For testing two variables each with two possible outcomes, the test for independence will give the same result as a two tailed test for the difference between proportions.

To show one answer occurs with higher probability for one group than another only the one tailed test for a difference between proportions can be used.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 34: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Copyright © 2013 Pearson Education, Inc. All rights reserved

10.4

Hypothesis Tests When Sample Sizes Are

Small

Page 35: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 35

Small Sample Sizes: Consolidation

Were hospitalization rates from the swine flu different for different ages?

With expected counts less than 5, the c2 test cannot be used.

Instead, consolidate into just young, middle and old.Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 36: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 36

Small Sample Sizes: Consolidation

Were hospitalization rates from the swine flu different for different ages?

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Age Category

Under 15 15 – 29 30 and Older Totals

Yes 16 9 10 35

No 239 241 104 584

Totals 255 250 114

Page 37: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 37

Small Sample Sizes: Consolidation

Were hospitalization rates from the swine flu different for different ages?

Now the sample sizes are large enough.

p-value = 0.12 is large.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Age Category

Under 15 15 – 29 30 and Older Totals

Yes 16 9 10 35

No 239 241 104 584

Totals 255 250 114

Page 38: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 38

Were hospitalization rates from the swine flu different for different ages?

Fail to reject the null hypothesis. There is insignificant evidence to make a conclusion about whether hospitalization rates from the swine flu were different for different ages.

Problems with this approach: Grouping infants and young teens may not make

sense. Grouping middle aged people with senior

citizens may not make sense.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 39: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 39

Fisher’s Exact Test

Used to compare two proportions (or more proportions with advanced techniques).

Can be used with small sample sizes. Too advanced without the use of technology

such as StatCrunch. For larger sample sizes use a test for

independence, homogeneity, or difference between proportions.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 40: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Copyright © 2013 Pearson Education, Inc. All rights reserved

Chapter 10

Case Study

Page 41: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 41

Is Oil Amount Associated With Successful Popcorn?

Success means at least half the kernels popped in 75 seconds or less.

H0: The quality of popcorn and the amount of oil are independent.

Ha: The quality of popcorn and the amount of oil are associated.Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 42: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 42

Is Oil Amount Associated With Successful Popcorn? All expected counts

at least 5. p-value = 0.006 is

very small. Reject H0, Accept Ha

There is statistically significant evidenceto support the claimthat oil amount and popcorn success are associated.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 43: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Copyright © 2013 Pearson Education, Inc. All rights reserved

Chapter 10

Guided Exercise 1

Page 44: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 44

Are Humans Like Random Number Generators?

38 students were asked to pick a “random” number from 1 to 5.

Test the hypothesis that humans are not like random number generators. Use a significance level of 0.05, and assume the data were collected from a random sample of students.Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Integer One Two Three Four Five

Frequency 3 5 14 11 5

Page 45: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 45

Are Humans Like Random Number Generators?1. Hypothesize

H0: Humans are like random number generators and produce numbers in equal quantities.

Ha: Humans do not produce numbers in equal quantities.

2. Prepare Why are all Expected = 7.6?

38/5 = 7.6 Use the c2 statistic.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Integer One Two Three Four Five

Freq. 3 5 14 11 5

Page 46: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 46

3. Compute to Compare

p-value = 0.0217

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

2 2 2

2 3 7.6 5 7.6 11 7.6

7.6 7.6 7.6

2 214 7.6 5 7.6

11.477.6 7.6

Page 47: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 47

4. Interpret

p-value = 0.0217 < 0.05 = a Reject H0. Accept Ha. Conclusion: Humans have been shown to be

different from random number generators.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 48: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

Copyright © 2013 Pearson Education, Inc. All rights reserved

Chapter 10

Guided Exercise 2

Page 49: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 49

Obesity and Relationship

In a study reported in the medical journal Obesity the research subjects were categorized in terms of whether or not they were obese and whether they were dating, cohabiting, or married.

Test the hypothesis that the variables Relationship Status and Obesity are associated, using a significance level of 0.05.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 50: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 50

1. Hypothesize

Calculate the row, column and grand totals.

H0: Relationship status and obesity are independent.

Ha: Relationship status and obesity are associated.Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Dating Cohabitating Married Total

Obese 81 103 147 331

Not Obese 359 326 277 962

Total 440 429 424 1293

Page 51: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 51

2. Prepare We choose the chi-square test for independence

because the data were from one random sample in which the people were classified two different ways. Find the smallest expected value and report it. Is it more than 5?

The smallest expected value is 108.5. Since it is much bigger than 5, the c2-test can be

used.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 52: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 52

3. Compute to Compare

c2 ≈ 30.83 p-value < 0.001

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 53: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 53

4. Interpret

p-value < 0.001 p-value < 0.001 < 0.05 = a. Reject H0. Accept Ha. There is statistically significant evidence

to conclude that relationship status and obesity are associated.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 54: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 54

Causality

Can we conclude from these data that living with someone is making some people obese and that marrying is making even more people obese? No. We can only conclude that obesity and relationship

status are associated.

Can we conclude that obesity affects your relationship status? No. Cause and effect cannot be concluded based on just

looking at the data. A control study would have to be done if possible.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.

Page 55: Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means

1 - 55

Percentages

Find and compare the percentages obese in the three relationship statuses.

In StatCrunch, select Column Percent.

We see that the percent obese (34.67%)for the married category is much higher than the percent obese for the dating category (18.41%). The obesity percent (24.01%) for cohabitating couples is in the middle.

Copyright © 2013 Pearson Education, Inc.. All rights reserved.