37
1 Doing Doing Statistics for Statistics for Business Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 15 Chapter 15 The Analysis The Analysis of of Qualitative Qualitative Data Data

Chapter 15 The Analysis of Qualitative Data

Embed Size (px)

DESCRIPTION

Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer. Chapter 15 The Analysis of Qualitative Data. Doing Statistics for Business. Chapter 15 Objectives In this chapter you will learn to use chi- square tests for: - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 15 The Analysis of Qualitative Data

1

DoingDoing Statistics for BusinessStatistics for Business Data, Inference, and Decision Making

Marilyn K. PelosiTheresa M. Sandifer

Chapter 15Chapter 15The Analysis of The Analysis of Qualitative DataQualitative Data

Page 2: Chapter 15 The Analysis of Qualitative Data

2

Doing Statistics for Business

Chapter 15 Objectives

In this chapter you will learn to use chi-square tests for:Testing whether a particular probability

model fits a set of data (Goodness of Fit test)

Testing equality of proportions for more than 2 populations

Page 3: Chapter 15 The Analysis of Qualitative Data

3

Doing Statistics for Business

Chapter 15 Objectives (con’t)

Testing whether 2 qualitative variables are dependent or independent

Page 4: Chapter 15 The Analysis of Qualitative Data

4

Doing Statistics for Business

The chi-square goodness of fit test checksto see how well a set of data fit the modelfor a particular probability distribution.

Page 5: Chapter 15 The Analysis of Qualitative Data

5

Doing Statistics for Business

The observed frequencies are the actualnumber of observations that fall into eachclass in a frequency distribution or histogram.

Page 6: Chapter 15 The Analysis of Qualitative Data

6

Doing Statistics for Business

The expected frequencies are the numberof observations that should fall into each class in a frequency distribution under the hypothesized probability distribution.

Page 7: Chapter 15 The Analysis of Qualitative Data

7

Doing Statistics for Business

A uniform distribution is one in which eachoutcome or class of outcomes is equally likelyto occur.

Page 8: Chapter 15 The Analysis of Qualitative Data

8

Doing Statistics for Business

TRY IT NOW!Seat-Belt Usage

Setting Up the Hypotheses for theGoodness of Fit TestAnalysts for insurance companies assume that the number of drivers

who wear seat belts is a binomial random variable with = 0.70. To

test this assumption they decide to set up checkpoints and sample 10

drivers every 2 hours. Set up the hypotheses to perform an appropriate

chi-square goodness of fit test.

Page 9: Chapter 15 The Analysis of Qualitative Data

9

Doing Statistics for Business

TRY IT NOW!Seat-Belt Usage

Calculating the Expected Frequencies

The insurance analysts collect data for 1000 samples of 10 drivers

and obtain the frequency distribution shown in the following table.

Find the expected frequency distribution for the data if the distribution

is really binomial with n = 10 and = 0.70.

Page 10: Chapter 15 The Analysis of Qualitative Data

10

Doing Statistics for Business

TRY IT NOW!Seat-Belt UsageCalculating the Expected Frequencies(con’t) # Wearing

Seatbelts (x)ObservedFrequency

p(x) ExpectedFrequency

0 01 02 13 64 335 1166 2137 2758 2169 119

10 21Total 1000

Page 11: Chapter 15 The Analysis of Qualitative Data

11

Doing Statistics for Business

TRY IT NOW!Seat-Belt Usage

Calculating the Expected Frequencies(con’t)Create frequency histograms for both the observed and the expected

frequency distributions. At this point, does it appear that the observed

data conform to the binomial distribution with n = 10 and = 0.70.?

Why or why not?

Page 12: Chapter 15 The Analysis of Qualitative Data

12

Doing Statistics for Business

TRY IT NOW!Seat-Belt UsageCalculating the Chi-Square StatisticFill in the following table to calculate the value of thechi-square statistic for the data obtained by the insurance analysts.

# WearingSeatbelts

ObservedFrequency

p(x) ExpectedFrequency

(o-e) (o -e)2/e

0 0 0.000 01 0 0.000 02 1 0.001 13 6 0.009 94 33 0.037 375 116 0.103 1036 213 0.200 2007 275 0.267 2678 216 0.233 2339 119 0.121 121

10 21 0.028 28Total 1000 1 1000

Page 13: Chapter 15 The Analysis of Qualitative Data

13

Doing Statistics for Business

k - p - 1

Figure 15.1 The Upper Tail of a Chi-Square Distribution

Page 14: Chapter 15 The Analysis of Qualitative Data

14

Doing Statistics for Business

TRY IT NOW!Seat-Belt UsageFinding the Critical Value and Performingthe Test

The insurance analysts decide that they want to test the goodness of fithypothesis at the 0.01 level of significance.

How many degrees of freedom will the critical value for the test have?

Find the critical value for the test.

Page 15: Chapter 15 The Analysis of Qualitative Data

15

Doing Statistics for Business

TRY IT NOW!Seat-Belt Usage

Finding the Critical Value and Performingthe Test (con’t)

Based on the chi-square test statistic and the critical value, what can you

conclude about the distribution of the number of people in a sample size

of 10 that wear seat belts?

Page 16: Chapter 15 The Analysis of Qualitative Data

16

Doing Statistics for Business

TRY IT NOW!Technical Support

Setting Up the Chi-Square Test forProportions

A company that sells computer software has three different locationsset up to provide customers with technical support for their products. The support representatives keep a log for each call to technical support,and as part of that log, they record whether the problem was resolvedsuccessfully.

Page 17: Chapter 15 The Analysis of Qualitative Data

17

Doing Statistics for Business

TRY IT NOW!Technical Support

Setting Up the Chi-Square Test forProportions (con’t)

The company analysts are interested in knowing whether the percentageof calls that are successfully resolved is the same for each location. Theyrandomly select logs from each location and collect data on the numberof calls that result in a successful resolution of the problem. The dataare summarized in the following table.

Page 18: Chapter 15 The Analysis of Qualitative Data

18

Doing Statistics for Business

TRY IT NOW!Technical SupportSetting Up the Chi-Square Test forProportions (con’t)

Set up the hypotheses for the software company.

Calculate the proportion of successfully resolved calls for each location. Based solely on these numbers, do you think that the proportion ofsuccessfully resolved calls for all three locations is the same?

LocationNumber of 1 2 3 TotalsSuccessful Calls 257 264 283 804Unsuccessful Calls 43 86 97 226Totals 300 350 380 1030

Page 19: Chapter 15 The Analysis of Qualitative Data

19

Doing Statistics for Business

TRY IT NOW!Technical SupportCalculating the Expected FrequenciesThe data for the computer software company interested in itstechnical support locations are

LocationNumber of 1 2 3 TotalsSuccessful Calls 257 264 283 804Unsuccessful Calls 43 86 97 226Totals 300 350 380 1030

Page 20: Chapter 15 The Analysis of Qualitative Data

20

Doing Statistics for Business

TRY IT NOW!Technical SupportCalculating the Expected Frequencies

Calculate , the percentage of calls that are resolved successfully, assuming that the three locations are the same.

Use the overall proportion of successful calls to find the expectedfrequency of successful calls for each location.

Page 21: Chapter 15 The Analysis of Qualitative Data

21

Doing Statistics for Business

Non-Binge Infrequent Frequent All

Never 77 39 31 14 58.25 40.22 48.54 147.0

Once or 7 19 3965 More 25.75 17.78 21.46 65.0

All 84 58 70 21 84.00 58.00 70.00 212.0

Chi-Square = 40.484, DF = 2, P-Value = 0.00

Cell Contents -- Count Exp Freq

Figure 15.2 Minitab Output from Chi-Square Analysis

Page 22: Chapter 15 The Analysis of Qualitative Data

22

Doing Statistics for Business

TRY IT NOW!Technical Support

Performing the Chi-Square Test forProportions

The computer software company with the different technical support

location wants to complete the test to determine whether the percentage

of successfully resolved calls is the same at all three locations. It wants

to test at the 0.01 level of significance.

Page 23: Chapter 15 The Analysis of Qualitative Data

23

Doing Statistics for Business

TRY IT NOW!Technical Support

Performing the Chi-Square Test forProportions (con’t)

Calculate the value of the chi-square test statistic and complete the test.

Is the proportion of successfully resolved calls the same at each location?

Page 24: Chapter 15 The Analysis of Qualitative Data

24

Doing Statistics for Business

TRY IT NOW!Drinking Survey

Setting Up the Contingency Table for aTest for Independence

The Public Health student who did the study on drinking also collected

data on the number of times that the student drove while intoxicated in the

last two weeks (coded). The contingency table for the usable responses is

given on the following slide.

Page 25: Chapter 15 The Analysis of Qualitative Data

25

Doing Statistics for Business

TRY IT NOW!Drinking Survey

Setting Up the Contingency Table for aTest for Independence (con’t)

Number of Times Drive While IntoxicatedClass Not at All Once Twice or More Total

Freshman 72 5 9 86

Sophomore 19 8 9 36

Junior 16 8 6 30

Senior 8 4 7 19

Total 115 25 31 171

Page 26: Chapter 15 The Analysis of Qualitative Data

26

Doing Statistics for Business

TRY IT NOW!Drinking Survey

Setting Up the Contingency Table for aTest for Independence (con’t)

The university is interested in knowing if the number of times a student

drove while intoxicated is related to his or her class in school. It feels

that this information will help target student audiences for programs on

drinking and driving.

Page 27: Chapter 15 The Analysis of Qualitative Data

27

Doing Statistics for Business

TRY IT NOW!Drinking SurveySetting Up the Contingency Table for aTest for Independence (con’t)

Set up the hypotheses that the university should test.

Calculate the expected frequencies for each cell and put them in theappropriate location in the table.

Are any of the expected frequencies less than 5? If so, can you suggest alogical way to combine categories to avoid this problem?

Page 28: Chapter 15 The Analysis of Qualitative Data

28

Doing Statistics for Business

TRY IT NOW!Drinking SurveyPerforming the Chi-Square Test forIndependenceThe university that is looking at the relationship between class year

and drinking and driving wants to perform the test at the 0.05 level of significance.

Calculate the value of the chi-square statistic for the data.

Find the critical value and perform the test.

Are class and drinking and driving independent?

Page 29: Chapter 15 The Analysis of Qualitative Data

29

Doing Statistics for Business

The Chi-Square Test for Independence in Excel - Creating the Contingency Table

1. From the Data menu, select Pivot Table Report and follow the steps of the pivot table wizard to create the table.

2. At step 4 of 4, after you indicate where you want the table to be placed, do not select Finish. Instead, click Options.

3. Make sure that the checkbox next to For empty cells, show is checked and in the textbox next to it, type “0”. (Excel will not accept empty cells in the contingency table.)

4. Click OK and then Finish.

Page 30: Chapter 15 The Analysis of Qualitative Data

30

Doing Statistics for Business

Figure 15.4 Pivot Table Options Dialog Box

Page 31: Chapter 15 The Analysis of Qualitative Data

31

Doing Statistics for Business

Figure 15.5 Contingency Table for Binge Drinking Data

Page 32: Chapter 15 The Analysis of Qualitative Data

32

Doing Statistics for Business

The Chi-Square Test for Independence In Kadd

Select Hypothesis Testing>Chi Square Test from the Kadd menu

The dialog box opens

Page 33: Chapter 15 The Analysis of Qualitative Data

33

Doing Statistics for Business

The Chi-Square Test for Independence Dialog Box

Page 34: Chapter 15 The Analysis of Qualitative Data

34

Doing Statistics for Business

The Chi-Square Test for Independence

Complete the Dialog Box:1. The input range should be the location of

the contingency table for the data. Make sure you do not select the column and row that contain the totals.

2. Indicate where you want the output located.

3. Click OK.

Page 35: Chapter 15 The Analysis of Qualitative Data

35

Doing Statistics for Business

The Chi-Square Test for Independence

Chi-square test statistic = 15.3884 Number of:

p-value = 0.0174 rows = 4columns = 3

Actual frequencies Expected frequenciesVariable B

Non-binge Infrequent Frequent TotalsVariable A Residence Hall or Dormitory 35 25 46 106

Fraternity or Sorority 0 1 0 1Other University Housing 0 2 1 3

Off Campus House or Apartment 49 30 24 103Totals 84 58 71 213

Chi-square calculationsVariable B

Non-binge Infrequent FrequentVariable A Residence Hall or Dormitory 1.1071 0.5172 3.2201

Fraternity or Sorority 0.3944 1.9447 0.3333Other University Housing 1.1831 1.7135 0.0000

Off Campus House or Apartment 1.7289 0.1360 3.1100

Page 36: Chapter 15 The Analysis of Qualitative Data

36

Doing Statistics for Business

Chapter 15 Summary

In this chapter you have learned: The chi-square test involves comparing observed

and expected frequencies for different classes of data.

The chi-square test is quite versatile and can be used to test:

Page 37: Chapter 15 The Analysis of Qualitative Data

37

Doing Statistics for Business

Chapter 15 Summary (con’t)

Goodness of Fit

Equality of Proportions for more than 2 Populations

Independence of Qualitative Variables

The results of a chi-square test do not solve a problems, but simply point out when further action is indicated and when it is not.