Chapter 12 Inferences on Categorical Data and Qualitative Data

Chapter 12 Inferences on Categorical Data and Qualitative Data

Categorical Data:

One variable multiple outcomes (one row; multiple columns): stat - goodness of fit - chi square

Two variables multiple outcomes (two or more rows; multiple columns):

stat – tables – contingency

Qualitative Data:

3 or more means: Stat ANOVA One Way Select all columns to compare Compute

2 variables with 3 or more means: Stat ANOVA Two Way select response, row factor, column

factoroptions: plot interactions, display means table, comput tukey HSD Compute

(do not select ‘fit additive model’)

Ch 12.1 Goodness-of-Fit Test

For categorical data (one variable) with more than two outcomes

Objective A : Goodness-of-Fit Test

Note the conditions above. The chi-square statistic will be 0 when all observed counts are the same as their expected counts.

Example 1: The probabilities of getting an A, B, C or D for a science class at a particular

university are shown below. Determine the expected counts for each outcome if the sample

size is 700.

A B C D

𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦: 𝑝𝑖 0.15 0.30 0.35 0.20

Expected counts: 𝐸𝑖 105 210 245 140

Note: one variable: grade

iii npE

14020.0*700

24535.0*700

21030.0*700

10515.0*700

4

3

2

1

E

E

E

E

Note: 𝑝𝑖 = 0.15 + 0.30 + 0.35 + 0.20 = 1

Check: 105 + 210 + 245 + 140 = 700

You will have to compute expected counts by hand for these hypothesis tests.

Example 2: (we will actually do this problem like in example 3)

For example you toss a special rigged coin (binomial since there only two possible outcomes:

heads or tails) four times and the number of heads are counted. There are five possible

outcomes: 0H, 1H, 2H, 3H, or 4H. The seller of this special coin claims a 0.3 probability of

landing on heads each flip. The buyer (who is a statistician) is not sure if he believes the seller.

Therefore he conducts his own experiment.

How was ,......, 21 EE being calculated?

P(0 Heads)= P(T) and P(T) and P(T) and P(T) = (0.7)(0.7)(0.7)(0.7) = 0.2401

n = 240.1 + 411.6 + 264 + 75.6 + 8.1 = 1000 (This process was repeated 1000 times.)

𝐸1 = 𝑝1𝑛 = 0.2401 ∗ 1000 = 240.1(240 times of the 1000 times resulted in zero heads)

We can get the rest from Statcrunch:

Stat --> Calculator --> Binomial --> Input the following --> Compute

(a) Determine the 2 test statistic.

i

ii

E

EO2

2

= 1.8

)1.810(

6.75

)6.7550(

6.264

)6.264280(

6.411

)6.411400(

1.240

)1.240260( 22222

= 11.987

(b) Determine the degrees of freedom.

4151 kDF

(k = 5 since there are 5 possible outcomes)

(c) Use StatCrunch to determine the P value.

Stat --> Calculator --> Chi-Square --> Standard -->

The hypothesis tests of this section (categorical data) are always right-tailed;

Input the following.

P value = 0.0174 which is unusual and 0.0174 < 0.05 so reject the null hypothesis Ho.

(d) Conclusion

There is sufficient evidence to support the claim that the random variable X is not binomial with

n = 4, p = 0.3. Therefore, the statistician should not buy the coin.

Example 3 (for hw): Use StatCrunch to perform the hypothesis testing of Example 2

at the 01.0 level of significance.

(a) Setup

Ho: The random variable X is binomial with n = 4, p = 0.3.

H1: The random variable X is not binomial with n=4, p=0.3.

(b) P value

Input Observed Counts in column 1 and Expected Counts in column 2 --> Stat --> Goodness-of-

fit --> Chi-Square test --> Select Var1 for Observed and Var2 for Expected--> Compute

P value = 0.0174 (same as before)

Which is low but the significance level is now 0.01 and 0.0174 is not less than 0.01.

Therefore, not unusual at a 0.01 significance level. Can not reject the null.

(c) Conclusion

There is not sufficient to support the claim that the random variable

X is not a binomial with n = 4, p = 0.3. The statistician might want to buy the coin.

The significance level makes a difference!!!

Example 4:

Note: one variable color of M&M

Total = 53+66+38+96+88+59 = 400

a. Set up

Ho: p(brown) = 0.12, p(yellow) = 0.15, p(red) = 0.12, p(blue) = 0.23,

p(orange) = 0.23, p(green) = 0.15

H1: At least one of the proportions is not equal to or different from the

reported claim from the manufacturer.

Expected values by hand (what we would expect if what the company claims is

true):

E(brown) = 0.12*400 = 48, E(yellow) = 0.15*400 = 60, E(red) = 0.12*400 = 48,

E(blue) = 0.23*400 = 92, E(orange) = 0.23*400 = 92, E(green) = 0.15*400 = 60.

X Brown Yellow Red Blue Orange Green

Observed 53 66 38 96 88 59

Expected 48 60 48 92 92 60

b. p-value (from Statcrunch)

Input Observed Counts in column 1 and Expected Counts in column 2 (these are computed by

hand) --> Stat --> Goodness-of-fit --> Chi-Square test --> Select Var1 for Observed and select

Var2 for Expected --> Compute

P-value (0.613) which is not unusual and 0.613 is not less than 0.05. Cannot reject the null

hypothesis.

c. Conclusion:

There is not sufficient evidence to support the claim that what the manufacture claims is not

true. Thus, the company reports correct proportions.

Ch 12.2 Tests for Independence

For categorical data that is summarized in contingency tables with at least two

variables and multiple outcomes

Objective A :Tests for Independence

Example 1 (we will follow example 2):

Note: The two variables are gender and life choice

(a) Compute the expected values of each cell under the assumption of

independence.

Pro life Pro choice Total

Men 196 (actual) 199 395

Women 239 249 488

Total 435 448 883

By assuming an individual opinion and gender are independent,

)(

))((

totalgrand

totalcolumntotalrowE 𝐸11 =

(395)(435)

883 ≈ 194.6 (expected)

592.194883

)435)(395(11 E 408.200

883

)448)(395(12 E

408.240883

)435)(488(21 E 592.247

883

)448)(488(22 E

Summarize the observed counts and expected counts in a table where the expected counts are

expressed in a parenthesis. Note: Statcrunch can compute expected values for this section.

Gender Pro Life Pro Choice Row Total

Men 196 (194.592) 199 (200.408) 395

Women 239 (240.408) 249 (247.592) 488

Column Total 435 448 883 (Grand Total)

(b) Verify that the requirements for performing a chi-square test of

independence are satisfied.

(1) Are all expected frequencies are greater than or equal to 1? Yes.

(2) No more than 20% of the expected frequencies are less than 5: true.

(c) Determine the 2 test statistic.

i

ii

E

EO2

2

= 592.247

)592.247249(

408.240

)408.240239(

408.200

)408.200199(

592.194

)592.194196( 2222

= 0.0374

(d) Determine the degrees of freedom.

DF = (r - 1)(c - 1) = (2 - 1)(2 - 1) = 1

(e) Test whether an individual's opinion regarding abortion is independent

of gender at the 10.0 level of significance.

Set up:

Ho: An individual opinion regarding abortion is independent of gender.

H1: An individual opinion regarding abortion is dependent of gender.

Or

Ho: There is no difference between gender and opinion on abortion.

H1: There is a difference between gender and opinion on abortion.

P value: from statcrunch:

Stat --> Calculator --> Chi-Square --> Standard -->

The hypothesis tests of this section are always right-tailed;

Input the following.

P-value = 0.847 which is not unusual and 0.847 is not less the significance level of 0.10. Cannot

reject the null hypothesis.

Conclusion:

There is not enough evidence to support the claim that abortion opinion and gender are

dependent.

Or

There is not enough evidence to show that there is a difference between gender and position

on abortion.

Example 2 (for homework): Use StatCrunch to redo example 1 of testing whether an individual's

opinion regarding abortion is independent of gender at the 10.0

level of significance.

(a) Setup

Ho: An individual opinion regarding abortion is independent of gender.

H1: An individual opinion regarding abortion is dependent of gender.

Or

Ho: There is no difference between gender and opinion on abortion.

H1: There is a difference between gender and opinion on abortion.

(b) P value from Statcrunch

Input the data as shown.

Stat --> Tables --> Contingency --> With Summary --> Under Select Column(s),

click Pro Life and Ctrl click Pro choice --> Under Row Labels, select Gender -->

Under Display, select Expected count --> Compute

StatCrunch Results:

(c) Conclusion

P-value = 0.8489 which is the same as before.

Example 3:

Note: two variables- wantedness and prenatal care

Test whether prenatal care and the wantedness of pregnancy are independent

at the 05.0 level of significance.

Note: df = (r-1)(c-1) = (3-1)(3-1) = 4

(a) Setup

Ho: Prenatal care and ‘wantedness’ of pregnancy are independent.

H1: Prenatal care and ‘wantedness’ of pregnancy are dependent.

Or

Ho: Prenatal care and ‘wantedness’ of pregnancy are not associated.

H1: Prenatal care and ‘wantedness’ of pregnancy are associated.

(b) P value

Input the data into Statcrunch (see below)

Stat --> Tables --> Contingency --> With Summary --> Under Select Column(s),

click Less than 3 months, 3 to 5 Months, and More Than 5 Months (use the

Ctrl) --> Under Row Labels, select Wantedness of Pregnancy --> Under

Display, select Expected count --> Compute

StatCrunch Results:

P-value (0.0003) which is unusual and 0.0003 is less than 0.05. Reject the null hypothesis.

(c) Conclusion

There is enough evidence to show that prenatal care and ‘wantedness’ of pregnancy are

dependent (associated).

Homework 12.2 #2 part b ‘conditional distribution’: edit from part 1 and select ‘row percent’

and enter answer as a decimal. Then part c: choose the graph based on these values.

12.2 #6 stat-tables-contingency-with data

Row- political aff.

Column-willing to pay higher taxes

Display-expected count

Compute

Careful: table gives no-yes but answer in MML is yes-no

Ch 12.3 Comparing Three or More Means (Quantitative data)

(Supplemental Materials)

- One-Way Analysis of Variance, ANOVA

Review:

Therefore: The chi-square goodness-of-fit tests are always right tailed because the numerator in the test statistic is

squared, making every test statistic, other than a perfect fit, positive.

Objective A :One-Way ANOVA Test

Condition: 𝑠𝑙𝑎𝑟𝑔𝑒𝑠𝑡 < 2𝑠𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 (the largest standard deviation has to be less than twice the smallest)

Note: populations must be normally distributed (regardless of sample size, ie. the greater than 30

requirement does not apply here).

To perform a one-way ANOVA, the populations must have the same variance. However, the variance/SD

of the samples will vary.

There is small variability ‘within’ each sample compared to the variability between the means (notice

there is no overlap of the intervals).

�̅�1(𝑏𝑙𝑢𝑒) = 5 �̅�1(𝑏𝑙𝑢𝑒) = 5

�̅�2(𝑟𝑒𝑑) = 8 �̅�2(𝑟𝑒𝑑) = 8

�̅�3(𝑔𝑟𝑒𝑒𝑛) = 11 �̅�3(𝑔𝑟𝑒𝑒𝑛) = 11

Here there is large variability ‘within’ each sample compared to the variability between the means. In

this case it is harder to determine if there is a significant difference between the means.

Notice the ratio uses the ‘between’ variability to ‘within’ variability.

The variability among the sample means is called between-sample variability and the variability of each

sample is the within-sample variability.

Example 1:

(a) Setup:

Ho: The mean reaction times are the same for all three groups.

H1: At least one of the group’s mean reaction time is different.

or

Ho: 𝜇1 = 𝜇2 = 𝜇3

H1: 𝜇1 ≠ 𝜇2 or 𝜇2 ≠ 𝜇3 or 𝜇1 ≠ 𝜇3

(b)

1. There are 3 simple random samples.

2. The 3 samples are independent of each other.

3. Normal probability plots indicate that the sample data come from a normal population.

4. If the largest sample standard deviation is no more than twice the smallest sample

standard deviation, we can assume the populations have the same variance.

We can use Statcrunch to find the standard deviations for each group. Input the data into the first three

columns as shown. For homework, see below.

Stat Summary Stats Columns Input the following Compute

Verify: largest SD < 2 smallest SD 0.1100 < 2(0.0638) 0.1100< 0.1276 true We can assume the populations have the same variance.

(C)

P value from Statcrunch (use this for homework):

Stat ANOVA One Way Select all three columns to compare (Simple, Go/No Go, Choice)

Compute This allows you to get SD and hypothesis test results.

P-value = 0.0681 which is not unusual and 0.0681 is not less than 0.05. Do not reject the null

hypothesis.

Conclusion: There is not sufficient evidence to support the claim that at least one mean reaction

time is different from the others.

(d) Draw boxplots of the three stimuli.

Graph Boxplot Select all three columns Check draw boxes horizontally and select the box to

show the means Compute

d) Compute the 95 % CI for each group.

T stats – one sample – with data –select all columns – CI - under optional graphs and tables: confidence

interval plot

Simple (0.352 0.486) Go no go (0.354, 0.586) Choice (0.455, 0.647) Diagram of the CI’s:

By hand:

e) Does the output for the 95% CI support the conclusion from the hypothesis test?

All intervals overlap. Therefore, there is no difference between the three groups. This is the same conclusion as before.

f) Test the hypothesis at a 0.10 significance level.

In this case, the p value of 0.0681 < 0.10 so it is unusual at this significance level. Therefore, reject the

null.

Conclusion: There is enough evidence to support the claim that there is a difference in reaction time in

at least one group. (However, we don’t necessarily which group was the different one.)

g) Find the corresponding 90% CI and show how it supports the conclusion above.

T stats – one sample – with data :

Simple (0.367 0.472) Go no go (0.380, 0.561 Choice (0.475, 0.626) Diagram of the CI’s by hand:

Since group 1 and group 3 do not overlap, then there is a difference among the three groups. This is the

same conclusion as above. Furthermore, we know that the two groups that were significantly different

were group 1 & 3.

Ch 12.4 Two - Way Analysis of Variance (Supplemental Materials)

Objective A : Two - Way ANOVA Test

An example of interaction effect is sleeping pills and alcohol. They are usually not fatal when

taken alone, but can be fatal when combined.

Example 1:

(a) Verify that the largest sample standard deviation is no more than twice the smallest standard

deviation. If this is true, we can assume the populations have the same variance.

Input the data in StatCrunch. Note this is a 2 x 3 factorial design since there are two rows and 3

columns in the table. Each cell has three replications.

Stat Summary Stats Columns Input the following, then click Compute

The largest SDis 2.6457513 and the smallest SD is 1.5275252.

Verify: largest SD < 2 smallest SD 2.64< 2 (1.53) 2.64< 3.06 true We can assume the populations have the same variance.

(b) Test whether there is an interaction effect between the drug dosage and age.

Before we can test the hypothesis if the drug increased the HDL (good) choesteral levels among a) different age groups b) different drug dossage we first need to see if there is an ‘interaction’ between age and dossage

Inputting the data is a bit tricky for a two-way ANOVA analysis.

Use cut and paste or type the data in to produce three columns.

Response (HDL), Row Factor (Age), Col Factor (Dosage) see below.

Stat ANOVA Two Way select response, row factor, column factoroptions: plot interactions,

display means table, comput tukey HSD Compute (do not select ‘fit additive model’)

Click on ‘ >’ for the second and third page of the StatCrunch outputs:

The above graphs are called the interaction plots. We look at the level of parallelism among the lines.

Since the lines are roughly parallel, we conclude there is no interaction between age and drug dosages.

Therefore, we can procede with the actual hypothesis test:

(c) If there is no interaction between age and drug dosages, determine whether there

is sufficent evidence to conclude that the mean increase in HDL cholesterol is different

(i) among each drug dosage group and (ii) for each age group. (Choose a significance level of 0.05)

Among each drug dosage group:

(i) the mean increase in HDL cholesterol among each drug dosage group is different

Returning to the first page in statcrunch, the P-value for dosage is <0.0001 which is unusual since

0.0001 is less than 0.05. Reject Ho. There is sufficient evidence to support the claim that HDL cholesterol level changes with dosage. Click > to the third page of the StatCrunch outputs:

Note: The above graph indicates the mean of HDL increases as the drug dosage increases (for both age grooups). Also, note that the difference between the two age groups is not as large (wide). Among each age dosage group:

(ii) the mean increase in HDL cholesterol among each age group is different

P-value for age is 0.0838 which is not unusual since 0.0838 is not less than 0.05. Do not reject Ho. There is not sufficient evidence to support the claim that HDL cholesterol changed among age groups. 12.4#5

Since P = .829 then not unusual results. No evidence of interaction. If there is evidence of interaction, then a hypothesis test cannot be conducted to see if there is a difference in means from factor A or factor B.

Note: MS = mean square 12.4#7 To determine if there is a significance difference among levels at a 0.05 significance level select the forth box “Compute Tukey HSD” and check the p-values.

12.4#8 Do together In order to get the p-value for interaction do not select: ‘fit additive model’

e. graph-QQ plot f. for the tukey test only checking the first two outputs: Gender, Age

Documents

Chapter 12 Inferences on Categorical Data and Qualitative Data