Download pdf - HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

HUDM4122Probability and Statistical Inference

April 27, 2015

HW10

Problem 1

• What is the critical value of t(e.g. p=0.05 for two-tailed test),for N=25?

• Correct answer: 2.06– Let’s look at how to find this

• Common wrong answer: 1.71– That’s a one-tailed test

Problem 3• Chico is interested in how high stuffed animals can

jump.A recent article in Stuffed Animal Quarterly suggeststhatthe average stuffed animal can jump 4 inches, and thatthe standard deviation is 4 inches as well.Chico asks his 9 favorite stuffed animals to jump.He finds that they jump an average of 5 inches.

• Are the stuffed animals jumping statisticallysignificantly higher than theaverage printed in Stuffed Animal Quarterly? Enter thep value.(Conduct a two-tailed t-test)

Problem 4

• Chico is interested in how high stuffed animals canjump.A recent article in Stuffed Animal Quarterly suggeststhatthe average stuffed animal can jump 4 inches, and thatthe standard deviation is 4 inches as well.Chico asks his 9 favorite stuffed animals to jump.He finds that they jump an average of 5 inches.

• What is the lower bound on the 95% ConfidenceInterval?(Use the t distribution.)

You Try It!• Chico is interested in how high action figures can

jump.A recent article in Action Figure Quarterlysuggests thatthe average stuffed animal can jump 2 inches,and thatthe standard deviation is 3 inches.Chico asks his 7 favorite action figures to jump.He finds that they jump an average of 4 inches.

• What is the 95% Confidence Interval?(Use the t distribution.)

Problem 5

8 students with a specific behavioral disorder participate in an interventiondesignedfor their needs, and are observed afterwards.The clinical observation scale goes from 0 to 10, with any score below 3consideredevidence for appropriate behavior.The average clincal observation score in your sample is 1.2, with astandard deviation of 2 points.What is the upper bound on the 95% Confidence Interval?(Give two digits after the decimal)

Lots of negative answers; is this plausible?

Problem 7

• You’re comparing the difference between Bob'sDiscount Math Curriculum and SaxonMathBob's: average grade = 58, standard deviation = 7,sample size = 15SaxonMath: average grade = 62, standarddeviation = 10.5, sample size = 20

Compute a two-tailed t-test to find out whetherthe difference between curricula isstatistically significant. Assume pooled variance.

Problem 9

• You’re comparing the difference between Bob'sDiscount Math Curriculum and JuteMath.

• Bob's: average grade = 58, standard deviation = 7,sample size = 15JuteMath: average grade = 62, standard deviation = 16,sample size = 20

Compute a two-tailed t-test to find out whether thedifference between curricula isstatistically significant. Assume unpooled variance.(Give two digits after the decimal, rounded)

You Try It!

• You’re comparing the difference between Bob'sDiscount Math Curriculum and PictMath.

• Bob's: average grade = 54, standard deviation = 8,sample size = 12PictMath: average grade = 60, standard deviation= 12, sample size = 18

Compute a two-tailed t-test to find out whetherthe difference between curricula isstatistically significant.

Problem 10• You're comparing

students' scores on themidterm and the final,to see if students didsignificantly worse onthe final than themidterm.Compute a two-tailedpaired t-test to answerthis question.Give two digits after thedecimal, rounded.

Midterm Final0.72 0.720.69 0.670.65 0.660.73 0.680.95 0.920.88 0.880.62 0.720.78 0.71

You Try It!• You're comparing


Midterm Final0.7 0.60.7 0.60.6 0.70.8 0.71.0 0.90.9 0.80.6 0.60.8 0.8

t(7)=1.87,p=0.10• You're comparing


Midterm Final0.7 0.60.7 0.60.6 0.70.8 0.71.0 0.90.9 0.80.6 0.60.8 0.8

Questions? Comments?

Chi-squared (χ2) distribution

So far…

• We have largely talked about– Comparing quantitative variables

– Is a mean different than 0 (or another criterion value)• Does a curriculum lead to learning?

– Are means different for two samples?• Does curriculum A lead to more learning than curriculum B?

– Are means different for two variables from the samesample?

• Do individual learners do better on the pre-test than thepost-test?

We have also

• Talked about comparing proportions

But what if…

• We want to compare two groups, in terms of acategorical variable?

Example

• One group of students uses Singapore Math• Another group of students uses Bob’s

Discount Math Curriculum

• The prevalence of different affective states ismeasured using BROMP field observations

• We compare this using a two-way table

We find

Affective State SingaporeMath

BDMC

BORED 2 5FRUSTRATED 9 14

ENGAGED 20 12CONFUSED 6 9DELIGHTED 13 10

We want to know

• Is affect significantly different betweenSingapore Math and BDMC?

We want to know

• Is affect significantly different between SingaporeMath and BDMC?

• : There is no difference in the proportions ofeach affective state, between the two variables

• : There is some difference in the proportionsof each affective state, between the two variables

How can we test this?

• We compare the actual counts in the table

• To the expected counts we might expect tosee for each variable,if were true

We can compute this

• , = ∗• Or as the book writes it

• , = ∗

Why we treat this as expected value

• If there really was no difference betweengroups

• Then the overall percentage of cases wherethe student is bored will be the same betweengroups


• So we can take the percentage of cases wherethe student is bored

• Multiplied by the percentage of cases that arein the group overall

• Multiplied by the total number of cases

• And that’s the number of cases we wouldexpect the group to be bored


• So we can take the percentage of cases wherethe student is bored:

• Multiplied by the percentage of cases that arein the group overall:

• Multiplied by the total number of cases n


• So we can take the percentage of cases wherethe student is bored:

• Multiplied by the percentage of cases that arein the group overall:

• Multiplied by the total number of cases n


, = = ∗

Example

AffectiveState

SingaporeMath

BDMC

BORED 2 5

FRUSTRATED 9 14

ENGAGED 20 12

CONFUSED 6 9

DELIGHTED 13 10

Example

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5

FRUSTRATED 9 14

ENGAGED 20 12

CONFUSED 6 9

DELIGHTED 13 10

Column Total

Example

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 7

FRUSTRATED 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23


Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 7

FRUSTRATED 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23


BORED-SM Expected = ∗

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 7

FRUSTRATED 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23


BORED-SM Expected = ∗ = 3.5

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23


BORED-BDMC Expected = ∗ = 3.5

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23


FRUSTRATED Expected = ∗ = 11.5

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7


ENGAGED 32

CONFUSED 15

DELIGHTED 23


ENGAGED? You Try It.

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7


ENGAGED 32

CONFUSED 15

DELIGHTED 23


CONFUSED? You Try It.

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7


ENGAGED 32

CONFUSED 15

DELIGHTED 23


DELIGHTED? You Try It.

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7


ENGAGED 16 16 32

CONFUSED 7.5 7.5 15

DELIGHTED 11.5 11.5 23


Now we can compare the two tables

Comparing the observed and expectedcounts

• χ2 = ∑ ( )

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7


ENGAGED 16 16 32

CONFUSED 7.5 7.5 15



χ2 =( . ). + ( . ). + ( . ). + …

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7


ENGAGED 16 16 32

CONFUSED 7.5 7.5 15



χ2 = . . + . . + . . + …

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23


AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7


ENGAGED 16 16 32

CONFUSED 7.5 7.5 15



χ2 = 5.36

How is χ2 distributed?

• It can’t be a Z or t distribution…

• Because all the values are squared, andgreater than 0

For v degrees of freedom

Image from David Sabo’s webpage,http://commons.bcit.ca/math/faculty/david_sabo/apples/math2441/section8/onevariance/chisqtable/chisqtable.htm

When df>=30Distribution is approximated by Z

Image from David Sabo’s webpage,http://commons.bcit.ca/math/faculty/david_sabo/apples/math2441/section8/onevariance/chisqtable/chisqtable.htm

Almost always used one-tail

Image from Philip Ender’webpage,http://www.philender.com/courses/intro/notes3/chi.html

Messy to calculate two-tailed:asymmetric

df

• Not calculated as a function of n!

• Instead, calculated in terms of number of rowsand columns

• = ( − 1)( − 1)

df: for this case

• r = 5• c = 2

• df = (5-1)(2-1)• df = 4

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7


ENGAGED 16 16 32

CONFUSED 7.5 7.5 15



Use χ2 table

• Or =CHIDIST in Excel

Use χ2 table


• In this case, =CHIDIST(5.36, 4)

Use χ2 table



• Which gives p = 0.25

• So BDMC and Singapore Math are notsignificantly different in terms of affect

Use χ2 table• Or =CHIDIST in Excel


• Which gives p = 0.25

• So BDMC and Singapore Math are notsignificantly different in terms of affect

• Written χ2(df=4,N=100)=5.36, p=0.25

Comments? Questions?

Note

• Columns do not have to have the same totalvalue

You Try It!

PreferredMovie

TeenageFemales

TeenageMales

Babies

RacecarExplosions

17 44 3

HorsePrincessDiaries

41 13 1

Shiny Things 0 2 71

Conan theLibrarian

4 12 2

χ2(df=6,N=196)=224, p<0.001

PreferredMovie

TeenageFemales

TeenageMales

Babies

RacecarExplosions

17 44 3

HorsePrincessDiaries

41 13 1

Shiny Things 0 2 71

Conan theLibrarian

4 12 2

How do we know which category isdifferent?

• After you do the overall χ2 test

• You can then look at individual categories versusall other categories, and do χ2 again

• But, warning – you have to do a post-hocadjustment of α

• Out of scope for this class; see Chapter 5, Video 1of Big Data and Educationhttp://www.columbia.edu/~rsb2162/bigdataeducation.html

When you can use χ2

• Not usable for very small amounts of data

• Magic number– All expected cell counts should be > 5– (Not everyone uses the same magic number here)

• If a cell count is under 5– You can try combining columns or rows

Example:What is your favorite sport?

Sport America IrelandBaseball 25 3Football 42 2Soccer 19 43Rugby 6 29Hurling 0 31

Tiddlywinks 3 0Calvinball 3 5

Example:What is your favorite sport?

Sport America Ireland TotalBaseball 25 3 28Football 42 2 44Soccer 19 43 62Rugby 6 29 35Hurling 0 31 31

Tiddlywinks 3 0 3Calvinball 3 5 8

Total 98 113 211

Some numbers too small!


Tiddlywinks 3 0 3Calvinball 3 5 8

Total 98 113 211

98 ∗ 3211 = 1.3998 ∗ 8211 = 3.72113 ∗ 3211 = 1.61113 ∗ 8211 = 4.23

Solution: Combine Two Rows


Sports RyanHas NeverHeard of 6 5 11

Total 98 113 211

98 ∗ 11211 = 5.11113 ∗ 11211 = 5.89

Questions? Comments?

Many other uses of χ2

• Comparing tables to tables– Looking at changes over time

• Testing whether data is consistent with aspecific distribution

• Computing the confidence interval of astandard deviation

Many other uses of χ2

• Comparing tables to tables– Looking at changes over time

• Testing whether data is consistent with a specificdistribution

• Computing the confidence interval of a standarddeviation– This is discussed in the book in chapter 10.6– It’s rare, and a bit conceptually messy– It’s rare in part because it provides asymmetric

confidence intervals

Final questions for the day?

Review sessions

• Please fill out doodle link by Friday• I will set up review sessions Friday

• We will not be able to use class time for areview session– If anyone can’t make either review session, we can

set up a separate meeting

Upcoming Classes

• 4/29 F test– HW 11 due

• 5/4 Test assumptions

• 5/6 ANOVA– HW 12 due

• 5/11 FINAL EXAM