HUDM4122 Probability and Statistical Inference April 1, 2015

HUDM4122Probability and Statistical Inference

April 1, 2015

Continuing from last class

You Try It

• 49 students use another curriculum and take pre and post tests

• The students average a gain of 3 points• The students get a standard deviation of 14

• Do the students learn from this curriculum?• Use a two-tailed Z test to find out

Z = = = = = 1.5

• Do the students learn from this curriculum?

1.5> 1.96It is not statistically significant

• Do the students learn from this curriculum?

Questions? Comments?

P-value

• As you’ve probably noticed, most papers don’t just report whether a result is statistically significant, they report a p-value as well

P-value

• The p-value is the smallest value of a• For which the test is still statistically significant

• Or in other words, it’s the probability that you could have seen the result you got, if the null hypothesis was true

To compute that probability

• Compute a Z for your data• Take the values –Z and +Z• Find the area to the left of the smaller Z on the Z

distribution• Find the area to the right of the bigger Z on the Z

distribution

• Add those together

• That’s your p

Example• Z = -1.96• So take -1.96 and +1.96

• Area to the left of Z=-1.96 is 0.025– See your probability table

• Area to the right of Z=+1.96 is 1-0.975= 0.025– See your probability table

• 0.025+0.025=0.05

• So for Z=-1.96, p =0.05

You try it

• Z = 1.53

You try it

• Z = -1.03

How you report it

• “The difference between the curricula was not statistically significant, Z=1.50,p=0.13”

• “The difference between the curricula was statistically significant, Z=5,p<0.001”

Reporting

• Customarily– p=actual value for p>=0.05– p<0.05 for 0.01<p<0.05– p<0.01 for 0.001<p<0.01– p<0.001 for p<0.001

MBB Say

This is non-standard; don’t do thisYou can sometime say “marginally significant” for 0.05<p<0.10; depends on the journal

Comments? Questions?

Comparing a sample to a single value

• Let’s say you want to compare a sample to a single value that is not zero

• Let’s review the example from last time• And then you will do one

Example from last time

• A TC professor is studying the grades on an exam taken by 49 students

• The students get an average (sampled) grade of 72

• The students get a standard deviation of 7

• Are the students doing statistically significantly better than the C cut-off line of 70?

2 > 1.96, so it is statistically significant

For Z=2, p = 0.023 + (1-0.977)=0.046

You try it• A fisherman is examining the size of the fish he catches to decide if

it’s worth fishing in these here waters• If the average catch size is 36” or under, those jerks in Albany will

confiscate his catch

• He catches 64 fish in his first net

• The fish have an average size of 37”• The fish have a standard deviation of 40”

• Should he fish in these here waters?• What’s the p value?

Z = = = 0.2• A fisherman is examining the size of the fish he catches to decide if

Z = = = 0.2, not stat sig.• A fisherman is examining the size of the fish he catches to decide if

p = 0.84• A fisherman is examining the size of the fish he catches to decide if

p = 0.84So he shouldn’t fish in these here waters

• A fisherman is examining the size of the fish he catches to decide if it’s worth fishing in these here waters

• If the average catch size is 36” or under, those jerks in Albany will confiscate his catch

Comments? Questions?

Two-group Z-test

• Combines our previously studied analysis to estimate the confidence interval of the difference between two groups

• And the process for computing statistical significance rather than confidence intervals

Two-sample Z-test

• A statistical test involving the Z distribution• Which, yes, means that your samples should

each have N>30

The test

• H0 : The difference between sample means is no different than 0

• Ha: The difference between sample means is different than 0

• Calculate a Z value for the difference between sample means

Significance Criterion

• For a two-tailed test, where = 0.05a

• We consider the test significant if

For example

• You’re comparing the difference between Reasoning Mind and Reasoning Lime

• Reasoning Mind: average grade = 72, standard deviation = 6, sample size = 36

• Reasoning Lime: average grade = 60, standard deviation = 30 , sample size = 36

Hypotheses

• Null hypothesis: There is no difference between Reasoning Mind and Reasoning Lime

• Alternative hypothesis: There is a difference between Reasoning Mind and Reasoning Lime

Z, so p=0.02 and it is statistically significant

You try it

• Our friend the fisherman is fishing in two rivers and wants to know if the fish are bigger in one river than another

• Salmon River: average size = 42”, standard deviation = 20”, sample size = 100

• Hudson River: average grade = 49”, standard deviation = 30” , sample size = 100

Types of Errors

• Statistician Terminology• Data Miner Terminology

“Type I Error”

• False Positive

• Rejecting the Null Hypothesis when the Null Hypothesis is true

• Saying the result is statistically significant when there’s nothing there

“Type II Error”

• False Negative

• Accepting the Null Hypothesis when the Null Hypothesis is false

• Saying the result is not statistically significant when there’s actually something there

In the traditionalstatistical significance paradigm

• You control a• You are unable to control b

Type I or Type II error?

• Reasoning Mind is better than Reasoning Lime, but your stat test got p=0.13

• Dreambox is not better than Bob’s Discount Math Curriculum, but your stat test got p=0.03

• Columbia University is better than Columbia College of Hollywood CA, but your stat test got p=0.17

Upcoming Classes

• 4/15 Statistical power– HW8 due

• 4/20 Independent-samples t-test– HW9 due

• 4/22 Paired-samples t-test

• 4/23 Special session on SPSS

HUDM4122 Probability and Statistical Inference April 1, 2015

Documents

Probability and Statstical Inference 2

Probability & Statistical Inference Lecture 4

Probability & Statistical Inference Lecture 1

Probability & Statistical Inference Lecture 2

Graphical Probability Models for Inference and Decision …mason.gmu.edu/~klaskey/.../GraphicalModels_Unit4_JTInference.pdf · Graphical Probability Models for Inference and Decision

Sampling Probability and Inference - SAGE Pub Probability and Inference - SAGE Pub ... and

Sampling Probability and Inference

Probability and Statstical Inference 4

Summarizing Data. Statistics statistics probability probability vs. statistics sampling inference

Statistical inference: Probability and Distribution

Probability & Statistical Inference Lecture 2

HUDM4122 Probability and Statistical Inference - upenn.edu · Flipping a fair coin Flipping same fair coin again. Which of these are independent? A B ... • Let’s say I’m a roadie

Probability & Statistical Inference Lecture 3

HUDM4122 Probability and Statistical Inference...Probability and Statistical Inference February 23, 2015 In the last class • We studied Bayes’ Theorem and the Law of Total Probability

Sample Probability and Statistical Inference

HUDM4122 Probability and Statistical Inference 2 • You have made friends with a specially trained mouse, who, on a given step, randomly goes left 1/3 of the time, forwards 1/3 of

Probability & Statistical Inference Lecture 4

Probability Theory and Statistical Inference

Probability & Statistical Inference Lecture 5

HUDM4122 Probability and Statistical Inference February 18, 2015