36
Statistics: Unlocking the Power of Data Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical versus practical significance (4.5) Multiple testing (4.5)

Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Embed Size (px)

Citation preview

Page 1: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Hypothesis Testing: Cautions

STAT 250

Dr. Kari Lock Morgan

SECTION 4.3, 4.5• Type I and II errors (4.3)• Statistical versus practical significance (4.5)• Multiple testing (4.5)

Page 2: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

There are four possibilities:Errors

Reject H0 Do not reject H0

H0 true

H0 false TYPE I ERROR

TYPE II ERRORTrut

h

Decision

• A Type I Error is rejecting a true null (false positive)

• A Type II Error is not rejecting a false null (false negative)

Page 3: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

• In the test to see if resveratrol is associated with food intake, the p-value is 0.035.

o If resveratrol is not associated with food intake, a Type I Error would have been made

• In the test to see if resveratrol is associated with locomotor activity, the p-value is 0.980.

o If resveratrol is associated with locomotor activity, a Type II Error would have been made

Red Wine and Weight Loss

Page 4: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

A person is innocent until proven guilty.

Evidence must be beyond the shadow of a doubt.

Types of mistakes in a verdict?

Convict an innocent

Release a guilty

Type error

Type error

Analogy to Law

Page 5: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

If the null hypothesis is true:• 5% of statistics will be in the most extreme 5% • 5% of statistics will give p-values less than 0.05• 5% of statistics will lead to rejecting H0 at α = 0.05• If α = 0.05, there is a 5% chance of a Type I error

Distribution of statistics, assuming H0 true:

Probability of Type I Error

Page 6: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

If the null hypothesis is true:• 1% of statistics will be in the most extreme 1% • 1% of statistics will give p-values less than 0.01• 1% of statistics will lead to rejecting H0 at α = 0.01• If α = 0.01, there is a 1% chance of a Type I error

Distribution of statistics, assuming H0 true:

Probability of Type I Error

Page 7: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

• The probability of making a Type I error (rejecting a true null) is the significance level, α

Probability of Type I Error

Page 8: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Probability of Type II ErrorHow can we reduce the probability of making a

Type II Error (not rejecting a false null)?

a) Decrease the sample sizeb) Increase the sample size

Page 9: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Larger sample size makes it easier to reject the null

H0: p = 0.5Ha: p > 0.5

n = 10

n = 100

Page 10: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Probability of Type II ErrorHow can we reduce the probability of making a

Type II Error (not rejecting a false null)?

a) Decrease the significance levelb) Increase the significance level

Page 11: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Significance Level and Errorsα

• Reject H0

• Could be making a Type I error if H0 true

• Chance of Type I error

• Do not reject H0

• Could be making a Type II error if Ha true

• Related to chance of making a Type II error

• Decrease α if Type I error is very bad

• Increase α if Type II error is very bad

Page 12: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

• The probability of making a Type I error (rejecting a true null) if the null is true is the significance level, α

• The probability of making a Type II error (not rejecting a false null) if the alternative is true depends on the significance level and the sample size (among other things)

• α should be chosen depending how bad it is to make a Type I or Type II error

Probability of Errors

Page 13: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Choosing αBy default, usually α = 0.05

If a Type I error (rejecting a true null) is much worse than a Type II error, we may choose a smaller α, like α = 0.01

If a Type II error (not rejecting a false null) is much worse than a Type I error, we may choose a larger α, like α = 0.10

Page 14: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Come up with a hypothesis testing situation in which you may want to…

• Use a smaller significance level, like = 0.01

• Use a larger significance level, like = 0.10

Significance Level

Page 15: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

• With small sample sizes, even large differences or effects may not be significant

• With large sample sizes, even a very small difference or effect can be significant

• A statistically significant result is not always practically significant, especially with large sample sizes

Statistical vs Practical Significance

Page 16: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

• Example: Suppose a weight loss program recruits 10,000 people for a randomized experiment.

• A difference in average weight loss of only 0.5 lbs could be found to be statistically significant

• Suppose the experiment lasted for a year. Is a loss of ½ a pound practically significant?

Statistical vs Practical Significance

Page 17: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Diet and Sex of Baby•Are certain foods in your diet associated with whether or not you conceive a boy or a girl?

•To study this, researchers asked women about their eating habits, including asking whether or not they ate 133 different foods regularly

•A significant difference was found for breakfast cereal (mothers of boys eat more), prompting the headline “Breakfast Cereal Boosts Chances of Conceiving Boys”.

http://www.newscientist.com/article/dn13754-breakfast-cereals-boost-chances-of-conceiving-boys.html

Page 18: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

“Breakfast Cereal Boosts Chances of Conceiving Boys”

I used to eat breakfast cereal every morning and have two boys. Do you think this helped to boost my chances of having boys?

a) Yesb) Noc) Impossible to tell

Page 19: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Hypothesis TestsFor each of the 133 foods studied, a hypothesis test was conducted for a difference between mothers who conceived boys and girls in the proportion who consume each food

If there are NO differences (all null hypotheses are true), about how many significant differences would be found using α = 0.05?

How might you explain the significant difference for breakfast cereal?

Page 20: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Multiple Testing

When multiple hypothesis tests are conducted, the chance that at least one test

incorrectly rejects a true null hypothesis increases with the number of tests.

If the null hypotheses are all true, α of the tests will yield statistically significant results

just by random chance.

Page 21: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

www.causeweb.orgAuthor: JB Landers

Page 22: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Multiple Comparisons• Consider a topic that is being investigated by research teams all over the world

Using α = 0.05, 5% of teams are going to find something significant, even if the null hypothesis is true

Page 23: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Multiple Comparisons

•Consider a research team/company doing many hypothesis tests

Þ Using α = 0.05, 5% of tests are going to be significant, even if the null hypotheses are all true

Page 24: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

• This is a serious problem

• The most important thing is to be aware of this issue, and not to trust claims that are obviously one of many tests (unless they specifically mention an adjustment for multiple testing)

•There are ways to account for this (e.g. Bonferroni’s Correction), but these are beyond the scope of this class

Multiple Comparisons

Page 25: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Publication Bias

• publication bias refers to the fact that usually only the significant results get published

• The one study that turns out significant gets published, and no one knows about all the insignificant results

• This combined with the problem of multiple comparisons, can yield very misleading results

Page 26: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

http://xkcd.com/882/

Jelly Beans Cause Acne!

Page 27: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Page 28: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Page 29: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

http://xkcd.com/882/

Page 30: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Multiple Testing and Publication Bias

THIS SHOULD SCARE YOU.

Why most published research findings are false.

Page 31: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Cuckoo Birds•Cuckoo birds lay their eggs in the nests of other birds

•When the cuckoo baby hatches, it kicks out all the original eggs/babies

•If the cuckoo is lucky, the mother will raise the cuckoo as if it were her own

http://opinionator.blogs.nytimes.com/2010/06/01/cuckoo-cuckoo/

•Do cuckoo birds found in nests of different species differ in size?

Page 32: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Length of Cuckoo Eggs

Page 33: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

Cuckoo EggsBird Sample

MeanSample

SDSample

Size

Pied Wagtail 22.90 1.07 15

Pipit 22.50 0.97 60

Robin 22.58 0.68 16

Sparrow 23.12 1.07 14

Wren 21.13 0.74 15

Overall 22.46 1.07 120

Page 34: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

p-values

Pied Wagtail Pipit Robin Sparrow Wren

Pied Wagtail - 0.21 0.34 0.59 0.0001

Pipit 0.21 - 0.71 0.07 0.00003

Robin 0.34 0.71 - 0.13 0.00006

Sparrow 0.59 0.07 0.13 - 0.00006

Wren 0.0001 0.00003 0.00006 0.00006 -

Page 35: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

• Two types of errors: rejecting a true null (Type I) and not rejecting a false null (Type II)

• Statistical vs practical significance

•Using α = 0.05, 5% of all hypothesis tests will lead to rejecting the null, even if all nulls are true

Summary

Page 36: Statistics: Unlocking the Power of Data Lock 5 Hypothesis Testing: Cautions STAT 250 Dr. Kari Lock Morgan SECTION 4.3, 4.5 Type I and II errors (4.3) Statistical

Statistics: Unlocking the Power of Data Lock5

To DoRead Section 4.3, 4.5

Do HW 4.5 (due Friday, 3/20)