100
Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik – ett alternativ till t-test och ANOVA? Uppsala 24 Oct 2019 NB: If you haven’t filled out the questionnaire yet, please do so! (for link: see tutorial announcement email)

Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Ronald van den BergDepartment of Psychology

Uppsala University / Stockholm University

Bayesiansk statistik – ett alternativ till t-test och ANOVA?

Uppsala24 Oct 2019

NB: If you haven’t filled out the questionnaire yet, please do so!(for link: see tutorial announcement email)

Page 2: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Ronald van den BergDepartment of Psychology

Stockholm University

Bayesian statistics #1: Hypothesis testing

Somewhere in a digital cloud17 June 2020

Page 3: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Tutorial #1: hypothesis testing

Examples of hypothesis testing:

• Is drug D more effective than a placebo?

• Is there a correlation between age and mortality rate in disease Y?

• Does model A fit the data better than model B?

• Do my subjects have a non-zero guessing rate?

Page 4: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Tutorial #2 (next week): hypothesis testing

Examples of estimation:

• On what percentage of people is this drug effective?

• How strong is the correlation between age and mortality rate in disease Y?

• How much better does model A fit the data than model B?

• How frequently did subjects guess in my experiment?

Page 5: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Why use statistics?

Page 6: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Why do we need statistical tests?

Differences are probably due to random variation

Differences are probably due to an effect of group

Group A Group B Group C Group A Group B Group CGroup A Group B Group C

Perf

orm

ance

Perf

orm

ance

Perf

orm

ance

Page 7: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Why do we need statistical tests?

Differences are probably due to random variation

Differences are probably due to an effect of group

Task of statistics is to quantify this "probably"

Group A Group B Group C Group A Group B Group CGroup A Group B Group C

Perf

orm

ance

Perf

orm

ance

Perf

orm

ance

Page 8: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Is there an effect of group on performance?

Group A Group B Group C

Perf

orm

ance

H0: There is no effect of group on performanceH1: There is an effect of group on performance

Page 9: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Group A Group B Group C

Perf

orm

ance

H0: There is no effect of group on performanceH1: There is an effect of group on performance

Frequentist approachCompute p(extremeness of the data | H0 is true)

Bayesian approachCompute p(data | H0 is true) / p(data | H1 is true)

Is there an effect of group on performance?

Page 10: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Frequentist approach

Page 11: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Note

There are two major schools of frequentist stats

vs.

The presently standard approach to hypothesis testing is aninconsistent hybrid that every decent statistician would reject

(Gigerenzer, 2004)

Page 12: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Hypothesis testing: Fisher's approach

1. Formulate a null hypothesis, H0

E.g.: “the drug has no effect on recovery speed”

2. Compute p, i.e., the probability of observing your data or more extreme data if H0 were true

3. A low p value implies that either something rare has occurred or H0 is not true

Page 13: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Hypothesis testing: Fisher's approach

1. Formulate a null hypothesis, H0

E.g.: “the drug has no effect on recovery speed”

2. Compute p, i.e., the probability of observing your data or more extreme data if H0 were true

3. A low p value implies that either something rare has occurred or H0 is not true

- Power analysis has no place in this framework- High p does not mean to accept H0

-> sounds reasonable, but ultimately a flawed way to test hypotheses

Reasoning:the lower p, the more certain we can be that H0 is false

Page 14: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

A p-roblem

Page 15: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Applying Fisher's approach to the case of Sally Clark

• 1996: Clark’s 1st son died a few weeks after birth (SIDS?)

• 1998: Clark’s 2nd son died a few weeks after birth (SIDS again????)

• 1999: Clark was found guilty of murder and given two life sentences

• H0: babies died from "Sudden Infant Death Syndrome" (SIDS) aka "crib death"

• SIDS occurence rate is 1 in 8,500

• The chance of this happening twice is 1 in 73 million, i.e., p = 0.0000000137

• Therefore, H0 is rejected

• Therefore, she must be guilty (double murder)

The conviction was partly based on the following statistical argument:

What is wrong with this line of reasoning?

Page 16: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Applying Fisher's approach to the case of Sally Clark

Even though H0 is unlikely, other hypotheses may be

even more unlikely!!

• H0: babies died from "Sudden Infant Death Syndrome" (SIDS) aka "crib death"

• SIDS occurence rate is 1 in 8,500

• The chance of this happening twice is 1 in 73 million, i.e., p = 0.0000000137

• Therefore, H0 is rejected

• Therefore, she must be guilty (double murder)

The conviction was partly based on the following statistical argument:

What is wrong with this line of reasoning?

Page 17: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Applying Fisher's approach to the case of Sally Clark

• H0: babies died from "Sudden Infant Death Syndrome" (SIDS) aka "crib death"

• SIDS occurence rate is 1 in 8,500

• The chance of this happening twice is 1 in 73 million, i.e., p = 0.0000000137

• Therefore, H0 is rejected

• Therefore, she must be guilty (double murder)

What happens if we add "murder" as an explicit alternative hypothesis?

• H1: double murder

• Infant murder rate in UK: approximately 1 in 33,000(*)

• The chance of this happening twice is 1 in 1.1 billion, i.e., p = 0.000000000918

• SIDS is 15 times more likely than murder!

(*) Marks, M. N., & Kumar, R. (1993). Infanticide in England and Wales. Medicine, Science and the Law, 33(4), 329-339.

Evidence is best treated as a relative concept

“How improbable is H0?”

“How (im)probable is H0, relative to H1?”

Page 18: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Applying Fisher's approach to the case of Sally Clark

• 1996: Clark's first son died suddenly within a few weeks of his birth

• 1998: Clark's second son died suddenly within a few weeks of his birth

• 1999: Clark was found guilty of murder and given two life sentences

• 2003: Clark is set free, yet highly traumatized

• 2007: Clark dies from alcohol poisoning

How did it end for Clark?

Page 19: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Applying Fisher's approach to the case of Sally Clark

The same kind of flawed reasoning was part of Lucia de Berk’s conviction in the Netherlands

Page 20: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

The deeper problem here:

• Some events are unlikely under any hypothesis

Page 21: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

The deeper problem here:

• Some events are unlikely under any hypothesis • Should we then reject them all and consider the event

unexplainable?

Solution: lower the α value for rare events?

Page 22: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

The deeper problem here:

• Some events are unlikely under any hypothesis • Should we then reject them all and consider the event

unexplainable?

However: how to do this without knowing the cause of the event??

Solution: lower the α value for rare events?

Page 23: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

The Bayes factor

Page 24: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Introduction to the Bayes Factor

( )

( )

( )

( )

( )

( )0 0 0

1 1 1

| |

| |

p H D p D H p H

p H D p D H p H=

Probability of Hypothesis 0, given the data

Probability of Hypothesis 1, given the data

Page 25: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Introduction to the Bayes Factor

Posterior ratio Bayes factor Prior ratio

( )

( )

( )

( )

( )

( )0 0 0

1 1 1

| |

| |

p H D p D H p H

p H D p D H p H=

Indicates how many times more likely the data are under H0 compared to H1

Page 26: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Introduction to the Bayes Factor

▪ By definition a relative measure▪ Easy, pleasant interpretation(s)▪ Allows to quantify evidence in favor of the null!▪ Generalizes more easily than frequentist approach?

Posterior ratio Bayes factor Prior ratio

( )

( )

( )

( )

( )

( )0 0 0

1 1 1

| |

| |

p H D p D H p H

p H D p D H p H=

Alternative interpretation:

BF indicates the change from prior odds to posterior odds brought about by the data

Page 27: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Introduction to the Bayes Factor

Posterior ratio Prior ratio

( )

( )

( )

( )

( )

( )0 0 0

1 1 1

| |

| |

p H D p D H p H

p H D p D H p H=

Visual interpretation of the Bayes factor

Bayes factor

Page 28: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Introduction to the Bayes Factor

Posterior ratio Prior ratio

( )

( )

( )

( )

( )

( )0 0 0

1 1 1

| |

| |

p H D p D H p H

p H D p D H p H=

Visual interpretation of the Bayes factor

Bayes factor

Page 29: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Introduction to the Bayes Factor

Posterior ratio Prior ratio

( )

( )

( )

( )

( )

( )0 0 0

1 1 1

| |

| |

p H D p D H p H

p H D p D H p H=

Visual interpretation of the Bayes factor

Bayes factor

Page 30: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Guideline for interpreting BF evidence strength(source: Wagenmakers et al. 2016)

Page 31: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

The two approaches in 5 steps

Frequentist approach (Fisher) Bayesian approach

Compute Bayes Factors

If p < 0.05: reject H0If p > 0.05: conclude nothing

Interpret the Bayes Factors as a continuous measure in favor oragainst the hypothesis

Formulate a single hypothesis H0 Formulate two or more hypotheses (may or may not include “H0”)

Make some initial decisions, e.g. "collect data from 20 subjects" or "collect data until BF>10 or BF<1/10 – may be revised later

Step 1

Gather data

Step 5

Decide on all study factors before measuring a single data point (sample size, what to do with outliers, etc) –revising these decisions later would invalidate the test

Step 2

Gather dataStep 3

Compute pStep 4

Page 32: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Fisherian vs Bayesian statistics:

p value• Evidence is absolute

(about single hypothesis)• Can only reject hypotheses• Tests are problem-specific• Confusing for non-statisticians

Bayes factor• Evidence is always relative

(w.r.t. alternative hypotheses)• Can reject and support hypotheses• Tests are general• Much less confusing

Page 33: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Fisherian vs Bayesian statistics:

p value• Evidence is absolute

(about single hypothesis)• Can only reject hypotheses• Tests are problem-specific?• Confusing for non-statisticians

Bayes factor• Evidence is always relative

(w.r.t. alternative hypotheses)• Can reject and support hypotheses• Tests are general?• Much less confusing

Page 34: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Fisherian vs Bayesian statistics:

p value• Evidence is absolute

(about single hypothesis)• Can only reject hypotheses• Tests are problem-specific?• Confusing for non-statisticians

Bayes factor• Evidence is always relative

(w.r.t. alternative hypotheses)• Can reject and support hypotheses• Tests are general?• Much less confusing

Page 35: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Fisherian vs Bayesian statistics:

p value• Evidence is absolute

(about single hypothesis)• Can only reject hypotheses• Tests are problem-specific• Confusing for non-statisticians

Bayes factor• Evidence is always relative

(w.r.t. alternative hypotheses)• Can reject and support hypotheses• Tests are general• Less confusing?

Page 36: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Fisherian vs Bayesian statistics:

p value• Evidence is absolute

(about single hypothesis)• Can only reject hypotheses• Tests are problem-specific?• Confusing for non-statisticians

Bayes factor• Evidence is always relative

(w.r.t. alternative hypotheses)• Can reject and support hypotheses• Tests are general?• Less confusing?

Why isn’t everyone a Bayesian???

Page 37: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Fisherian vs Bayesian statistics:

p value• Evidence is absolute

(about single hypothesis)• Can only reject hypotheses• Tests are problem-specific?• Confusing for non-statisticians

Bayes factor• Evidence is always relative

(w.r.t. alternative hypotheses)• Can reject and support hypotheses• Tests are general?• Less confusing?• Computationally expensive

Page 38: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Fisherian vs Bayesian statistics:

p value• Evidence is absolute

(about single hypothesis)• Can only reject hypotheses• Tests are problem-specific?• Confusing for non-statisticians

Bayes factor• Evidence is always relative

(w.r.t. alternative hypotheses)• Can reject and support hypotheses• Tests are general?• Less confusing?• Computationally expensive• Requires specification of priors

Page 39: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Fisherian vs Bayesian statistics:

p value• Evidence is absolute

(about single hypothesis)• Can only reject hypotheses• Tests are problem-specific?• Confusing for non-statisticians

Bayes factor• Evidence is always relative

(w.r.t. alternative hypotheses)• Can reject and support hypotheses• Tests are general?• Less confusing?• Computationally expensive• Requires specification of priors

“Objective” “Subjective”

Page 40: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesians quantify degrees of belief-> highly subjective

Frequentists quantify long-term frequencies-> claimed to be fully objective

Different philosophies

Page 41: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Example #1:

Correlation analysis

Page 42: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Correlation - example

Two common questions:1. Is the correlation "real"?2. What is a plausible estimate of the strength of the “true” correlation?

Frequentist approach:• Assume that data comes from a bivariate normal distribution• Compute p value to answer first question• Compute confidence interval to answer second question

Page 43: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Correlation - example

Intuitive way to think about the p-value:p ≈ probability of finding rsample > 0.39 if rpopulation = 0

Formally, however1. Compute t-statistic

2. Compute p = p(t* > 0.39 | rpopulation = 0)

Underlying logic:

If rpopulation=0, then t* follows a tdistribution with n-2 degrees of

freedom

Page 44: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

H0: No correlation between height ratio and relative support

Frequentist results: • p = 0.007• CI = [.12; .62]

What have we learned from this analysis?

Wrong! This is a Bayesian interpretation of a frequentist concept!

Correlation – frequentist results

2. We can be 95% confident that the “true” correlation is between .12 and .62

1. If the “true” (population-level) correlation were 0, we would have only 0.7% chance of finding data as extreme as our sample

Page 45: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Correlation analysis:a Bayesian approach

Page 46: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian correlation test

Same assumptionThe data come from a bivariate normal distribution

Same question Is there any evidence for a correlation at population level?

Different way to quantify this evidence▪ Bayes factor instead of p value▪ Credible interval instead of confidence interval

Page 47: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian correlation test

Posterior ratio Bayes factor Prior ratio

( )

( )

( )

( )

( )

( )0 0 0

1 1 1

| |

| |

p H D p D H p H

p H D p D H p H=

H0: r = 0

H1: r ≠ 0

In the context of correlation analysis, we define:

Hence, we want to compute

( )

( )

( )

( )01

| 0 , | 0BF

| 0 , | 0

p D r p r

p D r p r

= == =

x y

x y

(xi, yi)

Page 48: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian correlation test

( )

( )

( ) ( )

( ) ( )01

|B

0

, 0,, | 0

, | , | 0,F

p r

p

p r p

p r dp r

d=

==

=

x y x y θ θ θ

x y θ θ θx y

Hence, we want to compute

( )

( )

( )

( )0

01

1

| , | 0BF

| , | 0

p D H p r

p D H p r

== =

x y

x y

Page 49: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian correlation test

( )

( )

( ) ( )

( ) ( )01

|B

0

, 0,, | 0

, | , | 0,F

p r

p

p r p

p r dp r

d=

==

=

x y x y θ θ θ

x y θ θ θx y

Hence, we want to compute

( )

( )

( )

( )0

01

1

| , | 0BF

| , | 0

p D H p r

p D H p r

== =

x y

x y

Parameters of the assumed model

Prior over parameter values

Page 50: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian correlation test

( )

( )

( ) ( )

( ) ( )01

|B

0

, 0,, | 0

, | , | 0,F

p r

p

p r p

p r dp r

d=

==

=

x y x y θ θ θ

x y θ θ θx y

Hence, we want to compute

( )

( )

( )

( )0

01

1

| , | 0BF

| , | 0

p D H p r

p D H p r

== =

x y

x y

Need to specify what we mean here

Page 51: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian correlation test

( )

( )

( ) ( )

( ) ( )01

|B

0

, 0,, | 0

, | , | 0,F

p r

p

p r p

p r dp r

d=

==

=

x y x y θ θ θ

x y θ θ θx y

Hence, we want to compute

( )

( )

( )

( )0

01

1

| , | 0BF

| , | 0

p D H p r

p D H p r

== =

x y

x y

( )

( )

( ) ( )

( ) ( )01

, | 0,, | 0B

, | 0 , (|F

, )r p r dr

p r p dp r

p r p p d

==

=

=

x y x y

x y θ θ

θ

θ

θ θ

x y

Page 52: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian correlation test

( )

( )

( ) ( )

( ) ( )01

, | 0,, | 0B

, | 0 , (|F

, )r p r dr

p r p dp r

p r p p d

==

=

=

x y x y

x y θ θ

θ

θ

θ θ

x y

How to proceed from here?

Naive approach1. Plug in bivariate normal distribution2. Specify prior over r3. Specify prior over θ = {μ1, μ2, σ1, σ2}

Page 53: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian correlation test

( )

( )

( ) ( )

( ) ( )01

, | 0,, | 0B

, | 0 , (|F

, )r p r dr

p r p dp r

p r p p d

==

=

=

x y x y

x y θ θ

θ

θ

θ θ

x y

How to proceed from here?

Smarter approach: ask the internet

Page 54: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik
Page 55: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian correlation test

( )

( )

( ) ( )

( ) ( )01

, | 0,, | 0B

, | 0 , (|F

, )r p r dr

p r p dp r

p r p p d

==

=

=

x y x y

x y θ θ

θ

θ

θ θ

x y

How to proceed from here?

Wetzels & Wagenmaker’s approach:1. Assume a JZS prior on r [an “uninformative” prior]2. Now the BF can be computed analytically and depends only

on rsample and n.

Page 56: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian stats in action

Page 57: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

JASP:

• Free• Similar interface as SPSS• Bayesian and frequentist tests• Powered by BayesFactor for R

BayesFactor for R

• Free• Gives much more control over

what you’re doing than JASP

Page 58: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Frequentist approach:• p = 0.007• CI = [.12; .62]

Bayesian correlation test results

Bayesian approach:• BF10 = 6.33• CI = [.11; .60]

(CONFIDENCE interval) (CREDIBLE interval)

JASP result:

Page 59: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Frequentist approach:• p = 0.003• CI = [.16; 1.0]

Bayesian correlation test results

Bayesian approach:• BF+0 = 12.61• CI = [.11; .60]

(CONFIDENCE interval) (CREDIBLE interval)

Test #2: prior belief is that r is positive

Page 60: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Frequentist approach:• p = 0.997• CI = [-1, .58]

Bayesian correlation test results

Bayesian approach:• BF-0 = 0.052• CI = [-.14; -.001]

(CONFIDENCE interval) (CREDIBLE interval)

Test #3: prior belief is that r is negative

Page 61: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Example #2:

t-test

Page 62: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

T-test: frequentist approach

Male Female$24,000

$26,000

$28,000

$30,000

$32,000

$34,000

$36,000

$38,000

$40,000

$42,000

An

nu

al s

alar

y

H0: δ = 0

No difference in salary between men and women

Frequentist approach:1. Compute t-statistic2. Compute p value (based on t and n)

Result: p = 0.21

Interpretation:“Assuming H0 is true, we would find a test statistics as extreme (or more extreme) as in our sample in 21% of samples drawn from this population”

ConclusionNone – high p value does not imply H0 to be true

Page 63: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

T-test: Bayesian approach

Male Female$24,000

$26,000

$28,000

$30,000

$32,000

$34,000

$36,000

$38,000

$40,000

$42,000

An

nu

al s

alar

y

H0: δ = 0H1: δ ≠ 0

( )

( )

( )

( )0

01

1

| | 0BF

| | 0

p D H p D

p D H p D

== =

Page 64: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

T-test: Bayesian approach

Male Female$24,000

$26,000

$28,000

$30,000

$32,000

$34,000

$36,000

$38,000

$40,000

$42,000

An

nu

al s

alar

y

H0: δ = 0H1: δ ≠ 0

( )

( )

( )

( )0

01

1

| | 0BF

| | 0

p D H p D

p D H p D

== =

Approach• Assume Cauchy prior on effect size• Assume Jeffreys prior on variance, p(σ2) ∝ 1/σ2

• Compute BF as follows:

t = t statistic, N = #measurements, ν = #DoF = N-1

Page 65: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

T-test: Bayesian approach

-10 -8 -6 -4 -2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

δ

pro

bab

ility

Max width in JASP (b=2.0)

Default width (b=0.707)

Cauchy prior (like a normal, but sharper and fatter tails)

Page 66: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

T-test: Bayesian approach

Default prior Very wide prior

Page 67: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Example #3:

ANOVA & Regression

Page 68: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik
Page 69: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik
Page 70: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian approach to simple linear regression

LSD dose (tissue concentration)

Math score

1 2 3 4 5 6 700

10

20

30

40

50

60

70

80

90

100

Data source: Wagner, Agahajanian, and Bing (1968). Correlation of

Performance Test Scores with Tissue Concentration of Lysergic Acid

Diethylamide in Human Subjects. Clinical Pharmacology and

Therapeutics, Vol.9 pp635-638.

Assumed modely = α + βx + ε

α = interceptβ = slopeε = random error (Gaussian)

Frequentist vs Bayesian approach• Same assumed underlying model• Same questions/hypotheses• Different way of quantifying evidence

Page 71: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian approach to simple linear regression

Posterior ratio Bayes factor Prior ratio

( )

( )

( )

( )

( )

( )0 0 0

1 1 1

| |

| |

p H D p D H p H

p H D p D H p H=

H0: β = 0

H1: β ≠ 0

The hypotheses are:

( )

( )

( )

( )0

01

1

| | 0BF

| | 0

p D H p D

p D H p D

== =

Computable

Uncomputable unless we specify what we mean with “β≠0” -> Cauchy prior

Assumed modely = α + βx + ε

Page 72: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian approach to simple linear regression

( )

( )

( )

( )0

01

1

| | 0BF

| | 0

p D H p D

p D H p D

== =

Computable

Uncomputable unless we specify what we mean with “β≠0” -> Cauchy prior

-10 -8 -6 -4 -2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

0.5

β

pro

bab

ility

Max width in JASP (b=2.0)

Cauchy prior (like a normal, but sharper and fatter tails)

Default width in JASP regression (b=0.354)

Page 73: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian approach to simple linear regression

LSD dose (tissue concentration)

Math score

1 2 3 4 5 6 700

10

20

30

40

50

60

70

80

90

100

Data source: Wagner, Agahajanian, and Bing (1968). Correlation of

Performance Test Scores with Tissue Concentration of Lysergic Acid

Diethylamide in Human Subjects. Clinical Pharmacology and

Therapeutics, Vol.9 pp635-638.

Assumed modely = α + βx + ε

α = interceptβ = slopeε = random error (Gaussian)

Page 74: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian approach to simple linear regression

LSD dose (tissue concentration)

Math score

1 2 3 4 5 6 700

10

20

30

40

50

60

70

80

90

100

Data source: Wagner, Agahajanian, and Bing (1968). Correlation of

Performance Test Scores with Tissue Concentration of Lysergic Acid

Diethylamide in Human Subjects. Clinical Pharmacology and

Therapeutics, Vol.9 pp635-638.

Assumed modely = α + βx + ε

α = interceptβ = slopeε = random error (Gaussian)

Prior model evidence

Page 75: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian approach to simple linear regression

LSD dose (tissue concentration)

Math score

1 2 3 4 5 6 700

10

20

30

40

50

60

70

80

90

100

Data source: Wagner, Agahajanian, and Bing (1968). Correlation of

Performance Test Scores with Tissue Concentration of Lysergic Acid

Diethylamide in Human Subjects. Clinical Pharmacology and

Therapeutics, Vol.9 pp635-638.

Assumed modely = α + βx + ε

α = interceptβ = slopeε = random error (Gaussian)

Posterior model evidence

Page 76: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian approach to simple linear regression

LSD dose (tissue concentration)

Math score

1 2 3 4 5 6 700

10

20

30

40

50

60

70

80

90

100

Data source: Wagner, Agahajanian, and Bing (1968). Correlation of

Performance Test Scores with Tissue Concentration of Lysergic Acid

Diethylamide in Human Subjects. Clinical Pharmacology and

Therapeutics, Vol.9 pp635-638.

Assumed modely = α + βx + ε

α = interceptβ = slopeε = random error (Gaussian)

Change from prior to posterior odds (=Bayes factor of model Mx relative to all others)

Page 77: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian approach to simple linear regression

LSD dose (tissue concentration)

Math score

1 2 3 4 5 6 700

10

20

30

40

50

60

70

80

90

100

Data source: Wagner, Agahajanian, and Bing (1968). Correlation of

Performance Test Scores with Tissue Concentration of Lysergic Acid

Diethylamide in Human Subjects. Clinical Pharmacology and

Therapeutics, Vol.9 pp635-638.

Assumed modely = α + βx + ε

α = interceptβ = slopeε = random error (Gaussian)

Bayes factor of Mx relative to M0

Page 78: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Bayesian approach to simple linear regression

LSD dose (tissue concentration)

Math score

1 2 3 4 5 6 700

10

20

30

40

50

60

70

80

90

100

Data source: Wagner, Agahajanian, and Bing (1968). Correlation of

Performance Test Scores with Tissue Concentration of Lysergic Acid

Diethylamide in Human Subjects. Clinical Pharmacology and

Therapeutics, Vol.9 pp635-638.

Assumed modely = α + βx + ε

α = interceptβ = slopeε = random error (Gaussian)

BF estimation error

Page 79: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Example with multiple regressors (aka covariates)

Page 80: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Example with multiple regressors

Data

Dependent variable

Covariate #1 Covariate #2 Covariate #3

Assumed model: y = α + β1x1 + β2x2 + β3x3 + ε

(Source: R. Higgs (1971). "Race, Skills, and Earnings: American Immigrants in 1909", The Journal of Economic History)

Page 81: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Example with multiple regressors

Dependent variable: average weekly salary

Covariates: (1) english speaking (%), (2) literate (%), (3) >5 years in US (%)

FREQUENTIST RESULT

Page 82: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Example with multiple regressors

Dependent variable: average weekly salary

Covariates: (1) english speaking (%), (2) literate (%), (3) >5 years in US (%)

FREQUENTIST RESULT

BAYESIAN RESULT

Page 83: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Example with multiple regressors

Dependent variable: average weekly salary

Covariates: (1) english speaking (%), (2) literate (%), (3) >5 years in US (%)

FREQUENTIST RESULT

BAYESIAN RESULT

Page 84: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Example with multiple regressors

Dependent variable: average weekly salary

Covariates: (1) english speaking (%), (2) literate (%), (3) >5 years in US (%)

FREQUENTIST RESULT

BAYESIAN RESULT

Page 85: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Take-home points

#1

‘NHST’ is a widespread but flawed approach

(*) NHST=Null Hypothesis Significance Testing

Page 86: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Take-home points

#2

Evidence is best treated as a relative concept

❑ The Bayes Factor is by definition a relative measure❑ The p-value is an absolute measure

Page 87: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Take-home points

#3

Ideally we want to be able to both reject and accept hypotheses

❑ The Bayes Factor can quantify evidence in both directions❑ The p-value can only reject❑ Disregard of “null results” is a main driver behind the replication

crisis

Page 88: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Take-home points

#4

Ideally we want statistical evidence to be conditioned only on data

❑ The Bayes Factor has this property❑ The p-value depends on data collection stopping rule!

Page 89: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Take-home points

#5

The Bayesian approach requires specifying priors

❑ Some see this as a curse❑ Others see this as an opportunity to include prior knowledge

Page 90: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Take-home points

#6

Bayesians quantify belief, frequentists compute long-run frequencies

Page 91: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Take-home points

#7

Above all: make sure you know what you are doing!

Mindful Bayesian >

Mindful frequentist >>>>>>

Mindless Bayesian>

Mindless Frequentist

Page 92: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik
Page 93: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik
Page 94: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik
Page 95: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik
Page 96: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik
Page 97: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Some extra slides

Page 98: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Fisher vs Neyman-Pearson

Fisher's approach Neyman-Pearson's approach

Outcome: significant / non-significant Outcome: accept / reject

An alternative hypothesis cannot be specified

An alternative hypothesis must be specified

Does not have a concept of "power" Power has to be specified prior to the experiment

A single rejection of H0 is the start, not the end, of an investigation. Replication needed and meta-analyses are useful

A single rejection is meaningless –the framework only guarantees long-term type-1 and type-2 error rates but does not allow to make inference about a single case.

Presently, much statistical testing in psychology research is an "inconsistent hybrid that every decent statistician would reject"

(Gigerenzer, 2004)

p is a measure of evidence against H0 p is NOT a measure of evidence and should not be interpreted

Page 99: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

Main findings1) Only 36% of significant results replicated2) Effect sizes shrunk by ~50% in the replications

Why should we bother about statistical literacy?

Open Science Collaboration (2015),

Estimating the reproducibility of psychological science. Science, 349(6251)

Page 100: Bayesiansk statistik ett alternativ till t-test och ANOVA? · 6/17/2020  · Ronald van den Berg Department of Psychology Uppsala University / Stockholm University Bayesiansk statistik

A toxic mix of the following:

• Publication pressure• Disregard for “null findings”

… which incentivizes poor methodological hygiene:

• Hide null findings (file drawer problem)• Test many variables, report few (fishing)• Try many tests, report few (p-hacking)• Post-hoc hypothesizing (HARK-ing)• …

What caused the crisis?

Bayesian stats is not a miracle cure, but understanding the Bayesian approach will make you a more insightful consumer of statistics – which will likely lead to better statistical practices even if you stick to the frequentist methods.