10 Recommendations from the Reproducibility Crisis in Psychological Science

  • View
    885

  • Download
    0

  • Category

    Science

Preview:

Citation preview

Ten Recommendations from the Reproducibility Crisis in Psychological Science

Jim Grangej.a.grange@keele.ac.uk

http://xkcd.com/242/

Reproducibility

• “The extent to which consistent results are observed when scientific studies are repeated”– Open Science Collaboration (2012)

• The demarcation between science and pseudo-science

• Scientific claims should not gain credence by virtue of status/authority of their originator

Reproducibility

• How reproducible are psychological findings?– Murmurings that reproducibility is low…

Reproducibility

• How reproducible are psychological findings?– Murmurings that reproducibility is low…

Reproducibility

• How reproducible are psychological findings?– Murmurings that reproducibility is low…

Reproducibility

• How reproducible are psychological findings?– Murmurings that reproducibility is low…

Reproducibility

• How reproducible are psychological findings?– Murmurings that reproducibility is low…

Journal of Personality and Social Psychology had an editorial policy not to publish “mere”

replications (!)

Reproducibility

• How reproducible are psychological findings?– Murmurings that reproducibility is low…

• An empirical question– Replicate X number of studies, and estimate the

reproducibility of psychological science

Reproducibility

• Open Science Collaboration– Formed in 2011 (~60 members)– Grew to 270 scientists from over 50 countries

Reproducibility

• 100 replications from 3 prominent journals:Journal of

Experimental Psychology:

Learning, Memory, & Cognition

Journal of Personality and

Social Psychology

Psychological Science

Reproducibility

• 97% of original studies reported significant effects (!)– 36% of replications had significant effects in same

direction

Reproducibility Crisis

• Psychology has a reproducibility crisis– Murmurings now have empirical support

• Reputation of psychology as a science is at stake

• How can we as individual scientists & institutions reverse this crisis?

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

1. Replicate, replicate, replicate…

Observation Hypothesis Experimentation

Publish

1. Replicate, replicate, replicate…

Observation Hypothesis Experimentation VERIFICATION

1. Replicate, replicate, replicate…

• Devoting resources to confirmation of findings is irrational if the original findings are valid– If all published findings are true, why waste time

confirming them?

• Devoting resources to confirmation of findings is rational if the original findings are invalid

1. Replicate, replicate, replicate…

• Strong incentives to pursue new ideas– Publications– Grant income– Employment– Promotion / tenure– …Fame

• Incentives need to change (see later…)

1. Replicate, replicate, replicate…

• We have a professional responsibility to ensure the findings we are reporting are robust and replicable– Direct / conceptual replications (where possible)

should be part of the research pipeline

1. Replicate, replicate, replicate…

• Conduct direct replications before embarking on research program extending published work– Often considered a waste of time– Get straight in to developing new work (due to

incentives…)

1. Replicate, replicate, replicate…

• Conduct direct replications before embarking on research program extending published work– Smith (2002) finds working memory capacity is

improved by new encoding strategy– PhD student wishes to extend work of Smith

(2002)– Spends 3 years conducting experiments testing

boundary conditions on effect– Finds nothing…

1. Replicate, replicate, replicate…

• Conduct direct replications before embarking on research program extending published work

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

2. Statistics (i): Beware p-hacking

• A new-term post 2011

• Related to researcher degrees of freedom– The many decisions made during a research

project– How many subjects? What conditions? How do I

treat outliers? What analyses should I run? How many dependent variables? …ad infinitum

2. Statistics (i): Beware p-hacking

p>.05 p>.05 p>.05 p>.05 p<.05 p>.05 p>.05 p>.05

With enough choices, there will always be one path which leads to a significant effect

even in the absence of a true effect

• Examined impact of researcher degrees of freedom on type-1 error

2. Statistics (i): Beware p-hacking

2. Statistics (i): Beware p-hacking

• p-hacking:– Exploring researcher degrees of freedom to find a

significant effect– Implicit bias or explicit “data manipulation”

• Is there evidence for it in psychology?

2. Statistics (i): Beware p-hacking

Statistics (i): Beware p-hacking

Statistics (i): Beware p-hacking

Masicampo et al. (2012)

2. Statistics (i): Beware p-hacking

• Solution to p-hacking?– Pre-registered analysis plans (see later)

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

3. Statistics (ii): Know your p-values

• Researchers not cognisant of influence of p-hacking on inference because most don’t understand the p-value (Cumming, 2012)

a) the probability that the results are due to chance; i.e. the probability that the null is true

b) the probability that the results are not due to chance; i.e. the probability that the null is false

c) the probability of observing results as extreme (or more) as obtained, if the null is true

d) the probability that the results would be replicated if the experiment was conducted a second time

3. Statistics (ii): Know your p-values

• I argue they don’t provide the information researchers are interested in. (You are free to disagree.)

• Probability of data, given null hypothesis– p(D|H0)

• Aren’t we interested in p(H|D)?– Specifically, p(H1|D)

3. Statistics (ii): Know your p-values

• Is p(D|H) the same as p(H|D)?

• Probability of person being dead, given they’ve been murdered– p(Dead|Murdered) = 1

• Probability of person having been murdered, given they are dead– p(Murdered|Dead) ~ <.001

3. Statistics (ii): Know your p-values

• Is p(D|H) the same as p(H|D)?

• Probability of person being dead, given they’ve been murdered– p(Dead|Murdered) = 1

• Probability of person having been murdered, given they are dead– p(Murdered|Dead) ~ <.001

3. Statistics (ii): Know your p-values

• Consider data with p<.001– Should we reject H0?– Surely depends on H1?– If H1 represents a very small effect, the data might

be just as unlikely under H1 as under H0

• Strength of evidence for H0 needs to be compared to strength of evidence for H1 conditioned on data

3. Statistics (ii): Know your p-values

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

4. Statistics (iii): Boost your power

• Power:– The probability of finding an effect in your study if

the effect is real

• Think back to your last study:– What was the power of that study?– How many participants did you need to obtain

power = 0.80? – Did you plan your sample size based on power?

4. Statistics (iii): Boost your power

• Psychological studies are woefully under-powered (Asendorpf et al., 2013)– Median effect size is d=0.50– Median sample size is 40– Power = 0.35 (!!!)

• Translation: psychology studies have an average 35% chance of finding a real effect– Would you fund such studies?

4. Statistics (iii): Boost your power

• Why are they underpowered?– Misunderstanding / lack of appreciation of power– Large studies are expensive– Large studies are time consuming– We need to publish MORE papers, and MORE

frequently

4. Statistics (iii): Boost your power

Replication studies had mean power of 0.92

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

5. Open data, open materials, open analysis

• Make your experimental materials, data, & analysis scripts freely available online

5. Open data, open materials, open analysis

• Make your experimental materials, data, & analysis scripts freely available online

– Others can easily replicate your work– Others can check your data– Others can check reproducibility of your analysis

5. Open data, open materials, open analysis

• Open Materials– Publish your experimental scripts (or equivalent)

online together with your publication– Encourages others to replicate your work– Encourages others to engage with your methods– Encourages collaboration

5. Open data, open materials, open analysis

• Open Data– Allows independent verification of data quality;

allows others to reproduce your analysis; transparency enhances trust in your findings; encourages collaboration…

– Authors in most psychology journals are obligated to share data if asked. • Just cut out the need to ask…

5. Open data, open materials, open analysis

• Open Data– Many hesitant to share data due to fear of errors– Nice knock-on effect: If you know you are going to

post your data online, you will ensure your data is of high quality/integrity• So, if everyone did this, more data would be of higher

quality…

5. Open data, open materials, open analysis

• Open Data– Many hesitant to share data due to fear of errors– Nice knock-on effect: If you know you are going to

post your data online, and you are worried people will find an error in your data, you will ensure your data is of high quality/integrity

• So, if everyone did this, more data would be of higher quality…

5. Open data, open materials, open analysis

• Open Analysis– Post your analysis scripts online together with

your raw data

– Reproducibility of a research finding from original data is a necessary requirement for replication• Researcher B gets exactly the same results originally

reported by Researcher A from A’s data set when following the same methodology & analysis procedure

5. Open data, open materials, open analysis

• Open Analysis

5. Open data, open materials, open analysis

• Open Analysis

5. Open data, open materials, open analysis

• Open Analysis

5. Open data, open materials, open analysis

• Open Analysis

– Move away from menu-driven statistics programs– Leave no trace of your analysis steps– Scripts are fully reproducible

– How confident are you that you could re-produce your exact reported findings given the same raw data?

5. Open data, open materials, open analysis

• Open Analysis

– “I don’t have time to learn a new statistics program”

– I understand…

– Encourage your new MSc/PhD students to learn one

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

6. Conduct pre-registered confirmatory studies

6. Conduct pre-registered confirmatory studies

• Confirmatory research:– Theory-driven– Hypotheses formed a priori– Methods decided a priori– Analytical methods decided a priori

– …no exploitation of researcher degrees of freedom

6. Conduct pre-registered confirmatory studies

• Exploratory Research:

6. Conduct pre-registered confirmatory studies

6. Conduct pre-registered confirmatory studies

• Exploratory research is fine– But it must NOT be presented as though it were

confirmatory– For example: Using 20 dependent variables in

“scatter-gun” approach, only report 1 DV that was p<.05 as if this was hypothesised all along…

6. Conduct pre-registered confirmatory studies

• HARK-ing:– “Hypothesising After the Results are Known”

• Is there evidence for HARK-ing?– Anecdotal: How many times has an editor told you

to change your introduction to “tell the story” of your data better?

– 92% of psychology articles report confirmed hypotheses (Fanelli, 2010)

6. Conduct pre-registered confirmatory studies

How to conduct confirmatory research:

• Decide upon study details a priori:– Hypotheses to test– Number of subjects– Exact conditions– Which DVs to use– Analytical strategies (outlier management etc.)– Exclusion criteria– etc. …

• Only THEN start recruiting

6. Conduct pre-registered confirmatory studies

How to conduct confirmatory research:

• Pre-register your study

Design Study

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

7. Incorporate open science practices in teaching

• We have a professional responsibility to ensure our students (graduate and undergraduate) are knowledgeable of good, open, scientific practice

7. Incorporate open science practices in teaching

• We need to:– Ensure the next-generation of researchers move

on from the reproducibility crisis

7. Incorporate open science practices in teaching

• We need to:– Teach the importance of conducting well-powered

studies

7. Incorporate open science practices in teaching

• We need to:– Encourage critical evaluation of published studies

in terms of open science practices

7. Incorporate open science practices in teaching

• We need to:– Encourage transparency, accurate documentation

of scientific methods, and archiving of data

7. Incorporate open science practices in teaching

• We need to:– Encourage replication attempts, and not to view

them as “weaker” final year projects

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

8. Insist on open science practices as reviewers

• Open science practice leads to more-reliable, reproducible, science– We would like more people to be doing this

• Top-Down Model:– Wait for journals to insist on these practices

• Bottom-Up Model:– Incentivise these practices as editors/reviewers

8. Insist on open science practices as reviewers

• As of January 1, 2017, signatories (as reviewers and/or editors) make open practices a pre-condition for more comprehensive review

8. Insist on open science practices as reviewers

• Signatories will not offer comprehensive review for, nor recommend the publication of, any manuscript that does not meet the following minimum requirements:

– Data should be made publicly available

8. Insist on open science practices as reviewers

• Signatories will not offer comprehensive review for, nor recommend the publication of, any manuscript that does not meet the following minimum requirements:

– Stimuli & materials should be made publicly available

8. Insist on open science practices as reviewers

• Signatories will not offer comprehensive review for, nor recommend the publication of, any manuscript that does not meet the following minimum requirements:

– Clear reasons should be provided for why data and/or materials cannot be open

8. Insist on open science practices as reviewers

• Signatories will not offer comprehensive review for, nor recommend the publication of, any manuscript that does not meet the following minimum requirements:

– Documents containing details for interpreting any data files or analysis code should be made available with the above items

8. Insist on open science practices as reviewers

• Signatories will not offer comprehensive review for, nor recommend the publication of, any manuscript that does not meet the following minimum requirements:

– The location of all of these files should be advertised in the manuscript, and all files should be hosted by a reliable third party

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct re-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

9. Reward open science practices

• Strong incentives to pursue new ideas– Publications– Grant income– Employment– Promotion / tenure– …Fame

• What incentives are there to conduct correct research?

9. Reward open science practices

Good for Science:- Truth seeking- Rigour- Quality- Reproducibility- Knowledge for its

own sake

Good for Individuals/Institutions:- Publishable- Quantity- Novelty- Impact

9. Reward open science practices

• “The problem is that the incentives for publishable results can be at odds with the incentives for accurate results…”

• “…the solution requires making the incentives for getting it right competitive with the incentives for getting it published”

(Nosek et al., 2012)

9. Reward open science practices

• Institutions have a key role to play

– Journal reputations are at stake (cf., Psychological Science, PNAS)

9. Reward open science practices

• Institutions have a key role to play

– But also, University reputations are at stake

9. Reward open science practices

• Institutions have a key role to play

– Change is coming.

– Do we want to lead the change, or react to the change?

9. Reward open science practices

• (Utopian) ideas for Institutions:

– Doing research right takes longer

9. Reward open science practices

• (Utopian) ideas for Institutions:

– Be tolerant of lower output (if doing it right)– Incorporate design quality of study in unit assessments

of faculty output– Provide “bonus points” for number of pre-registered

studies (regardless of outcome)– Provide “bonus points” for number of studies with

power greater than 90%– Incorporate reproducibility rates of research in unit

assessments of faculty output

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

Ten Recommendations1. Replicate, replicate, replicate…2. Statistics (i): Beware p-hacking3. Statistics (ii): Know your p-values 4. Statistics (iii): Boost your power5. Open data, open materials, open analysis6. Conduct pre-registered confirmatory studies7. Incorporate open science practices in teaching8. Insist on open science practices as reviewers9. Reward open science practices10. Incorporate open science into hiring decisions

10. Incorporate open science into hiring decisions

• A thought-experiment…

Nosek et al. (2012)

10. Incorporate open science into hiring decisions

• Universe A & B:– Investigating embodiment of political extremism– Participants (N = 1,979!) from the political left,

right, and center– Perceptual judgement task

Nosek et al. (2012)

10. Incorporate open science into hiring decisions

• Universe A & B:– Investigating embodiment of political extremism– Participants (N = 1,979!) from the political left,

right, and center– Perceptual judgement task– Moderates perceived shades of grey more

accurately than left or right (p<.01).

Nosek et al. (2012)

10. Incorporate open science into hiring decisions

• Universe A:– Researcher writes the results up & submits to

Nature– “Political extremists perceive the world in black

and white figuratively and literally”– 241 citations in past 2 years

Nosek et al. (2012)

10. Incorporate open science into hiring decisions

• Universe B:– Knows about researcher degrees of freedom– Conducts pre-registered replication of original

study (99.5% power)– No difference in accuracy between political views

(p=.59)– No publication

Nosek et al. (2012)

10. Incorporate open science into hiring decisions

• In which universe will this student most likely receive a lectureship position?

• Universe A most likely– There is something wrong with hiring decisions if

“getting it published” is rewarded more than “getting it right”.

Nosek et al. (2012)

10. Incorporate open science into hiring decisions

• (Utopian) ideas for hiring committees:– Look for evidence of open science practice– Have open science practice as a “desired” (or

“essential”!) item on job specification– Judge publication quality rather than quantity• Ask candidates to submit “best 5” papers, and sub-

committee only see these– Judge these publications on impact (citation

count, IF of journal etc.), but also on power, pre-registration, and reproducibility (open data etc.)

Thank You!j.a.grange@keele.ac.uk

Recommended