Don’t make a fool of yourself - Foster Open Science · 2018. 6. 25. · Don’t make a fool of...

Preview:

Citation preview

Don’t make a fool of yourselfReputation and performance evaluation in academia

This presentation is licensed under a CC-BY 4.0 license. You may copy, distribute, and use the slides in your own work, as long as you give attribution to the original author at each slide that you use.

2018-06-20

PD Dr. Felix SchönbrodtLudwig-Maximilians-Universität München

www.nicebread.dewww.researchtransparency.org

@nicebread303

2

Thesis 1: Our current indicators for scientific quality do a

bad job.

Thesis 2:Our current incentives foster bad science.

Thesis 3:Some ideas, how good scientific practice and

incentive structures can be realigned.

Thesis 1: Our current indicators for scientific quality do a

bad job.

4

Journal Impact Factor (JIF)

5Lariviere, V., Kiermer, V., MacCallum, C. J., McNutt, M., Patterson, M., Pulverer, B., Swaminathan, S., u. a. (2016). A simple proposal for the publication of journal citation distributions. bioRxiv, 062109. doi:10.1101/062109

JIF = 35

76% of papers have less citations

2.1% of papers are never cited

Journal Impact Factor (JIF)

6

•Objective quantification of cristallograhic quality: higher JIF ➙ less quality (Brown and Ramaswamy, 2007)

Für Überblick, siehe Brembs, Button, & Munafò (2013)

Journal Impact Factor (JIF)

7Brown and Ramaswamy, 2007

bett

er

Journal Impact Factor (JIF)

8

•Objective quantification of cristallograhic quality: higher JIF ➙ less quality (Brown and Ramaswamy, 2007)

•Negative relationship between JIF and statistical power/sample sizes in psychology (Fraley & Vazire, 2014; Szucs & Ioannidis, 2016)

•Positive relationship between JIF and objective errors of gene names in Excel sheets (Ziemann, Eren, & El-Osta, 2016)

•Positive relationship between JIF and the frequency of retractions (Brembs, Button, & Munafò, 2013)

For an overview, see Brembs, Button, & Munafò (2013) and http://bjoern.brembs.net/2016/01/even-without-retractions-top-journals-publish-the-least-reliable-science/

9

„Double blind peer review is the hallmark of scientific quality

control“

Reliability for single case diagnostics

11Krohne, H. W., & Hock, M. (2015). Psychologische Diagnostik: Grundlagen und Anwendungsfelder. Kohlhammer Verlag.

How well do reviewers agree in their assessment of a paper?➙ interrater agreement

Peer review• The classic: Would top journals accept already published papers once

more? Peters and Ceci (1982)

12for an overview, see Osterloh, M., & Kieser, A. (2015).

Peer review• The classic: Would top journals accept already published papers once

more? Peters and Ceci (1982)

• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts):

13for an overview, see Osterloh, M., & Kieser, A. (2015).

Peer review• The classic: Would top journals accept already published papers once

more? Peters and Ceci (1982)

• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .84, kappa = .77 Bornmann, Mutz, Daniel (2010)

14for an overview, see Osterloh, M., & Kieser, A. (2015).

Peer review• The classic: Would top journals accept already published papers once

more? Peters and Ceci (1982)

• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .84, kappa = .77 Bornmann, Mutz, Daniel (2010)

15for an overview, see Osterloh, M., & Kieser, A. (2015).

Peer review• The classic: Would top journals accept already published papers once

more? Peters and Ceci (1982)

• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .34, kappa = .17 Bornmann, Mutz, Daniel (2010)

16for an overview, see Osterloh, M., & Kieser, A. (2015).

Peer review• The classic: Would top journals accept already published papers once

more? Peters and Ceci (1982)

• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .34, kappa = .17 Bornmann, Mutz, Daniel (2010)

• „Agreement about shared values“ ≠ „agreement about true value“ ➙ Correlation with „true value“ <= 1➙ Estimate correlation of reviewer’s assessment with „true value“ of a paper: r = .09 – .27 (mean r = .18; explained variance: 3%) Starbuck (2004)

• Decisions highly dependent on a (random) selection of reviewers ➙ Lottery Bornmann & Daniel (2009)

17for an overview, see Osterloh, M., & Kieser, A. (2015).

18SHIN, J. C., Toutkoushian, R. K., & Teichler, U. (2011). University Rankings: Theoretical Basis, Methodology and Impacts on Global Higher Education. Springer Science & Business Media, p. 151.; Zitat erstmalig gesehen bei Präsentation von Margit Osterloh

„When I divide the week’s contribution into two piles – one that we are going to publish and the other we are going to return – I wonder whether it would make any real difference to the journal or its readers if I exchanged one pile for another“.

Sir Theodore Fox in Lancet, 1965

Peer review• The classic: Would top journals accept already published papers once

more? Peters and Ceci (1982)

• Meta-analysis of reviewer agreement (k=48, 19,443 manuscripts): ⌀ ICC = .34, kappa = .17 Bornmann, Mutz, Daniel (2010)

• „Agreement about shared values“ ≠ „agreement about true value“ ➙ Correlation with „true value“ <= 1➙ Estimate correlation of reviewer’s assessment with „true value“ of a paper: r = .09 – .27 (mean r = .18; explained variance: 3%) Starbuck (2004)

• Decisions highly dependent on a (random) selection of reviewers ➙ Lottery Bornmann & Daniel (2009)

• Summary: Pre-publication peer review in the current system has a (perceived) value as feedback (to improve a manuscript), but virtually no value as quality control; very inefficient

19for an overview, see Osterloh, M., & Kieser, A. (2015).

Number of publicationsEmmy-Noether-Programm

20Böhmer et al (2008): Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & Ins, von, M. (2008); Neufeld, J. (2016).

Number of publicationsEmmy-Noether-Programm

21Böhmer et al (2008): Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & Ins, von, M. (2008); Neufeld, J. (2016).

Average JIFEmmy-Noether-Programm

22Böhmer et al (2008): Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & Ins, von, M. (2008); Neufeld, J. (2016).

Citations per PaperEmmy-Noether-Programm

23Böhmer et al (2008); Hornbostel, S., Böhmer, S., Klingsporn, B., Neufeld, J., & Ins, von, M. (2008); Neufeld, J. (2016).

Emmy-Noether-Programm

24Böhmer et al (2008)

"Taken together, the bibliometric results show remarkably small differences between funded and rejected applicants (prior to funding). Moreover, these small differences are not increased by the fact that one of both groups gets the funding of the Emmy Noether program* and the other doesn’t.“

*1 - 1.5 million €

Thesis 2:Our current incentives foster bad science.

26

Much of the scientific literature, perhaps half, may simply be untrue.

Part of the problem is that no one is incentivised to be right.

Richard Horton,Editor von The Lancet

Quantity, not quality

27Abele-Brehm, A. E., & Bühner, M. (2016). Wer soll die Professur bekommen? Psychologische Rundschau, 67(4), 250–261. http://doi.org/10.1026/0033-3042/a000335

Actual (not desired) relevance at professorship hiring committees: Rank

Number of peer-reviewed publications 1

Fit of research profile to the advertising institution 2

Quality of research talk 3

Number of publications 4

Volume of acquired third-party funding 5

Number of first authorships 6

… …

28

Bakker, M., van Dijk, A., & Wicherts, J. M. (2012). The Rules of the Game Called Psychological Science. Perspectives on Psychological Science, 7(6), 543–554. http://doi.org/10.1177/1745691612459060Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society Open Science, 3(9), 160384–17. http://doi.org/10.1098/rsos.160384

Ideal strategy for a high quantity of publications:small n + many studies + questionable research practices (QRPs), such as p-hacking

„The rules of the game“ „Evolution of bad science“

„Innovative, unprecedented, transformative!“ +880% von 1974- 2014

29Vinkers, C. H., Tijdink, J. K., & Otte, W. M. (2015). Use of positive and negative words in scientific PubMed abstracts between 1974 and 2014: retrospective analysis. Bmj, 351, h6467–6. http://doi.org/10.1136/bmj.h6467

Amazing!!

Enormous!!

Groundbreaking!!!

30

http://www.nature.com/news/2011/111005/full/478026a/box/2.htmlhttp://retractionwatch.com/2016/03/24/retractions-rise-to-nearly-700-in-fiscal-year-2015-and-psst-this-is-our-3000th-post/https://www.washingtonpost.com/news/speaking-of-science/wp/2016/04/01/when-scientists-lie-about-their-research-should-they-go-to-jail/

„In the past decade, the number of retraction notices has shot up 10-fold.“

Retractions: +1000% in 10 years

2013

467

500

2015

684

Scientific misconduct:+ 1200% in 4 years

31https://ori.hhs.gov/case_summary

U.S. Office of Research Integrity

2009-2011 2012-2015

3

36

Which part of published findings can be independently replicated?

32

0%

25%

50%

75%

100%

Psychology (2015; n = 97)

Economy* (2015; n = 67)

Cancer research 1 (2011; n = 53)

Cancer research 2 (2012; n = 67)

Experimental Philosophy

(n=40, 2018)

22%

79%89%

51%64%

78%

21%11%

49%36%

Open Science Collaboration (2015); Chang & Li (2015); Begley, C. G., & Ellis, L. M. (2012). Prinz, F., Schlange, T., & Asadullah, K. (2011); Cova et al. (2018)

* The data on economics is about reproducibility; i.e. the attempt to get the same results if you apply the original data analysis on the original data set.

Early career researchers are stuck

33

➙ felt contradiction between „good research“/„open research“ and „having a career in science“

What would be a good balance between Open Science and having a career in academia? […] Being open IMHO is a competitive disadvantage. Can you only afford open science when you are tenured?

My contract is limited to two years – although it would be nice to publish the data, I have no time to do it. I rather have to churn out another publication.

Why should I share my hard-won data with my rivals that presumably compete with me for the next post-doc position?

34

© KC Green

Thesis 3:Some ideas how to realign good scientific practice

and incentive structures.

Publications Job committeesFunding/Grants

Variant 1: Post-publication Peer Review (PPPR)

•Higher agreement in evaluating bad research•➙ Negative selection is possible; assessment of excellence is hardly possible

•Only perform input control (i.e., cut out the crap), after that use public post-publication peer review in a lively discourse to sort out excellent from the average stuff

36

Publications

Kriegeskorte, N. (2012). Open evaluation: a vision for entirely transparent post-publication peer review and rating for science, 1–18. http://doi.org/10.3389/fncom.2012.00079/abstract

Variant 2: Overlay Journals

1. All papers are initially preprints, which get comments in open peer review. They get eventually revised on the preprint server.

2. Journals look out for the best preprints („reader’s digest“), probably even compete for the best papers, or authors actively submit their preprint to a journal

3. Optionally, these papers get an additional traditional peer review with selected reviewers.

4. The final paper stays on the preprint server and gets linked from the overlay journaly.

37

Example: Discrete Analysis (mathematics): no APCshttp://discreteanalysisjournal.com/

Publications

Potential drawbacks

•An even greater flood of papers?• Use intelligent recommender systems• Overlay Journals as Reader’s Digest• Papers without any PPPR typically are not read

(„a twilight zone of unevaluated papers“)

•Attention economy: Success = a prolific social media activity?

•See also discussion and FAQ in Kriegeskorte (2012).

38Kriegeskorte, N. (2012). Open evaluation: a vision for entirely transparent post-publication peer review and rating for science, 1–18. http://doi.org/10.3389/fncom.2012.00079/abstract

Publications

39

✓ Full open access, no APCs✓ Non-commercial institutional

publisher (Linnaeus U library)✓ Open, citable peer review

(with doi)✓ Well-powered null results and

direct replications welcomed✓ Registered Reports as option✓ Mandatory open data✓ Open Science badges

(including a reproducibility badge)

Publications

40

https://www.psychopen.eu/ Publications

Research funding

https://dirnagl.com/2014/01/14/otto-warburgs-research-grant/

Grant proposal submitted to the precursor of the German Research Foundation, around 1921 (accepted).

Osterloh & Frey: Aleatoric Principle•Grant decisions as a lottery: It’s not a bug, it’s a feature!•Tiered selection process:

• Preselection by traditional peer review (but with clear focus on negative selection only!)

• Distribute funds either by chance, or by traditional selection criteria• After some time: Compare grant performance from „chance track“

with „traditional track“

•Applicable to research funding, but also to search committees for professorship positions; cf. University of Basel in 18th century) 42

Funding/Grants

Peer-to-peer funding

43

http://www.laborjournal-archiv.de/epaper/LJ_17_06/index.html#22https://dirnagl.com/2017/07/01/start-your-own-funding-organization/Bollen, J., Crandall, D., Junk, D., Ding, Y., & Börner, K. (2014). From funding agencies to scientific agency: Collective allocation of science funding as an alternative to peer review. EMBO Reports, 15(2), 131–133. https://doi.org/10.1002/embr.201338068

Funding/Grants

Quantity, not quality

44Abele-Brehm, A. E., & Bühner, M. (2016). Wer soll die Professur bekommen? Psychologische Rundschau, 67(4), 250–261. http://doi.org/10.1026/0033-3042/a000335

Job committees

Actual (not desired) relevance at professorship hiring committees: Rank

Number of peer-reviewed publications 1Fit of research profile to the advertising institution 2

Quality of research talk 3Number of publications 4

Volume of acquired third-party funding 5Number of first authorships 6

… …Quality assessment of the best three publications 17

… …Indicators of research transparency 41 (of 41)

Quality, not quantity

45Abele-Brehm, A. E., & Bühner, M. (2016). Wer soll die Professur bekommen? Psychologische Rundschau, 67(4), 250–261. http://doi.org/10.1026/0033-3042/a000335

Job committees

Indicators with the largest discrepancy between „desired“

and „actual“

46

Job committees

https://docs.google.com/document/d/1ty43Syw0Flkh8ncjW8MZArIkvYe8hLwwhLlIwbtSk_Y/edit?usp=drive_web&ouid=108982640291853577145

www.uni-koeln.de

The Department of Psychology at the Faculty of Human Sciences of the University of Cologne (UoC) seeks to appoint a

Full Professor (W3) of Social Psychology

to be filled as soon as possible.

The successful candidate is expected to have a record of excellence in social cognition, and/or related areas such as cognitive psychology or motivation science.

The candidate is also expected to strongly contribute to the UoC’s Center for Social and Economic Behavior and the Social Cognition Center Cologne of the Department of Psychology. Both structures are part of UoC’s Key Profile Area II, „Behavioral Economic Engineering and Social Cognition“.

The ideal candidate’s track record should show an excellent fit with these interrelated structures and a strong interest to bridge the fields of social cognition and behavioral economics.

The Department of Psychology aims for transparent and reproducible research (including Open Data, Open Materials, and Preregistrations). Applicants are asked to illustrate how they have pursued these goals in the past and/or how they plan to do so in the future.

We strongly encourage international applicants. Salaries and working conditions at the UoC - one of the German Universities of Excellence – meet international standards. Candidates are expected to be willing to learn the German language. The Faculties offer Bachelor, Master, and doctoral degrees. Courses are taught either in English or German.

Applicants will be hired in concordance with § 36 of the University Law of the State of North-Rhine Westphalia.

The UoC supports diversity, the multiplicity of perspectives, and equal opportunities. The University of Cologne particularly encourages applications from disabled persons. Disabled persons are given preference in case of equal qualification. Women are strongly encouraged to apply. Preferential treatment is given to women if their professional qualifications and abilities are equivalent to those of other applicants.

Applications with the usual documents (including vita, research statement, 5 most important publications, full list of publications and teaching experience, and diplomas) should be submitted via the University’s Academic Job Portal (https://berufungen.uni-koeln.de) until March 30th, 2017.

www.uni-koeln.de

The Department of Psychology at the Faculty of Human Sciences of the University of Cologne (UoC) seeks to appoint a

Full Professor (W3) of Social Psychology

to be filled as soon as possible.

The successful candidate is expected to have a record of excellence in social cognition, and/or related areas such as cognitive psychology or motivation science.

The candidate is also expected to strongly contribute to the UoC’s Center for Social and Economic Behavior and the Social Cognition Center Cologne of the Department of Psychology. Both structures are part of UoC’s Key Profile Area II, „Behavioral Economic Engineering and Social Cognition“.

The ideal candidate’s track record should show an excellent fit with these interrelated structures and a strong interest to bridge the fields of social cognition and behavioral economics.

The Department of Psychology aims for transparent and reproducible research (including Open Data, Open Materials, and Preregistrations). Applicants are asked to illustrate how they have pursued these goals in the past and/or how they plan to do so in the future.

We strongly encourage international applicants. Salaries and working conditions at the UoC - one of the German Universities of Excellence – meet international standards. Candidates are expected to be willing to learn the German language. The Faculties offer Bachelor, Master, and doctoral degrees. Courses are taught either in English or German.

Applicants will be hired in concordance with § 36 of the University Law of the State of North-Rhine Westphalia.

The UoC supports diversity, the multiplicity of perspectives, and equal opportunities. The University of Cologne particularly encourages applications from disabled persons. Disabled persons are given preference in case of equal qualification. Women are strongly encouraged to apply. Preferential treatment is given to women if their professional qualifications and abilities are equivalent to those of other applicants.

Applications with the usual documents (including vita, research statement, 5 most important publications, full list of publications and teaching experience, and diplomas) should be submitted via the University’s Academic Job Portal (https://berufungen.uni-koeln.de) until March 30th, 2017.

Since 2015: All professorship job descriptionsuse this requirement

See more such prof job ads at: https://osf.io/7jbnt/

Hiring committees: Make „open science“ a desirable or essential job characteristic

47http://www.fak11.lmu.de/dep_psychologie/osc/open-science-hiring-policy/index.html

Job committees

CV: Only the 10 best publications are allowed + extra information which is easy to gather

48

No journal; JIF is irrelevant or misleading

Paper-level citation metrics

Basic information for judging

evidential value

Open science indicators: Judging

reproducibility

Data: own collection or

reuse?

Summary• We apply established methodological quality checks in our

„normal“ scientific work – we should also apply them to performance evaluations in academia ➙ critical analysis of the current system

• The current incentive structure fosters bad science• Early career researchers feel a tension: You can either do

„good science“ or „career in academia“. Seniors have the obligation to resolve that dilemma for young scientists.

• Alternative ideas and structures for performance evaluation exist and just wait for being tested and evaluated.

49

Recommended