The challenge for the Neurosciences (and all bioscience)€¦ · Sena et al., PLOS Biology 2010; 8(3) e1000344 Estimating publication bias using trim-and-fill “Trim-and-fill analysis

The challenge for the Neurosciences (and all bioscience)

Shai D. Silberberg

National Institute of Neurological Disorders and Stroke, NIH

Trial Watch: Phase II and Phase III attrition rates 2011–2012

Arrowsmith & Miller, Nature Reviews Drug Discovery, 2013; 12:569

Is poor translatability primarily due to the inadequacy of animal models?

Prinz, Schlange and Asadullah

Bayer HealthCare

Nature Reviews Drug Discovery

2011; 10:712-713

43 / 67

Nature, Vol. 505, pp. 612-13, 30 January 2014

Trans-NIH actions - Pilots

Pilot Focus Types of Efforts Being Developed

Evaluation of scientific premise in grant applications

New Funding Opportunities with additional review criteria regarding scientific premise

Checklist and Reporting Guidelines Reviewer checklists regarding reporting standards and scientific rigor

Changes to Biosketch Biosketch pilot with focus on accomplishments and not just publications

Approaches to reduce "perverse incentives” to publish

Exploring award options with a longer period of support for investigators

Supporting replication studies New Funding Opportunities for replication studies, and options to assess whether pre-clinical findings should be replicated

Training Developing materials on experimental design

Other efforts Use of Prize Challenges to encourage reproducibility of results, PubMed Commons

Nature, Vol. 505, pp. 612-13, 30 January 2014

What are causes for poor reproducibility?

Publication bias

X

Confounding variablesCutting-edge science Resources

Lack of transparency

XChanceXExperimental

biasPoor

reproducibility=

Systematic reviews and Meta-analysis

We are all prone to bias!



biasPoor

reproducibility=

Publication bias

X


Chalmers et al., N Engl J Med 1983; 309: 1358-1361

Effect size for studies of FK506 (Tacrolimus) in experimental stroke.

Sena et al., Trends Neurosci 2007; 30: 433-439

The fewer methodological parameters are reported, the greater the apparent efficacy!

Sena et al., JCBFM. 2014; 34: 737-742



biasPoor

reproducibility=

Publication bias

X


Amyotrophic lateral sclerosis (ALS)

Death within 5 years of diagnosis Central pathological finding is motor neuron death

3% of cases from gain of function mutations in SOD1 Rodents over-expressing SOD1 recapitulate ALS

2002: Minocycline reported by a number of groups to extend survival of SOD1 mice

2003: Randomized placebo controlled trial (412 patients treated for 9 months)

2007: Results of the trial are published - minocycline found to have a harmful effect on patients with ALS

“In the past five years we havescreened more than 70 drugs in18000 mice across 221 studies,using rigorous and appropriatestatistical methodologies. While wewere able to measure a significantdifference in survival betweenmales and females with greatsensitivity, we observed nostatistically significant positive (ornegative) effects for any of the 70compounds tested, includingseveral previously reported asefficacious. “

Scott et al., Amyotroph Lateral Scler 2008; 9: 4-15

ALS Therapy Development Institute (ALS TDI)

“…the majority of published effects aremost likely measurements of noise inthe distribution of survival means asopposed to actual drug effect.“

2241 SOD1G93A control mice

The probability of seeing an effect by chance alone is significant even with 10 animals per group

Scott et al., Amyotroph Lateral Scler 2008; 9: 4-15

SOD1G93A transgenic mice

Started at 10 weeks of age

i.p. 25 and 50 mg/kg/day

7 animals / group (females)

Not randomized

“The experimenter was blinded to the treatment protocol.”

SOD1G93A transgenic mice

Started at 5 weeks of age

i.p. 10mg/kg/day

10 animals / group (sex?)

Not randomized

Not blinded

The survival benefit of minocycline in the SOD1G93A mouse model of ALS is likely due to small sample size and/or Bias



biasPoor

reproducibility=

Publication bias

X


Publication decisions and their possible effects on inferences drawn from tests of significance — or vice versa

THEODORED . STERLINGUniversity of Cincinnati

“There is some evidence that in fields where statisticaltests of significance are commonly used, research whichyields nonsignificant results is not published. Suchresearch being unknown to other investigators may berepeated independently until eventually by chance asignificant result occurs - an "error of the first kind“ -and is published.”

Journal of the American Statistical Association, 1959; 54:30-34

Psychological Bulletin, 1979; 86: 638-641

“For any given research area, one cannot tell howmany studies have been conducted but neverreported.

The extreme view of the "file drawer problem" isthat journals are filled with the 5% of the studiesthat show Type I errors, while the file drawers arefilled with the 95% of the studies that show non-significant results.”

The “File Drawer Problem” and Tolerance for Null Results

ROBERT ROSENTHALHarvard University

“We evaluated 340 articles included in prognosticmarker meta-analyses (Database 1) and 1575 articleson cancer prognostic markers published in 2005(Database 2).

……..Only five articles in Database 1 (1.5%) and 21 inDatabase 2 (1.3%) were fully ‘negative’ for allpresented results in the abstract and without effortsto expand on non-significant trends or to defend theimportance of the marker with other arguments.”

European Journal of Cancer, 2007; 43: 2559 - 2579



biasPoor

reproducibility=

Publication bias

X


Journal of Cerebral Blood Flow & Metabolism (2014) 34, 737–742

Journal of Neuroscience Methods 221 (2014) 92– 102

Vesterinen et al., J. Neurosc. Meth. 2014; 221: 92-102

“Systematic review sets out to use a structured process toidentify all data relevant to a specific research question.”“Systematic review sets out to use a structured process toidentify all data relevant to a specific research question.”

“This may be followed by meta-analysis, a statistical processthat provides a summary estimate of the outcomes from agroup of studies, and allows these outcomes from differentgroups of studies to be compared.”

Sena et al., JCBFM. 2014; 34: 737-742

What is the difference between systematic reviews and meta-analysis?

“Because certain aspects of experimental design canintroduce bias to the results of relevant studies, acentral part of a Cochrane review is to include onlythose studies meeting a certain threshold of internalvalidity to allow confidence in the results reported.

“Because certain aspects of experimental design canintroduce bias to the results of relevant studies, acentral part of a Cochrane review is to include onlythose studies meeting a certain threshold of internalvalidity to allow confidence in the results reported.

In contrast, preclinical reviews are typicallyexploratory. Because the summary estimate of theeffectiveness of an intervention in animal models is,of itself, not particularly useful information; thepractice has been to include all available data.

This is useful for identifying if there are any gaps inthe data.”

Vesterinen et al., J. Neurosc. Meth. 2014; 221: 92-102

Perrin, Nature 2014; 507: 423-425

Scenario A Scenario B

“Dependency”

The use of “precision” to weigh results

1) The effect sizes of individual studies aredetermined

2) The precision of an effect size is determinedby its standard error

3) More precise studies are given more weightin a meta-analysis than less precise ones

Sena et al., JCBFM. 2014; 34: 737-742

Trim-and-fill

Sena et al., PLOS Biology 2010; 8(3) e1000344

Estimating publication bias using trim-and-fill

“Trim-and-fill analysis suggestedthat publication bias mightaccount for around one-third ofthe efficacy reported insystematic reviews, withreported efficacy falling from31.3% to 23.8% after adjustmentfor publication bias.”

“We identified 16 systematicreviews of interventions tested inanimal studies of acute ischaemicstroke involving 525 uniquepublications.

Only ten publications (2%)reported no significant effects oninfarct volume and only six(1.2%) did not report at least onesignificant finding.”

Sena et al., PLOS Biology 2010; 8(3) e1000344

Trim-and-fill underestimates publication bias

Chopra et al., PNAS. 2007; 104: 16685-16689

In trim-and-fill all comparisons are included

“There is as much controversy over whatconstitutes an outlier as whether to remove themor not. Simple rules of thumb (e.g., data pointsthree or more standard deviations from the mean)are good starting points. Some researchers prefervisual inspection of the data.”

Osborne & Overbay, Practical Assessment, Research & Evaluation, 9(6), 2004

Better precision due to removal of “outliers”

Better precision due to systematic differences between comparison groups

Technical vs. biological replicates

Neurobehavioural score in models of encephalomyelitis

Better precision due to chance

Vesterinen et al., Multiple Sclerosis. 2010; 16: 1044-1055

Improving the translational hit of experimental treatments in multiple sclerosis

What roles should systematic reviews and meta-analysis of animal studies play at present time?

Provide a summary of available data

Identify gaps in the data

Highlight limitations in data quality

Serve as a wakeup call

Reveal sources of bias

Function as an education tool

Refrain from providing estimates of efficacy or publication bias!

“One important purpose (andperhaps the single mostimportant impact) of systematicreviews of preclinical studies hasbeen to explore the impact ofpossible sources of bias, and werecommend that this is carriedout in all systematic reviews.”

Vesterinen et al., J. Neurosc. Meth.2014; 221: 92-102

Documents

The challenge for the Neurosciences (and all bioscience)€¦ · Sena et al., PLOS Biology 2010; 8(3) e1000344 Estimating publication bias using trim-and-fill “Trim-and-fill analysis