Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
10/27/17
1
StatisticalProblemswithPreclinicalResearch- AndHowtoFixThem
TimothyT.Houle,PhD
MakingResearchFindingsLessFalse
TimothyT.Houle,PhD
10/27/17
2
Disclosures
• NIHGrants– NS065257– GM113852
• Industry:None
WhyMostPublishedResearchFindingsAreFalse-Ioannidis(2005)
http://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124
10/27/17
3
ExhibitA:ReplicationCrisis
FailtoReplicate: OthersWork Their OwnWorkChemistry 90% 60%Biology 80% 60%Physics 70% 50%Medicine 70% 60%EarthScience 60% 40%
Baker(2016).1,500scientistsliftthelidonreproducibility
ExhibitB:TheMysteryofthe
DisappearingEffectSize
DeclineEffect(Rhine,1930)GeneralizationsDecay(1975)
Time
0
EffectSize
Schooler,J.W.;Engstler-Schooler,T.Y.(1990)."Verbalovershadowingofvisualmemories:Somethingsarebetterleftunsaid".CognitivePsychology.22(1):36–71.
Jonah_Lehrer (2010)."TheTruthWearsOff".TheNewYorker.
10/27/17
4
ExhibitC:TheDogThatDidn’tBark
• Predictionmodelsthatarenotreplicated,used,orappliedinanyway
• 1978to2016:56,202predictionmodels– 346replicationstudies
CausesoftheCrisis
• Multifaceted– Unprecedentedrateofpublication– Pressurestopublish–Worshipofnovelty– Fraud– P-hacking– Researcherdegreesoffreedom– SelectivePublication
GardenofForkingPathsGelman (2013)
10/27/17
5
t-test
…Thedatamustbearandom,representativesample fromtheprocessbeingstudied
…Theobserveddatarepresentonerealizationofaprocessthatcouldbeindefinitelyrepeated
Scenario TestStatistic
1.Simpleclassicaltest
T(y) Oneplanned statisticalinference
2.Testpre-chosenfromsetofpossible
tests
T(y;f) Onetestwithpre-registered f
3.Testbasedonthedata
T(y;f(y) ) Onlyone test.Differenttestwouldhavebeenperformed
givendifferentdata4.Fishing T(y;fj ) Performing jtestsand
reportingthebestone(s)
f:controlvariables,covariates,transformations,datacodingrules,exclusion,outliers,maineffects,interactions,subgroups,alternateoutcomes,directionofeffect
10/27/17
6
GardenofForkingPaths
f1
f2
1
Simpleclassicaltest
Scenario TestStatistic
1.Simpleclassicaltest
T(y) Oneplanned statisticalinference
2.Testpre-chosenfromsetofpossible
tests
T(y;f) Onetestwithpre-registered f
3.Testbasedonthedata
T(y;f(y) ) Onlyone test.Differenttestwouldhavebeenperformed
givendifferentdata4.Fishing T(y;fj ) Performing jtestsand
reportingthebestone(s)
f:controlvariables,covariates,transformations,datacodingrules,exclusion,outliers,maineffects,interactions,subgroups,alternateoutcomes,directionofeffect
10/27/17
7
GardenofForkingPaths
f1
f2
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
Fishing
Tenure!
Scenario TestStatistic
1.Simpleclassicaltest
T(y) Oneplanned statisticalinference
2.Testpre-chosenfromsetofpossible
tests
T(y;f) Onetestwithpre-registered f
3.Testbasedonthedata
T(y;f(y) ) Onlyone test.Differenttestwouldhavebeenperformed
givendifferentdata4.Fishing T(y;fj ) Performing jtestsand
reportingthebestone(s)
f:controlvariables,covariates,transformations,datacodingrules,exclusion,outliers,maineffects,interactions,subgroups,alternateoutcomes,directionofeffect
10/27/17
8
GardenofForkingPaths
f1
f22 2
2 2
Testpre-chosen(registered)frommanytests
Scenario TestStatistic
1.Simpleclassicaltest
T(y) Oneplanned statisticalinference
2.Testpre-chosenfromsetofpossible
tests
T(y;f) Onetestwithpre-registered f
3.Testbasedonthedata
T(y;f(y) ) Onlyone test.Differenttestwouldhavebeenperformed
givendifferentdata4.Fishing T(y;fj ) Performing jtestsand
reportingthebestone(s)
f:controlvariables,covariates,transformations,datacodingrules,exclusion,outliers,maineffects,interactions,subgroups,alternateoutcomes,directionofeffect
10/27/17
10
ReplicationBehavior
InitialStudy Replication
EffectSize
FishingForkingPathsPre-registrationOne-test
Scenario TestStatistic
1.Simpleclassicaltest
T(y) Oneplanned statisticalinference
2.Testpre-chosenfromsetofpossible
tests
T(y;f) Onetestwithpre-registered f
3.Testbasedonthedata
T(y;f(y) ) Onlyone test.Differenttestwouldhavebeenperformed
givendifferentdata4.Fishing T(y;fj ) Performing jtestsand
reportingthebestone(s)
f:controlvariables,covariates,transformations,datacodingrules,exclusion,outliers,maineffects,interactions,subgroups,alternateoutcomes,directionofeffect
✔
10/27/17
11
DefinitionofFalse
Traditional• TypeI error:Rejectnullhypothesiswhenitistrue
• TypeIIerror:Failtorejectnullhypothesiswhenitisfalse
Gelman• TypeMerror:Errorsinthemagnitudeoftheestimatedeffectsize
• TypeSerror:Errorsinthesignoftheestimatedeffectsize
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.432.8657&rep=rep1&type=pdf
TypeM TypeS
RecipeforGreaterChancesofBeingFalse
• Smalleffectsize• Small(modest)samplesize• Largemeasurementerror• Highvariation
• Lowpriorprobabilityofeffect
10/27/17
12
Top3ForkingPaths• Onereasonablehypothesismapsontoseveralreasonablestatisticalhypotheses(i.e.,onetomany)– PONVisassociatedwithage
• Statisticalinteraction(moderation)mustbetakenintoconsiderationfortheprimaryinterpretation– Agexsexisneededtoconsidertheeffectofage
• Addingconfoundercontrolposthoc– Weshouldcontrolforseveralcovariatesastheyseemtobeconfoundingtheageassociation
PossibleReactionsDeductive• Thisisamajorconcern
• Scienceshouldproceedfromcarefullycraftedinferences
• Fewinferences,highconfidenceinthem
• P-values(andCI)don’treallyindicatewhattheyaresupposedtoundermostappliedcircumstances
• Weshouldchangewhatwedo
Inductive• Notconcerned
• Theveryideaofscienceistolearnfromdata;youhavetoexploreyourdatatoknowwhatittellsyou.
• Manyinferencesareokay
• P-values(andCI)aremerelytools,Iliketodisplaytheactualdataanyway
• Carryon!
10/27/17
13
Recommendations:
• Avoidforkingpaths– Pre-registration– Fixedstatisticalanalysisplans• Primarydesignation• Moderators• Multiplicityadjustments
– Reproducibledocuments
https://ropensci.org/
Recommendations
• Okay,youinsistonconductingdata-drivenanalyses:– Beaware• P-valuesaresuspect• CIcoverageistoonarrow• Effectsizeswillregressto0(howmuch?)
– Report• Describethenatureoftheplanofanalysis• Attempttodescribetheresearcherdegreesoffreedom• ProvidetheissueintheDiscussion
10/27/17
14
Recommendations
• Okay,youinsistonconductingdata-drivenanalyses:– Formalinductiveinference• Bayesianinferenceisinductiveinferenceforadults
– Allowotherstoreproduceyourwork– Internalvalidation• Bootstrapping,etc.