Upload
natalia-juristo
View
76
Download
0
Embed Size (px)
Citation preview
UseandMisuseoftheTermExperiment
inMSRResearch
NataliaJuristo
UniversityofOulu&
TechnicalUniversityofMadrid
PROMISE September 7th 2016
Mo?va?on
n TodayempiricismiseverywhereinSEn ThisdoesnotmeanSEisempiricallymaturen Conduc?ngempiricalstudiesdoesnotimplytheyarecarriedoutandunderstoodproperly
n IfocushereinamethodologicalissueonMSRresearchn TheuseofexperimentsinMSR
2
Mo?va?on
n ForseveralyearsIhavebeenstrugglingwithmatchingMSRresearchwiththemoretradi?onalSEempiricalresearch(beingconductedalongthelast35years)
n VeryoLenIwasshockedhearingtocallexperiment(inMSRworks)toempiricalstudiesIdonotconsiderassuch
n Idiscusstodayaboutaresearchweareconduc?ngtoclarifythisissue
3
Collabora?on
n Thisresearchhasbeenconductedincollabora?onwithn ClaudiaAyalan XavierFranchn BurakTurhan
4
EvidenceofMisuse
Small-scaleLiteratureReview
n Weconductedaliteraturereviewtodouble-checktheuseofthetermexperimentinMSRworks
n 2015MSR,ESEMandEMSEn MSR 42papersreviewedn ESEM 36papersn EMSE 55papers
6
FindingsVenue2015
UseofTermExperiment
MSRvstradi<onalexperiment
MSRUsevs.Misuse
ESEM 30.5%11outof36
72,72%MSRWorks(8papers)27,28%tradi?onalexperiments(3papers)
Wronguse:12,5%Properuse:87,5%
MSR 42,8%18outof42
100%MSRWorks(18papers)0%tradi?onalexperiments
Wronguse:44,45%Properuse:55,55%
EMSE 52,72%29outof55
65,51%MSRWorks(19papers)34,48%tradi?onalexperiments(10papers)
Wronguse:52,63%Properuse:47,36%
….Letmeelaboratewhythetermismisused
Whatisanexperiment
Experiment Definition
n Empirical procedure where key variables of a reality are manipulated to investigate the impact of such variations
WhatMakesanExperiment
n Manipula?onofvariablesunderstudyn Treatmentsmustbeassignedtoexperimentalunits
n Controllingpoten?alconfoundingvariablesimpac?ngresultsn Confoundingiseliminatedthoughrandomassignmentoftreatmentstounits
10
WhatMakesanExperimentInterven?on
n Experimenta?onn Thereisapurposelyinterven?onbyresearchers
n Researchersallocatetreatmentstounitsn Experimentalgroups(exposureandunexposure)aredeterminedbyresearcher
n Observa?onn Researchershaveapassiveroleanddonotinterferewithrealityn Dataaregenerateddirectlyfromrealityanda>ertheyareanalyzed
n Exposurestatusisnotdeterminedbyresearcher
11
WhatMakesanExperimentRandomiza?on
n Experimentslimitthepoten?alforanyconfoundingfactors(biases)byrandomlyassigningonepar?cipantpooltoatreatmentandanotherpar?cipantpooltocontrolorothertreatment
n Randomalloca?onoftreatmentstosubjectsminimizesthechancethattheincidenceofconfounding(par?cularlyunknownconfounding)variableswilldifferbetweenthetwogroups
12
WhatMakesanExperimentInterven?on+Randomiza?on
n Interven?onguaranteescausalityn Inspiringexample
n Inaquasi-experimentthealloca?onoftreatmentisnotpossible
n Althoughrunundercontrolledcondi?ons
n Thecaseofpsychologyexperimentsn Personalitytreats
13
WhatDoesnotMakesanExperiment
n Randomiza?onn Comparisonn Analysistechniques
14
WhatDoesnotMakesanExperimentRandomiza?on
n Randomiza?onisastrategyaimingtoreduceconfoundingvariables(bias)n Itismandatoryincontrolledexperimentsn Canbeappliedtoothertypeofempiricalstudies
n Inspiringexamplen Randomiza?oninsurveys
15
WhatDoesnotMakesanExperimentComparison
n Compareamongtheimpactofvaluesofavariabledoesnotmeanwewillbeabletorevealcausality
n Comparinginasetofdataunitswithdifferentvaluesofavariableneithermakesthestudyanexperimentnorcantracebackdifferencestotreatments
16
WhatDoesnotMakesanExperimentAnalysis
n Analysistechniquesdonotdifferen?ateexperimentsfromotherempiricalstudiesn Whatallowstorevealcausalityisnotthetypeofanalysistechniqueitisthedesignofthestudy
n Applyingtoasetofdataananalysistechniquetypicallyusedinexperimentsneithermakesthestudyanexperimentnordetectscausality
17
WhatDoesnotMakesanExperimentn AnMSRstudy
n ApplyingANOVAdoesnotmeanitisanexperimentn Comparingpoolsofdatadifferinginavariable’svaluedoesnotimplyitisanexperiment
n EvenifMSRstudieswouldrandomizedtheywerenotexperiments
n Designguaranteesn Thedropofbiasandconfoundingvariablesn Thedifferencesobservedinbehaviorarecausedbytreatments
18
ImpactofRandomiza?onandDesign
19
TypesofExperiments
n Withoutinterven?onn Naturalenvironment
n Naturalexperiments
n Interven?onn Where?
n Ar?ficialcontrolledenvironmentn Laboratorycontrolledexperiments
n Naturalenvironmentn Fieldexperiments
20
LaboratoryexperimentsPurposelyinterven?onRandomizedalloca?onoftreatmentsAr?ficialenvironmenthighlycontrolled
FieldexperimentsPurposelyinterven?onRandomizedalloca?onoftreatmentsNaturaluncontrolledenvironment
22
NaturalexperimentsNointerven?onInanaturaluncontrolledenvironment
MiningSoLwareRepositories
n MSRresearchn Outcomes(suchasqualityandproduc?vity)arestudiedinlarge-samplesofpastdatato
n Applysta?s?calmethodstotesthypothesisn Buildmachinelearningandminingmethodsonpastdataintotoolstosupportprogrammingtasks
n Thedatastoredinarepositoryhavebeenobtainedfromreality(withoutinterven?on)n ThereforeMSRworksareobserva?onalstudies
n Wecouldcallthemnaturalexperimentsbutthattermismisleading
23
MSRandEpidemiology
EmpiricalStudiesinMedicine
25
M
ethod Developm
ent
Laboratory Research or Pre-clinical
N
on-Hum
an E
xperiments
Field Research Ill People Ill & Healthy People
From 20-100 volunteers to 1-2M patients
Descriptive
A n a l y t i c
Retrospective
P ro s p e c t i v e
Descriptive
EmpiricalStudiesinMedicineAnaly<cal Experimental ClinicalTrial
FieldTrial
GroupTrial
Observa<onal CohortStudies Prospec@veStudy;Follow-upstudyConcurrentstudy;IncidencestudyLongitudinalstudy
HistoricalCohortstudies
Case-ControlStudies Retrospec@vestudy;CasecomparisonstudyCasehistorystudy;Casecompeerstudy;Casereferentstudy;Trohocstudy
Descrip<ve Individuals Cross-Sec?onalStudies Prevalencestudy;DiseasefrequencystudyMorbiditysurvey;Healthsurvey
Caseseries
Singlecase
Popula<on EcologicalStudies
(Prospec?ve)CohortStudyn Acollec?onofdataatregularintervalsofagroupofpeoplewhodonothavethe
diseaseforaperiodof?meandseewhodevelopsthedisease(newincidence)n Cohort
n Groupofpeoplewhoshareacommoncharacteris?cwithinadefinedperiodn e.g.,areborn,areexposedtoadrugorvaccineorpollutant,orundergoacertainmedicalprocedure
n Comparisongroupn Thegeneralpopula?onfromwhichthecohortisdrawnn Anothercohortofpersonsthoughttohavehadlikleornoexposuretothesubstance
underinves?ga?on,butotherwisesimilarn SE:Projects/Commitsthathavenotappliedthemethodunderstudy
n Examplen DoesexposuretoX(smoking)associatewithoutcomeY(lungcancer)?n Suchastudywouldrecruitagroupofsmokersandagroupofnon-smokers(theunexposedgroup)
andfollowthemforasetperiodof?meandnotedifferencesintheincidenceoflungcancerbetweenthegroupsattheendofthis?me
n SE:Apassivefollowupofprojects/commits,collec@ngdataatregularintervalsandno@ngthequality/produc@vetheyget
27
Retrospec?veStudies
n Theresearchercollectsdatafrompastrecordsanddoesnotfollowpa?entsupasisinprospec?vestudies
n Alltheevents(exposure,latentperiod,andsubsequentoutcome-developmentofdisease-)havealreadyoccurredinthepast
n Errorsduetoconfoundingandbiasaremorecommoninretrospec?vestudiesthaninprospec?vestudies
28
Retrospec?veStudiesThreatstoValidity
n Somekeydatahavenotbeenmeasuredn Biasesmayaffecttheselec?onofcontrols
n Selec?onbiasn Onlyselectpa?entswiththenecessaryinforma?on
n Misclassifica?onorinforma?onbiasasaresultoftheretrospec?veaspect
n Researcherscannotcontrolexposureoroutcomeassessmentbutinsteadneedtorelyonothersforaccuraterecordkeepingn Itcanbeverydifficulttomakeaccuratecomparisonsbetweentheexposed
andthenon-exposed
29
Retrospec?veCohortStudy
n Recordsofgroupsofindividualswhoarealikeinmanywaysbutdifferbyacertaincharacteris?carecomparedforapar?cularoutputn Forexample,femalenurseswhosmokeandthosewhodonotsmoken SE:Useofpastdatainarepositorytocomparecertainoutputof
projectswithcharacteris@cAandno-A
n Theresearchercollectsdatafrompastrecordsanddoesnotfollowpa?entsupasisthecasewithaprospec?vestudy
30
(Retrospec?ve)Case-ControlStudy
n Recordsofindividualsaredividedintwogroupsdifferinginoutcome(diseaseornot)andcomparedonthebasisofsomesupposedcausalakributen Case-Controlstudiesselectsubjectsbasedontheirdiseasestatus(theeffect)
n Cohortstudiesselectsubjectsbasedontheirexposurestatus(thecause)
n SE:Selectprojects/commitswithcertainlevel(i.e.qualityvalue)andtracebackcertainprojectcharacteris@csthatisbelievedtocontributetoquality
31
EcologicalStudiesn Unitsofanalysisarepopula?ons
n Comparisonofgroupsratherthanindividuals
n Explorescorrela?onsbetweengrouplevelexposureandoutcomes
32
HierarchiesofEvidence
HierarchyofEvidences
n Itiscri?caltounderstandwhichempiricalstudyyouareconduc?ngn Tofullyunderstandwhattheresultsaretellingus
n Thetypeofresultsdependsonthetypeofstudy!!!
n Evidencehierarchiesreflecttherela?veauthorityofvarioustypesofempiricalstudies
34
AuthorityofEvidences
Field Experiments Observational
Analytical Prospective
Retrospective
Observational Descriptive
Laboratory Experiments
PsychologyHierarchyofEvidence
38
TwoMSRexamples
Example1
n MSR’15n TheUniquenessofChanges:Characteris?csandApplica?onsn Ray,Nagappan,Bird,Nagappan,Zimmeramnn
n Whythispapern Averywellwrikenpapern Severalempiricalstudiesofdifferenttypeaboutthesameissuen ProminentMSRauthors
40
EmpiricalStudies(Authors’terms)n Topic
n Somechangesareuniquewhileotherarenotn Theyproposeawaytoiden?fyuniquenessofchanges
n Empiricalstudies(inauthors’terms)n Analysisofuniqueandnon-uniquechangesproper?es
n Whatistheextentofuniquechanges;Whointroducesuniquechanges;Wheredouniquechangestakeplace
n Applica?onsn ExperimentforRiskAnalysis
n CheckwhetherUfilecommitsarehaveahigherdefectratethanNUfilecommitsn UseMann-Whitneytestforthecomparison
n Recommenda?onsystemsn Asystemisembeddedinthedevelopmentenvironmenttosuggestchangesto
developersn Precisionandrecalloftherecommenda?onsisanalyzed
41
TypeofEmpiricalStudies(Epidemiologyterms)
n Analysisofuniqueandnon-uniquechangesproper?esn Whatistheextentofuniquechanges;Whointroducesuniquechanges;
Wheredouniquechangestakeplacen Ecologicalstudy
n Descrip?ve;Useofpopula?onaggregateddata
n Applica?on:ExperimentforRiskAnalysisn CheckwhetherUfilecommitshaveahigherdefectratethanNUfilecommits
n Retrospec?vecohortstudyn Comparisonofpastdata
n Applica?ons:Recommenda?onsystemsn Asystemisembeddedinthedevelopmentenvironmenttosuggestchangesto
developers;Precisionandrecalloftherecommenda?onsisanalyzedn Prospec?veobserva?onalstudy;ecological?
n Butnocomparisonismade(i.e.:ifquality/produc?vityofdevelopmentsusingtherecommenda?ons)n CouldbeconductedasFieldTrialor(Prospec?ve)Cohortstudy
42
Example2
n ESEM’15n Howtomakebestuseofcross-companydataforwebeffort
es?ma?onn Minku,Sarro,Mendes,Ferrucci
n Topicn ComparesCCdatasetversusWCdatasetforwebeffortes?ma?onn ComparesDycomagainstNN-filtering
n Dycom:FrameworkforlearningsoLwareeffortes?ma?onmodelsforacompanybasedonmappingCCmodelstothecompany’scontext)
n NN-filtering:NearestNeighborfilteringtomakeCCes?ma?ons
43
ExperimentsinEffortEs?ma?onResearch
n Interven?onn Thetwo(effortes?ma?on)techniquescomparedn Alloca?onoftreatmentstounits?
n Yesn Everyprojectbelongingtothetestdatasetisanexperimentalunit
n Experimentalgroupsarethetestdatasetes?matedwithoneortheothertechnique
n TypicalABdesigns;Butcouldtryothers
n Controlconfoundingvariablesthroughrandomiza?on?n No
44
WhichUseswereRight
Venue2015
UseofTermExperiment
MSRvstradi<onalexperiment
MSRUsevs.Misuse
ESEM 30.5%11outof36
72,7%MSRWorks(8papers)27,3%tradi?onalexperiments(3papers)
Observa?onal:12,5%Dataexperiments:87,5%
MSR 42,8%18outof42
100%MSRWorks(18papers)0%tradi?onalexperiments
Observa?onal:44,4%Dataexperiments:55,5%
EMSE 52,72%29outof55
65,5%MSRWorks(19papers)34,5%tradi?onalexperiments(10papers)
Observa?onal:52,6%Dataexperiments:47,4%
Conclusions
Conclusionsn MSRisaresearchmethodbywhichseveraltypeofempirical
studiescanbeconductedn Inanycasemostresearchis
n Observa?onaln Retrospec?ve
n Unlessdataisminedfromdevelopmenttoolsprospec?vely
n Thereforetheevidenceobtainedisoflowerqualitythann Observa?onalprospec?vestudiesn Fieldexperimentalstudies
n Showcorrela?onbutitishardtoprovecausa?onn Morepowerfultypesofobserva?onalstudies(Case-control;Cohort)
couldgetbekerevidence
47
UseandMisuseoftheTermExperiment
inMSRResearch
NataliaJuristo
UniversityofOulu&
TechnicalUniversityofMadrid
PROMISE September 7th 2016