View
44
Download
0
Category
Preview:
Citation preview
Guide to
scoring evidenceusing the Maryland
Scientific Methods ScaleUpdated June 2016
Contents
1. Introduction 1
2. Impact evaluation 3
Using Counterfactuals 3
3. SMS 5 methods 6
Randomised Control Trial (RCT) 6
4. SMS 4 methods 10
Instrumental variables (IV) 10
Regression Discontinuity Design (RDD) 13
5. SMS 3/4 methods’ 17
6. SMS 3 methods 18
Difference-in-differences (DiD) 19
Panel data methods 20
Propensity Score Matching (PSM) 23
7. SMS 2 methods and below 26
Cross-sectional regression 26
Before-and-after 27
Additionality (not SMS scoreable) 28
Impact modelling (not SMS scoreable) 29
Appendix 1: Quick-scoring scoring guide for the Maryland Scientific Method Scale 30
Appendix 2: Mixed methods scoring 4 or 3. 33
Hazard Regressions 33
Heckman two-stage correction (H2S)/ Control function (CF) 36
Appendix 3: Arellano-Bond method 39
00
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 1
01
Introduction
TheWhatWorksCentreforLocalEconomicGrowth(WWG)producessystematicreviewsoftheevidencebaseonabroadrangeofpoliciesintheareaoflocaleconomicgrowth.Animportantstepinthereviewprocessistheassessmentofwhetheranevaluationprovidesconvincingevidenceonlikelypolicyimpacts.OurassessmentisbasedonthescoringofpapersontheMarylandScientificMethodsScale(SMS),whichrankspolicyevaluationsfrom1(leastrobust)to5(mostrobust)accordingtotherobustnessofthemethodusedandthequalityofitsimplementation.Robustness,asjudgedbytheMarylandSMS,istheextenttowhichthemethoddealswiththeselectionbiasesinherenttopolicyevaluations.
ThisdocumentexaminesawidevarietyofcommonlyemployedmethodsandexplainshowweplacethemontheMarylandSMS.Itthenlooksatanumberofexamplesofpolicyevaluationsforeachmethod,scoringthemonthequalityoftheirimplementation.
Thisscoringguideplaysseveralimportantroles.Firstly,consistentwithourcommitmenttoopenness,itservesasdocumentationastohowwerankstudiesbyrobustnessforourseriesofsystematicreviewsandenablesustobeconfidentthatweareachievingalevelofconsistencyinourrankingacrosstheteam.Secondly,itactsasascoringhandbookforanyonewantingtoassesstherobustnessofaparticularpolicyevaluation.Thisprovidesusefulguidanceonhowmuchweighttoputonaparticularpieceofevidence.Finallyitcanhelporganisationsundertakingevaluationseitherinassessingthemaftercompletionorinhelpingchoosebetweenmethodologiesbeforehand.
Thisdocumentisnosubstituteforbettertechnicaltrainingandexpertadvice.Butitshouldhelpthosewithsomeknowledgeofevaluationtechniquestobetterunderstandrecentadvancesandthewaythatwetreattheseinoursystematicreviews.It’salsoimportanttonotethattherankingofindividualstudiesisnotanexactscienceandofteninvolvesadegreeofjudgement.Statisticaltestingcanonlytakeussofarinassessingthesuitabilityofagivenmethodandthequalityofitsapplicationinpractice.Whenassessinganindividualstudy(includingthespecificexamplesthatwediscusshere)thereisalwaysscopeforsomedisagreementontheexactranking.Indeed,anyonewhohasattendedanacademicseminarwillknowtheextenttowhichsuchissuescanbehotlydisputed.Thatsaid,onaverageourscoringwilltendtoproducerankingsonwhichmanyevaluationexpertswouldbroadly
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 2
agree.Ofcourse,whenpullingtogetherourevidencereviewswelookattheoverallbalanceoftheevidenceandit’sthereforehighlyunlikelythataspecificrankingonanygivenstudywillinfluencetheconclusionsthatwereachonpolicyeffectiveness.
Theexamplesthatweusearedrawnfromawiderangeofstudies.Notallofthemarespecificallyfocussedonlocaleconomicgrowthbutinallcasestheydemonstrateapproachesthatcouldfeasiblybetakeninfutureevaluations.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 3
Impact evaluation
Governmentsaroundtheworldincreasinglyhavestrongsystemstomonitorpolicyinputs(suchastheamountofloansguaranteedtoSMEs,whichpolicymakershopewillcreatenewjobs)andoutputs(suchasthenumberoffirmsthathavereceivedloanguarantees).However,theyarelessgoodatidentifyingpolicyoutcomes(suchastheeffectofprovidingaloanguaranteeonfirmemployment).Inparticular,manygovernment-sponsoredevaluationsthatlookatoutcomesdonotusecrediblestrategiestoassessthecausalimpactofpolicyinterventions.
Bycausalimpact,theevaluationliteraturemeansanestimateofthedifferencethatcanbeexpectedbetweentheoutcomefor(inthisexample)firms‘treated’inaprogramme,andtheaverageoutcometheywouldhaveexperiencedwithoutit.Pinningdowncausalityisacruciallyimportantpartofimpactevaluation.Estimatesofthebenefitsofaprojectareoflimitedusetopolicymakersunlessthosebenefitscanbeattributed,withareasonabledegreeofcertainty,tothatproject.
Thecredibilitywithwhichevaluationsestablishcausalityisthecriteriononwhichthisreviewassessestheliterature.
Using CounterfactualsEstablishing causality requires the construction of a valid counterfactual–i.e.whatwouldhavehappenedtoprogrammeparticipantshadtheynotbeentreatedundertheprogramme.Thatoutcomeisfundamentallyunobservable,soresearchersspendagreatdealoftimetryingtorebuildit.Thewayinwhichthiscounterfactualis(re)constructedisthekeyelementofimpactevaluationdesign.
A standard approach is to create a counterfactual group of similar places, people or businesses not undertaking the kind of project being evaluated. Changesinoutcomescanthenbecomparedbetweenthe‘treatmentgroup’(thoseaffectedbythepolicy)andthe‘controlgroup’(similarplaces,peopleorbusinessesnotexposedtothepolicy).
A key issue in creating the counterfactual group is dealing with the ‘selection into treatment’ problem.Selectionintotreatmentoccurswhenparticipantsintheprogrammedifferfromthosewhodonotparticipateintheprogramme.
02
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 4
Anexampleofthisproblemforaccesstofinanceprogrammes,asinourexampleabove,wouldbewhenmoreambitiousfirmsapplyforsupport.Ifthishappens,estimatesofpolicyimpactmaybebiasedupwardsbecauseweincorrectlyattributebetterfirmoutcomestothepolicy,ratherthantothefactthatthemoreambitiousparticipantswouldhavedonebetterevenwithouttheprogramme.
Selectionproblemsmayalsoleadtodownwardbias.Forexample,firmsthatapplyforsupportmightbeexperiencingproblemsandsuchfirmsmaybelesslikelytogroworsucceedindependentofanyadvicetheyreceive.Thesefactorsareoftenunobservabletoresearchers.
So the challenge for good programme evaluation is to deal with these issues, and to demonstrate that the control group is plausible.Iftheconstructionofplausiblecounterfactualsiscentraltogoodpolicyevaluation,thenthecrucialquestionbecomes:how do we design counterfactuals? Thisscoringguideexplainsthewaysinwhichweassesshowwellresearchershavedoneinansweringthisquestion.InparticularitexplainshowwerankimpactevaluationsbasedonanadjustedversionoftheMarylandScientificMethodsScale(SMS)todothis.1TheSMSisafive-pointscalerangingfrom1,forevaluationsbasedonsimplecrosssectionalcorrelations,to5forrandomisedcontroltrials.Box1providesanoverviewofthescaleandhighlightssomeoftheapproachesthatwediscussinmoredetailintheremainderofthisguide.
1 Sherman,Gottfredson,MacKenzie,Eck,Reuter,andBushway(1998).
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 5
Box 1: Our robustness scores (based on an adjusted Maryland Scientific Methods Scale)
Level 1:Either (a) across-sectional comparison of treated groups with untreated groups, or (b) a before-and-after comparison of treated group, without an untreated comparison group.Nouseofcontrolvariablesinstatisticalanalysistoadjustfordifferencesbetweentreatedanduntreatedgroupsorperiods.
Level 2:Use of adequate control variables and either (a) across-sectional comparison of treated groups with untreated groups, or (b) a before-and-after comparison of treated group, without an untreated comparison group. In(a), controlvariablesormatchingtechniquesusedtoaccountforcross-sectionaldifferencesbetweentreatedandcontrolsgroups.In(b),controlvariablesareusedtoaccountforbefore-and-afterchangesinmacrolevelfactors.
Level 3:Comparison of outcomes in treated group after an intervention, with outcomes in the treated group before the intervention, and a comparison group used to provide a counterfactual (e.g. difference in difference). Justificationgiventochoiceofcomparatorgroupthatisarguedtobesimilartothetreatmentgroup.Evidencepresentedoncomparabilityoftreatmentandcontrolgroups.Techniquessuchasregressionand(propensityscore)matchingmaybeusedtoadjustfordifferencebetweentreatedanduntreatedgroups,buttherearelikelytobeimportantunobserveddifferencesremaining.
Level 4:Quasi-randomness in treatment is exploited, so that it can be credibly held that treatment and control groups differ only in their exposure to the random allocation of treatment.Thisoftenentailstheuseofaninstrumentordiscontinuityintreatment,thesuitabilityofwhichshouldbeadequatelydemonstratedanddefended.
Level 5: Reserved for research designs that involve explicit randomisation into treatment and control groups, with Randomised Control Trials (RCTs) providing the definitive example. Extensiveevidenceprovidedoncomparabilityoftreatmentandcontrolgroups,showingnosignificantdifferencesintermsoflevelsortrends.Controlvariablesmaybeusedtoadjustfortreatmentandcontrolgroupdifferences,butthisadjustmentshouldnothavealargeimpactonthemainresults.Attentionpaidtoproblemsofselectiveattritionfromrandomlyassignedgroups,whichisshowntobeofnegligibleimportance.Thereshouldbelimitedor,ideally,nooccurrenceof‘contamination’ofthecontrolgroupwiththetreatment.
Note:TheselevelsarebasedonbutnotidenticaltotheoriginalMarylandSMS.Thelevelsherearegenerallyalittlestricterthantheoriginalscaletohelptoclearlyseparatelevels3,4and5whichformthebasisforourevidencereviews.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 6
SMS 5 methods
SMS5methodsrequirefullrandomisationofprogrammeparticipation.Thiscanonlybedonebyarandomisedcontroltrial.AssuchthereisonlyonemethodinthiscategoryontheSMS.Randomisation,properlyapplied,meansthereisnoselectionintothetreatment.Thisensuresthattherearenodifferencesbetweenthetreatmentgroupandcontrolgroupeitheronobservable(e.g.age)orunobservable(e.g.ability)characteristics.Anydifferencepost-treatmentmustthereforebeaneffectofthetreatment.Evidenceofthistypeisthehighestqualityofevidenceandisconsideredthe‘goldstandard’forpolicyevaluations.
Randomised Control Trial (RCT)AnRCTisdefinedbyrandomassignmenttoeitherthetreatmentorcontrolgroup.AtypicalRCTwillinvolvethefollowingsteps.First,theoriginalprogrammeapplicantsmaybepre-screenedoneligibilityrequirements.Second,alottery(computerrandomisation)assignsapercentageoftheeligibleapplicants(usually50%)tothecontrolgroupandtheremaindertothetreatmentgroup.Third,baselinedataiscollected(eitherfromanexistingdatasourceorfromabespoke‘baseline’survey).Fourth,thetreatmentisapplied.Fifthandfinally,dataiscollectedsometimeafterthetreatment(again,eitherfromanexistingdatasourceorabespoke‘follow–up’survey).Becauseindividualsarerandomlyassignedtothetreatmentandcontrolgroups,thereisnoreasonthattheyshoulddifferoneitherobservableorunobservablecharacteristicsinthebaselinedata.Thismeansthatdifferencesintheoutcomevariablesinthepost-treatmentdatawillbeentirelyattributabletothetreatmenti.e.arefreefromselectionbias.ForthisreasonanRCTscoresthemaximumscoreof5ontheMarylandscale.
InorderforanRCTtoachievethemaximumSMS5scoreonimplementation,threecriteriamustbemet.Firstly,therandomisationmustbesuccessful.Whetherornotthisisthecase,isoftentestedusing“balancingtests”thatcomparetreatedandcontrolindividualsinthebaselinedataonarangeofcharacteristics.Ifrandomisationhasbeensuccessfulthenthebalancingtestsshouldshownosignificantdifferencesbetweenthetwogroups.Secondly,attritionmustbecarefullyaddressed.Attritionhappenswheneverindividualsdropoutfromthestudye.g.donotcompletetreatmentordonotprovidefollow-updata(whenthisiscollectedusingafollow-upsurvey).Theavailabilityoffollow-updataisoftenaparticularproblemforindividualsinthecontrolgroupsincetheyhavelessincentive
03
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 7
tostayinthestudy.Ifdropoutisonarandombasis,thenattritionisnotanissue.However,theremaybeelementsofselectionbiasfordroppingout.Forinstance,itmaybethecasethelessskilledindividualsinacontrolgroupchoosetoleaveanunemploymenttrainingstudy.Thiswouldmaketheremainingcontrolgroupmoreskilledonaverageincreasingthelikelihoodthattheyfindemployment.Inturn,thiswouldleadtoadownwardlybiasedtreatmenteffect.Forthisreason,policyevaluatorsshouldpaycarefulattentiontotheissueofattrition.Ifattritionisliabletoselectionbias,thenevaluatorsshouldsatisfactorilyaddresstheissue.
Forinstance,inastudythatlooksattheimpactofneighbourhoodcrimeonthepropensityofrefugeestocommitcrimethemselves,ifcertainindividualsdropoutofthestudybecausetheymoveabroad,thenthereisaproblemofpotentiallybiasedattrition.Toovercomethis,theauthorscouldlookintowhatfactorsleadrefugeestogoabroad,andincludethesecontrolsintheirstudy(wediscussanexampleofthisbelow).Thirdly,theexperimentshouldbedesignedsuchthatthetreatmentdoesnotspillovertothecontrolgroupi.e.contaminationmustnotbeissue.Inorderforthecontrolgrouptobeasuitablecomparator,itmustnotinanywaybeexposedtothetreatment.Ifthisconditionisnotrespected,thenthetreatmenteffectmaybedownwardlybiasedbecausethecontrolgrouppartlybenefitsfromthetreatment.Contaminationmayhappenif,forexample,individualsinagivenneighbourhoodaregivenfinancialmanagementadvice,andtheysharetheirknowledgewithpeopleinotherneighbourhoodswhoaresupposedtogountreated.
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Randomised Control Trial (RCT)
a.k.a.
FieldExperiment
5,5if
• Randomisationissuccessful
• Attritioncarefullyaddressedornotanissue
• Contaminationnotanissue
5,4if
• Oneofthecriteriaisseverelyviolated
• 5,3if
• Twoormoreofthecriteriaareseverelyviolated
RCT Example 1 (SMS 5, 5)
A2013paperbyJensetal.publishedintheAmericanEconomicReviewevaluatestheimpactofneighbourhoodqualityonlifeoutcomes.2Inthiscase,theevaluationisdifficultbecauseitispossiblethattheresidentsofdisadvantagedneighbourhoodshaveworseoutcomesbecauseofindividualorfamilycharacteristics.Toanextent,certaintypesoffamilies“self-select”intobadneighbourhoods,meaningthatthereisaproblemofselectionbias.Inthisinstance,theauthorssurpassthisissuebyevaluatingtheimpactofMovingtoOpportunity(MTO),aprogrammethatgavefamilieswithindisadvantagedneighbourhoodstheopportunitytomovetomoreaffluentareasonarandombasis.InorderforthestudytoachievethemaximumSMSscoreof5,thethreecriteriamustbemet.
1. Randomisation is successful
Pass
TheauthorsmakeaconvincingargumentinfavourofthetrulyrandomnatureoftheMTOprogramme.Inapreliminarystage,interestedresidentsofpoorneighbourhoods(4,604low-incomepublichousingfamilies)enrolledinMTO.Thesefamilieswererandomlyassignedtooneofthreegroups:theExperimentalgroup(whichreceivedhousingvouchersthatsubsidisedprivate-marketrentsinlow-
2 Jens,L.,Duncan,G.,Gennetian,L.,Katz,L.,Kessler,R.,Kling,J.,Sanbonmatsu,L.(2013).Long-TermNeighborhoodEffectsonLow-IncomeFamilies:EvidencefromMovingtoOpportunity.AmericanEconomicReview,226-231.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 8
povertycommunities),theSection8group(whichreceivedunconstrainedhousingvouchers),andthecontrolgroup.Balancingtestsshowthattreatmentandcontrolgroupsaresimilaraccordingtoarangeofobservablecharacteristics,meaningthattherandomisationwassuccessful.
2. Attrition carefully addressed
Pass
Around90percentoftheindividualsthatwereoriginallyenrolledintheprogramwereinterviewed10to15yearsafterthebaselineyear(from1994to1998).Thisisafairlyhighrate,sothatsampleattritionisnotaprobleminthisinstance.
3. Contamination not an issue
Pass
Thevoucherswereinnowaytransferable,meaningthatcontaminationisnotanissueinthiscase.
Given that all three criteria are satisfied, this study achieves SMS 5 for implementation.
RCT Example 2 (SMS 5, 5)
A2014paperbyDammandDustmannpublishedintheAmericanEconomicReviewevaluatestheimpactofearlyexposuretoneighbourhoodcrimeonsubsequentcriminalbehaviour.3Isolatinghowneighbourhoodsaffectyoungpeople’spropensitytocommitcrimesisdifficult,asitispossiblethatchildrengrowingupincrime-riddencommunitiescomefromfamilyenvironmentsthatarethemselvesconducivetocriminalbehaviour.Toovercomethisproblem,theauthorsexploitaneventinwhichasylum-seekingfamilieswererandomlyassignedtocommunitieswithdifferinglevelsofcrimebytheDanishgovernment.
1. Randomisation is successful
Pass
Uponbeingapprovedforrefugeestatus,asylum-seekerswereassignedtemporaryhousinginoneofDenmark’s15counties.Withinthecounty,thecouncil’slocalofficeassignedeachfamilytoamunicipality.Thisassignmentwasrandom,althoughcouncilswereawareofbirthdates,maritalstatuses,numberofchildren,andnationalities.Theauthorshighlightthefactthatthecouncilinnowaybaseditsdecisiononthefamily’seducationalattainment,criminalrecord,orfamilyincome,simplybecauseitwasnotprivytothisinformation.Italsodidnotknowanythingaboutthefamilybesideswhatwaswritteninthequestionnaire.Furthermore,thefamily’spersonalpreferenceswerenottakenintoaccount.Thebaselinebalancingtestconfirmsthatthetreatmentandcontrolgroupswereindeedobservablysimilarbeforethetreatment.
2. Attrition carefully addressed
Pass
Ofthesampleof5,615refugeechildren,975leftDenmarkbeforetheageof21.Furthermore,anadditional215childrenwerenotobservedineveryyearbetweenarrivalandage21.Nonetheless,theauthorsfindnosignificantrelationshipbetweenleavingDenmarkornotbeingobservedallyearsandtheoutcomevariables.Thisindicatesthatthereisnosignificantselectionbiasinherenttodroppingoutofthestudy.
3 Damm,A.&Dustmann,C.(2014).DoesGrowingUpinaHighCrimeNeighborhoodAffectYouthCriminalBehavior?.AmericanEconomicReview,1806-1832.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 9
3. Contamination not an issue
Pass
Althoughfamilieswererandomlyassignedtoamunicipalityandwereencouragedtostaythereforatleastthedurationofthe18-monthintroductoryperiod,theywerenostrictrelocationrestrictions.Thismeansthatitispossiblethatfamiliesthatreceivedthetreatment(beinginaneighbourhoodwithmorecrime)didnotnecessarilyadheretothetreatment,andfamiliesthatwerecontrols(livedinsaferneighbourhoods)mayhavebeenexposedtothetreatment.Theauthorsfindthataround80percentofthefamiliesstayedintheirassignedareaafterthefirstyear,andhalfofallfamiliesstayedintheassignedareaaftereightyears.
Given that all three criteria are satisfied, this study achieves SMS 5 for implementation.
RCT Example 3 (SMS 5, 3)
A2012reportbyDoyleevaluatestheimpactofPreparingforLife(PFL),aprogrammethatprovidessupporttofamiliesfrompregnancyuntilthechildisoldenoughtostartschool.4Itislikelythatthereisasignificantselectionintotreatment–moreconcernedparentsmaychoosetojointheprogramme.Therefore,inordertoaccuratelyunderstandtheeffectofthetreatment,theauthoremploysarandomisedcontroltrial.
1. Randomisation is successful
Fail
PFLisspecifictocertaindeprivedcatchmentareasinIreland.Withinthesecatchmentareas,52percentofallpregnantwomenparticipatedintheprogramme,withtheremaining26percentrejectingtheoffer,andanother22percentnothavingbeenidentified.Inasubsequentstage,PFLrecipientswererandomlyassignedtooneoftwogroups:hightreatmentandlowtreatment.Anobservationallysimilarcommunitywaslaterusedasacontrol,whichimpliesthattreatmentwasnot,infact,randomlyassigned.
2. Attrition carefully addressed
Fail
Theauthorsdonotdiscusstheextenttowhichtreatedwomendroppedoutoftheprogramme.
3. Contamination not an issue
Fail
GiventhatalargepartofPFLentailstheprovisionofinformation(i.e.mentoringanddevelopmentinformationpacks)withinacommunity,itispossiblethatneighbourssharethecontentsoftheirtreatment.Thisisparticularlytrueforthehighandlowtreatmentgroups,astheyliveinthesamecommunity.
Given that all three criteria are violated, this study achieves SMS 3 for implementation.
4 Doyle,O.,&UCDGearyInstitute PFL EvaluationTeam.(2012). PreparingforLifeEarlyChildhoodInterventionAssessingtheEarlyImpactofPreparingforLifeatSixMonths.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 10
SMS 4 methods
SMS4methodsarecharacterisedbytheexploitationofsomesourceof‘quasi-randomness’.Thatis,randomnessthathasnotbeendeliberatelyimposedbutarisesbecauseofsomeotherreason.Usually,thismeansidentifyinghistorical,socialornaturalfactorsthatresultinpolicybeingimplementedinawaythatistosomeextentrandom.Byidentifyingcaseswhereapolicywasimplementedtosomeextentrandomly,theSMS4methodstrytoensurethatthetreatmentandcontrolgroupsaresimilaronobservableandunobservablecharacteristics.However,unlikecontrolledrandomisation,theyneedtomakethecasethattheresultingvariationintreatmentistrulyrandom.Ifthisargumentfailsinpracticethentreatmentandcontrolgroupsdifferandthiscanleadustoincorrectconclusionsabouttreatmenteffects.Itisforthisreasonthatthesemethodsscore4,ratherthanamaximum5,ontheSMSscale.
Instrumental variables (IV)Tosolvetheproblemofselectionbias,apolicyevaluationcanusetheinstrumentalvariables(IV)method.Thisapproachentailsfindingsomethingthatexplainstreatmentbuthasnodirecteffectontheoutcomeofinterest(andisnotrelatedtoanyotherfactorsthatmightdeterminethatoutcome).Thisfactor,the‘instrument’,substitutesforthetreatmentvariablethatmayitselfbecorrelatedwithothercharacteristicsthataffectoutcomes(andthuscauseselectionbias).Forinstance,whenlookingattheimpactofhighwaysonpopulationdensitywithincities,itispossiblethatnotonlymighthighwaysimpactpopulationdensity,butthattheconstructionofhighwaysmightrespondtochangesinpopulationdensity.Agoodwaytocircumventthisproblem,then,istoemployanIV.Forexample(discussedfurtherbelow)Baum-Snow(2007)usesaplannedUShighwaygridthatwasplannedin1947(butnotnecessarilybuilt)asaninstrumentforactualUShighways.Thelogicbehindthisinstrumentisthatitisessentiallyrandominitsassignment(asfaraspopulationdensitygoes),becausethe1947gridwasplannedformilitaryandtradepurposes.ThiselementofrandomnessisessentialforasuccessfulIV.Inthissense,itisawayofsimulatingtherandomassignmenttotreatmentthatisdonewithrandomisedcontroltrials.Asuccessfulinstrumenthasthepotentialtoeliminatedifferencesbetweentreatmentandcontrolgrouponobservableorunobservablecharacteristicsandthusscoreamaximumof4ontheSMSscale.Inordertoachievethisscore,the
04
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 11
instrumentmustsatisfythreemaincriteria.Firstly,theinstrumentmustbeexogenous,orrandominassignment.Secondly,itmustberelevant,orcrediblyrelatedtothevariablethatitreplaces.Thirdly,itmustbeexcludable,meaningthatitdoesnotdirectlyimpacttheoutcome.
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Instrumental Variable (IV)
a.k.a.
Two-StageLeastSquares(2SLS)
4,4ifinstrumentis:
• Relevant(explainstreatment)
• Exogenous(notexplainedbyoutcome)
• Excludable(doesnotdirectlyaffectoutcome)
Scoredasperunderlyingmethodif:
• Instrumentisinvalid
• e.g.crosssectionwithinvalidIVscores2;difference-in-differencewithinvalidIVscores3(seebelow)
IV Example 1 (SMS 4, 4)
A2007paperbyNathanielBaum-SnowpublishedintheQuarterlyJournalofEconomicsevaluatestheimpactofhighwaysonsuburbanisation.5Thisisadifficultquestiontoevaluatebecauseofthepotentialpresenceofselectionbias:highwaysthatwerebuiltmayhavebeenconstructedtoaccommodatesuburbanisation.Inordertoovercomethisproblem,theauthorusestheIVmethod,withhighwaysthatwereoriginallyplanned(butnotbuilt)in1947asaninstrumentforactualhighways.InordertoachievethemaximumSMSscoreof4,theinstrumentmustsatisfythethreecriteria.
1. Instrument is relevant
Pass
Thereisnodoubtthattheoriginal1947planisarelevantinstrumentbecausethecurrenthighwaysystemispartlybasedonit.
2. Instrument is exogenous
Pass
The1947planisanexogenousinstrumentbecauseitwasnotplannedwithsuburbanisationinmindbuttoconnectcitiesfortradepurposes.Onethreattoexogeneityisthatitmayhavebeenthecasethatlargercitiesweregivendenserhighwaysystemsinthe1947plan.Thereforetheauthorcontrolsforthe1947populationofeachcity.Conditionalonthisimportantcontrol,theinstrumentisconvincinglyexogenous.
3. Instrument is excludable
Pass
Byvirtueofthe1947gridbeingsimplyaplan,thereisnowayitcouldhavedirectlyaffectedsuburbanization,hencetheexclusionrestrictionholds.
Given that all three criteria are satisfied, this study achieves SMS 4 for implementation.
5 Baum-Snow,N.(2007).DidHighwaysCauseSuburbanization?.QuarterlyJournalofEconomics,775-805.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 12
IV Example 2 (SMS 4, 4)
A2013paperbyCollinsandShesterevaluatestheimpactofurbanrenewalprogrammesoneconomicoutcomesinvariousU.S.cities.6Theseprogrammesareparticularlyproblematicintermsofevaluationbecauseitmaybethatpoorercitiesself-selectintotreatmentbecausetheyaremoreinneedofurbanrenewalschemes,orconversely,thatrichercitiesself-selectintotreatmentbecausetheycanaffordtoundertakemoreurbanrenewal.Toovercometheseissues,theauthorsusevariationinthetimingofwhenstatesapprovedtheurbanrenewallawsasaninstrument.
1. Instrument is relevant
Pass
Duringtheroll-outofthepolicy,theinstrumentofstate-leveldelaysstronglypredictedwhetherornoturbanrenewalprogrammeswereinplace.
2. Instrument is exogenous
Pass
Theauthorsarguethatthestates’decisiontoallowurbanrenewallawsissomewhatrandominitsprecisetiming.Theyconcede,however,thatstates’receptivenesstofederalinterventionmaybeanindicatorofliberalness,whichmayinturnbepositivelyrelatedtoeconomicprogress.Thus,controlsforstateconservatismareincluded.Conditionalonthiscontrol,theargumentappearssound.
3. Instrument is excludable
Pass
Theyarguethatiftimingofstateapprovalofurbanrenewallawsdirectlyaffectedurbaneconomicoutcomes,thenitwouldalsoaffecttheeconomyoftherestofthestate.Findingthattheruralareasofstatesthatpassedthelawslaterwerenoeconomicallydifferentfromtheruralareasofstatesthatpassedthelawsearlier,theauthorsconvincinglydemonstratethattheexclusionrestrictionholds.
Given that all three criteria are satisfied, this study achieves SMS 4 for implementation.
IV Example 3 (SMS 4, 2)
A2011paperbyNishimuraandOkamuroevaluatestheimpactofindustrialclustersonthenumberofpatentsafirmappliesfor.Thisevaluationisproblematicbecausefirmsthatchoosetolocatewithinanindustrialclusterareprobablyinherentlydifferentfromthosethatdonot.Toovercomethis,theauthorsusefirmageasaninstrumentforclusterparticipation.Thelogicbehindthisinstrumentisthatsmallerfirmsareattractedtoindustrialclusters.Firmsizeis,inturn,relatedtofirmage.Furthermore,theauthorsarguethattheageofafirmissomewhatrandomgiventhatthenumberofpatentsthatafirmappliesfordoesnotaffectitsage.
1. Instrument is relevant
Pass
Theauthorsarguethatonlysmallandmediumfirmsgravitatetowardsclustersbecauselarge,internationalfirms,arelessinterestedinregionalcollaborations.Theydefendthatthesizeofafirmis,inturn,significantlyrelatedtoitsage.Thisappearsareasonableargument.
6 Collins,W.&Shester,K.(2013).SlumClearanceandUrbanRenewalintheUnitedStates.AmericanEconomicJournal:AppliedEconomics,239-273.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 13
2. Instrument is exogenous
Fail
Thisinstrumentisnotexogenousbecausefirmageisnotrandomlyassigned;onlyrelativelysuccessfulfirmssurvivebeyondacertainage.
3. Instrument is excludable
Fail
Theauthorsclaimthatfirmagedoesnotdirectlyinfluencethenumberofpatentsafirmappliesfor.However,youngfirms(e.g.start-ups)maybemoreinnovativeandapplyformorepatents.Conversely,oldfirmsaretypicallylargerandmayhaveagreaterbudgetforR&Dthatcouldleadtopatents.Ifeitheristruethentheexclusionrestrictionisviolated.Theauthorsdonotaddresstheseconcerns.
Given that only one of the three criteria is satisfied, this study is scored according to its underlying method (cross sectional with controls) and achieves SMS 2 for implementation.
Regression Discontinuity Design (RDD)Thismethodcanbeappliedincaseswheretherearecut-offsfortreatmenteligibility.Thesediscontinuitiesintreatment(wherebyunitsaretreatedaslongastheyareabove/belowacertainobservablethreshold)canbeexploitedtogeneratequasi-randomassignmentintotreatment.Essentially,thisinvolvescomparingunitswhoarejustabovethethreshold(andhencetreated)tothosethatarejustbelowthethreshold(andhenceuntreated).Whilstingeneraltheunitsthataretreatedaredifferenttothosethatarenot,theunitsthatarejusteithersideofthecut-offarelikelytobesimilar(bothintermsofobservableandunobservablecharacteristics).Thismakestreatmentaroundthecut-offalmostrandom.Giventhatthistypeofstudyexploitsacertainrandomnessintreatment(aswiththeIVmethod),itensuresthatthetreatmentgroupandcontrolgrouparesimilaronobservableandunobservablecharacteristics,thereforewarrantingamaximumofSMS4.
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Regression Discontinuity Design (RDD)
4,4if:
• Discontinuityintreatmentissharp(e.g.stricteligibilityrequirement)orfuzzydiscontinuitymethodused
• Onlytreatmentchangesatboundary
• Behaviourisnotmanipulatedtomakethecut-off
Scoredasperunderlyingmethodif
• Discontinuityconditionsseverelyviolated
• e.gcrosssectionwithinvalidIVscores2;difference-in-differencewithinvalidIVscores3(seebelow)
However,inorderforthestudytoachievethismaximumscore,threeassumptionsmustbesatisfied.Firstly,thediscontinuityintreatmentmustbesufficientlysharp.Thismeansthatthethresholdfortreatmentmustbeclearandenforcedinpractice,sothatwhenindividualsaboveandbelowthethresholdarecomparedtheresearcherisinfactcomparingtreatedtountreatedindividuals.Secondly,onlythetreatmentshouldchangeattheboundary.Thisisimportanttoensurecomparabilitybetweentreatmentandcontrolgroups.Thisassumptionwouldbeviolatedif,forexample,firmsthathadlessthantenemployeeshadtopaylowerpayrolltaxesinadditiontobeingeligibleforthegrant.Finally,behaviourmustn’tbemanipulatedaroundthecut-off,soastoguaranteethatindividualsaround
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 14
thethresholdareinfactcomparableineverywayexceptfortheirexposuretotreatment.Thisisparticularlyrelevantifthethresholditselfinducesunitstochangetheirbehaviour–forexample,firmscouldartificiallyensurethattheyhavelessthantenemployeeseventhoughtheywouldotherwisehiremore,justbecausetheywanttobeeligibleforthegrant.Ifthishappens,thenitcannotbecrediblyheldthattreatmentisrandomaroundtheboundary.
RDD Example 1 (SMS 4, 4)
A2013paperbyPop-ElechesandUrquiolapublishedintheAmericanEconomicReviewevaluatestheimpactofschoolqualityoneconomicoutcomes.7Thisisaparticularlydifficultquestiontoaddress,aspupilsofbetterschoolsareacademicallymorecapableinthefirstplace,andthereforemorelikelytogetbetterresultsregardlessofschoolquality.Toovercomethisselectionbias,theauthorsemploytheregressiondiscontinuitymethod.Theyexploitadiscontinuityintreatment,wherebystudentsthatachieveaboveacertaingradepointaverageareadmittedintoahigherqualityhighschool,whilethosethatachievejustbelowtherequiredgradeaverageattendalowerqualityhighschool.InorderforthestudytoachievethemaximumSMSscoreof4,thethreecriteriamustbemet.
1. Discontinuity in treatment is sufficiently sharp
Pass
InRomania,highschooladmissionsaredeterminedsolelybyschoolaveragesandexaminationresults.Thestandardsapplytoallstudentsinthesamemanner.Therefore,itcanbearguedwithcertaintythatthediscontinuityissharp,inthatthosethatscorebelowthethresholddefinitelydonotgetaplaceinthebestschool,andthosethatscoreabovedefinitelydo.
2. Only treatment changes at the boundary
Pass
Becausethisstudylooksatschoolquality,itcanbeplausiblyheldthattheonlythingthatchangesattheboundaryisthequalityofschool,sincestudentsthatsurpassthecut-offdonotgetfurtherbenefits(suchasmeritawardsforexample).
3. Behaviour is not manipulated to make the cut-off
Pass
Inorderforthisconditiontohold,itcannotbe,forinstance,thatstudentsjustabovethethresholdimploredtheirteachersforslightlybettergrades,asthatwouldmaketheminherentlydifferenttothosethatdidnot.Forthiscase,itisunlikelythatthishappenedgiventhatthethresholdisonlyknownonceexamresultsareout.
Given that all three criteria are satisfied, this study achieve SMS 4 for implementation.
RDD Example 2 (SMS 4, 4)
A2014studybyAndersonpublishedintheAmericanEconomicReviewevaluatestheimpactofpublictransporttransitonhighwaycongestion.8Inthiscase,itisdifficulttoisolatethedirectionofcausality,asthenumberofpeoplewhousepublictransportcanaffecthighwaycongestion,butat
7 Pop-Eleches,C.&Urquiola,M.(2013).GoingtoaBetterSchool:EffectsandBehavioralResponses.AmericanEconomicReview,1289-1384.
8 Anderson,M.(2014).Subways,Strikes,andSlowdowns:TheImpactsofPublicTransitonTrafficCongestion.AmericanEconomicReview,2763-2796.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 15
thesametime,highwaycongestioncanalsoaffectthenumberofpeoplethatusepublictransport.Toovercomethisissue,theauthorsexploitanaturalexperimentwherebyLosAngelespublictransportworkerssuddenlywentonstrikefor35days.Morespecifically,theyusetheRDDmethodtocomparehighwaycongestionjustbeforethestrikestartedandhighwaycongestionjustafterthestrikestarted.
1. Discontinuity in treatment is sufficiently sharp
Pass
Here,thediscontinuityisverysharp.Fromonedaytoanother,transitworkerswentonstrike,meaningthattheyworkednormallyoneday,anddidnotworkatallthenextday,andthedaysthereafter.Thisledtothealmostcompleteshutdownofpublictransportservices,withminorexceptionsofsomecontract-operatedbusservices.Theauthorsarguethattheseemergencyserviceswereonlyusefulforaverysmallfractionofpublictransportusers.
2. Only treatment changes at boundary
Pass
Thestrikewasspurredbyadisagreementbetweenthetransportmechanicsandtheemployersovercontributionstoahealthcarefund.Theauthorsarguethatthetimingofthestrikeisexogenousbecauseitstartedjustaftertheexpirationofa60-daycourt-orderedinjunctionthatprohibitedstriking.
3. Behaviour is not manipulated to make the cut-off
Pass
Itispossiblethatifpeoplewereexpectingthestrike,theychangedtheirworkarrangementstoadjusttotheinterruptions.Forinstance,itcouldbethatpeoplestartedworkingfromhomewhentheyotherwisewouldn’thave.Ifthiswerethecase,theeffectsofpublictransportonhighwaycongestionwouldbedownwardlybiased.However,theauthorsarguethatthestrikeledtoanabruptandunexpectedhaltinservice,meaningthatitisunlikelythatpeopleanticipatedit.
Giventhatallthreecriteriaaresatisfied,thisstudyachievesSMS4forimplementation.
RDD Example 3 (SMS 4, 2)
A2007reportbyHaeglandandMoenevaluatestheimpactofanR&DtaxcreditonactualfirmR&Dinvestment.9Inprinciple,thisisatoughpolicytoevaluateasfirmsoptintoreceivingthetaxcredit.Toovercomethisproblem,theauthorsusetheRDDmethodandexploitthefactthatfirmsthatinvestedupto4millionNOK(about£342,000)receivedan18percenttaxcreditoneveryNOKspent,whilefirmsthatinvestedabovethisamountreceivedthesamecredituptothe4millionthreshold,butnocreditthereafter.
1. Discontinuity in treatment is sufficiently sharp
Fail
Thediscontinuityseemsextremelyfuzzy,asfirmsabovethethresholdstillgetthetreatment,buttoalesserextent.Thisisbecausefirmsthatspendmorethanthethresholdstillgetataxcreditforthefirst4millionNOKspent,butdonotreceiveacreditforanyamountinvestedbeyondthat.Forexample,afirmthatinvests4million(belowthreshold)getsan18percenttaxcredit(i.e.720,000,about
9 Haegland,T.,&Moen,J.(2007).InputAdditionalityintheNorwegianR&DTaxCreditScheme.DiscussionPaper,StatisticsNorway.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 16
£61,500)butafirmthatinvests4.1million(abovethreshold)alsogets720,000whichis17.5percent.Thusthereisvirtuallynodifferenceintreatmentoneithersideoftheboundary.
2. Only treatment changes at boundary
Pass
Theredoesnotseemtoanysortofotherelementofvariationatthe4millionNOKcut-off.
3. Behaviour is not manipulated to make the cut-off
Fail
Inthiscase,firmshaveaclearincentivetospendlessthan4millionNOKinR&D,aseveryNOKspentbeyondthatthresholdisnotsubsidised.Therefore,itispossiblethatonlymuchlargerandmoresuccessfulfirmsspendbeyondthethreshold,whilesmaller,lesssuccessfulfirmsdeliberatelyspendlessthantheyotherwisewouldhaveintheabsenceofthe4millioncut-off.
Given that only one of the criteria is satisfied, this study achieves SMS 2 for implementation.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 17
SMS 3/4 methods
Thereareanumberofmethodswhichmayscorea4ora3dependingontheparticularwayinwhichtheyarespecified.ThisisaparticularissueforhazardregressionsandHeckmanselection(andother‘controlfunction’-type).Becausetheseapproachesarenotfrequentlyencounteredinexistingstudiesoflocaleconomicgrowth,andbecauseaspectsoftheseapproachesareparticularlytechnical,we’verelegatedconsiderationofthemtoAppendix2.Theseapproachesmayseeincreasedapplicationinfuture,butinthemaintextwefocusonthemorefrequentlyencounteredapproaches.
05
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 18
SMS 3 methods
WedefineSMS3methodstobetheminimumstandardforourreviews.Thesemethodsmustinvolveacomparisonoftreatedunitsagainstacontrolgroupbefore-and-afterthetreatmentdate.Thesemethodscontrolforselectiononobservablecharacteristics,andthroughabeforeandaftercomparison,eliminateanyfixedunobservabledifferencebetweentreatmentandcontrolgroup.However,theydonotinherentlycontrolforunobservabledifferencesthatdovarywithtime.Forthisreason,theyscoreamaximumof3ontheSMS.Inordertoachievesuchascore,however,theymustattempttosomehowcontrolfortime-varyingunobservableselectionbias.Thisisoftendonethroughtheconstructionofavalidcontrolgroup.Furthermore,theyshouldcontrolforobservabledifferencesbetweenthetreatmentandcontrolgroupsusingsomeformofregression,matching,controlforselection,orcontrolvariables.
Difference-in-differences (DiD)TheDIDmethodentailscomparingatreatmentandcontrolgroupbeforeandaftertreatment.Morespecifically,thetreatmenteffectiscalculatedbyfirstevaluatingthechangeintheoutcomevariableforthetreatedgroup,andthensubtractingthechangeininthecontrolgroupoverthesameperiod.Here,thecontrolgroupprovidesthecounterfactualgrowthpathi.e.whatwouldhavehappenedinthetreatmentgrouphaditnotbeentreated.Thisismuchbetterthanasimplebeforeandaftertreatmentcomparison,becauseitaccountsforthefactthatchangesinoutcomecanbeduetomanydifferentfactorsandnotjustthetreatmenteffect.Moreover,becauseDIDsubtractsthedifferencesbetweentreatmentandcontrolbothbeforeandafterthetreatment,iteffectivelycontrolsforanyunobserved,time-invariantdifferencesbetweenthetwogroups.However,itdoesnotaccountforunobservabledifferencesbetweenthetwothatvarywithtime.Forthisreason,itachievesamaximumofSMS3.Inordertoachievethisscore,twomaincriteriamustbesatisfied.Firstly,itmustbecrediblyarguedthatthetreatmentgroupwouldhavefollowedthesametrendasthecontrolgroup.Secondly,theremustalsobeaknowntimeperiodfortreatmentsothatthegroupscanbecomparedbeforeandafterthetreatment.
06
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 19
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Difference in differences (DID)
a.k.a.
Diffindiff
3,3if
• Controlgroupwouldhavefollowedsametrendandtreatmentgroup
• Knowntimeperiodfortreatment
3,2if
• Eitherofthecriteriaisnotsatisfied
DiD Example 1 (SMS 3, 3)
A2011studybyCurrie,Greenstone,andMorettievaluatestheimpactofhazardouswastesiteclean-upsonthebirthoutcomesofmotherslivingwithin2,000metresofthesite.10Theyusebirthstomotherswholivebetween2,000and5,000metresawayfromthesiteasacontrolgroup.InorderforthestudytoachievethemaximumSMSscoreof3,itmustsatisfythetwocriteria.
1. Treatment group would have followed same trend as control group
Pass
Firstly,thecontrolgroupischosenbecauseitissufficientlyfarawaysothatitisunaffectedbytreatment,becausehazardouswastesitesimpactpeoplethroughdirectcontact,fumes,andwatersupply,andonlythoselivingcloseenoughtoit(i.e.within2,000metres)areaffected.Secondly,thecontrolgroupwaschosenbecauseitissimilartothetreatmentgroup,asthepeoplelivingwithin2,000metresofawastesitearecomparabletothoselivingslightlyfartheraway.Evenso,theauthorsrecognisethatmotherslivingextremelyclosetothesitemaybeslightlydifferenttomotherslivingabitfartheraway.Forthisreason,theycontrolformaternaleducation,race,ethnicity,birthorder,andsmoking,aswellasotherneighbourhoodcharacteristics.
2. Known time period for treatment
Pass
Becausetheauthorsusedatafrom154sitesthatwerecleanedupbetween1989and2003,itishardtopinpointprecisedatesforeachclean-up.However,theauthorsusedatafromtheEnvironmentalProtectionAgency,whichincludesthedatewhenthesitewasaddedtotheNationalPriorityList,whentheclean-upwasinitiated,andwhenitwascompleted.
Given that the two criteria are satisfied, this study achieves SMS 3 for implementation.
DiD Example 2 (SMS 3, 3)
A1993studybyCardandKruegerevaluatestheimpactofaNewJerseyminimumwageincreaseontheemploymentlevelsoffastfoodrestaurants.11Itishardtoaccuratelygaugetheeffectsofsuchanincrease,asitislikelythatchangesinemploymentarecausedbyotherfactors.Toovercomethisproblem,theauthorsemployadifference-in-differencesapproachwherebytheycomparetheemploymentinNewJerseywiththeemploymentinPennsylvania,aneighbouringstatethatdidnotincreaseitsminimumwage.
1. Treatment group would have followed same trend as control group
10 Currie,J.,Greenstone,M.,Moretti,E.(2011).SuperfundCleanupsandInfantHealth.MITCEEPRWorkingPapers.11 Card,D.,Krueger,A.(1993).MinimumWagesandEmployment:ACaseStudyoftheFast-FoodIndustryinNewJersey
andPennsylvania.AmericanEconomicReview,Vol.84,No.4.,pp.772-793
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 20
Pass
TheauthorsmakeaconvincingargumentthatNewJerseyandPennsylvaniaare,infact,comparablestates.Theyanalysetherestaurantcompositionofeachandfindthattherearenosignificantdifferencesbetweenthetwo.Furthermore,becausetheyareneighbouringstates,itcanbecrediblyheldthattheyareexposedtothesamemacroeconomicshocks.
2. Known time period for treatment
Pass
Theminimumwageincreasedfrom$4.25to$5.05perhouronthe1stofApril1992.Theauthorsshowthatthenewminimumwageslawwas,infact,bindinginNewJersey,whichmeansthatwecanreasonablyassertthatthetreatmentwasalsosignificant.
Given that the two criteria are satisfied, this study achieves SMS 3 for implementation.
DiD Example 3 (SMS 3, 2)
A2005studybyValentinandLundJensenevaluatestheimpactofalawthattransferredpatentrightsfromindividualresearcherstouniversitiesoncollaborativedrugdiscoveryresearchinDenmark.12Becausethechangeinresearchfollowingthelawcouldbeattributedtomanydifferentcauses,theauthorsemployaDiDapproachusingSwedenasthecontrolgroup.
1. Treatment group would have followed same trend as control group
Fail
TheauthorsarguethattheDanishandSwedishbiotechindustriesaresimilarinsize,history,responsetothe2001high-techbubble,numberofinventions,andamountofinventorsmobilisedtobringinventionsabout.TheyarguethattheonlydifferencebetweenthemisthatDanishscientiststendtoworkinfirms,whileSwedishscientiststendtobeacademics.Tocontrolforthis,theyemployavariablefortheratioofbiopharmaceuticaltosmallmoleculepatents(asbiopharmaceuticalresearchtendstobedoneinuniversities,whilesmallmoleculeresearchtendstobedoneinfirms).However,theauthorsdonotadequatelyaddressthefactthatSwedenandDenmarkaresubjectedtodifferentmacroeconomicclimates,andhavedifferentsetsoffederallaws.Giventhattheydidnotuseanycontrolstotackletheseissues,itcannotbecrediblyheldthattheirtreatmentandcontrolgroupschangedinsimilarways,meaningthattheyarenotdirectlycomparable.
2. Known time period for treatment
Pass
ThelawthattransferredtheownershipofpatentsfromindividualresearcherstouniversitieswasenactedinDenmarkin2000.
Given that only one of the criteria is satisfied, this study achieves SMS 2 for implementation.
Panel data methods Thesemethodsusedatathatfollowsthesameindividualsovertimeallowingtheresearchertocontrolforthingsthatremainconstantforeachindividualacrosstime.Thiscanbedoneinavarietyofways.Themostcommonisthefixedeffects(FE)approachwhichentailsincludingdummyvariables(the
12 Valentin,F.&Jensen,R.(2005).EffectsonAcademia-IndustryCollaborationofExtendingUniversityPropertyRights.WorkingPaper.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 21
“fixedeffects”)forbothindividualcharacteristicsandanythingthathappenedinagiventimeperiod(timedummies).Alternatively,thefirstdifferences(FD)approachcanbeused,wherebytheoutcomeofapreviousperiodissubtractedinthefinalregression.Thisessentiallymeansthateverythingthathappenedinthepreviousperiodisremovedfromthefinalregression,thereforecancellingoutanythingthatremainsconstantacrosstime.13Inthiscase,becauseeverythingthatremainsconstantacrosstimeisdifferencedout,itisunnecessarytoincludefixedeffects.
Paneldatamethodssuccessfullycontrolforobservableandunobservablecharacteristicsthatremainconstantthroughouttime.However,theydonotaccountforunobservablecharacteristicsthatchangewithtime.Forthisreason,theycanachieveamaximumSMS3.Inordertoachievethemaximumscore,twomaincriteriamustbemet.Firstly,yearlyfixedeffectsmustbeaccountedfor,sothatthetreatmentisnottaintedbythecontextofagiventimeperiod.Secondly,whenusingthefixedeffectsmethod,thefixedeffectsmustbeattheunitofanalysis.Thismeansthatifthesampleisofpeople,thenthefixedeffectsmustrefertothem.Lastly,althoughpaneldatamethodscontrolforcharacteristicsthatdonotchangewithtime,theydonotcontrolforcharacteristicsthatdonotchangewithtime.Forthisreason,itisimportanttoincludeothertime-varyingcontrolsthatmayalsoaffecttheoutcome.
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Panel methods
Panel Fixed Effects (FE)
3,3if
• Fixedeffectisattheunitofobservation
• Yeareffectsareincluded
• Appropriatetime-varyingcontrolsareused
3,2if
• Oneormoreofthethreecriteriaisnotsatisfied
First Differences (FD)
3,3if
• Yeareffectsareincluded
• Appropriatetime-varyingcontrolsareused
3,2if
• Eitherofthetwocriteriaisnotsatisfied
FE Example 1 (SMS 3, 3)
A2011studybyDinkelmanpublishedintheAmericanEconomicReviewevaluatestheimpactofelectricityroll-outontheemploymentoutcomesofruralcommunities.14Inthiscase,theevaluationmaybetaintedbyselectionbias,asitispossiblethatpoliticallyimportantorgrowingareasarefavouredoverothers(thereversecouldalsobetrue).Toovercomethisissue,theauthoremploysafixedeffectsstrategy.InorderforthisstudytoachievethemaximumSMSscoreof3,itmustsatisfythethreecriteria.
1. Fixed effect is at the unit of analysis
Pass
Theauthorincludesavariableforcommunityfixedeffects.Giventhattheunitofobservationisthecommunity,thiscriterionismet.
13 SometimesresearchersthinkthatoutcomesinpreviousperiodsmaydirectlyaffectoutcomestodaywhichcausecomplicationsinpaneldatasettingsthataresometimesaddressedusingtheArellano-Bondmethod(seeAppendix3).
14 Dinkelman,T.(2011).TheEffectsofRuralElectrificationonEmployment:NewEvidencefromSouthAfrica.AmericanEconomicReview,3078-3108.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 22
2. Year effects are included
Pass
Controlsfortimetrendswithinthedistrictareincluded.
3. Appropriate time-varying controls used
Pass
Inthisparticularexample,itislikelythattherearemanykeyvariablesthatvaryovertime,andarethereforenotcapturedbythefixedeffects.Forthisreason,theauthorincludesavectorofcommunitycovariateswhichincludehouseholddensity,fractionofhouseholdslivingbelowthepovertyline,fractionofwhiteadults,educationalattainmentofadults,shareoffemale-headedhouseholds,andfemaletomaleratio.
Given that all three criteria are satisfied, this study achieves SMS 3 for implementation.
FE Example 2 (SMS 3, 2)
A2013studybyAtasoyanalysestheimpactofbroadbandinternetexpansiononemployment.15Thisevaluationposesthesameissuesofselectivityastheelectricityroll-outexample.Toovercomethis,theauthoremploysthefixedeffectsstrategy.
1. Fixed effect is at the unit of analysis
Pass
Theauthorderivescounty-levelbroadbandadoptionratesusingzip-codedata.Hesubsequentlyusescountyfixedeffects,whicharetherelevantunitofobservation.
2. Year effects are included
Pass
Yeardummyvariablesareincluded.
3. Appropriate time-varying controls used
Fail
Theauthorincludescontrolsfortime-variantpopulationcharacteristicssuchasincome,age,gender,race,anddensity.However,heneglectscertaincounty-levelcharacteristicsthataffectemployment,andarenotcapturedbythecountyfixedeffectsbecausetheychangewithtime.Thesecontrolscanincludetransportationinfrastructure,investmentineducationandjobtraining,andleveloffundinggrantedbythestate.
Given that only two of the criteria are satisfied, this study achieves SMS 2 for implementation.
15 Atasoy,H.(2013).TheEffectsofBroadbandInternetExpansiononLabourMarketOutcomes.ILRReview.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 23
FE Example 3 (SMS 3, 2)
A1997studybyGuellecanddelaPotterieevaluatestheimpactofgovernmentsubsidiesonR&Dexpenditure.16Inthiscasethereisaclearpotentialproblemofselectionbias,asfirmsthatchoosetousethesubsidiesaredifferentfromfirmsthatdonot.Toovercomethis,theyusethefirstdifferencesmethod.InorderforthisstudytoachievethemaximumSMSscoreof3,thetwocriteriamustbesatisfied.
1. Year effects are included
Pass
Theauthorsincludeayeardummyvariableinthefinalregression.
2. Appropriate time-varying controls used
Fail
Theauthorsdonotincludeanytime-varyingcontrols.Thismeansthatwhatisthoughttobetheimpactofthegovernmentsubsidiesmayinfactbeduetothingslikegrowthinthenumberoffirmemployees.
Given that only one of the criteria is satisfied, this study achieves SMS 2 for implementation.
Box 2: Fixed effects in the cross-sectional data context
Sometimes,cross-sectionalstudieswillusetheterm“fixedeffects”.However,itisimportanttodistinguishfixedeffectsinthiscontext,andfixedeffectsinthepaneldatacontext.Whilepaneldataconsidersanumberofindividualsacrosstime,cross-sectionaldataonlyconsidersindividualsinacertaintimeperiod.Forthisreason,panelfixedeffectsrefertowhatremainsconstantacrosstimeforaparticularindividual,andcross-sectionalfixedeffectssimplyrefertocontrols.Forinstance,ina2014studybyGibbonsetal,theauthorslookattheimpactofnaturalamenitiesonpropertyprices.Althoughtheirdataiscross-sectional,theyinclude“TravelToWorkArea(TTWA)levelfixedeffects”,whichinthiscasemeanscontrolsfortheTTWAthepropertyislocatedin.
Itisworthnotingthatevencross-sectionalstudiesthatincluderelevantcontrolscanonlygetamaximumSMSof2.
Propensity Score Matching (PSM)Thismethodentailsmatchingeverytreatedunitwithanobservationallysimilaruntreatedunit,andthenlookingatthedifferencesbetweenthemtogaugethetreatmenteffect.Thetreatedanduntreatedunitsarenotmatchedaccordingtowhethertheyhavesimilarcharacteristicsperse,butaccordingtoapropensityscore.Thisscoreisessentiallytheprobabilitythataunitistreated,accordingtoasetofcharacteristics.Thepropensityscoreistypicallycalculatedusingaregressionthatdealswitha0to1probabilityastheoutcomevariable(e.g.alogisticorprobitmodel).However,evenso,thismethoddoesnotentirelycorrectforselectionbiasbecauseitonlyaccountsforobservablecharacteristics.Forthisreason,ifusedinapurelycross-sectionalcontext,itcanachieveamaximumSMS2.Inordertoachievethisscore,itmustbecrediblyheldthatthematchingcriteriaarerelevanttoselectionintotreatment.Furthermore,akeyissuewithPSMisthatunitsthatareactuallytreatedaremorelikelytohavehigherpropensityscores,whileunitsthatareactuallyuntreatedaremorelikelytohavelower
16 Guellec,D.&delaPotterie,B.(1997)DoesgovernmentsupportstimulateprivateR&D?OECDEconomicStudies
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 24
propensityscores.Therefore,havingasufficientlylargeareaofoverlap(wherebytreatedunitsarematchedwithsimilaruntreatedunits)isessentialforthemethodtowork.InorderforthemethodtobeofSMS3quality,itmustbeusedwithothermethodsthatattempttocontrolforunobservablecharacteristics.Forinstance,itcanbecombinedwithmethodsthatexploittimevariationintreatment(e.g.DID).Inthiscase,inorderforthemethodtoachieveSMS3,thecriteriaforDIDmustberespected.
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Propensity Score Matching (PSM)
a.k.a.
Matching
With DID or panel method
3,3if
• Matchingcriteriasatisfied(seebelow)
• DIDorpanelcriteriasatisfied
3,2if
• Oneofthecriteriaisviolated
Cross-sectional
2,2if
• Goodmatchingvariables(i.e.relevanttoselection)
• Significantcommonsupport
2,1if
• Oneofthecriteriaisviolated
PSM Example 1 (SMS 3, 3)
A2012studybyBuscha,Maurel,Page,andSpeckesserevaluatestheimpactofhighschoolemploymentoneducationaloutcomes.17Thisquestionisdifficulttoevaluatebecausehighschoolstudentsthathaveapart-timejobmaynotbedirectlycomparabletotheirnon-workingpeers;itcouldbethattheyaremorehard-workinganddrivenandthereforehavebettergradesanyway,oritcouldbethattheycomefromdisadvantagedbackgroundsandthereforehaveworsegrades.Toovercomethefactthattreatedanduntreatedhighschoolstudentsmaybeintrinsicallydifferent,theauthorsemploythePropensityScoreMatchingapproachincombinationwiththeDIDmethod.InorderforthisstudytoachievethemaximumSMSscoreof3,thethreecriteriamustbesatisfied.
1. Good matching variables
Pass
Studentsarematchedaccordingtosocio-economicbackground,familybackground,schoollevelandregionalvariables,parentalexpenditure,schoolabsenteeismandconflict,alcoholanddrugissues,andeducationalaspirationsatgrade8.
2. Significant common support
Pass
Thereisaverylargeareaofcommonsupport,meaningthatfewobservationsweredroppedduetolackofsuitablematch.
3. Treatment group would have followed same trend as control group
Pass
17 Buscha,F.,Maurel,A.,Page,L.,Speckesser,S.(2012).TheEffectofEmploymentwhileinHighSchoolonEducationalAttainment:AConditionalDifference-in-DifferencesApproach.OxfordBulletinofEconomicsandStatistics,380-396.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 25
Usingthematchingmethod,theauthorsuseobservationallysimilarstudentsascontrolsforworkingstudents.Althoughunobserveddifferencesmayremain,itcanbereasonablyarguedthatthetwogroupswilllikelyfollowthesametrend.
4. Treatment date known and singular
Pass
Theauthorsuseadatasetthattracksstudentsfromwhentheywere8thgradein1988,until1992whentheygraduatedhighschool.Thedatasetincludesinformationforthreewaves:1988,1990,and1992.Becausestudentsarenotlegallyallowedtoworkbeforetheageof14,thedatasethaspreandposttreatmentinformationforthosethatchoosetoworkduringtheirhighschoolyears.Therefore,becausestudentsareonlylegallyallowedtoworkfrom1990onwards,thetreatmentdateisknownandsingular.
Given that the four criteria are satisfied, this study achieves SMS 3 for implementation.
PSM Example 2 (SMS 2, 2)
A2011studybyGhalib,Malki,andImaievaluatestheimpactofmicrofinanceonruralhouseholdpoverty.18Inthiscasethereisaclearproblemofselectionbiasbecausehouseholdsthatapplyformicrofinancemaybeintrinsicallydifferentfromthosethatdonot.Forinstance,theymayhavemoreentrepreneurialdriveorsimplybemorefinanciallyliterate.Toovercomethisproblem,theauthorsemployPSM.
1. Good matching variables
Pass
Treatmentandcontrolgroupsarematchedcharacteristicsthatincludeageofadults,childdependencyratio,accesstoelectricity,homeownershipstatus,consumptionofluxuryfood,percentageofliterateadults,andavailabilityoftoilets.Thesevariablesareimportantinthesensethattheycaptureafamily’slevelofwealthandeducation,anditcanbereasonablyheldthatpoorerhouseholdsselectintotreatment(orapplyformicrocreditinthiscase).
2. Significant common support
Pass
Whenanalysingtheoverlapbetweentreatmentandcontrol,theauthorsfindthatonly11observations(ofthe1,132)aredroppedduetoalackofsuitablematch.
Given that the two criteria are satisfied, this study achieves SMS 2.
18 Ghalib,A.,Malki,I.,Imai,K.(2011).TheEffectofEmploymentwhileinHighSchoolonEducationalAttainment:AConditionalDifference-in-DifferencesApproach.RIEBDiscussionPaperSeries.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 26
SMS 2 methods and below
Cross-sectional regressionCross-sectionalregressionsentailusingadatasetthatfeaturesmanydifferentindividualsatonepointintime.Thismethodsimplycomparestreatedindividualswithuntreatedindividuals,withouttakingintoaccountthedifferentcharacteristicsofthetwogroupsthatmayinfluencetheoutcome.Anattempttoaccountfordifferencesbetweentreatmentandcontrolgroupscanentailtheinclusionofcontrolvariables.However,evenifsomeobservablecharacteristicsarecontrolledfor,unobservabledifferencesbetweenthegroupsremain.Forthisreason,thismethodcanonlyachieveamaximumSMS2.Inordertoachievethisscore,however,adequatecontrolvariablesmustbeused.
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Cross-section regression 2,2if
• Adequatecontrolvariablesareused
2,1if
• Inadequatecontrolvariables
Cross-Sectional Regression Example 1 (SMS 2, 2)
A2012studybySarzynskievaluatestheimpactofpopulationsizeonthelevelofacity’spollution.19Theauthorusesadatasetthatfeaturesasampleof8,038citiesworldwidefortheyear2005.InorderforthestudytoachievethemaximumscoreofSMS2,thecriterionmustbesatisfied.
1. Adequate control variables are used
Pass
Besidespopulation,othercityvariablessuchasGDPpercapita,populationdensity,growthrate,climate,anddevelopmentstatusareincludedintheregression.Here,itisimportanttothinkabout
19 Sarzynski,A.(2012).BiggerIsNotAlwaysBetter:AComparativeAnalysisofCitiesandtheirAirPollutionImpact.UrbanStudies,3121,3138.
07
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 27
whetherthereareanyimportantvariablesthatareomittedandmaythereforebiastheresults.Inthisparticularcase,theauthorseemstohaveincludedanexhaustivelistoffactorsthatmayinfluencepollutionlevels.
Given that the criterion is satisfied, this study achieves SMS 2 for implementation.
Cross-Sectional Regression Example 2 (SMS 2, 1)
A2007studybyPateletal.evaluatestheimpactoflivinginanurbanversusruralsettingonmentalhealth.20Theyuseacross-sectionaldatasetofhouseholdsinMaputoandCuambainMozambique.
1. Adequate control variables are used
Fail
Toestimatetheimpactoflivinginaruralsettingonmentalhealth,theauthorsperformasimplebivariateregression.
Giventhatthecriterionisnotrespected,thisstudyachievesSMS1forimplementation.
Before-and-AfterThismethodoftenentailsusingatimeseriesdatasetforwhichoneindividualistrackedacrosstime.Anexampleofthisisthevariationofaspecificcountry’sGDPoveranumberofdifferentyears.Inthiscase,itispossiblethattheindividualisobservedbeforethetreatmentandafterthetreatment.However,the“after”doesnotnecessarilycapturethepuretreatmenteffect-itispossiblethatothercontextualfactorsaffectedtheoutcome,aswellastheindividual’scharacteristics.Forthisreason,thismethodcanachieveamaximumSMSscoreof2.Inordertoachievethisscore,however,relevantcontrolvariablesmustbeused.
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Before-and-after 2,2if
• Adequatecontrolvariablesareused
2,2if
• Inadequatecontrolvariables
Before-and-After Example 1 (SMS 2, 2)
A2014studybyBlankandEgginkevaluatestheimpactofdifferentgovernmentalhealthpoliciesonhospitalproductivity.Theyexploitatimeseriesdatasetthatfeaturestheoverallproductivityofdutchhospitalsfrom1972to2010.Itisworthnotingthatalthoughthismightseemlikeapaneldataset(becausethereareobviouslymanydifferenthospitals),itisessentiallyatimeseriesdatasetsimplybecausethereisnovariationintreatment-i.e.allhospitalsaresubjectedtothesamecentralgovernmentpolicies.Theonlyvariationintreatmentinthiscaseistemporal,asdifferentpolicieswereadoptedindifferenttimeperiods.InorderforthisstudytoachievethemaximumSMSof2,thecriterionmustberespected.
20 Patel,V.,Simbine,A.,Soares,I.Weiss,H.,Wheeler,E.(2007).PrevalenceofSevereMentalandNeurologicalDisordersinMozambique:aPopulation-BasedSurvey.Lancet,1055-1060.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 28
1. Adequate control variables are used
Pass
Inthisparticularcase,theauthorsdocontrolforthingslikethepriceofinputs,wages,andnumberofadmissions.
Given that the criterion is satisfied, the study achieves SMS 2 for implementation.
Before and After Example 2 (SMS 2, 1)
A2000studybyCormanandMocanevaluatestheimpactofdruguseandlevelsoflawenforcementoncrime.21Morespecifically,theylookathowthenumberofarrestsandpoliceofficers,aswellasdruguse,impactthenumberofmurders,assaults,robberies,andburglaries.TheyexploitadatasetthatfeaturesmonthlycrimeratesforNewYorkCity.Here,the“treatment”isthevariationinarrests,numberofpoliceofficers,andhospitalisationsrelatedtodruguse.
1. Adequate control variables are used
Fail
Theauthorsdonotincludecontrolsforotherfactorsthatmayinfluencecrimerates.Forinstance,itispossibletheunemploymentlevelsorinflationmayimpactcrime.Becausethesefactorsarenotaccountedfor,itmaybethecasethatthetreatmenteffectsarenotentirelyattributabletonumberofarrests,policeofficers,anddruguse.
Given that the criterion is not satisfied, the study achieves SMS 1 for implementation.
Additionality (not SMS scoreable) Thismethodentailsaskingparticipantsinaprogrammeaboutthechangesinoutcomethattheyattributetotheprogrammeeffects.AnexamplewouldbetoaskafirminreceiptofanR&Dgrant‘doyouthinkyoudomoreR&Dasaresultofthegrant?’.Ifsay75%offirmsanswer‘yes’thentheremaining25%isconsidered‘deadweight’i.e.R&Dthatwouldhavehappenedanyway.Whilstthismethoddoesacknowledgedeadweight,thecounterfactualisbasedpurelyonopinion:whatthefirmthinkswouldhavehappened.Giventhelackofobservablecounterfactual,thismethodisnotSMSscoreable.
Additionality Example
A2013reportcommissionedbytheDepartmentforBusinessInnovationandSkillsevaluatestheimpactoftheEnterpriseFinanceGuaranteescheme(EFG),aprogrammethatprovidedSMEswithanadditionalsourceoffunding.22Themethodologyofthisevaluationrelieschieflyonself-reportedsurveyinformation.Morespecifically,EFGfirmswereaskedtorespondtoquestionnairesthataskedabouthowtheirbusinessesbenefitedfromtheprogramme.Self-reporteddataareoftentheobjectofbias,andforthisreason,thisstudyisnotSMSscoreable.
21 Corman,H.,&Mocan,H.(2000).ATime-SeriesAnalysisofCrime,Deterrence,andDrugAbuseinNewYorkCity.AmericanEconomicReview,584-604.
22 Allinson,G.,Robson,P.,Stone,I.(2013).EconomicEvaluationoftheEnterpriseFinanceGuarantee(EFG)Scheme.DepartmentforBusinessInnovation&Skills.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 29
Impact modelling (not SMS scoreable)Impactmodellingisanapproachtoproduceexanteorexpostestimatesoftheeconomicimpactofapolicyorevent.Estimatesaremadeusing‘modellingassumptions’aboutthewayinwhicheventsorpolicesimpactonlocalareas.Forexample,impactmodelsadjustestimatesusingscalefactorstoaccountforissuessuchasdisplacementeffects–wherespendingassociatedwithalocaleventisoffsetfromspendingelsewhereorinanothertimeperiod.Sinceestimatesarebased(partlyorentirely)onassumptions,ratherthanobservedoutcomes,theapproachisparticularlyamenabletoexantestudiesofimpactwhereoutcomesarenotyetobservableinthedata(e.g.modellingthepotentialeconomicimpactsofafestivalorotherbigevent).
However,theresultsfromsuchmodellingprocessesareverysensitivetotheassumptionsmade,implyingthatinaccurateassumptionscanleadtoveryunreliableestimates.Inpractice,theassumptionsusedtendnotbebasedonstrongevidenceandareoftennotrelevanttothespecificcaseathand.Forexample,thereisnoreasontoexpectlargelocalmultipliereffectsinalocaleconomythatisalsopartofthelargernationaleconomy,andnostraightforwardwaytodividetheimpactsacrossspace.Similarly,itisneverreallypossibletoknowaccuratelyhowmuchdisplacementoccurs(e.g.inthefestivalexample,howmanyindividualswhowouldhavevisitedduringthefestival,avoidthefestivalandeithervisitanothertimeornotatall).Sincethismethoddoesnotprovideareasonablecounterfactual,itisnotSMSscoreable.
Impact modelling Example 1
AstudycommissionedbyChelmsfordCityCouncilevaluatestheimpactoftheVFestival(amusicfestival)onthelocaleconomy.Theystartbycalculatingthefestival’stotaldirectexpenditureanditsincidenceonthecity,county,andregion.Todothis,theauthorsusesurveydataontheexpenditurebyMetropolisMusic(theorganisersoftheevent),festivalcontractors,andvisitors.Inordertoestimatetheimpactofthisexpenditureonthelocalarea,theymakevariousassumptionsaboutthesizeofthelocalmultipliereffect,leakageeffectsnddisplacementeffects.Overallthedirectexpenditure(netimpacts)areestimatedat£7.4m(£6.6m)fortheEastofEnglandregion,£7.2m(£7.4m)forEssexcountyand£6.6m(£8.2m)forthecityofChelmsford.Notably,thenetimpactishigherthanexpenditureinChelmsfordbutlowerthanexpenditureintheregion.ThisreflectsthefactsomeexpenditureinChelmsfordisprobablypartlydisplacedfromthegreaterregion(50%ofvisitorstothefestivalcamefromtheEastofEnglandregion)whereasthelocalmultiplierisassumedtobethesame(about1.45)forthecity,countyandregion.
Inpracticethough,itisverydifficulttoensuretheseassumptionsarecorrect.Forexample,onewaytheseassumptionscouldfalldownisthatitislikelythatthelocalmultiplierissmallerinasinglecitythaninanentireregion.Alternatively,leakageanddisplacementeffectsmaybemuchlargerthanthought.ForChelmsford,itisassumedthatonlyaround12.8%ofexpenditurewouldhavehappenedanywayorleavesthecity.Giventhatthegreatmajorityofexpenditureisonfoodonthefestivalsite,thisfactorcouldbeanunderestimateif,forexample,alotofthecateringfirmscomefromoutsideofChelmsford.
Insummary,makingbroadassumptionsabouteffectsisoftenastraightforwardwaytoestimateimpactbut,duetothemanysourcesforerror,itisnosubstituteforobservingactualchangeinoutcomes(e.g.localfirmGVA)comparedwithacontrolgroup.SincethismethoddoesnotprovideacounterfactualitisnotSMSscoreable.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 30
Appendix 1: Quick-scoring guide for the Maryland Scientific Method Scale
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Randomised Control Trial (RCT)
a.k.a.
FieldExperiment
5,5if
• Randomisationissuccessful
• Attritioncarefullyaddressedornotanissue
• Contaminationnotanissue
5,4if
• Oneofthecriteriaisseverelyviolated
5,3if
• Twoormoreofthecriteriaareseverelyviolated
Instrumental Variable (IV)
a.k.a.
Two-StageLeastSquares(2SLS)
4,4ifinstrument
• Relevant(explainstreatment)
• Exogenous(notexplainedbyoutcome)
• Excludable(doesnotdirectlyaffectoutcome)
Scoredasperunderlyingmethodif
• Instrumentinvalid
• e.gcrosssectionwithinvalidIVscores2;difference-in-differencewithinvalidIVscores3(seebelow)
Regression Discontinuity Design (RDD)
4,4if
• Discontinuityintreatmentissharp(e.g.stricteligibilityrequirement)orfuzzydiscontinuitymethodused
• Onlytreatmentchangesatboundary
• Behaviourisnotmanipulatedtomakethecut-off
Scoredasperunderlyingmethodif
• Discontinuityconditionsseverelyviolated
• e.gcrosssectionwithinvalidIVscores2;difference-in-differencewithinvalidIVscores3(seebelow)
Difference in differences (DID)
a.k.a.
Diffindiff
3,3if
• Controlgroupwouldhavefollowedsametrendandtreatmentgroup
• Knowntimeperiodfortreatment
3,2if
• Eitherofthecriteriaisnotsatisfied
Panel methods
Panel Fixed Effects (FE)
3,3if
• Fixedeffectisattheunitofobservation
• Yeareffectsareincluded
• Appropriatetime-varyingcontrolsareused
3,2if
• Oneormoreofthethreecriteriaisnotsatisfied
First Differences (FD)
3,3if
• Yeareffectsareincluded
• Appropriatetime-varyingcontrolsareused
3,2if
• Eitherofthetwocriteriaisnotsatisfied
Arellano-Bond (AB)
3,3if
Yeareffectsareincluded
• Appropriatetime-varyingcontrolsareused
3,2if
• Oneofthetwocriteriaisnotsatisfied
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 31
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Hazard Regressions
Mixed Proportional Hazards (MPH)
4,4if
• Keyassumptionof‘noanticipation’holds
• Variationintiming(e.g.peoplestarttrainingatdifferenttimesrelativetobecomingunemployed)
4,3if
• Anticipationoftreatmentlikely
• Littlevariationintiming
4,2if
• Neitherofthecriteriaissatisfied
Proportional Hazards (PH)
3,3if
• Adequatecontrolgroupisestablished
• Treatmentdateisknownandsingular
3,2if
• Oneofthetwocriteriaisnotsatisfied
Heckman Two-Stage Approach (H2S)
Or
Control Function (CF)
With IV 4,4if
• SelectionequationincludesanIVthatsatisfiesthethreecriteria(seeIV)
Scoredasperunderlyingmethodif
• Instrumentinvalid
• e.gcrosssectionwithinvalidIVscores2;difference-in-differencewithinvalidIVscores3(seebelow)
With DID or panel method
3,3if
• Selectionequationincludesrelevantobservablevariables
• DIDorpanelcriteriamet
3,2if
• Oneofthecriteriaisnotsatisfied
Cross-sectional
2,2if
• Selectionequationincludesrelevantobservablevariables
2,1if
• Selectionequationdoesnotincluderelevantobservablevariables
Propensity Score Matching (PSM)
a.k.a.
Matching
With DID or panel method
3,3if
• Matchingcriteriasatisfied(seebelow)
• DIDorpanelcriteriasatisfied
3,2if
• Oneofthecriteriaisviolated
Cross-sectional
2,2if
• Goodmatchingvariables(i.e.relevanttoselection)
• Significantcommonsupport
2,1if
• Oneofthecriteriaisviolated
Cross-sectional regression 2,2if
• Adequatecontrolvariablesareused
2,1if
• Inadequatecontrolvariables
Before-and-after 2,2if
• Adequatecontrolvariablesareused
2,1if
• Inadequatecontrolvariables
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 32
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Stated effects/impact
a.k.a.
Additionality
Statedexperiences
NotSMSscoreable
Impact modelling NotSMSscoreable
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 33
Appendix 2: Mixed methods scoring 4 or 3.Asdiscussedinthemaintext,thereareanumberofmethodsthatarelessfrequentlyencounteredinlocaleconomicgrowthevaluationswhichmayscorea4ora3dependingontheparticularwayinwhichtheyarespecified.Thissectionexaminesthesemethodsandhowwescorethem.
Hazard Regressions Hazardfunctionsareusedtofindouttheimpactofapolicyonanoutcomethatrepresentsaduration.Theyarecommoninlaboureconomics,wheretheyareoftenusedtoanalysethevariablesthatimpactstrikeorunemploymentduration.Thehazardfunctioncalculatestheprobabilitythatanindividualwillleaveagivenstate(forinstance,unemployment)atagivenmomentintime.Proportionalhazardmodelsacknowledgethatthedependentvariableisafunctionofthetreatment,someobservedexplanatoryvariables(suchasageoreducation),arandomvariablethataccountsforindividualheterogeneity,andabase-linehazard.Thisbase-linehazardisessentiallythe“average”hazardfunction(i.e.propensitytoexitunemployment).Thismethoddealswithselectionintheprogrammeonobservablecharacteristicsbycontrollingfortheeffectofthesefactorsonthehazardrate.However,sinceselectionintothetreatmentgroupmayoccuronunobservablevariables,abiasmaypersist.BecausethismethodcontrolswellforobservablesbutisnotabletodealwithunobservablesitscoresamaximumofSMS3.However,inorderforthemethodtoachieveSMS3,twocriteriamustbesatisfied.Firstly,acontrolgroupmustbeused,anditmustbecrediblyarguedthattreatmentgroupwouldhavefollowedthesametrendasthiscontrolgroup,haditnotbeenexposedtotreatment.Secondly,theremustalsobeaknownandsingulartreatmentdatesothatthegroupscanbecomparedbeforeandafterthetreatment.
AnextensionofProportionalHazardmodelsistheMixedProportionalHazard(MPH)model.TheMPHexploitstimevariationintreatmenttoconstructacontrolgroup.Morespecifically,ithingesonthefactthatindividualsareexposedtotreatmentatdifferenttimes.Therefore,individualsthatreceivedthetreatmentatthebeginningoftheirunemploymentspellarecomparedtoindividualsthatreceivedthetreatmentafewmonthsaftertheyenteredunemployment.Here,theperiodwheretheindividualdidnothavetreatmentisusedasacontrolgroup.Thebasicideaisthattheindividualthatgottreatmentfromtheoutsetandtheindividualthatgottreatmentafewmonthslaterareverysimilarbecausetheybothself-selectedintotreatment,theonlydifferencebeingthepointatwhichtheyactuallygotit.BecausetheMPHfeaturesacontrolgroupthatisplausiblyunobservablyandobservablysimilartothetreatmentgroup,itcanachieveamaximumofSMS4.However,inorderforittoachievethismaximum,itmustmeettwokeycriteria.Firstly,individualscannotanticipatetreatment.Forinstance,ifanindividualknowstheyaregoingtoreceivetraininginamonth’stime,theymaynotsearchforjobsasintenselyastheydidbeforeknowing.Thismeansthatthe“control”groupiseffectivelytaintedandtheimpactofthepolicywillthereforelikelybeoverstated.Secondly,theremustbevariationintimingofindividuals’treatment.TheMPHmodelessentiallycomparesindividualsthatgotthetreatmentstraightawaywiththosethatgotthetreatmentabitlater.Forthisreason,itisextremelyimportantthatthisvariationexists.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 34
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Hazard Regressions
Mixed Proportional Hazards (MPH)
4,4if
• Keyassumptionof‘noanticipation’holds
• Variationintiming(e.g.peoplestarttrainingatdifferenttimesrelativetobecomingunemployed)
4,3if
• Anticipationoftreatmentlikely
• Littlevariationintiming
4,2if
• Neitherofthecriteriaissatisfied
Proportional Hazards (PH)
3,3if
• Adequatecontrolgroupisestablished
• Treatmentdateisknownandsingular
3,2if
• Oneofthetwocriteriaisnotsatisfied
MPH Example 1 (SMS 4, 4)
A2008paperbyLalive,vanOurs,andZweimullerevaluatestheimpactofactivelabourmarketprogrammesinSwitzerland.23Theseprogrammesaimtohelptheunemployedbyprovidingassistanceinsearchingforjobs.Aswithmostunemploymentprogrammes,thisisparticularlyhardtoevaluateastheunemployedarelikelytobedifferentfromtheemployed.Toovercomethis,theauthorsemployaMixedProportionalHazard(MPH)approach.InorderforthestudytoachievethemaximumSMSscoreof4,thetwocriteriamustbesatisfied.
1. No anticipation of treatment
Pass
Inthisparticularcase,itisarguedthatanticipationdoesnotplayasignificantrolebecauseSwissparticipantswereonlynotifiedaboutactualparticipationonetotwoweeksinadvance.Furthermore,theauthorshighlightthatinSwitzerlandtherearepenaltiesforjobseekersthatreducetheirsearcheffortsinanticipationofprogrammeparticipation.Lastly,theypointtothefactthattheprogrammewasoversubscribed,sojobseekerscouldnotaccuratelyknowwhenandiftheywouldbeparticipating.
2. Variation in timing
Pass
Theprogrammedoesnotspecifythatindividualsshouldbeunemployedforacertainamountoftimeinordertobeeligible.Thismeansthattheauthorscanstudytheeffectsoftheprogrammeforindividualsthathavebeenunemployedfordifferenttimeperiods.
Given that the two criteria are satisfied, this study achieves SMS 4 for implementation.
23 Lalive,R.,VanOurs,J.,Zweimuller,J.(2008).TheImpactofActiveLabourMarketProgrammesonTheDurationofUnemploymentinSwitzerland.EconomicJournal,235-257.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 35
MPH Example 2 (SMS 4, 3)
A2006studybyHujerandZeissevaluatestheimpactofajobcreationschemeonemployment.24Policiesthatseektoimproveindividuals’chancesoffindingajobareparticularlyhardtoevaluatebecauseunemploymentisoftendeterminedbypeople’sobservedandunobservedcharacteristics.Accordingly,theauthorsusetheMPHmodeltoovercomethisissue.
1. No anticipation of treatment
Fail
Itisextremelyunlikelythatindividualsdidnotknowbeforehandthattheywouldbeparticipatinginthejobschemeinthefuture,meaningthatthisconditiondoesnothold.Itisthereforelikelythatparticipantsstoppedsearchingforjobsasintenselywhentheyfoundoutthattheywouldbeparticipatinginthejobcreationscheme.
2. Variation in timing
Pass
Peoplethatparticipateintheprogrammedosowithvaryingamountsoftimeinunemployment.Inpractice,thismeansthattherearen’tanyeligibilityrequirementsintermsofunemploymenttime,thereforeimplyingthattheeffectoftheprogrammeonpeoplewithdifferingtimesofunemploymentcanbestudied.
Given that only one of the criteria is satisfied, this study achieves SMS 3 for implementation.
PH Example 3 (SMS 3, 3)
A2013studybyPalaliandvanOursevaluatestheimpactoflivingclosetoacoffee-shoponcannabisuse.25Theauthorsuseahazardfunctiontolookathowlongpeople“survive”withoutusingcannabis.Furthermore,theyexploitthefactthatcoffee-shopsbecamewidespreadduringthe1980s/90s,andthatthereweren’tanybeforethen.Theythereforeusecohortsthatwereinprimedrug-usingageaftertheriseofcoffee-shopsasthetreatmentgroup,andcohortsthataretoooldtobeaffectedcoffee-shopsasacontrol.
1. Adequate control group is established
Pass
Inthiscase,thetreatmentgroupisthecohortbornbetween1974and1992,whilethecontrolgroupisthecohortbornbetween1955and1973.Itcouldbeargued,however,thatthedifferentcohortsareexposedtodifferentculturaltrends,andarethereforeinherentlydifferenttoeachother,particularlywithrespecttocannabisuse.Tocounterthis,theauthorscontrolforthingslikereligiousaffiliation,urbanversusruralresidence,andmigrantstatus.
2. Treatment date is known and singular
Pass
In1980,theDutchgovernmentpubliclyannounceditspolicyoftoleranceofcoffee-shops.Thisledtoarapidincreaseinthenumberofestablishments,sothatbythemid-1990stherewerearound
24 Hujer,R.,Zeiss,C.(2006).TheEffectsofJobCreationSchemesontheUnemploymentDurationinEastGermany.IABDiscussionPapers.
25 Palali,A.,&VanOurs,J.(2013).DistancetoCannabis-ShopsandAgeofOnsetofCannabisUse.CentERDiscussionPaper,2013-2048.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 36
1500coffee-shops.Itcanbereasonablybelievedthatthisannouncementisasingularandtraceabletreatment.
Given that the two criteria are satisfied, this study achieves SMS 3 for implementation.
PH Example 2 (SMS 3, 2)
A1990studybyGundersonandMelinoevaluatestheimpactofpublicpolicyonstrikeduration.26Becausetheoutcomevariablehastodowithduration,theauthorsuseahazardregression.InordertoachievethemaximumSMSscoreof3,thetwocriteriamustbesatisfied.
1. Adequate control group is established
Fail
Inthiscase,theauthorsdonotattempttofindacontrolgroup.Instead,theysimplyrunaregressionwherebythestrikedurationsofdifferentunitswithdifferentpublicpoliciesarecompared.Therefore,itcanbeheldthattheauthorsdonotcomparetreatedunitswithuntreatedunits,butcompareunitswithdifferentintensitiesoftreatment.
2. Treatment date is known and singular
Fail
Thereisnoattempttodoacomparisonbeforeandafterthetreatment.Instead,placeswithinitialandfixeddifferinglevelsoftreatmentarecompared.
Given that none of the criteria are satisfied, this study achieves SMS 2 for implementation.
Heckman two-stage correction (H2S)/ Control function (CF)Thesemethodsuseatwo-stageapproachtoovercomeselectionbias.Thefirststageentailscalculatinga“selectionequation”thatincludesthecharacteristicsthatmayleadsomeonetoselectintotreatment.Inthesecondstage,elementsofthisselectionequationareincorporatedintothefinalregression(theonethatincludesthetreatmenteffectandoutcomevariableofinterest).Inthisway,theinclusionoftheselectionequationeffectively“absorbs”anyofthepre-existingselectionbias.However,althoughthesemethodsprovideapotentialsolutionforselectionbias,theydonotnecessarilyachievethisaim.Thisisbecausethequalityofthesemethodsislargelydependentonthetypesofvariablesthatareincludedintheselectionequation.Iftheselectionequationincludesahigh-qualityinstrument,thenitcanbeheldthattheselectionequationsuccessfullyaccountsforanyunobservableselectionbias,meaningthatthesemethodscanachieveamaximumofSMS4.Thisscorecanbeachievediftheinstrumentintheselectionequationsatisfiesthethreecriteriaforvalidinstruments(exogenous,relevant,andexclusionary).However,iftheselectionequationonlyincludesobservablecharacteristicsandnoinstrument,thentheunobservableelementofselectionbiasremainsanissue.Inthiscase,thesemethodscanachieveamaximumSMS2.Thismaximumscorecanbeachievedifthevariablesintheselectionequationadequatelyexplainselectionintotreatmentthatisbasedonobservablecharacteristics.However,evenifonlyobservablecharacteristicsareincludedintheselectionequation,ifthesemethodsarecombinedwithothermethodsthatattempttocontrolforunobservablecharacteristics(forinstance,DIDorpaneldatamethods),thentheycanachieveamaximumSMSscoreof3.Inordertoachievethisscore,theymustsatisfythecriteriaforthemethodthattheyareusedincombinationwith.
26 Morley,G.&Melino,A.(1990).TheEffectsofPublicPolicyonStrikeDuration.JournalofLaborEconomics
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 37
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Heckman Two-Stage Approach (H2S)
Or
Control Function (CF)
With IV 4,4if
• SelectionequationincludesanIVthatsatisfiesthethreecriteria(seeIV)
Scoredasperunderlyingmethodif
• Instrumentinvalid
• e.gcrosssectionwithinvalidIVscores2;difference-in-differencewithinvalidIVscores3(seebelow)
With DID or panel method
3,3if
• Selectionequationincludesrelevantobservablevariables
• DIDorpanelcriteriamet
3,2if
• Oneofthecriteriaisnotsatisfied
Cross-sectional
2,2if
• Selectionequationincludesrelevantobservablevariables
2,1if
• Selectionequationdoesnotincluderelevantobservablevariables
H2S with IV Example (SMS 4, 3)
A2013studybyChen,Sheng,andFindlayevaluatestheimpactofindustry-levelforeigndirectinvestment(FDI)ontheexportvalueofdomesticfirmsinChina.27Thereisaclearproblemofselectionbias,asindustriesthatreceiveFDIarelikelydifferenttothosethatdonot.Toovercomethis,theauthorsemploytheHeckmantwo-stageapproach.Furthermore,theyuseFDIinflowstoASEANcountries(perindustry)asaninstrumentforFDIinflows,andincludethisvariableintheselectionequation.
Instrument is:
1. Instrument is relevant
Pass
ItcanbeplausiblyheldthatFDIinChinaiscloselyrelatedtotheFDIinflowsofASEANcountriesasawhole.
2. Instrument is exogenous
Fail
WhetherornotaparticularindustryattractsFDIisnotrandom.Forthisreason,itcannotbereasonablyarguedthattheinstrumentisexogenous.
3. Instrument is exclusionary
Pass
FDIinASEANcountriesdoesnotdirectlyimpacttheexportvalueofChinesefirms.
Given that only two of the criteria are satisfied, this study achieves SMS 3 for implementation.
27 Chen,C.,Sheng,Y.,Findlay,C.(2013).ExportSpilloversofFDIonChina’sDomesticFirms.ReviewofInternationalEconomics,841-856.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 38
CF with DID Example (SMS 3, 3)
A2006studybyAndrenandAndrenevaluatestheimpactofvocationaltrainingonemployment.28Toovercomeissuesofselectionbias,theyemploytheC.F.method.Furthermore,itcontrolforunobservableselectionintotreatment,theyconstructacontrolgroupofindividualsthatdidnotreceivethetraining.Theythencomparetreatedandcontrolindividuals’propensitytoleaveunemploymentbeforeandafterthetraining.
1. Selection equation includes relevant observable variables
Pass
Theauthorsincludevariablessuchasage,sex,education,whethertheindividualhaschildrenornot,andregionofresidence.Withrespecttoobservables,thisseemstobeanadequatelist.
2. Control group would have followed same trend as treatment group
Pass
TheauthorshadaccesstoarichdatabaseofallunemployedindividualsinSweden.Theythereforechoseindividualsthatwereobservationallysimilartotreatedindividualstoformtheircontrolgroup.Accordingly,itcanbecrediblyarguedthattreatmentandcontrolgroupsareatleastobservationallysimilar,andthereforewilllikelyfollowthesametrends.
3. Treatment date is known and singular
Pass
Althoughdifferentindividualsstartedandendedthevocationaltrainingprogrammeatdifferenttimes,itisreasonabletopresumethatthereisaclearandsingulardateoftreatmentforeachperson.Thisisbecausevocationaltrainingprogrammeshaveaclearstartdate,sothatthereareno“partial”treatmenteffects.
Giventhatthethreecriteriaaresatisfied,thestudyachievesSMS3forimplementation.
H2S Cross-sectional Example (SMS 2, 2)
A2009studybyHammarstedtevaluatestheimpactoffutureearningsonthedecisiontoseekself-employmentinsteadofwage-employment.29Thedecisiontobecomeanentrepreneurislikelytaintedbyproblemsofself-selection–thosethatchoosetoforgowageemploymentareprobablyinherentlydifferenttothosethatdonot.Forthisreason,theauthorusestheHeckmantwostageapproachtocorrectforselectionbias.However,becausethestudyonfeaturesobservationsfortheyear2003,itispurelycross-sectional.
1. Selection equation includes relevant observable variables
Pass
Theauthorincludesanextensivelistofcontrolvariablesthataccountfordifferencesineducationandlocation.
Giventhatthecriterionissatisfied,thisstudyachievesSMS2forimplementation.
28 Andren,T.&Andren,D.(2006).AssessingtheEmploymentEffectsofVocationalTrainingUsingaOne-FactorModel.AppliedEconomics,2469-2486.
29 Hammarstedt,M.(2009).PredictedEarningsandthePropensityforSelf-Employment.InternationalJournalofManpower,349-359.
Guide to scoring evidence using the Maryland Scientific Method Scale - Updated June 2016 39
Appendix 3: Arellano-Bond methodTheArellano-Bond(AB)methodtakesasimilarapproachtoeitherthefixedeffectsorfirstdifferencesmethodbutincludesalagoftheoutcomevariableintheregression..However,ABrecognisesthatthelagmaybecorrelatedwiththeerrorterm,meaningthatthecoefficientsmaybebiased.Forthisreason,afurtherlagisusedasaninstrumentfortheoriginallag.
MethodMaximum SMS score (method, implementation)
Adjusted SMS score (method, implementation)
Panel Methods
Arellano-Bond (AB)
3,3if
• Yeareffectsareincluded
• Appropriatetime-varyingcontrolsareused
3,2if
• Oneofthetwocriteriaisnotsatisfied
AB Example 1 (SMS 3, 3)
A2010studybyBayona-SaezandGarcia-MarcoevaluatestheimpactoftheEurekaprogramme(aEuropeaninitiativetosupportR&D)onfirmperformance.30AswiththeevaluationofotherR&Dsubsidyprogrammes,thereisprobablyaproblemofselectionbiasasfirmsthatusethesubsidiesarelikelytobedifferentfromthosethatdonot.Toovercomethis,theauthorsusetheArellano-Bondmethod.
1. Year effects are included
Pass
Theauthorsincludeayearvariableinthefinalregression.
2. Appropriate time-varying controls used
Pass
Inthisparticularcase,itislikelythatthereareimportantvariablesthataffectfirmprofitabilityandvarywithtime.Accordingly,theauthorsincludeatime-varying“firmsize”variable.
Given that the two criteria are satisfied, this study achieves SMS 3 for implementation
30 Bayona-Sáez,C.&García-Marco,T.(2010).AssessingtheEffectivenessoftheEurekaProgram.ResearchPolicy,1375-1386.
The What Works Centre for Local Economic Growth is a collaboration between the London School of Economics and Political Science (LSE), Centre for Cities and Arup.
www.whatworksgrowth.org
ThisworkispublishedbytheWhatWorksCentreforLocalEconomicGrowth,whichisfundedbyagrantfromtheEconomicandSocialResearchCouncil,theDepartmentforBusiness,InnovationandSkillsandtheDepartmentofCommunitiesandLocalGovernment.ThesupportoftheFundersisacknowledged.TheviewsexpressedarethoseoftheCentreanddonotrepresenttheviewsoftheFunders.
Everyefforthasbeenmadetoensuretheaccuracyofthereport,butnolegalresponsibilityisacceptedforanyerrorsomissionsormisleadingstatements.
Thereportincludesreferencetoresearchandpublicationsofthirdparties;thewhatworkscentreisnotresponsiblefor,andcannotguaranteetheaccuracyof,thosethirdpartymaterialsoranyrelatedmaterial.
June2016
WhatWorksCentreforLocalEconomicGrowth
info@whatworksgrowth.org@whatworksgrowth
www.whatworksgrowth.org
©WhatWorksCentreforLocalEconomicGrowth2016
Recommended