Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
1
TheAcquisitionandUseofCausalStructureKnowledge
BenjaminMargolinRottman
LearningResearchandDevelopmentCenterUniversityofPittsburgh3939O’HaraStreetPittsburghPA15260
InM.R.Waldmann(Ed.),OxfordHandbookofCausalReasoning.Oxford:OxfordU.P.
AuthorNote:ThisresearchwassupportedbyNSFBCS-1430439.IthankMichaelWaldmannandBobRehderforprovidinghelpfulcommentsandsuggestions.
2
AbstractThischapterprovidesanintroductiontohowhumanslearnandreasonaboutmultiplecausalrelationsconnectedtogetherinacausalstructure.Thefirsthalfofthechapterfocusesonhowpeoplelearncausalstructures.Themaintopicsinvolvelearningfromobservationsvs.interventions,learntemporalvs.atemporalcausalstructures,andlearningtheparametersofacausalstructureincludingindividualcause-effectstrengthsandhowmultiplecausescombinetoproduceaneffect.Thesecondhalfofthechapterfocusesonhowindividualsreasonaboutthecausalstructure,suchasmakingpredictionsaboutonevariablegivenknowledgeaboutothervariables,oncethestructurehasbeenlearned.Someofthemostimportanttopicsinvolvereasoningaboutobservationsvs.interventions,howwellpeoplereasoncomparedtonormativemodels,andwhethercausalstructurebeliefsbiasreasoning.InbothsectionsIhighlightopenempiricalandtheoreticalquestions.Keywords:CausalStructureLearning,CausalReasoning,CausalBayesianNetworks1Introduction Inthepasttwodecades,psychologicalresearchoncasuallearninghasbeenstronglyinfluencedbyanormativeframeworkdevelopedbystatisticians,computerscientists,andphilosopherscalledCausalBayesianNetworks(CBN)orprobabilisticdirectedacyclicgraphicalmodels.ThepsychologicaladoptionofthiscomputationalapproachisoftencalledtheCBNframeworkorcausalmodels.TheCBNframeworkprovidesaprincipledwaytolearnandreasonaboutcomplexcausalrelationsamongmultiplevariables.
Forexample,Thrornley(2013)usedcausallearningalgorithmstoextractthecausalstructureinFigure1frommedicalrecords.Havingthecausalstructureisusefulforexpertssuchasepidemiologistsandbiologiststounderstandthediseaseandmakepredictionsforgroupsofpatients(e.g.,thelikelihoodofhavingcardiovasculardiseaseamong70yearoldsmokers).Itisalsousefulforscientistswhenplanningfutureresearch;whenresearchingcardiovasculardiseaseastheprimaryoutcomeitiscriticaltomeasureandaccountforsmokingstatusandage;itisnotimportanttomeasureorstatisticallycontrolforsystolicbloodpressure.Figure1:CausalStructureofCardiovascularDisease(adaptedfromThornley,2013)
Thoughcausalstructuresaresurelyusefulforscientists,thecausalmodelsapproachtocausalreasoninghypothesizesthatlaypeoplealsohaveanintuitiveunderstandingofcausalstructuresanduseaframeworksimilartoCBNstolearnandreasonaboutcausalrelations.Adoptinga“manasintuitivestatistician”or“intuitivescientist”approach(Peterson&Beach,1967),wecanalsocontemplatehowadoctormightdevelopasetofcausalbeliefsaboutcardiovasculardiseasesomewhatakintoFigure1.Ofcoursethedoctorlikelyhassomeknowledgeofspecificcausallinksfrommedicalschool
Sex
Cholesterol Ratio
Smoking
Cardiovascular Disease
Ethnicity
Diabetes
Age
Statin Use
Systolic Blood Pressure
3
andresearcharticles.Buttheselinksmaybereinforcedorcontradictedbypersonalexperiencesuchasnoticingwhichpatientshavewhichsymptomsanddiseases,andtrackinghowpatients’symptomschangeafterstartingatreatment.DevelopingasetofcausalbeliefssuchasinFigure1wouldallowaphysiciantomakeprognosesandtreatmentplanstailoredtoindividualpatients. TheCBNframeworksupportsallofthesedifferentfunctions:learning,prediction,explanation,andintervention.TherestofthischapterwillexplainwhattheCBNframeworkentails,theevidencepertainingtohowpeoplelearnandreasonaboutcausalnetworks,andhowcloselyhumansappeartomimicthenormativeCBNframework. Theoutlineofthischapterisasfollows.IfirstexplainwhatCBNsare,bothnormativelyandasamodelofhumanlearningandreasoning.Thebulkofthefirsthalfofthischapterisdevotedtoevidenceabouthowpeoplelearnaboutcausalnetworksincludingthestructure,strength,andintegrationfunction.IthendiscussevidencesuggestingthatinsteadofusingthebasicCBNframework,peoplemaybeusingsomethingakintoageneralizedversionoftheCBNframeworkthatallowsforreasoningabouttime.Thesecondhalfofthechapterisdevotedtoevidenceonhowpeoplereasonabouttheircausalbeliefs.AttheendofthechapterIraisesomequestionsforfutureresearch.1.1WhatAreCausalBayesianNetworks? ACausalBayesNetworkisacompactvisualwaytorepresentthecausalrelationsbetweenvariables.Eachvariableisrepresentedasanode,andarrowsrepresentcausalrelationsfromcausestoeffects.Theabsenceofanarrowimpliestheabsenceofacausalrelation. ThoughCausalBayesianNetworkscapturethecausalrelationsamongvariables,theyalsosummarizethestatisticalrelationsbetweenthevariables.TheCBNframeworkexplainshowcausalrelationsshouldbelearnedfromstatisticalrelations.Forexample,givenadatasetwithanumberofvariables,theCBNframeworkhasrulesforfiguringoutthecausalstructure(s)thataremostlikelytohaveproducedthedata.Conversely,acausalstructurecanbereadsuchthatifthecausalstructureisbelievedtobetrue,itimpliescertainstatisticalrelationsbetweenthevariables;thatsomesetsofvariableswillbecorrelatedandotherswillnotbecorrelated.Inordertounderstandhowto“read”aCBN,itisimportanttounderstandtheserelationsbetweenthecausalarrowsandthestatisticalpropertiestheyimply.
First,itiscriticaltounderstandsomebasicstatisticalterminology.“Unconditionaldependence”iswhethertwovariablesarestatisticallyrelatedto(or“dependenton”)eachother(e.g.,correlated)withoutcontrollingforanyothervariables.Iftheyarecorrelated,theyaresaidtobedependent,andiftheyarenotcorrelated,theyaresaidtobeindependent.Conditionaldependenceiswhethertwovariablesarestatisticallyrelatedtoeachotheraftercontrollingforoneormorevariables(e.g.,whetherthereisasignificantrelationbetweentwovariablesaftercontrollingforathirdvariableinamultipleregression).ConditionalandunconditionaldependenceandindependencearecriticalforunderstandingaCBN,soitisimportanttobefluentwiththesetermsbeforemovingon.
Therearetwoproperties,theMarkovproperty,andthefaithfulnessassumption,thatexplaintherelationsbetweenthecausalarrowsinaCBNandthestatisticaldependenciesbetweenthevariablesinadatasetthatissummarizedbytheCBN(alsoseeRehder,thisvolumeaandb).TheMarkovpropertystatesthatonceallthedirectcausesof
4
avariableXarecontrolledfororheldconstant,XisstatisticallyindependentofeveryvariableinthecausalnetworkthatisnotadirectorindirecteffectofX.
Forexample,considerFigure2b.TheMarkovassumptionstatesthatXwillbeindependentofallvariables(e.g.,Z)thatarenotdirectorindirecteffectsofX(XhasnoeffectsinFigure2b)oncecontrollingforalldirectcausesofX(YistheonlydirectcauseofX).SimilaranalysescanbeusedtoseethatXandZarealsoindependentconditionalonYinFigures2aand2c.
InregardstoFigure2d,theMarkovAssumptionimpliesthatXandZareunconditionallyindependent(notcorrelated).NeitherXnorZhaveanydirectcauses,soXwillbeindependentofallvariables(suchasZ),thatarenotadirectorindirecteffectofX(e.g.,Y).TheMarkovpropertyissymmetric;ifXisindependentofZ,ZisindependentofX-theyareuncorrelated. ThefaithfulnessAssumptionstatesthattheonlyindependenciesbetweenvariablesinacausalstructuremustbethoseimpliedbytheMarkovassumption(Glymour,2001;Spirtes,Glymour,&Scheines,1993).Statedanotherway,allvariablesinthestructurewillbedependent(correlated),exceptwhentheMarkovpropertystatesthattheywouldnotbe.ThismeansthatifwecollectaverylargeamountofdatafromthestructuresinFigures2a,2b,or2c,XandY,andYandZ,andXandZwouldallbeunconditionallydependent;theonlyindependenciesbetweenthevariablesariseduetotheMarkovassumption,thatXandZareconditionallyindependentgivenY.IfwecollectedalargeamountofdataandnoticedthatXandZwereunconditionallyindependent,thisindependencyinthedatawouldnotbe“faithful”toFigures2a,2b,or2c,implyingthatthedatadonotcomefromoneofthesestructures.ForFigure2d,theonlyindependencyimpliedbytheMarkovpropertyisthatXandZareunconditionallyindependent.Ifalargeamountofdatawerecollectedfromstructure2d,XandY,andZandYwouldbedependent(accordingtothefaithfulnessassumption).Figure2:FourCBNs
Insum,causalmodelsprovideaconcise,intuitive,visuallanguageforreasoningaboutcomplexwebsofcausalrelations.Thecausalnetworkdiagramintuitivelycaptureshowthevariablesarecausallyandstatisticallyrelatedtoeachother.Butcausalnetworkscandomuchmorethanjustdescribethequalitativecausalandstatisticalrelations;theycanpreciselycapturethequantitativerelationsbetweenthevariables.
Tocapturethequantitativerelationsamongvariables,causalnetworksneedtobespecifiedwithaconditionalprobabilitydistributionforeachvariableinthenetworkgivenitsdirectcauses.AconditionalprobabilitydistributionestablishesthelikelihoodthatavariablesuchasYwillhaveaparticularvaluegiventhatanothervariableX,Yscause,hasaparticularvalue.Additionally,exogenousvariables,variablesthathavenoknowncausesinthestructure,arespecifiedbyaprobabilitydistributionrepresentingthelikelihoodthattheexogenousvariableassumesaparticularstate.
Forexample,theCBNinFigure2awouldbespecifiedbyaprobabilitydistributionforX,aconditionaldistributionofYgivenX,andaconditionaldistributionofZgivenY.IfX,
X ZY
a
X ZY
b
X ZY
c
X ZY
d
5
Y,andZarebinary(0or1)variables,thedistributionforXwouldsimplybetheprobabilitythatx=1,P(x=1).TheconditionalprobabilityofYgivenXwouldbetheprobabilitythaty=1giventhatx=1,P(y=1|x=1),andtheprobabilitythaty=1giventhatx=0,P(y=1|x=0),andlikewisefortheconditionalprobabilityofZgivenY.(Thereisalsoanotherwaytospecifytheseconditionaldistributionswith“causalstrength”parameters,whichwillbediscussedinlatersections,andsummarizedinSection3.6.SeealsoCheng&Lu,thisvolume;Griffiths,thisvolume;Rehder,thisvolume,aandb,formoredetailsaboutparameterizingastructurewithcausalstrengths.)
Ifthevariablesarenormally-distributedcontinuousvariables,thedistributionforXwouldbecapturedbythemeanandstandarddeviationofX.Then,theconditionaldistributionofYwouldbecapturedbyaregressioncoefficientofYgivenX(e.g.,theprobabilitythaty=2.3giventhatx=1.7),aswellasaparametertocapturetheamountoferrorvariance.
TheCBNinFigure2cwouldbespecifiedbyadistributionforY,andconditionalprobabilitydistributionsforXgivenYandZgivenY.TheCBNinFigure2dwouldbespecifiedbyprobabilitydistributionsforXandZ,andaconditionalprobabilitydistributionforZgivenbothXandY.Inthisway,alargecausalstructureisbrokendownintosmallunits.
Oncealltheindividualprobabilitydistributionsarespecified,Bayesianinferencecanbeusedtomakeinferencesaboutanyvariableinthenetworkgivenanysetofothervariables,forexample,theprobabilitythaty=3.5giventhatx=-0.7andz=1.1.CBNsalsosupportinferringwhatwouldhappenifonecouldinterveneandsetanodetoaparticularvalue.Beingabletopredicttheresultofaninterventionallowsanagenttochoosetheactionthatproducesthemostdesiredoutcome.
CBNshavebeentremendouslyinfluentialacrossawiderangeoffieldsincludingcomputerscience,statistics,engineering,epidemiology,managementsciences,andphilosophy(Pearl,2000;Spirtes,Glymour,&Scheines,2000).TheCBNframeworkisextremelyflexibleandsupportsmanydifferentsortsoftasks.Theycanbeusedtomakeprecisepredictions(includingconfidenceintervals),theycanincorporatebackgroundknowledgeoruncertainty(e.g.,uncertaintyinthestructure,oruncertaintyinthestrengthsofthecausalrelations)forsensitivityanalysis.Theycanbeextendedtohandleprocessesthatoccurovertime.AndsinceCBNsareanextensionofprobabilitytheory,theycanincorporateanyprobabilitydistribution(logistic,multinomial,Gaussian,exponential).Insum,theCBNframeworkisanextremelyflexiblewaytorepresentandreasonaboutprobabilisticcausalrelations.1.2WhatistheCausalBayesianNetworkTheoryofLearningandReasoning? Mostgenerally,thecausalmodeltheoryofhumanlearningandreasoningisthathumanslearnandreasonaboutcausalrelationsinwaysthataresimilartoformalCBNs.ThistheoryispartofabroadermovementinpsychologyofusingprobabilisticBayesianmodelsasmodelsofhigher-levelcognition.11Here“model”isservingtwopurposes.FirstprobabilisticBayesianmodelsareintendedtobeobjectivemodelsofhowtheworldworks(e.g.,Figure1isanobjectivemodelofcardiovasculardisease).Thesecondsenseofmodel,asusedbythepsychologist,isthatthesameprobabilisticmodelcouldalsoserveasamodelofhumanreasoning–treatingFigure1asarepresentationofhowadoctorthinksaboutcardiovasculardisease.
6
Thebroadermovementofusingprobabilisticmodelsasmodelsofhigher-levelcognitionistypicallyviewedatMarr’scomputationallevelofanalysis–identifyingtheproblemtobesolved.Indeed,articlesappealingtocausalnetworkshavefulfilledthepromiseofacomputational-levelmodel–forexample,theyhavereframedtheproblemofhumancausalreasoningbyclarifyingthedistinctionbetweencausalstrengthvs.structure(Griffiths&Tenenbaum,2005),andbyidentifyingcausalstructurelearningasagoaluntoitself(AlisonGopniketal.,2004;Steyvers,Tenenbaum,Wagenmakers,&Blum,2003). ThoughtheflexibilityoftheCBNframeworkisobviouslyatremendousadvantageforitsutility,theflexibilitymakesitchallengingtospecifyaconstraineddescriptivetheoryofhumanlearningandreasoning.ThetheoreticalunderpinningoftheCBNframework(e.g.,learningalgorithms,inferencealgorithms)isanactiveareaofresearchratherthanastatictheory.Additionally,therearemanydifferentinstantiationsofhowtoapplytheframeworkinaspecificinstance(e.g.,alternativelearningalgorithms,alternativeparameterizationsofamodel). BecauseoftheflexibilityandmultifacetednatureoftheCBNframework,itisnotparticularlyusefultotalkabouttheCBNframeworkasawhole.Instead,inthecurrentchapterIfocusonthefitbetweenspecificaspectsoftheframeworkandhumanreasoning,withinaspecifictask.2Learning2.1LearningCausalStructure2.1.1LearningaCausalStructurefromObservations OneofthemostdramaticwaysthattheCBNframeworkhaschangedthefieldofhumancausalreasoningisbyidentifyingcausalstructurelearningasaprimarygoalforhumanreasoning(Steyversetal.,2003).Afundamentalprincipleoflearningcausalstructurefromobservationisthatitisoftennotpossibletoidentifytheexactcausalstructure;twoormorestructuresmayexplainthedataequallywell.Thisisessentiallyamoresophisticatedversionof“correlationdoesnotimplycausation.” Considerthe9observationsinTable1,whichsummarizesthecontingencybetweentwovariables,XandY.Thecorrelationbetweenthesetwovariablesis.79.JustknowingthatXandYarecorrelatedcannottelluswhetherXcausesY[X→Y]orwhetherYcausesX[X←Y].(TechnicallyitisalsopossiblethatathirdfactorcausesbothXandY,butIignorethisoptionforsimplicityofexplanation.)Onesimplewaytoseethisisthatunderbothofthesecausalstructures[X→Y]and[X←Y],wewouldexpectXandYtobecorrelated,sothefactthattheyarecorrelatedcannottellusanythingaboutthecausalstructure.AmoresophisticatedwaytounderstandhowitisimpossibletodeterminethetruecausalstructurejustfromobservingthedatainTable1istoseehowparameterscanbecreatedforbothcausalstructuresthatfitthedataequallywell.Thismeansthatthestructuresareequallylikelytohaveproducedthedata.
7
TherightsideofTable1showsparametersfortherespectivecausalstructurethatfitthedataperfectly.Forexample,for[X→Y],weneedtofindthreeparameterstospecifythestructure.ThebaserateofX,P(x=1),canbeobtainedbycalculatingthepercentageoftimesthatX=1regardlessofY,((3+1)/9).P(y=1|x=1)issimplythepercentoftimesthatY=1giventhatX=1,andP(y=1|x=0)isthepercentoftimesthatY=1giventhatX=0.ParameterscanbededucedforX←Yinasimilarfashion.IfwesimulatedalargenumberofobservationsthatwewouldexpecttoseefromeachcausalstructurewiththeparametersspecifiedinTable1,wewouldfindthatbothstructureswouldproducedatawithproportionsthatlookssimilartothedatainTable1.Specifically,wewouldobservetrialsinwhichbothvariablesare1about3outof9times,trialsinwhichbothvariablesare0about5/9times,andtrialsinwhichX=1andY=0about1in9timesinthelongrun.Becausewewereabletofindparametersforthesestructuresthatproducedataverysimilartothedataweobserved,thesetwostructuresareequallylikelygiventheobserveddata.Table1:SampleDataforTwoVariablesX Y Numberof
ObservationsParametersthatfitthedataperfectly
foreachcausalstructureX→Y X←Y
1 1 3 P(x=1)=4/9P(y=1|x=1)=3/4P(y=1|x=0)=0
P(y=1)=3/9P(x=1|y=1)=1P(x=1|y=0)=1/6
1 0 10 1 00 0 5Thesamelogicalsoapplieswithmorevariables.ConsiderthedatainTable2withthreevariables.IfyouranacorrelationbetweeneachpairofvariablesyouwouldfindthatXandYarecorrelated(r=.25),YandZarecorrelated(r=.65),andXandZarecorrelated(r=.17)butareindependent(r=0)onceYiscontrolledfor.AccordingtotheMarkovandFaithfulnessassumptions,thispatternofdependenciesandconditionalindependenciesisconsistentwiththreeandonlythreecausalstructures;X→Y→Z,X←Y←Z,andX←Y→Z.
Table2showsparametersforeachofthesecausalstructuresthatfitthedataperfectly.IfwesampledalargeamountofdatafromanyofthethreestructuresinTable2withtheassociatedparameters,theproportionofthetypesof8observationsofX,Y,andZwouldbeverysimilartotheproportionsofthenumberofobservationsinTable2.ThesethreecausalstructuresaresaidtoformaMarkovclassbecausetheyareallequallyconsistentwith(orlikelytoproduce)thesetofconditionalandunconditionaldependenciesintheobserveddata.Thus,itisimpossibletoknowwhichofthesethreestructuresproducedthesetofdatainTable2.
8
Table2:SampleDataforThreeVariablesX Y Z Numberof
ObservationsParametersthatfitthedataperfectly
foreachcausalstructureX→Y→Z X←Y←Z X←Y→Z
1 1 1 6 P(x=1)=1/2P(y=1|x=1)=3/4P(y=1|x=0)=1/2P(z=1|y=1)=2/3P(z=1|y=0)=0
P(z=1)=5/12P(y=1|z=1)=1P(y=1|z=0)=5/14P(x=1|y=1)=3/5P(x=1|y=0)=1/3
P(y=1)=5/8P(x=1|y=1)=3/5P(x=1|y=0)=1/3P(z=1|y=1)=2/3P(z=1|y=0)=0
1 1 0 31 0 1 01 0 0 30 1 1 40 1 0 20 0 1 00 0 0 6
Importantly,non-Markovequivalentcausalstructurescanbedistinguishedfromoneanotherwithobservations.Forexample,commoneffectstructuressuchasX→Y←ZareintheirownMarkovequivalenceclass,sotheycanbeuniquelyidentified.AccordingtotheMarkovassumptionfor[X→Y←Z],XandYaredependent,andZandYaredependent,butXandZareindependent.Therearenootherthree-variablestructureswiththisparticularsetofconditionalandunconditionaldependencies.ThismeansthateventhroughX→Y→Z,X←Y←Z,andX←Y→ZareallequallylikelytoproducethedatainTable2,X→Y←Zismuchlesslikely.SupposewetriedtofindparametersforX→Y←ZtofitthedatainTable2.ItwouldbepossibletochooseparameterssuchthatYandXarecorrelatedroughlyaroundr=.25,andthatYandZarecorrelatedroughlyaroundr=.65,matchingthedatainTable2fairlyclosely.Butcritically,wewouldfindthatnomatterwhatparameterswechose,XandZwouldalwaysbeuncorrelated,andthus,itwouldbeveryunlikelythatthedatafromTable2wouldcomefromX→Y←Z.
Insum,byexaminingthedependenciesbetweenvariablesitispossibletoidentifywhichtypesofstructuresaremoreorlesslikelytohaveproducedtheobserveddata.StructureswithinthesameMarkovequivalenceclassalwayshavetheexactsamelikelihoodofproducingaparticularsetofdata,whichmeansthattheycannotbedistinguished,butstructuresfromdifferentMarkovequivalenceclasseshavedifferentlikelihoodsofproducingaparticularsetofdata.
Steyversetal.,(2003)conductedasetofexperimentstotestwhetherpeopleunderstandMarkovequivalenceclassesandcouldlearnthestructurefrompurelyobservationaldata.First,theyfoundthatgivenaparticularsetofdata,participantswereabovechanceatdetectingthecorrectMarkovclass.Furthermore,peopleseemtobefairlygoodatunderstandingthatobservationscannotdistinguishX→YfromX←Y.AndpeoplealsoseemtounderstandtosomeextentthatcommoneffectstructuresX→Y←ZbelongtotheirownMarkovequivalenceclass. However,Steyversetal.’sparticipantswerenotgoodatdistinguishingchainandcommoncausestructuresevenwhentheywerefromdifferentequivalenceclasses(e.g.,X→Y→Zvs.X→Z→Y).Distinguishingthesestructureswasmademoredifficultintheexperimentbecausethemostcommontypeofobservationforallthesestructureswasforthethreevariablestohavethesamestate.Still,theparticipantsdidnotappeartousethe
9
trialswhentwoofthevariablesshareastatedifferentfromathirdtodiscriminatecausalstructures(e.g.,theobservationX=Y≠ZismoreconsistentwithX→Y→ZthanX→Z→Y). GiventhatMarkovequivalenceclassissoimportantfortheoriesofcausalstructurelearningfromobservation,itissurprisingthatthereisnotmoreworkonhowwelllaypeopleunderstandMarkovequivalence.OneimportantfuturedirectionwouldbegiveparticipantsasetoflearningdatathatunambiguouslyidentifiesaparticularMarkov-equivalentclass,andtestthepercentofparticipantswho1)identifythecorrectclass,2)identifyallthestructuresintheMarkovclass,and3)includeincorrectstructuresoutsidetheclass.Suchanexperimentwouldhelpclarifyhowgoodorbadpeopleareatlearningcausalstructurefromobservations.Additionally,Steyversetal.usedcategoricalvariableswithalargenumberofcategoriesandnearlydeterministiccausalrelations,whichlikelyfacilitatedaccuratelearningbecauseitwasveryunlikelyfortwovariablestohavethesamevalueunlesstheyarecausallyrelated.ItwouldbeinformativetoexaminehowwellpeopleunderstandMarkovequivalenceclasseswithbinaryorGaussianvariables,whichwilllikelybeharder.Anotherquestionraisedbythisarticleistowhatextentheuristicstrategiesmaybeabletoexplainthepsychologicalprocessesinvolvedinthisinference.InthestudiesbySteyversetal.therearesomesimplerulesthatcandistinguishtheMarkovequivalenceclassesfairlysuccessfully.Forexample,uponobservingatrialinwhichX=Y=Z,X←Y→ZismuchmorelikelythanX→Y←Z,butuponobservingatrialinwhichX≠Y=Z,thelikelihoodsflip.ButinothertypesofparameterizationssuchasnoisybinarydataorGaussiandata,thisdiscriminationwouldnotbesoeasy.
EventhoughMarkovequivalenceisacorefeatureofcausalstructurelearningfromobservations,asfarasIknowthisstudybySteyversetal.istheonlystudytotesthowwellpeoplelearncausalstructurespurelyfromthecorrelationsbetweenthevariables.Thereareanumberofotherstudiesthathaveinvestigatedotherobservationalcuestocausality.Forexample,anumberofstudieshavefoundthatifXoccursfollowedbyY,peoplequicklyandrobustlyusethistemporalorderordelaycuetoinferthatXcausesY(Lagnado&Sloman,2006;Mccormack,Frosch,Patrick,&Lagnado,2015).Thisinferenceoccursdespitethatfactthatthetemporalordermaynotnecessarilyrepresenttheorderinwhichthesevariablesactuallyoccurred,butinsteaditmightreflecttheorderinwhichtheybecomeavailableforthesubjecttoobservethem.Anothercuethatpeopleusetoinfercausaldirectionarebeliefsaboutnecessityandsufficiency.Ifalearnerbelievesthatallcausesaresufficienttoproducetheireffects,thatwheneveracauseispresent,itseffectswillbepresent,thenobservingthatX=1andY=0impliesthatXisnotacauseofY,otherwiseYwouldbe1(Mayrhofer&Waldmann,2011).Insections2.1.4Iwilldiscussoneotherwaythatpeoplelearncausaldirectionfromobservation.Butthegeneralpointofallofthesestudiesisthatwhentheseothercuestocausalityarepittedagainstpurecorrelationsbetweenthevariables,peopletendtousetheseothercuestocausality(seeLagnado,Waldmann,Hagmayer,&Sloman,2007forasummary).
Insum,thoughthereissomeevidencethatpeopledounderstandMarkovequivalenceclasstosomeextent,thisunderstandingappearslimited.Furthermore,therearenotmanystudiesonhowwellpeoplelearncausalstructuresinabottom-upfashionpurelyfromcorrelationaldatawhencovariationistheonlycueavailable.Incontrast,whatisclearisthatwhenotherstrategiesareavailable,peopletendtousetheminsteadofinferringcausalstructurepurelyfromthedependenciesandconditionalindependencies.
10
2.1.2LearningCausalStructurefromInterventions:ChoosingInterventionsAnotheracoreprincipleunderlyingcausalstructurelearningisthatinterventions
(manipulations)havethecapabilitytodiscriminatecausalstructuresthatwouldnotbedistinguishablefromobservation;thisisthesamereasonwhyexperimentsaremoreusefulthanobservationalstudies.GoingbacktotheexampleinFigure1,ifweweretryingtofigureoutthecausalrelationsbetweendiabetes(D),statinuse(S),andsystolicbloodpressure(BP),observationaldatawouldonlybeabletonarrowthepossibilitiesdowntothreestructures:D→S→BP,D←S←BP,andD←S→BP.However,ifwecoulddoarandomizedexperimentsuchthathalfthepatientstakeastatinandtheotherhalfdonot,wecouldinferthecausalstructure.IfD→S→BPisthetruecausalstructure,thenthepatientswhotakeastatinwouldhavelowerBPthanthosewhodonot,buttherewouldnotbeanydifferenceinDacrossthetwogroups.Incontrast,ifD←S←BPisthetruecausalstructure,therewouldbeadifferenceofDacrossthetwogroupsbuttherewouldnotbeadifferenceinBPacrossthetwogroups.Finally,ifD←S→BPisthetruecausalstructure,therewouldbeadifferenceinbothDandBPacrossthetwogroups.Therestofthissectionwillexplaininmoredetailhowinterventionscanbeusedtopreciselyidentifyacausalstructure,andhowhumansuseinterventions.
Thelanguageofcausalstructurediagramshasasimplenotationtorepresentinterventions.Whenaninterventionsetsthestateofavariable,allothervariablesthatwouldotherwisebecausesofthemanipulatedvariablearenolongercauses,sothoselinksgetremoved.Forexample,whenthepatientsinourexamplearerandomlyassignedtotakeastatin,eventhoughnormallyhavingdiabetesisacauseoftakingastatin,nowbecauseoftherandomassignmentdiabetesisnolongeracauseoftakingastatin.
Moregenerally,thereasonwhyinterventionscanmakecausalstructuresthatareinthesameMarkovequivalenceclassdistinguishableisthattheinterventionchangesthecausalstructure.Forthisreason,interventionsaresometimescalled“graphsurgery”(Pearl,2000).Figure3a-cshowsthreecausalstructuresthatarenotdifferentiablefromobservation.Figure3d-fandg-ishowthesamethreecausalstructuresundereitheraninterventiononYoraninterventiononX;theinodesrepresenttheintervention.(TheinterventiononYisanalogoustothepreviousexampleoftherandomizedexperimentabouttakingastatin;itcouldbeusefultocomparethesetwoexamplesforgenerality.)
UndertheinterventiononYallthreecausalstructuresnowhavedifferentdependencerelations.InGraphD,ZandYwouldstillbecorrelated,butneitherwouldbecorrelatedwithX.InGraphE,XandYwouldbecorrelated,butneitherwouldbecorrelatedwithZ.AndinGraphF,allthreevariableswouldbecorrelated,butXandZwouldbecomeuncorrelatedconditionalonY.Insum,interventionsonYchangethecausalstructuresuchthattheresultingstructuresnolongerfallwithinthesameMarkovequivalenceclass,sotheycanbediscriminated.Incontrast,aninterventiononXcandiscriminateGraphGfromH,butcannotdiscriminateGraphHfromI.ThismeansthataninterventiononXdoesnotprovideasmuchinformationfordiscriminatingthesethreecausalstructuresasdoesaninterventiononY.
11
Figure3:ThreeCausalStructureswithDifferentTypesofInterventions.
Dopeoplechooseinterventionsthatmaximize“informationgain,”theabilitytodiscriminatebetweenmultiplepossiblestructures?Beforegettingtotheevidence,itisusefultoconsideranalternativestrategyforchoosinginterventionstolearnaboutcausalstructureasidefrommaximizinginformationgain;selectinginterventionsthathavethelargestinfluenceonothervariables.Forexample,consideragainthethreestructuresintheNoInterventionsrowinFigure3.InGraphA,Xinfluencestwovariablesdirectlyorindirectly,andinGraphsBandCXdoesnotinfluenceeitherothervariable,foratotal“centrality”ratingof2.Zhasthesamecentralityrating-2.Y,incontrast,influencesZinGraphA,XinGraphB,andXandYinGraphC,foratotalcentralityratingof4.Insum,lookingacrossallthreepossiblestructuresZismore“central”ormoreofa“rootcause.”IfalearnerchoosestointervenetomaximizetheamountofchangesinothervariablestheywilltendtointerveneonYinsteadofXorZ.
Sometimes,asintheexampleinFigure3theInformationGainstrategyandtheRootCausestrategyproducethesameinterventions;YisthemostcentralvariableandinterventionsonYhelpdiscriminatethethreestructuresthemost.However,sometimesthetwostrategiesleadtodifferentinterventions.Forexample,whentryingtofigureoutwhether[X→Z→Y]or[X→Y→Z]isthetruestructure,Xhasthehighestcentralityrating.However,interveningonXwouldnotdiscriminatethestructureswell(lowinformationgain)becauseforbothstructuresitwouldtendtoproducedatainwhichX=Y=Z.InterveningonYorZwouldmoreeffectivelydiscriminatethestructures.Forexample,interveningonYwouldtendtoproducedatainwhichX=Z≠Yfor[X→Z→Y]butwouldtendtoproducedatainwhichX≠Y=Zfor[X→Y→Z],effectivelydiscriminatingthetwostructures.TheRootCauseStrategy,interveningonX,canbeviewedasatypeofpositiveorconfirmatorytestingstrategyinthesensethatitconfirmsthehypothesisthatXhassomeinfluenceonYandZ,butdoesnotactuallyhelpdiscriminatebetweentheremaininghypotheses.
Coenenetal.(2015)testedwhetherpeopleusethesetwostrategiesandfoundthatmostpeopleuseamixtureofboth,thoughsomeappeartousemainlyoneortheother.InanotherexperimentCoenentestedwhetherpeoplecanshifttowardsprimarilyusingtheinformationgainstrategyiftheyarefirsttrainedonscenariosforwhichtherootcausepositivetestingstrategywasverypooratdiscriminatingthecausalstructures.Evenwithoutfeedback,overtimeparticipantsswitchedmoretowardsusinginformationgain.Theyalsotendedtousetherootcausestrategymorewhenansweringfaster.Insum,root
X ZY
a
No Interventions:
Intervention on Y:
X ZY
bX ZY
c
X ZY
dX ZY
eX ZY
fi i i
Intervention on X:
X ZY
gX ZY
hX ZY
ii i i
12
causepositivetestingisaheuristicthatsometimescoincideswithinformationgain,anditappearsthatpeoplesometimescanovercometheheuristicwhenitisespeciallyunhelpful.
ThoughSteyversetal.(Steyversetal.,2003)didnotdescribeitinthesamewayasCoenenetal.(2015),theyactuallyhaveevidenceforafairlysimilarphenomenontothepositiveteststrategy.Theyfoundthatpeopletendedtointerveneonrootcausesmorethanwouldbeexpectedbypurelyusinginformationgain.Intheirstudy,participantsfirstsaw10observationallearningtrials,andthenchosethecausalstructurethattheythoughtwasmostplausible,forexample[X→Y→Z].Technicallysincethedatawasobservational,theycouldnotdistinguishmodelswithinthesameMarkovequivalentclassatthisstage.NexttheyselectedoneinterventiononeitherX,Y,orZandwouldget10moretrialsofthesameinterventionrepeatedly.Steyversetal.foundthattheirparticipantstendedtoselectrootcausevariablestointerveneupon.Iftheythoughtthatthechain[X→Y→Z]structurewasmostplausible,theymostfrequentlyintervenedonX,thenY,andthenZ.Thispatternfitsbetterwiththerootcauseheuristicthantheinformationgainstrategy,whichsuggestsinterveningonY.
Becausethisfindingcannotbeexplainedbyinformationgainalone,Steyversetal.createdtwoadditionalmodels;hereIonlydiscussRationalTestModel2.Thismodelmakestwoadditionalassumptions.First,itassumesthatparticipantshadadistortedsetofhypothesesaboutthepossiblecausalstructure.NormativelytheirhypothesisspaceforthepossiblecausalstructuresshouldhavebeentheMarkovequivalentclass;iftheyselected[X→Y→Z]asthemostlikelystructureafterthe10observationaltrials,theyshouldhavealsoviewed[X←Y←Z]and[X←Y→Z]asequallyplausible,andthegoalshouldhavebeentotrytodiscriminateamongthesethree.However,thismodelassumesthatinsteadoftryingtodiscriminatebetweenthethreeMarkovequivalentstructures,participantsweretryingtodiscriminatebetween[X→Y→Z],[X→Y;Z],[X;Y→Z],and[X;Y;Z];thelatterthreearethesubsetofthechainstructuresthatincludethesamecausallinksorfewerlinks.Icallthisassumptionthe“alternatehypothesisspaceassumption”inthatthesetofpossiblestructures(hypothesisspace)isbeingchanged.
Underthisnewhypothesisspace,whenXisintervenedupon,eachofthesefourstructureswouldproducedifferentpatternsofdata,makingitpossibletodeterminewhichofthesefourstructuresismostlikely;seeTable3.ThismeansthatXhashighinformationgain.Incontrast,whenYorZismanipulated,someofthestructuresproducethesamepatternsofdata,meaningthattheyprovidelessinformationgain.
Table3:MostCommonPatternofDataProducedbyAnInterventionOnAParticularNodeforaParticularStructure
AlternateHypothesisSpace
Assumption
AlternateHypothesisSpaceAssumption&OnlyAttendingtoVariableswithSameState
asManipulatedVariableStructureHypothesisSpace
InterventionOn: InterventionOn:X Y Z X Y Z
X→Y→Z X=Y=Z X≠Y=Z X=Y≠Z X=Y=Z Y=Z ZX→Y;Z X=Y≠Z X≠Y≠Z X=Y≠Z X=Y Y ZX;Y→Z X≠Y=Z X≠Y=Z X≠Y≠Z X Y=Z ZX;Y;Z X≠Y≠Z X≠Y≠Z X≠Y≠Z X Y ZHowInformativeThisInterventionIs High Medium Medium High Medium Low
13
Steyversetal.introduceanotherassumptionaswell;theyassumethatpeopleonly
attendtovariablesthattakeonthesamestateastheintervened-uponvariableandignoreanyvariablesthathaveadifferentstatefromthemanipulatedvariable.ThecombinationofthetwoassumptionsisdetailedintherightcolumnofTable3.WhenXisinterveneduponitwouldproducethreedifferentpatternsofdataforthefourcausalstructures,whichmeansthatithasfairlyhighinformationgain;itcandistinguishallbutthebottomtwostructuresinTable3.ThereasonwhyaninterventiononXcannolongerdistinguishbetweenthebottomtwostructuresisbecauseoftheassumptionthattheothertwovariablesthatdonotequalX,YandZ,areignored.
WhenYisintervenedupon,itcannarrowdownthespaceof4structuresdownto2,amediumamountofinformationgain.WhenZisinterveneduponallthestructuresproducethesamepatternofdatasoaninterventiononZdoesnothelpatalltoidentifythetruestructure.Insum,thecombinationofthesetwohypothesesnowmakesitsuchthatinterveningonXismoreinformativethanY,whichismoreinformativethaninventingonZ.Thispatternmatchesthefrequencyofparticipants’interventions,whichweremostfrequentlyonX,thenonY,andlastlyonZ.
TherearetwokeypointsmadebythisanalysisofthesimilaritiesbetweenthefindingsofCoenenetal.andSteyversetal.First,eventhoughtheyapproachtheresultsfromdifferentperspectivesandtalkabouttheresultsindifferentways,theybothfoundthatpeopletendedtointerveneonrootcauses.Second,eventhoughSteyvers’modelhasrationalelementstoit,theresultingmodelisnotveryclosetotheidealmodel,forwhichYisthemostinformativeintervention.Finally,bycomparingdifferentmodelswithdifferentassumptionsitcanbeseenhowthetwoassumptionsmadebySteyversetal.effectivelyamounttothepositiveteststrategyputforthbyCoenenatal.Restated,thesamebehavioralpatternofinterveningprimarilyonrootcausescouldbeexplainedinmorethanoneway.
Bramleyetal.,(2015)alsostudiedanumberofimportantfactorsrelatedtolearningfrominterventions.Overall,theyfoundthathumanswerehighlyeffectivecausallearners,andabletoselectandmakeuseofinterventionsfornarrowingdownthenumberofpossiblestructures.Oneparticularfactorheintroducedwasthepossibilityofinterveningontwovariablessimultaneouslyratherthanjustone.Doubleinterventionsareparticularlyhelpfultodistinguishbetween[X→Y→ZandX→Z]vs.[X→Y→Z].Withasingleintervention,thesetwostructuresarelikelytoproduceverysimilaroutcomes.Forexample,aninterventiononXislikelytoproducedatainwhichX=Y=Z,aninterventiononYislikelytoproducedatainwhichX≠Y=Z,andaninterventiononZislikelytoproducedatainwhichX=Y≠Z.
However,consideradoubleinterventionsettingX=1andY=0.Underthesimplechain[X→Y→Z],Zismostlikelytobe0,butinthemorecomplexstructureinwhichZisinfluencedbybothXandY,Zhasahigherchanceofbeing1becauseX=1.Bramleyetal.foundthatdistinguishingbetweenthesetwotypesofstructureswasthehardestdiscriminationinthisstudyandproducedthemosterrors.61ofthesubjectsrarelyuseddoubleinterventions,whereas49weremorelikelytousethem,suggestingthatpeoplemaynotusedoubleinterventionsfrequentlyenoughforcausallearning.
Insum,therearemanyopenquestionsabouthowpeoplechooseinterventions.Thereissomeevidencethatpeopledouseinformationgainwhenselectinginterventions,
14
buttherearealsoavarietyofheuristics(onlyusesingleinterventions,interveneonrootcauses)andorsimplifyingassumptions(focusonalimitedanddistortedhypothesisspace,onlyattendtovariableswiththesamevalueasthemanipulatedvariable).Clearlythereismoreworktobedonetohaveafullerandmorerobustunderstandingofhowhumanschooseinterventionstolearnaboutcausalstructures.2.1.3LearningCausalStructuresfromInterventions:InterpretingInterventionsandUpdatingBeliefsabouttheStructure Thepriorsectiondiscussedhowpeoplechooseinterventionstolearnacausalstructure.Thissectionexamineshowpeopleinterprettheoutcomeaftermakinganintervention.Fourpatternshavebeenproposedtoexplainhowpeopleinterprettheoutcomesofinterventions.ThefirstisthatifavariableXismanipulatedandanothervariableZassumesthesamestateofX,peopletendtoinferadirectlinkfromXtoZ.Thoughthisheuristicmakessensewhenlearningtherelationsbetweentwovariables,itcanleadtoincorrectinferencesincasesinvolvingthreeormorevariableslinkedtogetherinachainstructuresuchasX→Y→Zbecauseitcanleadpeopletoinferadditionallinksnotinthestructure.IfalearnerintervenesonXsuchthatitis1,andsubsequentlyYandZareboth1,thisheuristicimpliesthatX→YandX→Z.Indeed,peopleofteninferthatthereisanX→ZlinkaboveandbeyondX→Y→Z,evenincaseswhenthereisnotadirectlinkfromXtoZ(Bramleyetal.,2015;Fernbach&Sloman,2009;Lagnado&Sloman,2004;Rottman&Keil,2012).(Inreality,thecorrectwaytodeterminewhetherthereisaX→ZaboveandbeyondX→Y→ZistoseewhethertheprobabilityofZiscorrelatedwiththestateofXwithintrialsinwhichYis1orwithintrialsinwhichYis0,ortousedoubleinterventionsasexplainedpreviously.)
Thisheuristicisproblematicfortworeasons.Atatheoreticallevel,itsuggeststhatpeoplefailtopayattentiontothefactthatXandZareindependentconditionalonY.Asalreadydiscussed,attendingtostatisticalindependenciesiscriticalforunderstandingMarkovequivalenceclasses,andthisfindingsuggeststhatpeopledonotfullyunderstandtherelationsbetweenstatisticalindependenceandcausalMarkovequivalenceclass.Atamoreappliedlevel,addingthisadditionallinkX→Zcouldleadtoincorrectinferences(seesection3.2).Inparticular,wheninferringthelikelihoodthatZwillbepresentgiventhatYispresent,peopletendtothinkthatXhasaninfluenceonZaboveandbeyondY.InreferencetoasubsetofFigure1,eventhoughthetruecausalstructureisEthnicity→Smoking→CardiovascularDisease,thisheuristiccouldleaddoctorstoincorrectlypredictthatpeopleofcertainethnicitiesaremorelikelytodevelopcardiovasculardiseaseevenafterknowingtheirsmokingstatus,eventhoughethnicityhasnoinfluenceoncardiovasculardiseaseaboveandbeyondsmoking(accordingtoThornley,2013).Suchamisperceptioncouldleadpeopleofthoseethnicitiestofeelunnecessarilyworriedthattheirethnicitywillcausethemtohavecardiovasculardisease. Thesecondpatternofreasoningwasalreadydiscussedintheprevioussection.Steyversetal.,(2003)proposedthatwhenapersonintervenesonavariable,thattheyonlyattendtoothervariablesthatassumethesamestateasthemanipulatedvariable.Theprevioussectionexplainedhowthistendencywouldbiasreasonerstointerveneonrootcauses(Table3).Butthistendencywouldalsodecreasetheeffectivenessoflearningfrominterventions.IfoneintervenesonZandtheresultingobservationisX=Y≠Z,thefactthatXandYhavethesamestateshouldincreasethelikelihoodthatthereissomecausalrelationbetweenXandY;however,thisheuristicimpliesthatpeoplewouldnotlearnanything
15
aboutXorYbecausepeopleonlyattendtovariableswiththesamestateastheintervened-uponvariable(Z).Insum,thissimplificationmeansthatpeopledonotextractasmuchinformationfrominterventionsastheycould. ThethirdandfourthhabitsofupdatingcausalbeliefsafterinterventionscomefromthestudybyBramleyetal’sstudy(2015).InthisstudyparticipantsmadeaseriesofinterventionsonX,YorZ,andaftereachinterventiontheydrewthecausalstructurethattheybelievedtobethemostplausiblestructuregiventheevidenceuptothatpoint.Theydiscoveredtwointerrelatedhabits.First,participantsupdatedtheirdrawingsofthecausalstructureslowly.Thiscanbeexplainedasaconservativetendency;peopleneedconsiderableevidencebeforeaddingordeletingacausalrelationtotheirsetofbeliefs.Thesecondpatternisthatwhendrawingthecausalstructuresparticipantswereinfluencedbythemostrecentinterventionandappearedtoforgetmanyoftheoutcomesofpriorinterventions.Thecombinationofthesetwohabits,conservatismandforgettingcanbeexplainedwithananalogytobalancingacheckbook.Aftereachtransactiononeupdatesthecurrentbalancebyaddingthemostrecenttransactiontothepriorbalance,butonedoesnotre-calculatethebalancefromallpastexperiencesaftereachtransaction.Keepingtherunningbalanceisawaytosimplifythecalculation.Likewise,storingarepresentationofthecausalstructureasasummaryofthepastexperienceallowsthelearnertogetbywithoutrememberingallthepastexperiences;thelearnerjusthastoupdatethepriorcausalstructurerepresentation.Inarelatedvein,FernbachandSloman(2009)foundthatpeoplehavearecencybias–theyaremostinfluencedbythemostrecentdata,whichissimilartoforgetfulness.Understandingtheinterplaybetweenallofthesehabitswillprovideinsightsintohowpeoplelearncausalstructuresfrominterventionsinwaysthatarecognitivelytractable.2.1.4LearningTemporalCausalStructures Sofarthischapterhasfocusedonhowpeoplelearnaboutatemporalcausalnetworksinwhicheachobservationisassumedtobetemporallyindependent.Intheexampleatthebeginningofthechapteraboutcardiovasculardisease,eachobservationcapturedtheage,sex,smokingstatus,diabetesstatus,andothervariablesofanindividualpatient.Thecausallinkbetweensmokingandcardiovasculardisease,forexample,impliesthatacrosspatients,thosewhosmokearemorelikelytohavecardiovasculardisease. However,oftenitisimportanttounderstandhowvariableschangeovertime.Forexample,aphysiciantreatingpatientswithcardiovasculardiseaseisprobablylessinterestedinpopulation-leveleffects,andinsteadismoreinterestedinunderstandinghowachangeinsmokingwouldinfluenceanindividualpatient’sriskofdevelopingcardiovasculardisease. TemporalversionsofCBNscanbeusedtorepresentlearningandreasoningaboutchangesovertime(Ghahramani,1998;Murphy,2002alsoseeRehder,thisvolumeb).TemporalCBNsareverysimilartostandardCBNs,excepteachvariableisrepresentedbyaseriesofnodesforeachtimepointt.Thecausalstructureisoftenassumedtobethesameacrosstime,inwhichcasethecausalstructureisrepeatedateachtimepoint.Additionally,oftenvariablesareassumedtobeinfluencedbytheirpaststate;positiveautocorrelationmeansthatifthevariablewashighattimet,itislikelytobehighatt+1.
16
Figure4:ATemporalCBN
Forexample,Figure4showsacausalnetworkrepresentingtheinfluenceofusinganantihypertensiveonbloodpressure.1representsusinganantihypertensiveorhighbloodpressure,whereas0representsnotusinganantihypertensiveorhavingnormalbloodpressure.Insteadofjusthavingonenodethatrepresentsusinganantihypertensiveandanotherforbloodpressure,nowthestructureisrepeatedateachtimepoint.Additionally,theautocorrelationcanbeseenwiththehorizontalarrows.Allthingsbeingequal,ifapatient’sbloodpressureishigh,itwilltendtostayhighforperiodsoftime.Likewise,ifapatientstartsusinganantihypertensive,theymightcontinuetouseitforawhile.
LikeallCBNs,temporalCBNsfollowthesamerulesandconventions.Hereinsteadofusinginodestorepresentinterventions,Iusedtexttoexplaintheintervention(e.g.,aphysicianprescribedanantihypertensive).TheinterventionsarethereasonthatsomeoftheverticalandhorizontalarrowsareremovedinFigure4becauseaninterventionmodifiesthecausalstructure.TheMarkovconditionstillholdsinexactlythesamewayasintemporalCBNs.Forexample,apatient’sbloodpressure(BP)atage73isinfluencedbyhisBPat72,buthisBPat71doesnothaveaninfluenceonhisBPat73aboveandbeyondhisBPatage72. Causallearningfrominterventionsworksinessentiallythesamewayintemporalandatemporalcausalsystems.InFigure4itiseasytolearnthatusinganantihypertensiveinfluencesbloodpressure,notthereverse.Whenthedrugisstarted,thepatient’sBPdecreases,andwhenthedrugisstopped,thepatient’sBPincreases.Butwhenanotherintervention(e.g.,exercising)changesthepatient’sbloodpressure,itdoesnothaveaneffectonwhetherthepatientusesastatin. Oneinterestingaspectabouttemporalcausalsystems,isthatispossibletoinferthedirectionofacausalrelationshipfromobservations,whichisnotpossiblewithatemporalsystems.ConsiderthedatainFigure5;thedirectionofthecausalrelationisnotshowninthefigure.Thereisanasymmetryinthedata;sometimesXandYchangetogether,andsometimesYchangeswithoutXchanging,butXneverchangeswithoutYchanging.ColleaguesandIhavefoundthatbothadultsandchildrennoticethisasymmetryanduseittoinferthatXcausesY(Rottman&Keil,2012;Rottman,Kominsky,&Keil,2014;Soo&Rottman,2014).ThelogicisthatYsometimeschangesonitsown,implyingthatwhatevercausedYtochangedidnotcarryovertoX;YdoesnotinfluenceX.Furthermore,sometimesXandYchangetogether.SincewealreadybelievethatYdoesnotinfluenceX,onewayto
0
1
1
0
1
1
1
0
0
1
Physician prescribed
antihypertensive
Age7170 72 73 74 75 76
Using an anti-hypertensive?
High Blood Pressure?
0
1
0
1
0
1
77Stopped taking
medicine because of fatigue
started exercising
stopped exercising
17
explainthesimultaneouschangeinXandYisthatachangeinXcausedthechangeinY.2ThisisonewayinwhichhumancausallearningseemsmoreakintolearningtemporalCBNratherthananatemporalCBN;thetemporalaspectofthisdataiscriticalforinferringthecausaldirection.Figure5:ExampleofLearningCausalDirectionfromTemporalData
AnumberofotherphenomenafitwellintothetemporalCBNframework.Consider
thedatainFigure6ObservedData.Inthissituation,subjectsknowthatCisapotentialcauseofE,notthereverseandthegoalistojudgetheextentthatCinfluencesE.OverallthereisactuallyzerocorrelationbetweenCandE.ThefaithfulnessassumptionstatesthattheonlyindependenciesinthedataarisethroughtheMarkovassumption.IfCandEareunconditionallyindependent,itmeansthatCcannotbeadirectcauseofE.Instead,anotherpossibility(PossibleStructure1inFigure6)isthatsomeunobservedthirdvariableUisentirelyresponsibleforE.
Figure6:LearningaboutanInteractionwithanUnobservedFactor
However,whenfacedwithdatalikeinFigure6,peopledonotconcludethatCis
unrelatedtoE;insteadtheynoticethatthereareperiodsoftimeinwhichChasapositiveinfluenceonE(Times0-3),andotherperiodsoftimeinwhichChasanegativeinfluenceonE(Times4-7).TheysubsequentlytendtoinferthatCdoesactuallyhaveastronginfluenceonE,butthatthereissomeunobservedfactorthatisfairlystableovertime,andCandtheunobservedfactor(U)interacttoproduceE(Rottman&Ahn,2011).Thisexplanationis2Technically,thereasonwhyitispossibletolearnthedirectionofthecausalrelationistheautocorrelation,thebeliefthatYt→Yt+1andthatXt→Xt+1.Thus,thelearnerisreallydiscriminatingbetween[Yt→Yt+1←Xt+1←Xt]and[Yt→Yt+1→Xt+1←Xt],whichareindifferentMarkovequivalenceclasses.IthankDavidDanksforpointingthisout.
0
0
1
1
1
0
1
1
0
0
Time0 1 2 3 4
X
Y
1
1
Ct
Et
0
0
1
1
0
0
0
1
1
0
0 1 0 1 01Ut
0
1
1
1
0
0
i i i i i i i i
Time0 1 2 3 4 5 6 7
1
1
Ct
Et
0
0
1
1
0
0
0
1
1
0
1 1 1 0 01Ut Ut+1
0
1
0
1
0
0
i i i i i i i i
Time0 1 2 3 4 5 6 7
1
1
Ct
Et
0
0
1
1
0
0
0
1
1
0
0
1
1
0
i i i i i i i i
Time0 1 2 3 4 5 6 7
Observed Data
Possible Structure 1: Only U is a Cause Possible Structure 2: Crossover Interaction
18
representedinFigure6PossibleStructure2.Inthisstructure,bothCandUinfluenceE,andthereisanarkbetweenthetwolinks,whichrepresentsaninteraction;inthiscasetheinteractionisaperfectcross-oversuchthatEis1ifbothCandUare1orbothare0.ThereasonpeopleappeartomakethisinferenceaboutthecrossoverinteractionwithanunobservedcauseratherthaninferringthatCisunrelatedtoEisbecausethedataaregroupedintodistinctperiodssuchthatthereareperiodsduringwhichthereissometimesapositiverelationandothertimesanegativerelationbetweenCandE.ThisallowsthereasonertoinferthatsomeunobservedfactorUmustaccountfortheswitch.Ifthesame8trialswererandomizedthenpeopletendtoinferthatonlyUisacauseofE,notC.Thisinferenceagainsuggeststhatpeopletendtorepresentcausalsystemsastemporallyextended(thatvariablessuchasUtendtobeautocorrelated)ratherthanatemporal(seeRottman&Ahn,2009,foranotherexample).
Elsewhere,colleaguesandIhaveargued(Rottmanetal.,2014)thatmanyofthecausallearningphenomenathathavebeenusedasevidencethatpeoplelearnaboutcausalrelationsinwaysakintoCBNsareevenbetterexplainedbytemporalCBNs.Forexample,onestudyfoundthatchildrencanlearnaboutbidirectionalcausalrelationsinwhichtwovariablesbothcauseeachother(Schulz,Gopnik,&Glymour,2007).Bidirectionalcausalstructurescanonlyberepresentedthroughtemporal,notatemporalcausalnetworks(Griffiths&Tenenbaum,2009;Rottmanetal.,2014). Inconclusion,thereisgrowingevidencethat,atleastincertainsituations,peopleappeartobelearningsomethingsimilartoatemporalcausalnetwork,andthetemporalaspectofreasoningallowsthemtoinferquitesophisticatedcausalrelationsthatwouldotherwisebeimpossibletolearn.2.2LearningabouttheIntegrationFunction AnotheraspectofaCBNthatmustbelearnedinadditiontothestructureistheintegrationfunction;thewaythatmultiplecausescombinetoinfluenceaneffect(alsoseeGriffiths,thisvolume,Rehder,thisvolumeaandb).Forexample,inregression,thepredictorsaretypicallyassumedtocombinelinearly.TheCBNframeworkallowsforthepossibilitythatcausescanpotentiallycombineinanyconceivableway,andhumansareextremelyflexibleaswell.Forexample,Waldmann(2007)demonstratedthatpeoplenaturallyreasonaboutcausesthatareadditive(e.g.,theeffectoftakingtwomedicinesisthesumofthetwoindividualeffects)andaverages(e.g.,thetasteoftwochemicalsmixedtogetheristheaverageofthetwo).Furthermore,peopleusebackgroundknowledge(e.g.,aboutmedicinesandtaste)todecidewhichtypeofintegrationfunctionismoreplausibleinagivensituation. Mostresearchoncausallearninghasfocusedonbinaryvariables.Themostprominentintegrationfunctionforbinaryvariables,calledNoisy-OR,describessituationsinwhichtherearemultiplegenerativecauses(Cheng,1997;Pearl,1988).Itstipulatesthattheprobabilityoftheeffectbeingabsentisequaltotheprobabilitythatallthecauseshappentosimultaneouslyfailtoproducetheeffect.Iftherearetwocauses,eachofwhichproducetheeffect50%ofthetimeontheirown(acausalstrengthof.5),thenbothwouldfailsimultaneously25%ofthetime;theeffectshouldoccur75%ofthetime.Iftherearethreecauses,eachofwhichproducestheeffect50%ofthetime,thenallthreewouldsimultaneouslyfail53=12.5%ofthetime;theeffectwouldbepresent87.5%ofthetime.AnanalogousintegrationfunctioncalledNoisy-And-Notcanbeusedtodescribeinhibitorycausesthatcombineinasimilarfashion.Itisnotdifficulttoimagineothersortsof
19
integrationfunctions,andthefollowingstudieshaveexaminedhowpeoplelearnabouttheintegrationfunctionfromdata. Beckersetal.(2005;seealsothechapterinthisvolumebyBoddez,DeHouwer,&Beckers)studiedhowbeliefsabouttheintegrationfunctioninfluencelearning.Inonestudyparticipantsfirstlearnedabouttwocauses,GandH,bothofwhichproduceanoutcomeof1onetheirown.Inthe“additive”conditiontheysawthatGandHtogetherproduceanoutcomeof2,whichisconsistentwithanintegrationfunctioninwhichtwocausesaddtogether.InanotherconditiontheysawthatGandHtogetherproduceanoutcomeof1.Thisisinconsistentwiththenotionthatthetwocausesaddtogether;insteaditsuggestssomesortof“sub-additive”integrationfunctioninwhichtheeffectcanneverbehigherthan1.SubsequentlyparticipantsinbothconditionsexperiencedablockingparadigminwhichtheylearnthatAbyitselfproducesanoutcomeof1,andAplusXproducesanoutcomeof1.InthesubadditiveconditionparticipantsstillthoughtthatXmightbeacausebecausetheybelievedthattheeffectcouldnevergohigherthan1.Incontrast,intheadditiveconditiontheyconcludedthatXwasnotacause;ifitwas,thenpresumablytheeffectwouldhavebeen2.
LucasandGriffith(Lucas&Griffiths,2010)investigatedasimilarphenomenon,thatinitialtrainingabouthowcausescombineinfluenceswhethersubjectsinterpretthatavariableisacauseornot.Theyfirstpresentedpeoplewithdatathatsuggestedthatthecausesworkedconjunctively(multiplecauseswereneededtobepresentfortheeffecttooccur),orthroughthenoisy-ORfunction(asinglecausewassometimessufficienttoproducetheeffect).Afterwards,participantssawacauseDneverproducetheeffect,andsawthattwocausesincombination,DandF,producedtheeffect.ParticipantsintheconjunctiveconditiontendedtoconcludethatbothDandFwerecauses,whereasparticipantsinthenoisy-ORconditiontendedtoinferthatonlyFwasacause.
Insum,theseresultsshowthatpeoplequicklyandflexiblylearnabouthowcausescombinetoproduceaneffectandtheintegrationrulethattheylearndramaticallyinfluencessubsequentreasoningaboutthecausalsystem.2.3LearningCausalStrength Sofarthischapterhasfocusedonhowpeoplelearncausalstructure,andtoalesserextentintegrationfunctions.Oneotherimportantcomponentofcausalrelationsiscausalstrength,ourinternalmeasurementofhowimportantacauseis.Forexample,ifamedicineworksverywelltoreduceasymptom,ithashighcausalstrength,butifitdoesnotreducethesymptomatallithaszerocausalstrength. PriortotheCBNframework,theoriesofcausalstrengthlearningwerebasedonsimplemeasuresofthecontingencybetweenthecauseandeffect.Forexample,theΔPmodelcomputesthestrengthoftheinfluenceofacause(C)onaneffect(E)bytheextentofthedifferenceintheprobabilityoftheeffectwhenthecauseispresentvs.absent;P(e=1|c=1)-P(e=1|c=0)(Cheng&Novick,1992;Jenkins&Ward,1965).Thissamecontrastiscalculatedatasymptotebyoneofthemostinfluentialmodelsconditioningasawaytocapturehowstronglyacueandoutcomebecomeassociatedbyananimal(Danks,2003;Rescorla&Wagner,1972).Thissamemodelhasalsobeenproposedasamodelofcausallearning,theideabeingthatthestrongerthatacueisassociatedwithanoutcome,thestrongerthathumanswouldinferthatthecuecausestheoutcome(DavidRShanks&Dickinson,1987).
20
WiththeintroductionoftheCBNframeworkanumberoftheoriesofcausallearningwereproposedthatincorporatedifferentsortsoftop-downcausalbeliefsintothelearningprocess.AnumberofotherchaptersdiscusscausalstrengthlearningincludingthosebyGriffiths,ChengandLu,andPerales,Catena,MaldonadoandCándido.Thus,IbrieflydiscusstheconnectionsbetweentheCBNframeworkandtheoriesofcausalstrengthlearning,whileleavingthedetailstothoseotherchapters.2.3.1ElementalCausalInduction:LearningCausalStrengthBetweenTwoVariables OneofthemostimportantdevelopmentsofmodelsofcausalstrengthlearningisthePower-PCmodel(Cheng,1997).ThismodelbuildsofftheΔPmodelbyincorporatingcausalbeliefsandassumptions.ThismodelassumesthatonegenerativecausecombinesthroughtheNoisy-ORintegrationfunctionwithanotherunobservedcause.Forexample,imaginethattheeffectEoccurs25%ofthetimewithouttheobservedcauseC;P(e=1|c=0)=.25.Wecanattributethis25%tosomebackgroundcausethathasastrengthof.25.Further,imaginethattheobservedcausehasastrengthof2/3.Whentheobservedcauseispresent,theeffectshouldoccur75%ofthetimeifCandthebackgroundcausecombinethroughanoisy-ORfunction;P(e=1|c=1)=.75.(Theeffectwouldfailwithaprobabilityof1/3×3/4=¼).
Chengusedthissortoflogic,inreverse,todeducethatifanobservedcausecombineswithabackgroundcausethroughanoisy-ORintegrationfunction,thecorrectwaytocalculatecausalstrengthinvolvesdividingΔPbyP(e=0|c=0).Considernowtheprobabilitiesjustpresented,withoutknowingthecausalstrength:P(e=1|c=1)=.75andP(e=1|c=0)=.25.AccordingtoΔP,thecausalstrengthis.5;thecausesraisestheprobabilityoftheeffectby.5.AccordingtoPower-PC,thecausalstrengthofCis(.75-.25)/(1-.25)=.67;thecauseincreasestheeffectby2/3rds(from.25to.75).Insum,byspecifyingasetofpriorbeliefsaboutthecausalrelation,Chengspecifiedhowcausalstrengthshouldbeinducedgiventhosebeliefs. AnotherinfluentialdevelopmenttocausalstrengthlearningistheCausalSupportmodel.GriffithsandTenenbaum(2005)proposedthatwhenpeopleestimatecausalstrength,whattheyareactuallydoingisnotjudgingthemagnitudeoftheinfluenceofthecauseontheeffect,similartoeffectsizesininferentialstatistics,butratherjudgingtheextenttowhichthereisevidencethatthereisanycausalrelationornot,similartothefunctionofap-valueinhypothesistesting.Atatheoreticallevel,thismodeliscalculatedbydeterminingtherelativelikelihoodthatthetruecausalstructureis[C→E←U],thatbothCandanunobservedfactorUinfluenceEvs.thatthetruecausalstructureis[C;E←U],thatCdoesnotinfluenceEandEisdeterminedbyanunobservedfactorU.Thus,causalsupporttreatscausalstrengthlearningasdiscriminatingbetweentwopossiblecausalstructures,oneinwhichCactuallyisacauseofE,andoneinwhichCisnotacauseofE.
CausalSupporthasanumberofbehavioralimplications,butthemostobviousoneandeasiesttothinkaboutissamplesize.WhereasΔPandPower-PCareunaffectedbysamplesize,CausalSupportisinfluencedbysamplesize.GoingbacktotheanalogyofCausalSupportasap-valuewhereasΔPandPower-PCareeffectsizemeasures,ifthereisalargeenoughsamplesizeitispossibletohaveaverylowp-value(confidentthatthereisacausalrelation)eveniftheeffectsizeissmall. Insum,PowerPCandCausalSupportwerebothmotivatedbyunderstandingcausalitythroughaCBNperspective,involvingtop-downbeliefsabouthowanobservedcausecombineswithotherunobservedfactors.
21
2.3.2InferringCausalStrength:ControllingforOtherCauses Theprevioussectionfocusedonhowpeopleinfercausalstrengthgivenobservationsofjustasinglecauseandeffect,elementalcausalinduction.However,oftentherearemorethantwovariables.Wheninferringthestrengthofonecauseonaneffectitisimportanttocontrolforcertaintypesofthirdvariables(andnotothers),dependingonthecausalstructure.ConsiderFigure1.Whenstudyingthestrengthoftheeffectofanewdrugoncardiovasculardisease,itisimportanttocontrolforageandsmokinghabits,eitherstatisticallyorthroughthedesignofthestudy.Oneshouldnotcontrolforstatinusebecauseitisnotadirectcauseofcardiovasculardisease.Figure7:PossibleThirdVariableswhenLearningtheCausalRelationfromCtoE
Moregenerally,considertryingtolearnifthereisacausallinkfromapotentialcauseCtoapotentialeffectE,andifso,howstrongtherelationis.Figure7presents8differentthirdvariables(S-Z);thequestioniswhichofthesevariablesshouldbecontrolledfor.Forreadersfamiliarwithmultipleregression,youcanthinkofCasonepredictorintheregressionthatyouareprimarilyinterestedin,andEistheoutcomevariable.Thequestionaboutcontrollingforalternativevariablesiswhichofthesevariablesshouldbeincludedaspredictorsorcovariatesintheanalysis?ThefollowingbulletssystematicallyexplaineachofthethirdfactorsandwhetheritshouldbecontrolledforwheninferringthestrengthofConE:
• VandXareconfoundsandmustbecontrolledforwheninferringtherelationofConE.IftheyarenotcontrolledfortherewouldbeaspuriouscorrelationbetweenCandEevenifthereisnocausalrelationbetweenCandE.(XrepresentsthecasewhensomeunobservedfactorcausesbothCandX.)
• WrepresentsanalternativemechanismfromCtoE.InordertotestwhetherthereisadirectinfluenceofConEaboveandbeyondWitmustbecontrolledfor.
• Yisanoisevariable.AccountingforitincreasesourpowertodetectarelationbetweenCandE.
• UandZshouldnotbecontrolledfor.Thelogicisabitopaque(Eells,1991,p.203),butconsiderthesimplecasethatEdeterministicallycausesZsuchthattheyareperfectlycorrelated.ControllingforZexplainsallthevarianceinE,andtherewillbenoleftovervarianceforCtoexplain.ControllingforZandUcandistorttheapparentrelationbetweenCandE.
• SandTneverneedtobecontrolledfor.WithlargesamplesizesitdoesnotmatterifSandTarecontrolledforornotwheninferringtheinfluenceofConE.ThereasonisthateventhoughSandTarecorrelatedwithC,sinceSandTarescreenedofffromE
C
W
E
S
Z
T VU X Y
22
(SandTareindependentofEaftercontrollingforC),theywillnothaveanypredictivepowerinaregressionaboveandbeyondC.However,withsmallsamplesizes,mostlikelySandTwillnotbeperfectlyuncorrelatedwithEcontrollingforC,inwhichcasetheycanchangetheestimatedinfluenceofConE.Thus,theyshouldnotbecontrolledfor.
Insum,theoverallruleisthatwheninferringthestrengthofarelationofConE,thirdvariablesthatarebelievedtobepotentialdirectcausesofEshouldbecontrolledfor;othervariablesshouldnotbecontrolled(Cartwright,1989;Eells,1991;Pearl,1996).Thisrulenicelydovetailswithhowcausalstructuresaredefined;eachvariableismodeledusingaconditionalprobabilitydistributionincorporatingallofitsdirectcauses. Remarkably,avarietyofresearchsuggeststhatpeoplehavetheabilitytoappropriatelycontrolforthirdvariableswheninferringcausalstrength.Infact,researchonthistopicwasthefirstresearchonwhetherpeopleintuitivelyusebeliefsaboutcausalstructurewhenreasoningaboutcausality(Waldmann&Holyoak,1992;Waldmann,1996,2000).MichaelWaldmannandcolleaguescalledthistheorytheCausalModeltheory;theideawasthatwheninferringcausalstrength,peopleusebackgroundknowledgeaboutthecausalstructure(“model”)todeterminewhichvariablestocontrolfor.Inthefirststudyonthistopic,ascenariowiththreevariablesX,Y,andZwassetup.Basedonthecoverstorythethreevariableswereeithercausallyrelatedinacommoneffectstructure[X→Y←Z]orinacommoncausestructure[X←Y→Z].Inthecommoneffectcondition[X→Y←Z],thegoalforparticipantswastodecidetheextenttowhichXandZwerecausesofY;normativelypeopleshouldcontrolforalternativecauses(e.g.,controlforXwhendeterminingwhetherZisacauseofY).Inthecommoncausecondition[X←Y→Z],thegoalforparticipantswastodecidetheextenttowhichXandZareeffectsofY;normativelythesetwodecisionsshouldbemadeseparately(e.g.,oneshouldignoreXwhendeterminingtheinfluenceofYonZ). Afterthecoverstorymanipulatingthebelievedcausalstructure,participantsfirstexperiencedasetofdatainwhichXandYwereperfectlycorrelated;Zwasnotdisplayed.ThistrainingmadeitseemthatthereisastrongcausalrelationbetweenXandY.ThentheyexperiencedasetofdatainwhichX,Y,andZwereallperfectlycorrelated;nowZisaredundantpredictorofYbecauseXisentirelysufficienttopredictY.Insum,participantsexperiencedtheexactsamedata,andtheonlydifferencebetweenthetwoconditionswastheirbeliefaboutthecausalstructure. Inthecommoneffectcondition[X→Y←Z],participantscontrolledforXwheninterpretingwhetherZwasacauseofY,andconsequentlyconcludedthatZisnotacauseofYbecauseXisentirelysufficienttopredictwhetherYwaspresentorabsent.Incontrast,inthecommoncausecondition[X←Y→Z],participantsdidnotcontrolforX,andconcludedthatYwasacauseofbothXandZ.
Subsequently,anumberofotherstudieshavealsoshownthatpeoplecontrolforalternativecauses(V-YinFigure7)ofthemaineffectandnotalternativeeffectsofthemaincause(TinFigure7)(Goodie,Williams,&Crooks,2003;Spellman,Price,&Logan,2001;Waldmann,2000).ThereisevenworksuggestingthatpeopledonotcontrolforvariableslikeSandZ(Waldmann&Hagmayer,2001);however,therehasnotbeenresearchonwhetherpeoplecontrolforvariableslikeU.
Insum,whenlearningaboutacausalrelationbetweenCandE,peoplehavesomecoreintuitionstocontrolforvariablesthattheybelievetobealternativecausesofE,and
23
nototherroles,whichiscriticalforcorrectcausallearning(Glymour,2001).Thisresearchissomeofthemostdramaticshowinghowtop-downbeliefsaboutcausalstructureinfluencelearning,andconsequentlyissomeofthestrongestevidencethathumancausalreasoninginvolvesstructureddirectionalrepresentationsbeyondjustassociationsbetweenvariables(Waldmann,1996).3ReasoningwiththeCausalStructure Sofarthischapterhasfocusedonhowpeoplelearnaboutacausalnetwork;thestructureofthenetwork,theparametersorcausalstrengths,andthefunctionalform.Theremainderofthechapterishowpeopleusethisknowledge(seealsoOaksfordandChater,thisvolume).GoingbacktoFigure1,onemightdesiretoexplainwhetheraperson’scardiovasculardiseasewascausedbyhisage,orhissmoking.Onemightdesiretopredictwhetherhiscardiovasculardiseasewillgetworseasheages.Andonemightdesiretoknowwhichintervention,stoppingsmokingorstartingtotakeastatinwouldhavethelargestinfluenceonhiscardiovasculardiseaseinordertochoosetheactionwiththegreatestrewards. Thoughthissecondhalfofthechapterfocusesonreasoningaboutthecausalnetworkratherthanlearning,itisimpossibletocompletelydivorcelearningandreasoning.Intherealworldwelearnaboutcausalrelationsbothfromfirst-handexperiencewithdata(e.g.,didstartingthestatinlowermybloodpressure)andalsofromcommunicatedknowledge(e.g.,fromfamilymembers,teachers,doctors,newspaperarticles).Researchinpsychologyhasusedbothpersonalexperienceandcommunicatedknowledge,oftenincombination,toteachsubjectsaboutthecausalstructurebeforetheyreasonaboutthestructure.Typicallywordsandpicturesareusedtoconveythecausalstructuretoparticipants,althoughthestructuralinformationissometimesconveyedthroughorsupplementedwithexperienceddata.Iftheparticipantslearnanythingabouttheparameters(causalstrengths)ofthecausalstructure,itisusuallyconveyedthroughdata-drivenexperience,thoughsometimestheparametersareconveyedtextually.Theintegrationfunctionisoftennotmentionedatall,thoughsometimesitismentioned.
Oneofthechallengeswithstudyinghowwellpeoplereasonaboutcausalstructuresisthatapparentflawsinreasoningcaneitherbeexplainedasreasoningbiases,oraspoor,biased,orinsufficientlearningaboutthecausalstructure.Itisnotclearhowtocleanlydifferentiatethetwobecausecheckingthatthecausalstructureislearnedappropriatelyinvolvesquestionsthataretypicallyviewedasreasoningaboutthecausalstructure.Thissetsupadifficultsituationbecauseanyobservedreasoningbiascanpotentiallybeexplainedawaybyclaimingthattheresearcherfailedtosufficientlyconveythecausalstructuretotheparticipants.HereIdonottrytosolvethisproblem,butinsteadjustpresenttheempiricalfindingsofhowcloselyreasoningappearstofitwiththecausalstructurespresentedtosubjects.Theseconclusionsarebasedonamuchmorethoroughanalysisoftheliteraturethancanbepresentedhere(Rottman&Hastie,2014),thoughthischapterincludessomenewlypublishedevidence.3.1ReasoningbasedonObservationsvs.Interventions InSection2,IexplainedhowtheCBNframeworktreatsobservationsandinterventionsverydifferentlyforlearningacausalstructure.Interventionschangethecausalstructurebyremovinglinksfromvariablesthatwerepreviouslycausesofthemanipulatedvariable.Forexample,giventhestructureX→Y→Z,ifYisintervenedupon,YgetsseveredfromXresultingin[X;Y→Z].UnderaninterventiononY,Xwouldbe
24
statisticallyindependentoruncorrelatedfromY,eventhoughZwouldstillbedependentuponY. Practically,giventhestructureX→Y→Z,ifareasonercanobservethestateofY,theycanmakeapredictionaboutbothXandZ.Inthetypesofsituationstypicallystudiedinthelabwithbinaryvariablesandpositivecausalrelations,ifYisobservedas1,thenXandZarebothlikelytobe1aswell.However,ifareasonerintervenesonYandsetsitsvalueto1,thenZislikelytobe1,butthisinterventionwouldhavenoinfluenceonX,sothebestestimateofXissimplyitsbaserate.Insum,interventionsonlyinfluencevariablesdown-streamfromthemanipulatedvariable,notup-stream(butseeHiddleston,2005foranalternativeapproach,andalsoseethechapterbyOveronwhether“if...then”conditionalsareinterpretedasinterventions). Anumberofresearchershavefoundthatpeoplediscriminatebetweenobservationsandinterventionswhenmakinginferencesbasedonacausalstructure.SlomanandLagnado(2005)setupsimpleverbaldescriptionsinwhichoneevent(X)causestheother(Y),andfoundthatwhenYwasobservedtohaveaparticularvalue,Xwouldbeinferredtohavethesamevalue,butwhenYwasintervenedupontohaveaparticularvalue,Xwasinferredtohaveitsnormaldefaultvalue.Insum,whenitwasmadeveryclearwhethertherewasanobservationvs.anintervention,subjects’judgmentslargelyfollowedtheprescriptionsoftheCBNframework.Incontrast,whenmoreambiguouslanguageisusedsuchthatthevalueofavariablecouldbeknowneitherthroughanobservationoranintervention,thentheresponseslookedmoremuddy(seealsoRips,2010). Anothersetofstudiestookthisbasicfindingastepfurtherbydemonstratingthatthisdifferencebetweeninterventionsvs.observationsalsoholdsincontextsinwhichparticipantsaretoldthecausalstructureandthenlearntheparameters(e.g.,thebaseratesandthecausalstrengths)fromexperience.Considerasetofstudiesthatinvestigatedreasoningonadiamondstructure[X←W→YandX→Z←Y](Meder,Hagmayer,&Waldmann,2008,2009;Waldmann&Hagmayer,2005).Thesestudiesareuniqueforinvolvingmorethanthreevariables,andalsoforhavingtwocausalroutesW→X→ZandW→Y→Z.Despitethecomplexitiesinvolvedinthesestudies,theparticipantsshowedremarkablesubtletyinreasoningaboutthecausalstructures,anddistinguishingbetweeninterventionsandobservationsdifferently.
ConsiderobservingalowvalueofX,andtryingtoinferthevalueofZ.InthediamondstructuretherearetworoutesfromXtoZ:X←W→Y→ZandX→Z.DuetothesetworoutesXandZshouldbestronglycorrelated,andthusZshouldbequitelowwhenXisobservedtobelow.Incontrast,ifXisinterveneduponandsettoalowvalue,therouteX←W→Y→Zisdestroyed–thelinkfromWtoXiscut.TheX→Zrouteisstillopen,sothepredictedvalueofZisstilllow,butitshouldnotbeaslowaswhenXisobserved.Infact,thisistheexactpatternofreasoningthatwasobserved;theinferenceofZafteranobservationofXwaslowerthanafteraninterventiononX.Thisfindingfurthersuggeststhatpeoplereasonaboutobservationsbothdown-streamandup-stream,buttheyreasonaboutinterventionsonlydown-stream.Thisresearchalsoshowshowpeoplecanreasonaboutobservationsandinterventionsonmorecomplexstructures. Sofarthissectionhasfocusedon“perfect”interventionsinwhichtheinterventioncompletelydeterminesthestateofthemanipulatedvariables,andcompletelyseversallotherinfluences.However,ofteninterventionsarenotperfect.Forexample,afterprescribingapatientanantihypertensivetotreathighbloodpressure,thepatientmaynot
25
actuallytakeit,ormaynottakeitexactlyasprescribed(e.g.,asfrequentlyastheyshould,attherightdose).Furthermore,evenifthepatientdoestakethemedicineasprescribed,themedicinedoesnotguaranteethatallpatientswillhavea120/80bloodpressure.Patientswhoinitiallyhadveryhighbloodpressureswillprobablystilltendtohavehigherbloodpressuresthanthosewhoinitiallyhadmoderatelyhighbloodpressures.Or,perhapsthemedicineonlysucceedstobringthebloodpressureintoanormalrangeforacertainpercentageofpatients,butnotforothers.Intheseways,takinganantihypertensiveisan“imperfect”interventiononbloodpressure;apatient’sbloodpressureisnotcompletelydeterminedbytheintervention.Insuchcasesofimperfectinterventions,reasoningup-streamiswarrantedtosomeextent,similartoobservations.Unfortunately,therehasbeenfairlylittleworkexamininghowpeoplereasonaboutimperfectinterventions(Meder,Gerstenberg,Hagmayer,&Waldmann,2010;Meder&Hagmayer,2009).
Insum,theexistingresearchhasfoundthatpeopledodistinguishbetweeninterventionsandobservationswhenreasoningaboutcausalsystems,inparticularthatinterventionsonlyinfluencevariablesdown-streamfromtheintervened-uponvariable.Animportantdirectionforfutureresearchistoexaminehowpeoplereasonaboutimperfectinterventions.Thisseemsespeciallyimportantgiventhatmanyoftheactionsor“interventions”humansperformarenotperfectinterventions.3.2DoPeopleAdheretotheMarkovConditionwhenReasoningaboutCausalStructures? RecallthattheMarkovconditionstatesthatonceallthedirectcausesofavariableZarecontrolledfororheldconstant,ZisstatisticallyindependentofeveryvariableinthecausalnetworkthatisnotadirectorindirecteffectofZ.Forexample,inthestructureX→Y→Z,ZisconditionallyindependentofXonceY(theonlydirectcauseofZ)isheldconstant.PeoplehaveoftenbeenfoundtoviolatetheMarkovassumption;theirinferencesaboutthestateofZareinfluencedbythestateofXevenwhentheyalreadyknowthestateofY(Mayrhofer&Waldmann,2015;Park&Sloman,2013;Rehder&Burnett,2005;Rehder,2014;Rehder,thisvolumeb;Walsh&Sloman,2008).Specifically,peopletendtoinferthatP(z=1|y=1,x=1)>P(z=1|y=1,x=0)eventhoughtheyshouldbeequivalent.Likewise,theyuseZwheninferringXevenafterknowingthestateofY.Goingbacktosection2.1.3,suchamistakecouldleadadoctortoincorrectlybelievethatethnicityhasaninfluenceoncardiovasculardiseaseaboveandbeyondsmokingevenwhenthetruecausalstructureisEthnicity→Smoking→CardiovascularDisease. ThereareavarietyofpossibleexplanationsforwhyinferencesviolatetheMarkovcondition,andmostoftheexplanationshaveattemptedtofindrationalizationsfortheviolations,reasonsthatsuchjudgmentswouldmakesenseaccordingtotheCBNframeworkassumingsomemodificationtothestructureduetopriorknowledge.Forexample,ifsubjectsbelievethatthereissomeothercausallinkbetweenXandZ(e.g.,X→Z,X←Z,orX←W→Z)inadditiontothecausalstructuretoldtothembytheexperimenter(X←Y→Z),suchadditionalinformationcouldjustifytheirinferences.ThreespecificproposalsarethatpeopleinferanunobservedfactorthatinhibitsbothXandZ,anunobservedfactorthatinfluencesX,Y,andZ,oranintermediarymechanismMsuchthatZcausesA,whichinturncausesXandY.Differentarticlesinthelistabovehavesupporteddifferentaccounts.Forexample,BurnettandRehder(2005)arguedfortheaccountinwhichanunobservedfactorinfluencesX,Y,andZ.ParkandSloman(2013)foundthatpeopleonlymaketheMarkovviolationwhenthemiddlevariableispresent,notabsent;
26
P(X=1|Y=1,Z=1)>P(X=1|Y=1,Z=0)butthatP(X=1|Y=0,Z=1)=P(X=1|Y=0,Z=0).ThisfindingismostconsistentwiththeaccountthatpeopleinferanunobservedfactorthatinhibitsXandZ.TheyalsofoundthatthesizeoftheMarkovviolationwaslargerwhenparticipantsbelievedthatthetwoeffects(XandZ)arebothcausedthroughthesamemechanism(e.g.,YcausesmechanismA,whichinturncausesXandZ),thanthroughseparatemechanisms(e.g.,X←A←Y→B→Z,whereAandBarethetwomechanismsthatexplainhowXandZareeachcausedbyY).MayrhoferandWaldmann(2015)havealsofoundevidencethatpeopleinferanunobservedinhibitoryfactorthatinfluencesmultipleeffectsofthesamecause.AndtheyfurtherfoundthatthesizeoftheMarkovviolationwasinfluencedbywhetherthecausesandeffectsweredescribedasagentsvs.patients(e.g.,cause“sending”informationtoeffectvs.effect“reading”informationfromcause).
Rehder(2014)foundsomesupportforboththeunobservedinhibitorandtheonevs.twomechanismaccounts,thoughmoregenerallyhefoundthatnoneoftheserationalizationsprovideaparsimoniousandcomprehensiveexplanationforallthereasoningerrors.Hearguedthatitisindeedhighlylikelythatpeopleembellishcausalstructuresgiveninexperimentswithadditionalnodesandlinksbasedontheirownpriorknowledge.However,Rehderproposedthatinadditiontoanyembellishmentsduetobackgroundknowledge,somejudgmentsfollowedanassociative-styleofreasoningthatdoesnotobeytheMarkovassumption.Heproposedtakinganindividual-differencesapproachtounderstandingwhycertainpeoplearemorelikelytouseanassociativestyleofreasoning. OnesurprisingaspectabouttheworkonwhetherpeopleupholdtheMarkovconditionisthattherehavebeenveryfewstudiesinwhichpeoplelearntheparametersofthecausalstructurethroughtrial-by-trialexperience,andthenmakejudgments.3GivingparticipantsstatisticalexperiencewiththecorrelationsbetweenthevariablesprovidesthemwithdirectevidencethatXandZarestatisticallyindependentgivenY.ParkandSloman(2013)conductedoneexperimentofthissort.TheirparticipantsinferredthatP(z=1|y=1,x1)>P(z=1|y=1,x=0),thoughP(z=1|y=0,x=1)=P(z=1|y=0,x=0);aviolationoftheMarkovconditiononlywheny=1.Asdiscussedabove,thispatternactuallyfitstheproposalthatpeopleinferanunobservedinhibitorycauseofbothXandY.However,themodifiedstructurewiththeunobservedinhibitorycauseisstillunfaithfultothedatathattheyobserved;inthelearningdataXandZwereindependentwheny=1.Thisraisesaquestionforfutureresearch:ifbeingtoldthestructureandexperiencingdatafaithfultothestructureisnotsufficienttostampoutviolationsoftheMarkovassumption,whatis?3.3QualitativeandQuantitativeInferenceswhenReasoningaboutCausalStructures RottmanandHastie(2014)reviewedinferencesonmanydifferenttypesofcausalstructuresincludingonelink[X→Y],chains[X→Y→Z],commoncause[X←Y→Z]commoneffect[X→Y←Z],anddiamond[X←W→YandX→Z←Y]structures.Foreachofthesestructureswereviewedevidenceabouthowwellpeoplemakeinferencesononevariablegivendifferentobservedcombinationsoftheothers(e.g.,XgivenknowledgeaboutY,orYgivenknowledgeofXandZ,etc.).3Inthesectionsaboveonlearningcausalstructures,whenthetruestructureisX→Y→Z,peopletendtoalsoinferthelinkX→Z,suggestingthattheyarenotfullyawareoftheconditionalindependence.Thissectionfocusesonreasoningaboutthecausalstructureratherthanlearning,thoughofcoursetheyarerelated.
27
Weconcludedthatforalmostallthecausalstructures(seethesectionbelowonexplainingawayforanexception)theinferencestendtogointherightdirection.Forexample,forthechain[X→Y→Z],ifbothcausalrelationsbetweenX→YandY→Zwerepositiveorbothwerenegative,peopletendedtoinferapositiverelationbetweenXandZ.Butifoneofthelinkswaspositiveandtheothernegativepeopleinferanegativecausalrelation(Baetu&Baker,2009).
Thepreviouslymentionedstudiesinvolvinginterventionsandobservationsonadiamondstructure[X←W→YandX→Z←Y]alsorevealhowsensitivepeoplearetotheparametersofthestructure(Mederetal.,2008,2009).Thesestudiessystematicallymanipulatedthebaseratesofsomeofthevariables,andalsothestrengthsofsomeofthecausallinks.Eventhoughthecausalstructuresinvolved4variables,andtheinferencerequiredreasoningwithtworoutesfromXtoZ,allofthesemanipulationshadinfluencesonsubjects’inferencesinthepredicteddirections.Insum,reasoninghabitsoftencorrespondtothequalitativepredictionsoftheCBNframework.
Yet,despitethequalitativecorrespondencebetweenhumaninferencesandthenormativejudgmentsbasedontheCBNframework,thequantitativecorrespondenceisnotsotight.Forexample,inoneconditionwheninferringtheprobabilityofZgivenXforthestudyabove,thenormativeanswerwas12.5%,yetsubjectsansweredonaverage37%.Giventhat50isthemiddleofthescale,37%isactuallyconsiderablyclosertoadefaultof50%thanthenormativeanswer.Thispatternofconservativeresults,judgmentstooclosetothecenterofthescalewasverycommonacrossmanystudiesreviewedinRottmanandHastie(2014).Forexample,forbothchain[X→Y→Z]andcommoncause[X←Y→Z]structurespeopledotypicallyinferacorrelationbetweenAandC,however,oftenthecorrelationisconsiderablyweakerthanthecorrelationinthedatathatthesubjectsobserved(Baetu&Baker,2009;Bes,Sloman,Lucas,&Raufaste,2012;Hagmayer&Waldmann,2000;Park&Sloman,2013).Therearemultiplepossibleinterpretationsofsucheffectssuchasresponsebiasesormemoryerrors(Costello&Watts,2014;Hilbert,2012)orpotentiallypriorsontheparameters(Lu,Yuille,Liljeholm,Cheng,&Holyoak,2008;Yeung&Griffiths,2011).Moreevidenceisneededtounderstandwhytheseeffectsoccur,andalsotounderstandtheaccuracywhenreasoningwithmorethan3or4variables.3.4ReasoningaboutExplainingAwaySituations
Theprevioussectionalreadyaddressedquantitativeinferencesoncausalnetworks,andtheconclusionisthatforthemostpartpeoplearefairlygoodatmakinginferences,thoughthereisaconservativebias.However,thereisonetypeofinferencecalledexplainingawaythatstandsoutasparticularlydifficult.ExplainingawayinferencesinvolvejudgmentsofP(x=1|y=1,z=1)andP(x=1|y=1,z=0)onacommoneffectstructure[X→Y←Z].ThereasonthatexplainingawayinferencesaresochallengingisthatoncethestateofYisknown,XandZactuallybecomenegativelydependent,sothenormativepatternofinferenceisP(x=1|y=1,z=1)<P(x=1|y=1,z=0).Thisisunlikeanyothertypeofinference.Forexample,onachainstructure[X→Y→Z],positiverelationsbetweenXandYandYandZmeanthatthereisapositiverelationbetweenXandZ;P(x=1|z=1)>P(x=1|z=0),andbecauseoftheMarkovassumptionP(x=1|y=1,z=1)=P(x=1|y=1,z=0).
IntermsofFigure1[smoke→cardiovasculardisease←age],explainingawaycouldinvolveinferringtheprobabilitythatsomeonesmokesgiventheirageandknowingthattheyhavecardiovasculardisease.Outofpatientswhohavecardiovasculardisease,
28
knowingthatagivenpatientisoldmeansthatitislessnecessarytoinferthattheysmokeinordertoexplainthecardiovasculardisease;oldage“explainsaway”thecardiovasculardisease.Ifthepatientisyoungitbecomesmorenecessarytoinferthattheysmoke-otherwisewhatexplainsthecardiovasculardisease?Insum,whenthetwocauseshaveapositiveinfluenceontheeffect,thecausesbecomenegativelyrelatedcontrollingfortheeffect.
Priorevidencedidnotdecisivelyidentifyhowwellpeopleexplainaway(Morris&Larrick,1995;Sussman&Oppenheimer,2011).Thenewestandclearestevidencesuggeststhatpeoplehaveconsiderabledifficultieswhenmakingexplaining-awayjudgments(Rehder,2014).Thoughsometimespeoplegetthedirectionoftheinferencecorrect,P(x=1|y=1,z=1)<P(x=1|y=1,z=0),theyoftenareambivalentaboutthedirectionoftheinference,andsometimesthinkthatZwouldhaveapositiveeffectonX,P(x=1|y=1,z=1)>P(x=1|y=1,z=0).Rehderproposedthatthistypeofreasoningismoreakintoanassociativespreading-activationnetworkthancausalreasoning.ReidHastieandIhavealsorecentlycollecteddataonexplainingaway;unlikethepreviousresearchwegaveparticipantslearningdatasothattheycouldreasonfromexperienceratherthanjustfromthecausalstructure,andsothattheyalsohavedirectevidencethatP(x=1|y=1,z=1)<P(x=1|y=1,z=0).WesometimesfoundexplainingawaythatwasmuchweakerthannormativelypredictedbytheCBNframework,andothertimesinferencepatternsintheoppositedirectionfromexplainingaway.
Thechallengepeoplehavewithexplainingawayissomewhatmysterious.Therearenoothertypesofcausalinferencethatgivereasonerssomuchtrouble,yetatthesametimeexplainingawayhasalsobeentoutedasafundamentalstrengthofhumanreasoning(Jones,1979;Kelley,1972;Pearl,1988,p.49).Therearealsootherresultsinwhichexplainingawaydoesoccur.Oppenheimeretal.(2013)createdstoriestoelicitexplainingaway.Forexample,participantsweretoldaboutananimalwiththreefeatures–feathers,layseggs,andcannotfly–andaskedtoratehowlikelythisanimalistobeanostrich.Beinganostrichisaplausibleexplanationforwhythisbirdcannotfly.Otherparticipantsweregiventhesamethreefeatureswithoneadditionalfeature,thatithasabrokenwing,whichisanalternativecausefornotbeingabletofly.Theseparticipantsjudgedthelikelihoodofbeinganostrichaslowerthantheparticipantswhowerenotgiventhisfeature,suggestingexplainingaway(seealsoOppenheimer&Monin,2009).Sosometimespeopledogetthedirectionoftheinferencecorrect.4(Thisstudydidnothavenormatively-correctquantitativeanswerstocomparehumaninferencesagainst,anditalsotestsacomparisonof).
Anadditionalcomplexityisthatexplainingawayisrelatedtoanotherphenomenon.ExplainingawayinvolvesinferringtheprobabilityofXgivenknowledgeofYandZonthestructure[X→Y←Z].AnothermuchstudiedtopicisinferringthecausalstrengthofXonY.Asalreadydiscussed,peopleknowthattheymustcontrolforZwheninferringthecausal4Thisstudyisdifferentfromtheonesaboveintwoways.First,thisstudydidnothaveanormativelycorrectquantitativeanswertocomparehumaninferencesagainst.Second,thisstudyteststhecomparisonP(ostrich|feathers,layseggs,cannotfly,brokenwing)vs.P(ostrich|feathers,layseggs,cannotfly),notP(ostrich|feathers,layseggs,cannotfly,nobrokenwing).ThisisanalogoustoP(x=1|y=1,z=1)vs.P(x=1|y=1)insteadofP(x=1|y=1,z=0),soitisaslightlydifferentcomparison.
29
strengthofXonY.However,whenZisaverystrongcauseofY,itisnotuncommonforpeopletoinferthatthestrengthofXisveryweak,weakerthanitactuallyis;sometimesthisiscalled“discounting”(Goedert&Spellman,2005).Thisdiscountingeffectisrelatedtoexplainingawayinthatbothphenomenarequireunderstandingthattwocausesarecompetingtoexplainaneffect.
Insum,thereisconflictingevidenceastowhen,whether,andhowmuchpeopleexplainaway.Despitethefactthatexplainingawayhasbeenstudiedfor40years,thereisstillimportantworktobedonetoreconcilethesefindings.3.5DoCausalRelationsBiasReasoning?
Itisafairlycommonviewinpsychologythatitisitiseasierforpeopletoreasonfromcausestoeffectsthanfromeffectstocauses(Pennington&Hastie,1993;White,2006),andthishypothesisissupportedbyevidencethatcausetoeffectjudgmentsaremadefasterthaneffecttocausejudgments(Fernbach&Darlow,2010).Thequestioninthissectioniswhethercognitiveeasehasaninfluenceontheinferencesthemselves.
TverskyandKahneman(1980)foundthatcausalinferencesarehigherwhenreasoningfromcausestoeffects.Similarly,Besetal.(2012)foundthatwhenmakinginferencesonthechain[X→Y→Z],inferencesofP(z=1|x=1)werehigherthanP(x=1|z=1).Additionally,bothoftheseinferenceswerehigherthaninferencesP(z=1|x=1)orP(x=1|z=1)onacommoncause[X←Y→Z]structure.Thesedifferencesareespeciallyinstructivebecausetheirparticipantsreceivedtrial-by-trialtraining,accordingtowhichalltheinferencesmentionedaboveshouldhavebeenequivalent.TheyspeculatethatmakinginferencesbetweenXandZonthecommoncauseisharderbecauseonemustreasonaboutcausalrelationsgoingintwodifferentdirections,andthisincreaseddifficultycouldlowerthefinaljudgment.
Thisstudyreachesaverydifferentconclusionthanmostoftherestofthearticlespresentedinthischapter.Theconclusionisthatstrengthoftheinferencesisdeterminedbytheeaseofexplaininghowthetwovariablesareconnected,andthatthiscognitiveeaseoverwhelmstheprobabilitiesparticipantsexperience.Eventhoughtheexplanationsforthesefindingsappealtocausalstructureandcausaldirection,theyareinconsistentwiththeCBNframework;theCBNframeworkpredictsthatalltheinferencesmentionedabovewouldbeequalgiventheparametersusedinthestudy.
Thoughtheeffectsofcausaldirectionwerefoundconsistentlyacrossthreeexperiments,thereareotherresultsthatdonotentirelyfitwiththestorythatcause-to-effectjudgmentsarehigherthaneffect-to-causejudgments.First,Fernbachetal.(2011,p.13)failedtoreplicatethestudybyTverskyandKahneman(1980).Morebroadly,Fernbachetal.havefoundthatinferencesfromcausestoeffectstendtobelowerthanthenormativestandard,butinferencesfromeffectstocausestendtoberoughlynormative(Fernbach,Darlow,&Sloman,2010;Fernbachetal.,2011;Fernbach&Rehder,2013;seealsoRehder,thisvolumeb).Theexplanationisthatwhenreasoningfromcausestoeffects,peoplesometimesforgetthatalternativecausescouldproducethetargeteffectasidefromthemaincause,thoughtheydonotforgetaboutalternativecauseswhenreasoningfromtheeffecttoatargetcause.
Thereissometensionbetweenthesetwosetsoffindings;Besetal.foundthateffect-to-causejudgmentsaretoolow(lowerthancause-to-effectjudgments),whereasFernbachetal.foundthatcause-to-effectjudgmentsaretoolow.However,theseresults
30
cannotbedirectlycomparedbecausetheydifferonavarietyofdimensions.5Fernbachetal.usedrealworldcoverstories,askedparticipantstheirbeliefsabouttheparametersofthecausalstructure,andthenusedthoseparameterstocalculatethenormativeanswers.Becauseofthisapproach,Fernbachetal.couldnotdirectlycomparethecause-to-effectandeffect-to-causeinferencesandinsteadcomparedeachinferencetothenormativestandardforthatinference.6Incontrast,Besetal.(Experiment3)gaveparticipantstrial-by-triallearningdata;becausethelearningdataweresymmetricthecause-to-effectandeffect-to-causeinferencescouldbedirectlycompared(althoughthecoverstorylabelsforthevariableswerenotcounterbalanced).
Insum,thoughitisintuitivethatitiseasiertoreasonfromcausestoeffectsratherthanviceversa,itisstillunclearweatherorhowcognitivefluencyandneglectofalternativecausesmanifestinjudgments;itisnotclearexactlywhetherorwhencause-to-effectjudgmentsarehigherthaneffect-to-causejudgments.Itisespeciallyimportanttocometoconsensusontheseresults,orexplainwhydifferentpatternsofreasoningarefoundindifferentsituations,becausebothofthepatternsoffindingsimplydeviationsfromtheCBNframework.3.6AlternativeRepresentationsforCausalReasoning SofarthischapterhaspresentedtheCBNframeworkasasinglemethodoflearningcausalstructuresandmakinginferences.However,likemostsophisticatedmodelingtools,thereareactuallymanychoicesthatthemodelercanmake.AssumingthathumancognitiverepresentationsofcausalityaresomehowsimilartotherepresentationofaCausalBayesiannetwork(directedrepresentationsofcausality,parameterstocapturethestrengthofcausalrelationsandbaserates),thesechoicescorrespondtodifferentcognitiverepresentationsofthetaskandbackgroundknowledge.Anaccuratedescriptionofcausalreasoningrequiresclarifyingtherepresentationsbeingused.InthenexttwosectionsIdiscusssomerepresentationaloptions,andwhethertheycanbeempiricallydistinguished.
ConsiderthecasethatyouaretoldthatXandZbothcauseY[X→Y←Z],youexperienceasetoflearningtrialsthatinstantiatethestatisticalrelationsbetweenthesevariables,andaresubsequentlyaskedtoinferP(x=1|y=1,z=1).Figure8detailsfourpossibleprocessesformakingthejudgment.
Thefirstroute,thedashedline,involvesmakingtheinferencedirectlyfromtheexperienceddata.Wheneveralearnerexperiencesdatathatinstantiatesthecausalstructureitispossibletocometothecorrectinferencebyfocusingontheexperienceddata5IthankMichaelWaldmannforhighlightingthesedifferences.6Assumingaworldinwhichcausesandeffectshavethesamebaserates,onaverage,Fernbachetal.’sfindingsimplythatcause-to-effectjudgmentswouldbelowerthaneffect-to-causejudgments.However,Fernbachetal.actuallyassumeaworldinwhicheffectshavehigherbaseratesthancausesonaverage.Fernbachetal.(2011,p.13)claimthatanormativeCBNanalysisshowsthatinferencesofP(effect=1|cause=1)shouldbehigherthanP(cause=1|effect=1)65%ofthetimewhenintegratingacrosstheentireparameterspacewithuniformpriors.Thereasonforthisfindingisduetothefactthattheyassumedthattherearealternativefactorsthatcangenerateeffectsbutnotinhibiteffects.Thissameanalysisshowsthateventhoughcauseshaveabaserate.5onaverage,effectshaveabaserateof.625.Sotheiranalysisisonlyappropriateinworldsinwhichtherearenoinhibitoryfactors.
31
andignoringthecausalstructure.Forexample,inordertocalculateP(x=1|y=1,z=1),areasonerjustneedstorememberthetotalnumberofobservationsinwhichallthreevariableswere1,P(x=1,y=1,z=1),anddividethisbythetotalnumberofobservationsinwhichy=1andz=1ignoringX,N(y=1,z=1);seeFigure8.Thisreasoningprocesscanbethoughtofassimilartoexemplarmodelsofcategorization;inferenceisperformedbyrecallingspecificexemplars. Theremainingthreeoptionsallinvolveelaboratingthecausalstructurewithdifferentkindsofparameters,andinferenceisperformedthroughacomputationontheparameters.Thoughinsomewaystheinferenceitselfseemsmorecomplicated,thecognitivebenefitisthatthelearneronlyneedstostorethestructureandtheparameters,notalltheindividualinstances.ThedifferencebetweenthesethreeoptionsishowtheyrepresenttheconditionalprobabilitydistributionofY,theprobabilityofYgiventhecausesXandZ.ThisconditionalprobabilitydistributionisdenotedasP(Y=y|X=x,Z=z),whichmeanstheprobabilitythatYisinaparticularstate(y=0or1),giventhatXandZareeachinparticularstates,xandz.
Representation1,involvescalculatingtheconditionalprobabilitydistributionP(Y=y|X=x,Z=z)directlyfromtheexperienceddata.Forexample,theprobabilitythaty=1giventhatx=1andz=1,iscalculateddirectlyfromrows1and3fromtheexperiencetable.Inferencecanthenproceedthroughsimpleprobabilitytheory(Figure8).Heckerman(1998)providesatutorialonthisapproach,andprovidescitationstootherexactandapproximateinferencealgorithms. Representation2doesnotdirectlyrepresenttheconditionalprobabilitydistributionP(Y=y|X=x,Z=z),butinsteadassumesthatpeoplespontaneouslyinfercausalstrengthsfromthelearningdata.SX→YandSZ→YrefertothestrengthofXonYandZonY,respectively.Themostpopularwaytorepresentcausalstrengthsinthenormativepsychologicalliteratureisusingcausalpowertheory,whichassumesthatcausescombinethroughaNoisy-ORfunction(Cheng,1997;Novick&Cheng,2004alsoseeSections2.2and2.3).Thisapproachalsorequiresthelearnertoestimatetheprobabilitythattheeffectispresentwithoutanyofitscauses,P(Y=1|x=0,z=0).Thecausalstrengthsandthefunctionalform(Noisy-OR)subsequentlyallowareasonertodeducetheconditionaldistributionP(Y=y|X=x,Z=z),whichwouldbeusedformakingtheinferenceP(x=1|y=1,z=1).ThecriticaldifferencebetweenRepresentation1vs.2isthatRepresentation2embodiestheassumptionthatXandZcombinethroughaNoisy-ORfunctionanddonotinteract(Novick&Cheng,2004);theNoisy-ORassumptionisthereasonwhyRepresentation2hasonly5parametersinsteadofthe6parametersinRepresentation1. Representation3isverysimilartoRepresentation2;however,insteadofrepresentingtheparameterP(Y=1|x=0,z=0),anadditionalbackgroundcauseBisaddedthatexplainsthecaseswhenY=1butXandZare0.InFigure8,Bisassumedtoalwaysbepresent,andtohaveastrengthof1/3.
Thequestionraisedbythesefouroptionsiswhethersomesortofrepresentationofcausalstructureandstrengthmediatestheprocessofmakinganinferencebasedonexperienceddata,orwhethertheinferenceismadedirectlyfromtheexperienceddata(dashedline).Ifindeedsomesortofcausalstructurerepresentationmediatestheinference,whichformofrepresentationgetsused?Allfourapproachesmaketheexactsamepredictions,sotheyaredifficulttodistinguishempirically.
32
Idonotknowofanystudiesthataddressthefirstquestion,whetheracausalstructurerepresentationmediatestheprocessofmakinganinferencebasedonexperiencedata.However,therearesomestudiesthathaveattemptedtodistinguishthenatureoftheCBNrepresentation,specificallythedifferencebetweenRepresentations2vs.3.Figure8:FourPossibleProcessesforMakinganInference
Note:Nreferstothenumberoftrialsofobservationsofaparticulartype.
KrynskiandTenenbaum(2007)studiedhowwellpeoplemakeinferencesonthefamousmammogramproblem.Inthisproblem,participantsaretoldthatbreastcancer
0 0 0 80 1 60
400 16110401 0
1 30 18011
1 911#ZYX
X
Y
Z
P(y=1|x=1,z=1)=3/4P(y=1|x=1,z=0)=2/3P(y=1|x=0,z=1)=1/2P(y=1|x=0,z=0)=1/3
P(x=1)=.5 P(z=1)=.5
Functional Form = Noisy OR, no interactionP(y=1|X=x,Z=z)= 1-(1-SX→Y)x(1-SZ→Y)z(1-SB→Y)
X
Y
ZP(x=1)=.5 P(z=1)=.5
B
SB→Y=1/3
SZ→Y=1/4SX→Y
=1/2
P(b=1)=1
P(y=1|x=0,z=0)=1/3,Functional Form = Noisy OR, no interactionP(y=1|X=x,Z=z)= 1-(1-SX→Y)x(1-SZ→Y)z(1-P(y=1|x=0,z=0))
X
Y
ZP(x=1)=.5 P(z=1)=.5
SZ→Y=1/4
SX→Y=1/2
CBN Representation 1:Conditional distribution is represented as a table of all combinations of the causes.
CBN Representation 2: Conditional distribution is represented as the Noisy-OR functional form combining causal strengths for known causes, and a parameter for Y when all known causes are absent.
CBN Representation 3: Conditional distribution is represented as the Noisy-OR functional form combining causal strengths. Reasoner infers a background cause B if Y is present without X or Z.
Joint Probability Distribution:The memory for the number of observations of each type of event. When there are three binary variables, there are 8 possible event types.
P(x=1|y=1,z=1) = P(x=1,y=1,z=1) / P(y=1,z=1) = P(x=1,y=1,z=1) / (P(x=1,y=1,z=1) + P(x=0,y=1,z=1)) = P(y=1|x=1,z=1)P(x=1)P(z=1) / (P(y=1|x=1,z=1)P(x=1)P(z=1) + P(y=1|x=0,z=1)P(x=0)P(z=1)) = 3/4*1/2*1/2 / (3/4*1/2*1/2 + 1/2*1/2*1/2) = 3/5
P(x=1|y=1,z=1) = N(x=1,y=1,z=1) / N(y=1,z=1) = N(x=1,y=1,z=1) / (N(x=1,y=1,z=1) + N(x=0,y=1,z=1)) = 9/(9+6) = 3/5
X
Y
Z
Causal Structure:
Given information: Reasoner is told the causal structure and receives trial-by-trial experience of the multivariate distribution, which can be summarized as a joint probability table. Reasoner is asked to infer P(x=1|y=1,z=1).
33
(cause)almostalwaysresultsinapositivemammogramtest(effect),andtheyaretoldthebaserateofbreastcancer.Theyarealsotoldthatmammogramshavefalsepositives6%ofthetime.Critically,thisfalsepositiverateisframedeitherasinherentrandomness(Representation2,whichhasaparametertorepresenttheprobabilityoftheeffectwhentheknowncauseisabsent),orduetoabenigncyst(anexplicitbackgroundcauselikeinRepresentation3).KrynskiandTenenbaumfoundthatparticipants’judgmentsabouttheprobabilityofbreastcancergivenapositivemammogramwereconsiderablymoreaccuratewhenthefalsepositiveratewasframedasbeingcausedbyabenigncyst,suggestingthatRepresentation3maybethemostintuitive.
AnumberofrecentstudieshelptoclarifythisfindingbyKrynskiandTenenbaum.First,thoughthisfacilitationofBayesianrespondingbyacausalframinghassometimesbeenfound,theeffecthasnotalwaysbeenconsistent(Hayesetal.,2015;Hayes,Newell,&Hawkins,2013;McNair&Feeney,2014,2015).Thereappeartobetwomainreasonsfortheinconsistency.First,thecausalframinghasabiggerinfluenceforparticipantswhohavehighermathematicalabilities(McNair&Feeney,2015).Second,thefacilitationeffectisoftenseeninareductioninextremeoverestimations(calledbaserate‘neglect’);however,thefinaljudgmentsareoftenlower,closertothenormativeresponse,butstillnotquite‘normative’(McNair&Feeney,2014).AplausibleexplanationforthiseffectwasputforthbyHayes,Hawkins,andNewell(2015;2014),whofoundthatthecausalframingincreasestheperceivedrelevanceofthefalsepositiveinformation.Theyconcludedthatthecausalframingmainlyhasaninfluenceontheattentionpaidtothefalsepositiverateandpossibletheconstructionofarepresentationoftheproblem,butdoesnotnecessarilyhelpparticipantstoactuallyusethefalsepositiverateinanormativewaywhencalculatingtheposteriorinference.
Insum,itseemslikehavingexplicitalternativecauses(Representation3)mayfacilitateaccuratecausalinference.Thatsaid,thisfindingraisesaworryingprospectthatcausalreasoningisapparentlyfragileenoughthatitcanbeharmedbyasmallframing.Ifcausalreasoningisrobustwhycan’tpeopletranslatebetweentheserepresentationsbymentallygeneratinganalternativecausetorepresentthefalsepositiverate?
Morebroadly,thepurposeofthisanalysisinFigure8wastoshowthattheCBNframeworkcanbeinstantiatedinmultiplepossibleways.Differentarticlespresentdifferentversions.Eventhoughtheyallmakesimilarifnotidenticalpredictions,thesealternativeversionspresentdifferentcognitiveprocessesinvolvedinmakingtheinference.Inordertomovefromacomputational-leveltheorytoanalgorithmic-leveltheoryitwillbenecessarytofurtherclarifytherepresentationsandinferenceprocess.Itisespeciallycriticaltoclarifywhetheracausalstructurerepresentationmediatescausalinferencewhenareasonerhasexperiencedlearningdatabecauseinsuchinstancesitispossibletomakeinferencesdirectlyfromtherememberedexperienceswithoutthinkingaboutthecausalstructureatall.3.7EvenMoreComplicatedAlternativeModelsforCausalReasoning TheprevioussectiondiscussedfourpossibleimplementationsoftheCBNframework.However,inrealitytherearemanymorepossibilities.AfullyBayesiantreatmentoflearningandinferenceallowsforawayforpriorknowledgetoinfluencethelearningandinferenceprocesses.Inregardstoacausalstructure,therearethreepossiblerolesofpriorinformation;priorbeliefsaboutthenetwork,abouttheintegrationfunction,andaboutthestrengthsorparameters.
34
First,whereasRepresentations2and3inFigure8bothassumeoneparticularfunctionalform,theNoisy-OR,inrealitylearningisnotthissimple.Section2.2onfunctionalformsalreadycoveredexperimentsonhowpeoplelearnthespecificwayinwhichmultiplecausescombinetoproduceaneffect,andhowthisbeliefshapesfurtherlearningandreasoningaboutthecausalsystem.(Beckers&Miller,2005;Lucas&Griffiths,2010;Waldmann,2007).Thus,afullyBayesianversionofFigure8wouldallowformultiplepossibleintegrationfunctionsandpriorsonthosefunctions. Second,theparametersinFigure8werecalculatedbyusingpointestimates.Forexample,theparameterP(y=1|x=0,z=0)forRepresentations1and2,andtheSB→YparameterinRepresentation3,areallgivenasexactly1/3inFigure8,whichwascalculatedbycomparingrows6and8inthedatatable.Ifapointestimateoftheparametersisused,thenallfourapproachesproduceexactlythesameinferences.Alternatively,anotheroptionisthatpeoplerepresentuncertaintyaboutalloftheparametersbasedontheamountofdataexperienced.Ifthissecondapproachisused,thenRepresentation1willmakesomewhatweakerinferencesthanRepresentations2and3,becauseRepresentation1requiresinferringanadditionalparameter.Additionally,peoplemayhavepriorbeliefsaboutcausalstrengthsthatmaybiasthelearningandinferenceprocess.Forexample,Luetal.arguedthatpeoplebelievecausestobesparseandstrong(Luetal.,2008).GiventhedatainFigure8,thesparseandstrongpriorspullthestrengthsdownward;insteadofastrengthof.50,thesparseandstrongpriorswouldproduceastrengthestimateof.43andwithmoredatatheestimategetscloserto.50.Incontrast,YeungandGriffiths(Yeung&Griffiths,2015)foundthatpeoplehavepriorssuchthattheybelievethatmostcandidatecausesareverystrong.Ifpeoplehadsuchpriorsitwouldresultincausalstrengthestimatesabove.50.Priorsonstrengthwouldhaveadown-streaminfluenceoninference;thestrongerthecausalstrengthbeliefs,thestrongertheinferencesshouldbe. Third,peopleoftenhavepriorbeliefsaboutthecausalnetwork.Luetal.’ssparseandstrongpriorsuggeststhatpeoplebelievethatfewercausesaremorelikelythanmanycauses(Luetal.,2008).Inarelatedvein,Mederetal.(2014)proposedthatwhenperforminganinference,eveniftoldacausalstructure,peoplemayentertainthepossibilitythatanothercausalstructurecouldactuallybethetruestructure,whichcaninfluencethejudgment.Inparticular,Mederetal.toldparticipantsthestructure[X→Y],hadthemobservecontingencydatasothattheycouldlearnthestatisticalrelationbetweenXandY,andthenhadthemmakeaninferenceofP(x=1|y=1).TheyfoundevidencethatwhenthecausalstrengthofXonYisfairlyweak,peoplemaynotbelievethestructure[X→Y]andinsteadentertainthepossibilitythatXandYmaybeunrelated.Thisgeneralapproach,thatpeoplemayentertainthepossibilitythatthecausalstructurepresentedbytheexperimentermaynotactuallybethetruecausalstructurehasalsobeenusedtoexplainviolationsoftheMarkovassumption(seeSection3.2).Oneproblemwiththisaccount,however,isthatwhentherearemorethantwovariablesitisunclearwhatsetofalternativestructuresisentertained,andconsideringmultiplepossibilitieswouldquicklybecomecognitivelyunwieldy. Insum,allowingforthepossibilitythatpeoplethinkaboutmultiplepossiblestrengths,functionalforms,andcausalstructuresmakestheCBNframeworkveryflexible,andonacase-by-caselevelitseemsplausiblethatpeoplemayactuallyhavepriorsforanyoftheseaspectsofthenetwork.However,incorporatingallofthesepriorsmakesthe
35
reasoningtaskmuchharderthananyoftheoptionsinFigure8,anditseemsunlikelythatpeoplearealwaysengagedinreasoningwithallthesepriorssimultaneously.Thus,itwillbeimportanttounderstandwhenpeoplemakeuseofthepriorsandhowwelltheyincorporatepriorswithobserveddataformakinginferences.4FinalQuestions,FutureDirections,andConclusions ThroughoutthechapterIhavehighlightedquestionsandfuturedirections.InthissectionIrepeatsomeofthosequestionsandaddsomenewones.Ibelievethatthesequestionsarecriticalforhavingathoroughandaccurateunderstandingofhumancausallearningandreasoning.
1) Thoughrecentlytherehavebeenmoreattemptstoexploreotherfunctionalforms,thevastmajorityofresearchontheCBNframeworkhasinvestigatedbinaryvariablesthatcombinethroughaNoisy-ORfunction.Therehasbeenverylittletheorizingaboutwhatcausalstrengthmeans,forexample,whencausesandoreffectsaremultilevel(Pacer&Griffiths,2011;Rottman,2016;White,2001).Forexample,isthehumaninterpretationofcausalstrengthformultilevel(e.g.,Gaussian)variablesanalogoustoeffectsizemeasuresforlinearregression?Whatistherelationbetweenfunctionlearningandcausalstrengthlearning?Dopeoplefaceanychallengesorusedifferentheuristicswhenlearningcausalstructuresfrommultilevelratherthanbinaryvariables?Insum,causalreasoningisextremelydiverse,anditwillbecriticaltobroadenourexperimentalparadigmstocapturethisdiversity.
2) Oneofthegoalsofcognitivepsychologyistounderstandtherepresentationsthatpeopleuseforthought.AsFigure8demonstrates,therearemultiplepossiblerepresentationsforhowpeoplereasonaboutcausalstructures,andmanyoftheserepresentationsmakeexactlythesame(orverysimilar)predictions.Clarifyingwhichsortsofrepresentationsareusedwillhelpdevelopamoreprecisedescriptiveaccountofcausalreasoning.
3) SofartheCBNframeworkhasbeenframedasacomputational-leveltheoryofhumancausalreasoning.However,thecomputationsinvolvedininferringacausalstructurefromdata,ormakinginferencesonanetwork(e.g.,Figure8)areverycomplex.Thus,animportantgoalistodevelopaprocess-levelaccountofhowpeopleactuallyperformtheseinferences.Anumberoftheoristshaveproposedvariousheuristicsforcausallearning,whichoftencomeclosetotheoptimalsolution,andoftenhaveequalorbetterfittoparticipants’inferences(Bramleyetal.,2015;Coenenetal.,2015;Lagnado&Sloman,2004;Rottman&Keil,2012;Rottmanetal.,2014;Steyversetal.,2003).YetsofarthisheuristicsapproachhasbeendisconnectedandhasoftentakenthebackseattoproofofconceptdemonstrationsthattheCBNframeworkcanmodelhumanlearning.Moreattentiontohowtheseinferencesareactuallymadethroughaprocess-levelaccountwillhelpprovidepsychologicalinsightintothisfascinatingandcomplexreasoningprocess.
4) Lastly,allofthestudiesonhumancausalreasoninggiveparticipantstoyexamplesandsampledatainshortperiodsoftime.Itisunclearhowwellthisresearchstrategycapturesactualcausalreasoningintherealworld,whichinvolveslong-termaccumulationofdataandmanymorevariables.Anidealapproachwouldbetofindareal-worlddomaininvolvingcausesandeffectsthatincludesrecordsofexperiences.Forexample,ahighlyaccurateelectronicmedicalrecordssystemmight
36
inthefuturepermitustotrackadoctor’sexperienceswithallthevariablesinFigure1toseeifthedoctor’sjudgmentsfitcloselywithhisorherpersonalexperiences.
ThecausalBayesiannetworkframeworkhasentirelyreshapedthelandscapeofresearchoncausalitytothepointthatitisnowrareseearticlesthatinvestigatecausallearningwithoutmentioningtheCBNframework.Whereasresearchoncausalreasoningusedtobeprimarilyaboutinferencesbetweenasinglecauseandeffect,nowthecentralquestionsareaboutlargercausalstructures.Thus,thenewfocusisonhowpeoplelearnthestructureanddeterminecausaldirectionality,howpeoplesimplifycomplexstructuresintosmallerunitsusingtheMarkovassumption,andhowvariousbeliefscapturedinthenetworksuchastheintegrationfunctioninfluencelearningandreasoning.EvenolderquestionssuchaselementalcausallearninghavebenefittedtremendouslyfromtheCBNframeworkbyreinterpretingstrengthasaparameterinthecausalnetwork.
Onthedescriptiveside,themostimportantfactabouthumancausalreasoningisthathumansareremarkablygoodcausalreasoners;weadeptlyincorporatemanydifferentbeliefswhenlearningandreasoning(e.g.,integrationfunctions,autocorrelation,causaldirectionality),wecanlearnaboutquitecomplicatedcausalrelations(e.g.,unobservedcausesthatinteractwithobservedcauses),andweoftendosowithremarkablylittledata.TheintroductionoftheCBNframeworkhasrevealedmanyofthesecapacitiesthatwerepreviouslyunknownandhasalsoraisedimportantquestionssuchashowsuchashowtodevelopaprocess-levelaccountofthesesophisticatedinferences,howcloselydotherepresentationsoftheCBNframeworkmapontotheactualrepresentationsthatweuseforcausalreasoning,howcausalreasoningoccurswithmorediversesortsofstimuliandinmorenaturalisticenvironments.Answeringthesequestionswillnotonlyhelpusdevelopamoreaccurateandcompletepictureofhumancausalreasoningbutmayalsoidentifywaystohelppeoplebecomeevenbettercausalreasoners.
37
References:Baetu,I.,&Baker,aG.(2009).Humanjudgmentsofpositiveandnegativecausalchains.
JournalofExperimentalPsychology:AnimalBehaviorProcesses,35(2),153–68.http://doi.org/10.1037/a0013764
Beckers,T.,&Miller,R.R.(2005).Outcomeadditivityandoutcomemaximalityinfluencecuecompetitioninhumancausallearning,31(2),238–249.http://doi.org/10.1037/0278-7393.31.2.238
Bes,B.,Sloman,S.A.,Lucas,C.G.,&Raufaste,E.(2012).Non-bayesianinference:causalstructuretrumpscorrelation.CognitiveScience,36(7),1178–203.http://doi.org/10.1111/j.1551-6709.2012.01262.x
Bramley,N.R.,Lagnado,D.A.,&Speekenbrink,M.(2015).Conservativeforgetfulscholars-Howpeoplelearncausalstructurethroughsequencesofinterventions.JournalofExperimentalPsychology.Learning,Memory,andCognition,41(3),708–731.http://doi.org/10.1037/xlm0000061
Cartwright,N.(1989).Nature’scapacitiesandtheirmeasurement.Oxford,UK:ClarendonPress.
Cheng,P.W.(1997).Fromcovariationtocausation:Acausalpowertheory.PsychologicalReview,104(2),367–405.http://doi.org/10.1037//0033-295X.104.2.367
Cheng,P.W.,&Novick,L.R.(1992).Covariationinnaturalcausalinduction.PsychologicalReview,99(2),365–82.http://doi.org/10.1037/0033-295X.99.2.365
Coenen,A.,Rehder,B.,&Gureckis,T.(2015).Strategiestointerveneoncausalsystemsareadaptivelyselected.CognitivePsychology,79,102–133.http://doi.org/10.1016/j.cogpsych.2015.02.004
Costello,F.,&Watts,P.(2014).SurprisinglyRational :ProbabilityTheoryPlusNoiseExplainsBiasesinJudgment,121(3),463–480.
Danks,D.(2003).EquilibriaoftheRescorla–Wagnermodel.JournalofMathematicalPsychology,47(2),109–121.http://doi.org/10.1016/S0022-2496(02)00016-0
Eells,E.(1991).Probabilisticcausality.Cambridge,UK:CambridgeUniversityPress.Fernbach,P.M.,&Darlow,A.(2010).CausalConditionalReasoningandConditional
Likelihood.InProceedingsofthe32ndannualconferenceoftheCognitiveScienceSociety.(p.305).Austin,TX:CognitiveScienceSociety.http://doi.org/10.1177/0272989X9101100408
Fernbach,P.M.,Darlow,A.,&Sloman,S.A.(2010).Neglectofalternativecausesinpredictivebutnotdiagnosticreasoning.PsychologicalScience,21(3),329–36.http://doi.org/10.1177/0956797610361430
Fernbach,P.M.,Darlow,A.,&Sloman,S.A.(2011).Asymmetriesinpredictiveanddiagnosticreasoning.JournalofExperimentalPsychology:General,140(2),168–85.http://doi.org/10.1037/a0022100
Fernbach,P.M.,&Rehder,B.(2013).Cognitiveshortcutsincausalinference.Argument&Computation,4(1),64–88.http://doi.org/10.1080/19462166.2012.682655
38
Fernbach,P.M.,&Sloman,S.A.(2009).Causallearningwithlocalcomputations.JournalofExperimentalPsychology:Learning,Memory,andCognition,35(3),678–93.http://doi.org/10.1037/a0014928
Ghahramani,Z.(1998).LearningdynamicBayesiannetworks.InAdaptiveprocessingofsequencesanddatastructures(Vol.1387,pp.168–197).SpringerBerlinHeidelberg.
Glymour,C.(2001).TheMind’sArrows.Cambridge,MA:MITPress.Goedert,K.M.,&Spellman,B.A.(2005).Nonnormativediscounting:Thereismoretocue
interactioneffectsthancontrollingforalternativecauses.AnimalLearning&Behavior,33(2),197–210.http://doi.org/10.3758/BF03196063
Goodie,A.S.,Williams,C.C.,&Crooks,C.L.(2003).Controllingforcausallyrelevantthirdvariables.TheJournalofGeneralPsychology,130(4),415–30.http://doi.org/10.1080/00221300309601167
Gopnik,A.,Glymour,C.,Sobel,D.M.,Schulz,L.E.,Kushnir,T.,&Danks,D.(2004).Atheoryofcausallearninginchildren:CausalmapsandBayesnets.PsychologicalReview,111(1),3–32.http://doi.org/10.1037/0033-295X.111.1.3
Griffiths,T.L.,&Tenenbaum,J.(2005).Structureandstrengthincausalinduction.CognitivePsychology,51(4),334–84.http://doi.org/10.1016/j.cogpsych.2005.05.004
Griffiths,T.L.,&Tenenbaum,J.(2009).Theory-basedcausalinduction.PsychologicalReview,116(4),661–716.http://doi.org/10.1037/a0017201
Hagmayer,Y.,&Waldmann,M.R.(2000).SimulatingCausalModels :TheWaytoStructuralSensitivity.InL.R.Gleitman&A.K.Joshi(Eds.),ProceedingsoftheTwenty-SecondAnnualConferenceoftheCognitiveScienceSociety(pp.214–219).Austin,TX:CognitiveScienceSociety.
Hayes,B.K.,Hawkins,G.E.,Newell,B.R.,Hayes,B.K.,Hawkins,G.E.,&Newell,B.R.(2015).ConsidertheAlternative:TheEffectsofCausalKnowledgeonRepresentingandUsingAlternativeHypothesesinJudgmentsUnderUncertainty.JournalofExperimentalPsychology :Learning,Memory,andCognition,41(6).http://doi.org/10.1037/xlm0000205
Hayes,B.K.,Hawkins,G.E.,Newell,B.R.,Pasqualino,M.,&Rehder,B.(2014).Theroleofcausalmodelsinmultiplejudgmentsunderuncertainty.Cognition,133(3),611–620.http://doi.org/10.1016/j.cognition.2014.08.011
Hayes,B.K.,Newell,B.R.,&Hawkins,G.E.(2013).Causalmodelandsamplingapproachestoreducingbaserateneglect.Proceedingsofthe35thAnnualConferenceoftheCognitiveScienceSociety,567–572.
Heckerman,D.(1998).AtutorialonlearningwithBayesiannetworks.InM.I.Jordan(Ed.),LearninginGraphicalModels(pp.301–354).Springer.
Hiddleston,E.(2005).Acausaltheoryofcounterfactuals.Nous,39,632–657.http://doi.org/10.1111/j.0029-4624.2005.00542.x
Hilbert,M.(2012).Towardasynthesisofcognitivebiases:hownoisyinformationprocessingcanbiashumandecisionmaking.PsychologicalBulletin,138(2),211–37.http://doi.org/10.1037/a0025940
39
Jenkins,H.M.,&Ward,W.C.(1965).Judgmentofcontingencybetweenresponsesandoutcomes.PsychologicalMonographs:GeneralandApplied,79(1),1–17.
Jones,E.(1979).Therockyroadfromactstodispositions.TheAmericanPsychologist,34(2),107–17.
Kelley,H.H.(1972).Causalschemataandtheattributionprocess.InE.Jones,D.E.Kanouse,H.H.Kelley,R.E.Nisbett,S.Valins,&B.Weiner(Eds.),Attribution:PerceivingtheCausesofBehavior(pp.151–174).Morristown,NJ:GeneralLearningPress.
Krynski,T.R.,&Tenenbaum,J.(2007).Theroleofcausalityinjudgmentunderuncertainty.JournalofExperimentalPsychology.General,136(3),430–50.http://doi.org/10.1037/0096-3445.136.3.430
Lagnado,D.A.,&Sloman,S.A.(2004).Theadvantageoftimelyintervention.JournalofExperimentalPsychology.Learning,Memory,andCognition,30(4),856–76.http://doi.org/10.1037/0278-7393.30.4.856
Lagnado,D.A.,&Sloman,S.A.(2006).Timeasaguidetocause.JournalofExperimentalPsychology.Learning,Memory,andCognition,32(3),451–60.http://doi.org/10.1037/0278-7393.32.3.451
Lagnado,D.A.,Waldmann,M.R.,Hagmayer,Y.,&Sloman,S.A.(2007).Beyondcovariation:cuestocausalstructure.InA.Gopnik&L.Schulz(Eds.),Causallearning:Psychology,philosophy,andcomputation(pp.154–172).Oxford:OxfordUniversityPress.
Lu,H.,Yuille,A.L.,Liljeholm,M.,Cheng,P.W.,&Holyoak,K.J.(2008).Bayesiangenericpriorsforcausallearning.PsychologicalReview,115(4),955–84.http://doi.org/10.1037/a0013256
Lucas,C.G.,&Griffiths,T.L.(2010).Learningtheformofcausalrelationshipsusinghierarchicalbayesianmodels.CognitiveScience,34(1),113–47.http://doi.org/10.1111/j.1551-6709.2009.01058.x
Mayrhofer,R.,&Waldmann,M.R.(2011).HeuristicsinCovariation-basedInductionofCausalModels:SufficiencyandNecessityPriors.InC.H.Carlson&T.Shipley(Eds.),Proceedingsofthe33rdAnnualConferenceoftheCognitiveScienceSociety(pp.3110–3115).Austin,TX.
Mayrhofer,R.,&Waldmann,M.R.(2015).Agentsandcauses:Dispositionalintuitionsasaguidetocausalstructure.CognitiveScience,39(1),65–95.http://doi.org/10.1111/cogs.12132
Mccormack,T.,Frosch,C.,Patrick,F.,&Lagnado,D.A.(2015).TemporalandStatisticalInformationinCausalStructureLearning.JournalofExperimentalPsychology.Learning,Memory,andCognition,41(2),395–416.
McNair,S.,&Feeney,A.(2014).Whendoesinformationaboutcausalstructureimprovestatisticalreasoning?QuarterlyJournalofExperimentalPsychology(2006),67(4),625–45.http://doi.org/10.1080/17470218.2013.821709
McNair,S.,&Feeney,A.(2015).Whosestatisticalreasoningisfacilitatedbyacausalstructureintervention?PsychonomicBulletin&Review,22(1),258–264.http://doi.org/10.3758/s13423-014-0645-y
40
Meder,B.,Gerstenberg,T.,Hagmayer,Y.,&Waldmann,M.R.(2010).ObservingandIntervening :RationalandHeuristicModelsofCausalDecisionMaking.TheOpenPsychologyJournal,(3),119–135.
Meder,B.,&Hagmayer,Y.(2009).Causalinductionenablesadaptivedecisionmaking.InN.A.Taatgen&H.vanRijn(Eds.),Proceedingsofthe31thAnnualConferenceoftheCognitiveScienceSociety(pp.1651–1656).Austin,TX:CognitiveScienceSociety.
Meder,B.,Hagmayer,Y.,&Waldmann,M.R.(2008).Inferringinterventionalpredictionsfromobservationallearningdata.PsychonomicBulletin&Review,15(1),75–80.http://doi.org/10.3758/PBR.15.1.75
Meder,B.,Hagmayer,Y.,&Waldmann,M.R.(2009).Theroleoflearningdataincausalreasoningaboutobservationsandinterventions.Memory&Cognition,37(3),249–64.http://doi.org/10.3758/MC.37.3.249
Meder,B.,Mayrhofer,R.,&Waldmann,M.R.(2014).StructureInductioninDiagnosticCausalReasoning.PsychologicalReview,121(3),277–301.http://doi.org/10.1037/a0035944
Morris,M.W.,&Larrick,R.P.(1995).Whenonecausecastsdoubtonanother:Anormativeanalysisofdiscountingincausalattribution.PsychologicalReview,102(2),331–355.http://doi.org/10.1037/0033-295X.102.2.331
Murphy,K.P.(2002).Dynamicbayesiannetworks:representation,inferenceandlearning.UniversityofCalifornia,Berkeley.
Novick,L.R.,&Cheng,P.W.(2004).Assessinginteractivecausalinfluence.PsychologicalReview,111(2),455–85.http://doi.org/10.1037/0033-295X.111.2.455
Oppenheimer,D.M.,&Monin,B.(2009).Investigationsinspontaneousdiscounting.Memory&Cognition,37(5),608–614.http://doi.org/10.3758/MC.37.5.608
Oppenheimer,D.M.,Tenenbaum,J.,&Krynski,T.R.(2013).CategorizationasCausalExplanation.DiscountingandAugmentinginaBayesianFramework.InPsychologyofLearningandMotivation-AdvancesinResearchandTheory(Vol.58,pp.203–231).Elsevier.http://doi.org/10.1016/B978-0-12-407237-4.00006-2
Pacer,M.D.,&Griffiths,T.L.(2011).Arationalmodelofcausalinductionwithcontinuouscauses.InAdvancesinNeuralInformationProcessingSystems(pp.2384–2392).
Park,J.,&Sloman,S.A.(2013).MechanisticbeliefsdetermineadherencetotheMarkovpropertyincausalreasoning.CognitivePsychology,67(4),186–216.http://doi.org/10.1016/j.cogpsych.2013.09.002
Pearl,J.(1988).ProbabilisticReasoninginIntelligentSystems:NetworksofPlausibleInference.MorganKaufmannPublishers.
Pearl,J.(1996).StructuralandProbabilisticCausality.InD.Shanks,K.J.Holyoak,&D.L.Medin(Eds.),Psychologyoflearningandmotivation:CausalLearning(Vol.34,pp.393–435).SanDiego:AcademicPress.
Pearl,J.(2000).Causality:Models,Reasoning,andInference.Cambridge,UK:CambridgeUniversityPress.
41
Pennington,N.,&Hastie,R.(1993).Reasoninginexplanation-baseddecisionmaking.Cognition,49,123–163.
Peterson,C.R.,&Beach,L.R.(1967).Manasanintuitivestatistician.PsychologicalBulletin,68(1),29–46.
Rehder,B.(2014).IndependenceandDependenceinHumanCausalReasoning.CognitivePsychology,72.http://doi.org/10.1016/j.cogpsych.2014.02.002
Rehder,B.,&Burnett,R.C.(2005).Featureinferenceandthecausalstructureofcategories.CognitivePsychology,50(3),264–314.http://doi.org/10.1016/j.cogpsych.2004.09.002
Rescorla,R.A.,&Wagner,A.R.(1972).AtheoryofPavlovianconditioning:Variationsintheeffectivenessofreinforcementandnonreinforcement.InA.H.Black&W.F.Prokasy(Eds.),ClassicalconditioningII:Currentresearchandtheory.(pp.64–99).NewYork:Appleton-Century-Crofts.
Rips,L.J.(2010).Twocausaltheoriesofcounterfactualconditionals.CognitiveScience,34(2),175–221.http://doi.org/10.1111/j.1551-6709.2009.01080.x
Rottman,B.M.(2016).Searchingforthebestcause:Rolesofmechanismbeleifs,autocorrelation,andexploitation.JournalofExperimentalPsychology:Learning,MemoryandCognition.http://doi.org/http://dx.doi.org/10.1037/xlm0000244
Rottman,B.M.,&Ahn,W.(2009).Causallearningabouttoleranceandsensitization.PsychonomicBulletin&Review,16(6),1043–9.http://doi.org/10.3758/PBR.16.6.1043
Rottman,B.M.,&Ahn,W.(2011).Effectofgroupingofevidencetypesonlearningaboutinteractionsbetweenobservedandunobservedcauses.JournalofExperimentalPsychology.Learning,Memory,andCognition,37(6),1432–48.http://doi.org/10.1037/a0024829
Rottman,B.M.,&Hastie,R.(2014).Reasoningaboutcausalrelationships:Inferencesoncausalnetworks.PsychologicalBulletin,140(1),109–39.http://doi.org/10.1037/a0031903
Rottman,B.M.,&Keil,F.C.(2012).Causalstructurelearningovertime:Observationsandinterventions.CognitivePsychology,64(1-2),93–125.http://doi.org/10.1016/j.cogpsych.2011.10.003
Rottman,B.M.,Kominsky,J.F.,&Keil,F.C.(2014).ChildrenUseTemporalCuestoLearnCausalDirectionality.CognitiveScience,38(3),1–25.http://doi.org/10.1111/cogs.12070
Schulz,L.E.,Gopnik,A.,&Glymour,C.(2007).Preschoolchildrenlearnaboutcausalstructurefromconditionalinterventions.DevelopmentalScience,10(3),322–32.http://doi.org/10.1111/j.1467-7687.2007.00587.x
Shanks,D.R.,&Dickinson,A.(1987).Associativeaccountsofcausalityjudgment.InG.H.Bower(Ed.),Thepsychologyoflearningandmotivation(Vol.21,pp.229–261).SanDiego:AcademicPress.
Sloman,S.A.,&Lagnado,D.A.(2005).DoWe“do”?CognitiveScience,29,5–39.http://doi.org/10.1207/s15516709cog2901_2
42
Soo,K.,&Rottman,B.M.(2014).LearningCausalDirectionfromTransitionswithContinuousandNoisyVariables.InP.Bello,M.Guarini,M.McShane,&B.Scassellati(Eds.),Proceedingsofthe36thAnnualConferenceoftheCognitiveScienceSociety.Austin,TX:CognitiveScienceSociety.
Spellman,B.A.,Price,C.M.,&Logan,J.M.(2001).Howtwocausesaredifferentfromone:Theuseof(un)conditionalinformationinSimpson’sparadox.Memory&Cognition,29(2),193–208.http://doi.org/10.3758/BF03194913
Spirtes,P.,Glymour,C.,&Scheines,R.(1993).Causation,Prediction,andSearch.N.Y.:Springer-Verlag.
Spirtes,P.,Glymour,C.,&Scheines,R.(2000).Causation,prediction,andsearch.(2nded.).NewYork,N.Y.:MITPress.
Steyvers,M.,Tenenbaum,J.,Wagenmakers,E.,&Blum,B.(2003).Inferringcausalnetworksfromobservationsandinterventions.CognitiveScience,27(3),453–489.http://doi.org/10.1016/S0364-0213(03)00010-7
Sussman,A.,&Oppenheimer,D.(2011).ACausalModelTheoryofJudgment.InC.Hölscher,Carlson&T.Shipley(Eds.),Proceedingsofthe33rdAnnualConferenceoftheCognitiveScienceSociety(pp.1703–1708).Austin,TX:CognitiveScienceSociety.
Thornley,S.(2013).UsingDirectedAcyclicGraphsforInvestigatingCausalPathsforCardiovascularDisease.JournalofBiometrics&Biostatistics,04(05).http://doi.org/10.4172/2155-6180.1000182
Tversky,A.,&Kahneman,D.(1980).CausalSchematainJudgmentsUnderUncertainty.InM.Fishbein(Ed.),Progressinsocialpsychology(pp.49–72).Hillsdale,NewJersey:Erlbaum.
Waldmann,M.R.(1996).Knowledge-basedcausalinduction.InD.R.Shanks,K.L.Holyoak,&D.L.Medin(Eds.),Thepsychologyoflearningandmotivation(Vol.34:Causal,pp.47–88).SanDiego.
Waldmann,M.R.(2000).Competitionamongcausesbutnoteffectsinpredictiveanddiagnosticlearning.JournalofExperimentalPsychology:Learning,Memory,andCognition,26(1),53–76.http://doi.org/10.1037//0278-7393.26.1.53
Waldmann,M.R.(2007).Combiningversusanalyzingmultiplecauses:howdomainassumptionsandtaskcontextaffectintegrationrules.CognitiveScience,31(2),233–56.http://doi.org/10.1080/15326900701221231
Waldmann,M.R.,&Hagmayer,Y.(2001).Estimatingcausalstrength:theroleofstructuralknowledgeandprocessingeffort.Cognition,82(1),27–58.http://doi.org/10.1016/S0010-0277(01)00141-X
Waldmann,M.R.,&Hagmayer,Y.(2005).Seeingversusdoing:twomodesofaccessingcausalknowledge.JournalofExperimentalPsychology.Learning,Memory,andCognition,31(2),216–27.http://doi.org/10.1037/0278-7393.31.2.216
Waldmann,M.R.,&Holyoak,K.J.(1992).Predictiveanddiagnosticlearningwithincausalmodels:Asymmetriesincuecompetition.JournalofExperimentalPsychology:General,121(2),222–236.http://doi.org/10.1037/0096-3445.121.2.222
43
Walsh,C.,&Sloman,S.A.(2008).Updatingbeliefswithcausalmodels:Violationsofscreeningoff.MemoryandMind:AFestschriftforGordonH.Bower,345–358.
White,P.A.(2001).Causaljudgmentsaboutrelationsbetweenmultilevelvariables.JournalofExperimentalPsychology:Learning,MemoryandCognition,27(2),499–513.
White,P.A.(2006).Thecausalasymmetry.PsychologicalReview,113(1),132–47.http://doi.org/10.1037/0033-295X.113.1.132
Yeung,S.,&Griffiths,T.L.(2011).Estimatinghumanpriorsoncausalstrength.InProceedingsofthe33rdAnnualConferenceoftheCognitiveScienceSociety.(pp.1709–1714).Austin,TX:CognitiveScienceSociety.
Yeung,S.,&Griffiths,T.L.(2015).Identifyingexpectationsaboutthestrengthofcausalrelationships.CognitivePsychology,76,1–29.http://doi.org/10.1016/j.cogpsych.2014.11.001