54
Classical and quantum causal inference (An introduction to techniques and open questions) Sally Shrapnel

Classical and quantum causal inference

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Classical and quantum causal inference

(An introduction to techniques and open questions)

SallyShrapnel

Classicalcausalinference

...theLaplacianconceptionismoreintunewithhumanintuitions.Thefewesoteric

quantumexperimentsthatconflictwiththepredictionsoftheLaplacianconception

evokesurpriseanddisbelief,andtheydemandscientistsgiveupdeeplyentrenched

intuitionsaboutlocalityandcausality.Ourobjectiveistopreserve,explicateand

satisfy- notdestroy- thoseintuitions.(Pearl,2009)[26]

Classicalcausalinference

Glymour onBellexperiments:

...realexperiments...createassociationsthathavenocausalexplanation

consistentwiththeMarkovassumption,andtheMarkovassumptionmustbe

appliedtoobtainthatconclusion.Youcansaythereisnocausalexplanationof

thephenomenon,orthatthereisacausalexplanationbutitdoesn’tsatisfythe

Markovassumption.(Glymour,2006)[124]

Classicalcausalinference

Currentquantumtheoryandexperimentsshowthatthisassumptiondoesnotholdat

thequantumlevel,wheretherearenonlocalvariablesthatappeartohaveadirect

causaleffectonothers.WhilethesecasesdonotimplythatthecausalMarkov

assumptiondoesnothold,theydosuggestthatwemayseemoreviolationsofthis

assumptionatthequantumlevel.However,inpractice,theCausalMarkovassumption

assumptionappearstobeareasonableworkingassumptioninmostmacroscopic

systems.

Outline

1. Classicalcausalinference

2. Quantumcausalinference

3. Greatintheory,butwhataboutinpractice?

4. Machinelearningtotherescue?

Causalinference

Petersetal,2018

p(Ph,A)

p(Ph,B)

Causalstructurematters

Petersetal,2018

interventionistcausation

CausationisNOTsimplycorrelation

Correlationsdonotenableustodistinguishbetweeneffectiveandineffectivestrategiesthatbringaboutspecificends(Cartwright,1979)

“interventioninvariantmodel”

interventionistcausation

CausationisNOTsimplycorrelation

• correlationsdonotenableustodistinguishbetweeneffectiveandineffectivestrategiesthatbringaboutspecificends(Cartwright,1979)

Goldstandard

interventionistcausation

Interveningandrandomisation oftennotpossible

• Lookforfootprintsinthedatatogiveuscluestothetopologyofthecausalstructure

• Needprinciplesorassumptionstogofromjointdistributionoverobservedvariablestocausalstructure.

Causalinference

Reichenbach’sprinciple:ifXandYarestatisticallydependent,then

1. thereexistseitheracommoncauseZoradirectcausalrelationshipbetweenXandY,and

2. ZscreensoffXandYfromeachother(X⊥Y⏐ Z)

X

Z

Y

X Y

X Y

Causalinference

Jointdistributionisthesameinallthreecases:causalstructure capturedbydifferentpossiblefactorisations

X

Z

Y

X Y

X Y

! 𝑝 𝑋 𝑍 𝑝 𝑌 𝑍 𝑝(𝑍)�

)

𝑝(𝑌)𝑝(𝑋|𝑌)

𝑝(𝑋)𝑝(𝑌|𝑋)

𝑝(𝑋, 𝑌)

CausalgraphicalmodelCasualmodel=graph+causalparametersCausalparameters=distributionofeachvariableconditionedonitsparents

X4

X3

X5

X1

X2

Structuralcausalmodel

𝑋1 = 𝑓1(𝑁1)𝑋2 = 𝑓2(𝑁2)

𝑋3 = 𝑓3(𝑋1, 𝑁3)𝑋4 = 𝑓4(𝑋3, 𝑁4)

𝑋5 = 𝑓5 𝑋3, 𝑋2, 𝑁5 X4

X3

X5

X1

X2

N4

N5

N2N3

N1

MarkovCondition

X4

X3

X5

X1

X2

N4

N5

N2N3

N1

parentsNon-descendants

Graphically,children areconditionallyindependentoftheirnon-descendants,giventheirparents.

(𝑋 ⊥ 𝑌 𝑍 𝐺→ (𝑋 ⊥ 𝑌 𝑍 𝑃

FaithfulnessCondition

X4

X3

X5

X1

X2

N4

N5

N2N3

N1

parentsNon-descendants

Graphically,childrenareconditionallyindependentoftheirnon-descendants,giventheirparents.

Independenciesfoundinthedistributionareonly thoseimpliedbytheMarkovcondition

(𝑋 ⊥ 𝑌 𝑍 𝐺 → (𝑋 ⊥ 𝑌 𝑍 𝑃

(𝑋 ⊥ 𝑌 𝑍 𝐺 ← (𝑋 ⊥ 𝑌 𝑍 𝑃

FaithfulnessCondition

X4

X3

X5

X1

X2

N4

N5

N2N3

N1

parentsNon-descendants

Graphically,childrenareconditionallyindependentoftheirnon-descendants,giventheirparents.

Independenciesfoundinthedistributionareonly thoseimpliedbytheMarkovcondition

(𝑋 ⊥ 𝑌 𝑍 𝐺 → (𝑋 ⊥ 𝑌 𝑍 𝑃

(𝑋 ⊥ 𝑌 𝑍 𝐺 ← (𝑋 ⊥ 𝑌 𝑍 𝑃

Interventions

X4

X3

X5

X1

X2

N4

N5

N2

N1

Interventionseparatesvariablefromantecedentcauses

𝑋1 = 𝑓1(𝑁1)𝑋2 = 𝑓2(𝑁2)

𝑋3 = 𝑓3(X1,N3)𝑋4 = 𝑓4(𝑋3, 𝑁4)

Causalinference

X4

X3=2

X5

X1

X2

N4

N5

N2

N1

Interventionseparatesvariablefromantecedentcauses

𝑋1 = 𝑓1(𝑁1)𝑋2 = 𝑓2(𝑁2)

𝑋3 = 2𝑋4 = 𝑓4(𝑋3, 𝑁4)

Whyisitcausal?

LocalCPDaregeneratedbyautonomouscausalmechanisms thatexistbetweenparentsandchildren.

Causalstructureofmodelisisomorphictonetworkofautonomouscausalmechanisms.

Interventions bringlocaldistributionundercontrolofexperimenteranddon’tdisruptothermechanismsinthemodel

Thisstructurehaspragmaticvalue– ittellsushowtoacttobringaboutcertainends

Latentvariables?

GraphnotMarkovian andFaithful?

Implicitassumptionthatthisisalwaysaconsequenceofunmeasuredcommoncauses(latentvariables)

Inprinciple,locatingandmeasuringsuchvariablesoughttorestoreMarkovianity tothemodelforsomecausalstructure.

Constraintbasedmethods

Algorithms:PC

– Identifiesadjacenciesviadependencies

– IdentifiesV-structuresX YZXZXYZXYZY

– Propagationrulestoorientremainingedgestoavoidcycles

– IdentifiesMarkovequivalentsetofDAGsunderCMA,CFA,acyclicity andCSA

Constraintbasedmethods

PROBLEMS

Givesyougraphbutnotthestructuralequations(can’tinfercounterfactuals).

StatisticaltestsacceptorrejectCIatagivenconfidenceintervalandgraphstructureisverysensitivetohowyousetthisconfidencelevel.

CItestingisunjustifiedforfinitedata(forarbitrarilycomplexfunctions)

Propagationrulestendstopropagateerrors.

Whatabouttwovariables?

AlternativestoCItesting?

Generalprinciples?

Keyideatoretainisindependenceofcausalmechanisms:

“Statisticalcorrelationsbetweenvariablesinasystemaretheresultofacausalgenerativeprocess thatiscomposedofautonomousmodulesthatdonotinformorinfluenceeachother.”

Forbivariatecasethisreducestoindependenceofcause andmechanismrelatingcausetoeffect.

𝑝 𝑐, 𝑒 = 𝑝 𝑐)𝑝 𝑒 𝑐 (𝑐 → 𝑒

Asymmetryofcauseandeffect

𝑝 𝑥, 𝑦 = 𝑝 𝑥)𝑝 𝑦 𝑥 (𝑥 → 𝑦

𝑝 𝑥, 𝑦 = 𝑝 𝑦)𝑝 𝑥 𝑦 (𝑦 → 𝑥

𝑦 = 𝑥@ + 𝑥 + 𝑁

Wantlowstructuralvariabilityofmechanismfordifferentinputvaluesofcause

Alternatives?

Makeassumptionsaboutdatageneratingcausalmechanisms:

Bivariate

1. Additivenoisemodel(ANM)• assumenonlinearfunction,additivenoise,noiseindependentof

cause• Admitsonlyasingleunidentifiablecase:linearfunctionwithgaussian

causeandnoise

2. Post-nonlinearmodel(PNL)• Addsanextranonlinearfunctionwhichisinvertible• Morenon-identifiablecases

3. GaussianProcessInferencemodel(GPI)

4. Algorithmiccomplexity– exploitasymmetriesinfactorisation accordingtoKolmogorovcomplexitymeasures(selectssimplestexplanation)

Alternatives?

Multivariate:

1. ConstraintbasedmethodsPC,FCI(relaxesCS),RFCI.Allrequiredata++

2. Scorebasedalgorithmssearchmodelspaceandminimise aglobalscore• GESexploresgraphspaceusingoperators“addedge”,“removeedge”,

“reverseedge”andoptimises accordingtoBIC.• FGESmorecomputationallyefficient

3. Hybridalgorithms=constraint+scorebased• Max-minHillclimbing(MMHC)buildsskeletonusingCIteststhen

performsagreedyhill-climbingsearchtoorientedges• GFCIusesFGEStosketchgraphandthenCItoorientedges

4.Exploitasymmetrybetweencauseandeffect(LinGAM)+MORE• linearfunctions,non-gaussian sourcenodes,additivenoise

Crowdsourcedcausaldiscovery

Guyon,(2013)

Principledmethods(withvoting)– 0.6accuracyonclassificationtask

Generalised – 0.8(lowlevelfeaturesofjointdistribution- 9000!)

Notlearningthefunctionalrelationships,justdirectionality

machinelearningmethods

Data=x Model=f(x)p(y|x)

10

Optimise andregularise!

Label=y

machinelearningmethods

Supervisedlearning

Largedatabasesofsampledvariablepairswithknowncause-effectrelation.

Castasclassificationproblem.

Taskistogeneralise solutiontounseendatasets(borrowexistingregularisationmethodsfromML).

BUTneedtofeedit“groundtruth”models~motivatedbycommonsense,domainknowledge.

machinelearningmethods

Generativemodels((CiGAN,CausalGAN,CGNN)

SAM“StructuralAgnosticModel”withpenalised adversariallearning.

Recoverscasualgraphfromdata(learnsbothjointandinterventionaldistributions).

RelaxesCMC,CFC,CSA,acyclicity.

Scalestohundredsofvariables.

Estimatesbothstructureofgraphandfunctionalcausalmechanisms.

90%accurate.

Needtofeedit“groundtruth”models– sensitiveto“Bayeserror”.

Needsverification!

Classicalcausalinference

Causalinferencefromjointdistributionsisverydifficult.

Nicetheoreticalandconceptualapproaches,butmanypracticaldifficulties.

Machinelearningseemsapromising(bruteforce)approachbutatthecostofinterpretability(fornow).

Causalinferencetotheexclusionofotherdomainknowledgeisanoccupationalhazard…

Parachuteusetopreventdeathandmajortraumarelatedtogravitationalchallenge:systematicreviewofrandomised controlledtrialsBMJ 2003; 327

Objectives Todeterminewhetherparachutesareeffectiveinpreventingmajortraumarelatedtogravitationalchallenge.Design Systematicreviewofrandomised controlledtrials.Datasources:Medline,WebofScience,Embase,andtheCochraneLibrarydatabases;appropriateinternetsitesandcitationlists.Studyselection: Studiesshowingtheeffectsofusingaparachuteduringfreefall.Mainoutcomemeasure Deathormajortrauma,definedasaninjuryseverityscore>15.ResultsWewereunabletoidentifyanyrandomised controlledtrialsofparachuteintervention.

Conclusions Aswithmanyinterventionsintendedtopreventillhealth,theeffectivenessofparachuteshasnotbeensubjectedtorigorousevaluationbyusingrandomised controlledtrials.Advocatesofevidencebasedmedicinehavecriticisedtheadoptionofinterventionsevaluatedbyusingonlyobservationaldata.Wethinkthateveryonemightbenefitifthemostradicalprotagonistsofevidencebasedmedicineorganised andparticipatedinadoubleblind,randomised,placebocontrolled,crossovertrialoftheparachute.

Bellinequalities

2006Glymour:

Belltheoremisaparticularexampleofamoregeneraltheoryofcausalinference:

Causalgraphicalmodels

Glymour,Clark "MarkovPropertiesandQuantumExperiments,"inW.DemopoulosandI.Pitowsky,eds. PhysicalTheoryandItsInterpretation:EssaysinHonorofJeffreyBub,Springer2006.

Quantumcausalinference

“AnycausalmodelwhichcanreproduceBell-inequalityviolationswhilerespectingtheobservedindependences…willnecessarilyviolateaprinciplethatisatthecoreofallthebestcausaldiscoveryalgorithms [Faithfulness].”

Quantumcausalinference?

r

Non-localityviolatesfaithfulness.

• Settingsaremarginallyindependent𝐴 ⊥ 𝐵

• Nosignalling𝑋 ⊥ 𝐵 𝐴; 𝑌 ⊥ 𝐴 𝐵

Quantumcausalinference

r

Retrocausality violatesfaithfulness.

• Settingsaremarginallyindependent𝐴 ⊥ 𝐵

• Nosignalling𝑋 ⊥ 𝐵 𝐴; 𝑌 ⊥ 𝐴 𝐵

Quantumcausalinference

r

Superdeterminism violatesfaithfulness.

• Settingsaremarginallyindependent𝐴 ⊥ 𝐵

• Nosignalling𝑋 ⊥ 𝐵 𝐴; 𝑌 ⊥ 𝐴 𝐵

Quantumcausalinference

Conclusion:CItestingnotrichenoughmethodologytocapturequantumcausalrelations

Threeapproaches

1. ClassicalapproachPlugquantumdataintoclassicalalgorithms.Usefulascertificationforquantumness.Doesitreallyhelpwithcausalexplanation?

2. Quantumdomainisacausal.

3. Tryanddevelopamoregeneralversionofcausalinference.

Acausal?

Alternativeapproach

Assumequantumcausalstructureisprimitive,buildacausaltheoryfromthegroundup(usingmathematicalobjectsfromQM),thatrecoversclassicalcausalstructureinasuitablelimit.

Writedefinitionsaccordingtothewayphysicistsuse quantumtheorytomakeinterventionistinferencesthatdistinguishbetweeneffectiveandineffectivestrategies(inmostgeneralform).

Obeysindependenceassumptionsthatunderpinclassicalcausalinference.

CostaandShrapnel(2016)“Quantumcausalmodelling”,NJP,18,063062

Shrapnel(2016)“Usinginterventionstodiscoverquantumcausalstructure”PhDthesis,http://espace.library.uq.edu.au/view/UQ:411093

Shrapnel(2017)“DiscoveringQuantumCausalModels,”TheBritishJournalforthePhilosophyofScience(advancearticle)

Desiderata

1. Empirical =Theformalismshouldallowforthediscoveryofcausalstructurefromempiricaldata(causalstructure- canactasanoracleforinterventions).

2. Explanation =All correlationsbetweenempiricallyderiveddatashouldbeaccountedforvianotionsofdirect,indirectorcommoncauserelations,i.e.thereshouldbeno“unexplained”correlations.

3. Classicality=Classicalcausalmodelsshouldberecoveredasalimitingcaseofquantumones.

Quantumcausalmodels

A

B

C

D

G

F

E

Variables=regions

Values=CP maps

Intervention=instruments

CausalMechanisms=CPTP

Causalstructure=process

Generalised circuit

Processisgeneratedbyautonomouscausalmechanisms(channels).

Mechanisms=deterministicunitaries withunmodelled noise

=

Contextuality?

=

Shrapnel,CostaandMilburn,NJP (2018)ShrapnelandCosta,Quantum (2018)

𝑝 𝑗 = 𝑇𝑟(𝐸J𝜌)

Unique!

Unique!

Quantumcausalmodels

A

B

C

D

G

F

E

Variables=regions

Values=CPmaps

Intervention=instruments

CausalMechanisms=CPTP

Causalstructure=process

=

Assumptions:independenceofinterventionsandmechanisms.Do-calculusisignoreincomingstateandre-prepareoutgoingstate.Markovianity – setoflinearconstraintsontheprocess.Faithfulness– nofine-tunedmechanisms(measuretheoreticsense).

Quantumcausalmodels

A

B

C

D

G

F

E

Causalmodelis“interventioninvariant”

Non-Markovian?(unmodelled (latent)commoncause)- extendingthemodeltoincludesuchnodesrestorestheMarkovpropertytothecausalgraph

Canprovethatallclassicalcausalmodelscanbegivenaquantumrepresentation(allinstrumentsfixedinsomebasis)

Inprinciple:

MeasurementDataà graphthat

(i) Accountsforallcorrelations(classicalorquantum)viaautonomouscausalmechanisms

(ii) Actsasanoracleforfutureinterventions

Needcausaldiscoveryalgorithms….

Discoveryalgorithm:

1. Determinesifprocessiscausallyordered

2. Checksifalllatentvariablesareincluded

3. IfMarkoviangivesuniqueDAG

Giarmatzi andCosta,NQI,2018

Stillworktodo

Quantumcausalinferenceiscomputationallyand practicallyveryhard.

Needtoinputprocesswhichrequiresinformationallycompletetomography–exponentialinnumberofvariables.

Needtoknowdimensionofinputsystemsandnumberofsubsystems.

Switch?Indefinitecausalorder?LargerclassesofWthataretheoreticallypossible.

Supervisedlearningofprocess

Task=>classifyprocessasMarkovianvsnon-Markovian,=>estimatedimensionofnon-Markovianenvironment.

SimulatedataandusesupervisedMLtechnique– labelledprocesses.

TrainedRandomForestRegressor,testonunseendata(frominsideandoutsideoutsidetrainingrange)

99%accurateonMarkovianvsnon-Markovian.

95%accurateondimensionalityofenvironment.

Nolossofaccuracyorgeneralityonless-than-informationallycompletedata(only20%offeaturesincluded).

Supervisedlearningofprocess

Caveats:

Simulateddata– tryexperimentaldatanext.

Scaling?

Transferabletodifferentsizeprocesses?

Interpretability?

Conclusions

Causalinferenceisimpossiblewithoutassumptions.

Independenceofcausalmechanisms(includinginterventions)iskey.

Machinelearningmethodsmayprovideapowerfultoolforidentificationofcausalstructure(butatsomecosttointerpretability).

Implicationsof“theoryblind”classical-quantumcausalinference?

References

Acknowledgements:

Causalparents?Hardy,Brukner,Costa,Oreshkov,Spekkens,Liefer,Chiribella,Tucci,Ried,Cavalcanti,Lal,Henson,Pusey,Chaves,Pienaar,Giarmatzi…..+manymore(Lloyd)Causalsisters?Reidetal.,Allenetal.,Causalchildren?Schmid,Pienaar....+more

Books:“Elementsofcausaldiscovery”,Petersetal.,(2018);“DeepLearning”Goodfellowetal.,(2016)