Reasoning on Data

Preview:

Citation preview

ReasoningonData:

TheOntology-MediatedQueryAnsweringProblem

Marie-Laure Mugnier

University of Montpellier

UNILOGSchool–Vichy–2018

KNOWLEDGEREPRESENTATIONANDREASONING(KR)

•  A field historically at the heart of Artificial Intelligence

•  Study formalisms (or languages) to

•  represent various kinds of human knowledge •  do reasoning on these representations

•  along the tradeoff expressivity / tractability of reasoning

à KR languages based on computational logic In this talk: classical first-order logic (FOL)

Major conferences: IJCAI, AAAI, KR

M.-L.Mugnier–UNILOGSchool– 2018 2

Part1:Basics

Knowledge bases, Ontologies Logical view of Queries and Data Main KR formalisms to represent and reason with ontologies Ontology-Mediated Query Answering

Part2:KRformalismsandalgorithmicapproaches

Part3:DecidabilityissuesintheexistenFalruleframework

OVERVIEWOFTHETUTORIAL

M.-L.Mugnier–UNILOGSchool– 2018 3

Knowledge Base (KB)

Reasoning Services

•  General knowledge on the application domain « Cats are Mammals »

•  Factual Knowledge Description of specific individuals, situations, ...

Félix is a Cat

Factbase, Database

Fundamental tasks

•  Checking the consistency of the KB

•  Computing answers to a query over the KB

...

KNOWLEDGEBASEDSYSTEMS

Ontology

Knowledge expressed in a KR language

Reasoning algorithms associated with the KR language

M.-L.Mugnier–UNILOGSchool– 2018 4

WHATISANONTOLOGY?

In computer science: a formal specification of the knowledge of a particular domain Ø  which allows for machine processing Ø  that relies on the semantics of knowledge

> automated reasoning

Such a specification consists of Ø  a vocabulary in terms of concepts and relations Ø  semantic relationships between these elements

M.-L.Mugnier–UNILOGSchool– 2018 5

EXAMPLESOFONTOLOGIES

¢  Medecine and life sciences : hundreds of available ontologies

�  general medical ontologies

SNOMED CT (400 000 terms) GALEN (> 30 000 terms)

�  specialized medical ontologies FMA (anatomy) NCI (cancer), ...

�  biology �  agronomy

¢  Information systems of large organizations and corporations

M.-L.Mugnier–UNILOGSchool– 2018 6

ATTHECOREOFONTOLOGIES:CONCEPTS/CLASSES

¢  Concept/class:acategoryofenSSes(objects)thatshareproperSes InFOL:unarypredicate:Cat,Mammal

¢  Instanceofclass:aspecificmemberofthisclass

¢  FundamentalrelaSononclasses:specializaFon(subsumpFon)(«isakindof»,«subclassof»)

SemanScs:everyinstanceofC1isalsoinstanceofC2

C1

C2

C1 subclass of C2

�x (C1(x) à C2(x)) �x (Cat(x) à Mammal(x))

In FOL: a term (variable or constant)

M.-L.Mugnier–UNILOGSchool– 2018 7

ATTHEHEARTOFONTOLOGIES:CONCEPTS/CLASSES

Disease

LungDisease

Pneumonia

InfecSousPneumonia

InfecSousDisease

BacterialDisease

ViralDisease

BacterialPneumonia

Legionella

InfecSousAgent

Bacteria Virus

OrganismDisorder

However, an ontology is not just a classification!

Concepts organized by specialization

(SNOMEDCT)

M.-L.Mugnier–UNILOGSchool– 2018 8

ONTOLOGIESAREMUCHMORETHANCLASSIFICATIONS

Vocabulary 1.  concepts / classes

2.  relations (between instances)

« An ontology specifies the vocabulary of an application domain and semantic relationships between the terms of the vocabulary»

+ semantic relationships between concepts

+ semantic relationships between relations

+ properties of relations

+ other axioms that more generally express domain knowledge

+ properties of concepts

M.-L.Mugnier–UNILOGSchool– 2018 9

RELATIONSBETWEENINSTANCES

Often these are binary relations (also called « roles » or « properties »)

subrelation of

SignatureofarelaSon:assignsamaximumconcepttoeachargument(«domain»and«range»inOWL)

�x�y (hasCA(x,y) à dueTo(x,y))

dueTo

hasCausaFveAgent

�x�y (hasCA(x,y) à Disease(x) � Organism(y))

Argument1:aDisorderArgument2:anOrganismor[...]

Argument1:aDiseaseArgument2:anOrganism

M.-L.Mugnier–UNILOGSchool– 2018 10

EXAMPLESOFOTHERFREQUENTTYPESOFAXIOMS

•  Necessaryand/orsufficientproperSesofconcepts (ex: BacterialDisease)

•  Properties of relations

•  NegaSveconstraints(disjointnessbetweenconcepts,relaSons,...)

Bacteria∩Virus=� �x(Bacteria(x)�Virus(x)à )�x(Bacteria(x)à¬Virus(x))

�x (BacterialDisease(x) → �y (Bacteria(y) � hasCausativeAgent(x,y))

A bacterial disease is caused by a bacteria

inverse relations: �x�y (hasPart(x,y) ⟷ isPartOf(y,x))

symmetry, transitivity, ...

functional relation: �x�y�z (isPartOf(x,y) � isPartOf(x,z) → y = z)

M.-L.Mugnier–UNILOGSchool– 2018 11

WHATKINDSOFLANGUAGESTOEXPRESSONTOLOGIES?

HierarchiesofclassesHierarchiesofbinaryrelaSons(called«properSes»)SignaturesoftheserelaSons(«domain»and«range»)

àOWLDLfragmentofRDFSchema(SemanScWeb)

DescripFonLogicsRule-basedlanguages Datalog,existenSalrules,

RDFdeducSverules,AnswerSetProgramming...

Very light languages

More expressive fragments of first-order logics

From a logical viewpoint: anontologyiscomposedofafinitesetofpredicates(toexpressconceptsandrelaSons)afinitesetof(closed)formulasoverthesepredicates oftheform�X(condiFon[X]àconclusion[X])

M.-L.Mugnier–UNILOGSchool– 2018 12

WHATAREONTOLOGIESGOODFOR?

•  provideacommonvocabulary

àitiseasiertoshareinformaSon(typicallybetweenexpertsofseveraldomains)

•  constrainthemeaningofterms

àforcestoexplicitnot-saidthingsandtoremoveambiguiSeshencelessmisunderstandings

•  todoautomatedreasoning,basisofhigh-levelservices

à findimplicitlinksbetweenpiecesofknowledgeà checktheconsistencyoftheKB,finderrorsinmodelingà enrichdataqueryanswering

M.-L.Mugnier–UNILOGSchool– 2018 13

«findallpaSentsaffectedbyalungdiseaseduetoabacteria»

Data

Query (SQL, SPARQL, MongoDB ...)

ONTOLOGY-MEDIATEDQUERYANSWERING(EX:MEDICALRECORDS)

Database(relaSonal,RDF,NoSQL,...)

??

PaSentP:Diagnosis=«legionella»

M.-L.Mugnier–UNILOGSchool– 2018 14

DonnéesData

Query

Knowledge Base

AlegionellaisbacterialpneumoniaAbacterialpneumoniaisapneumoniaApneumoniaisalungdiseaseAbacterialpneumoniaiscausedbyabacteriaIfxiscausedbyythenxisduetoyIfthediagnosisofapaSentxcontainsadiseaseythenxisaffectedbyy

ONTOLOGY-MEDIATEDQUERYANSWERING

«findallpaSentsaffectedbyalungdiseaseduetoabacteria»

PaSentP:Diagnosis=«legionella»

Ontology

M.-L.Mugnier–UNILOGSchool– 2018 15

Knowledgebase

Query

Ontology

Addinganontologicallayerontopofdata

1-Enrichthevocabularyallowingtoabstractfromaspecificdatastorage

2-Infernewfacts,notexplicitelystored,allowingforincompletedatarepresentaSon

Data

ONTOLOGY-MEDIATEDQUERYANSWERING(OMQA)

M.-L.Mugnier–UNILOGSchool– 2018 16

Query

Ontology

Data

3–provideaunifiedviewofmulSplesources

Data Data

ONTOLOGY-MEDIATEDQUERYANSWERING(OMQA)

M.-L.Mugnier–UNILOGSchool– 2018 17

OMQAEXAMPLE:ONTOLOGICALKNOWLEDGE

AlegionellaisbacterialpneumoniaAbacterialpneumoniaisapneumoniaApneumoniaisalungdiseaseAbacterialpneumoniaiscausedbyabacteriaIfxiscausedbyythenxisduetoyIfthediagnosisofapaSentxcontainsadiseaseythenxisaffectedbyy

�x(Legionella(x)→BacterialPneumonia(x))

�x(BacterialPneumonia(x)→�y(hasCausaSveAgent(x,y)�Bacteria(y)))

�x�y(hasCausaSveAgent(x,y)→dueTo(x,y))

�x�y((Diagnosis(x,y)�Disease(y))→isAffectedBy(x,y))

M.-L.Mugnier–UNILOGSchool– 2018 18

FACTBASE

RelaSonalschema: finitesetRofrelaSonsàpredicates infinitedomainofvaluesàconstants

Factbase:usuallyasetofgroundatoms(ontheontologicalvocabulary)seenastheconjuncSonoftheseatoms

F={PaSent(P),Diagnosis(P,M),Legionella(M)}

ArelaFonaldatabasemaynaturallybeviewedasafactbase

InstanceofarelaSonr: finitesetoftuplesonràatomsonr

rakr1 akr2a1a2a1

a2a3a1

«Thediagnosisforthepa4entPislegionella»

Databaseinstance={instanceforeachrinR} àfactbase

{r(a1,a2),r(a2,a3),r(a1,a1)}

M.-L.Mugnier–UNILOGSchool– 2018 19

CONJUNCTIVEQUERIES(CQ)

« find all patients affected by a lung disease due to a bacteria»q(x) = ∃y ∃z (Patient(x) � isAffBy(x,y) � LungDisease(y) � dueTo(y,z) � Bacteria(z)) A CQ is an existentially quantified conjunction of atoms The free variables are the answer variables If closed formula: Boolean CQ DatalognotaSonans(x)ßPaSent(x),isAffBy(x,y),LungDisease(y),dueTo(y,z),Bacteria(z)Select-Join-ProjectqueriesinrelaSonalalgebra(SQL)SELECT...FROM…WHERE<joincondi4ons> SPARQL(semanScwebqueries)SELECT…WHERE<basicgraphpa=ern>

M.-L.Mugnier–UNILOGSchool– 2018 20

ANSWERSTOACONJUNCTIVEQUERY

¢  TheanswertoaBooleanCQqinFisyesifF�qyes=()

¢  LettheCQq(x1,...,xk).Atuple(a1,…,ak)ofconstantsisananswertoqwithrespecttoafactbaseFif F�q[a1,...,ak],whereq[a1,...,ak]isobtainedfromq(x1,...,xk)byreplacingeachxibyai

¢  LetFandqbeseenassetsofatoms.AhomomorphismhfromqtoFisa

mappingfromvariables(q)toterms(F)suchthath(q)F

F�q()iffqcanbemappedbyhomomorphismtoF(a1,…,ak)isananswertoq(x1,...,xk)w.r.t.Fiff

thereisahomomorphismfromqtoFthatmapseachxitoai

M.-L.Mugnier–UNILOGSchool– 2018 21

KEYNOTION:HOMOMORPHISM

q(x)=∃y(movie(y)∧play(x,y))

HomomorphismhfromqtoF:subsStuSonofvar(q)byterms(F)suchthath(q)F

F

h1:xàayàm1

h2:xàayàm2

h1(q)=movie(m1)∧play(a,m1)

h2(q)=movie(m2)∧play(a,m2)

movie(m1)movie(m2)movie(m3)actor(a)actor(b)actor(c)play(a,m1)play(a,m2)play(c,m3)

Answers:obtainedbyrestricSngthedomainsofhomomorphisms toanswervariables

x = a x = c

h3:xàcyàm3 h3(q)=movie(x0)∧play(c,m3)

movie(y)play(x,y)

M.-L.Mugnier–UNILOGSchool– 2018 22

« find all patients affected by a l u n g d i s e a s e d u e t o a bacteria »

q(x) = ∃y∃z(PaSent(x)�isAffectedBy(x,y)�LungDisease(y)�dueTo(y,z)�Bacteria(z))

Factbase = { Patient(P), Diagnosis(P,M), Legionella(M) } « The diagnosis for the patient P is legionella »

Legionellaspecialisa4onofLungDiseaseandBacterialDisease(andDisease)henceLungDisease(M) henceBacterialDisease(M),

Disease(M)

�x(BacterianDisease(x)→�y(hasCausaSveAgent(x,y)�Bacteria(y)))hencehasCausaSveAgent(M,b)andBacteria(b)

�x�y(hasCausaSveAgent(x,y)→dueTo(x,y))hencedueTo(M,b)

�x�y((Diagnosis(x,y)�Disease(y))→isAffectedBy(x,y))henceisAffectedBy(P,M) Answer : x = P

ONTHEOMQAEXAMPLE

M.-L.Mugnier–UNILOGSchool– 2018 23

AMOREGENERALSCHEMA

Query

Ontology

Data Data Data

Mappingsfromdatatofacts

{Databasequery⤳Facts}

Factbase

Conceptuallevel

DescripSonoftheapplicaSondomainwithahighabstracSonlevel

Queryusingthevocabularyoftheontology

Factbase(possiblyvirtual)usingthevocabularyoftheontology

Independentandheterogeneousdatasources

Theanswerstothequeryareinferredfromtheknowledgebase

«Ontology-BasedDataAccess»[Poggietal.,JoDS,2008]

M.-L.Mugnier–UNILOGSchool– 2018 24

MAPPINGS

q(x):�n�sPaSent_T(x,n,s) ⤳PaSent(x)q’(x):�n�sPaSent_T(x,n,s)�DiagnosSc_T(x,y)�y=«Legionella»

⤳�z(diagnosis(x,z)�legionella(z))

Patient_T [ID_PATIENT, NAME,SSN] Diagnosis_T[ID_PATIENT, DISORDER]

PaSent/1Diagnosis/2Legionella/1

Mapping:databasequery(X)⤳conjuncFonwithfreevariablesX

PaSent(P)Diagnosis(P,M)Legionella(M)

PaSent_T Diagnosis_T

......

id ssn disidname

⤳ P....

..

..

..

..

..

..

«Leg.»....

P....

M.-L.Mugnier–UNILOGSchool– 2018 25

Knowledgebase

Query

Ontology

Factbase

ONTOLOGY-MEDIATEDQUERYANSWERING(OMQA)

(Boolean)conjuncSvequeryq

TheoryOinasuitableFOLfragment

Setofgroundatoms(orexistenSallyclosedformula)F

Fundamental decision problem

O, F � q ?

M.-L.Mugnier–UNILOGSchool– 2018 26

OVERVIEWOFTHELECTURE

Part1:Basics

Part2:KRformalismsandalgorithmicapproaches

Outline of description logics – Horn DLs Existential Rules Materialization approach (forward chaining) Query rewriting approach (related to backward chaining)

Part3:DecidabilityissuesintheexistenFalruleframework

M.-L.Mugnier–UNILOGSchool– 2018 27

DESCRIPTIONLOGICS

¢  AfamilyofKRlanguagesforrepresenSngandreasoningwithontologies

¢  MostlycorrespondtodecidablefragmentsofFOL

(relatedtomodalproposiSonallogic,theguardedfragmentofFOL,...)¢  Variable-freesyntax

¢  Usedtobecalled«conceptlanguages»:fromconceptandrolenames(unaryandbinarypredicates)andasetofconstructorsdefinecomplexconcepts(morerecently:complexroles)

¢  Anontologyisasetofaxiomsthatstateinclusionsbetweenconcepts (andbetweenroles)

M.-L.Mugnier–UNILOGSchool– 2018 28

DESCRIPTIONLOGICS:BUILDINGBLOCKS(SYNTAX)Vocabulary

Atomicconcepts: Human,Parent,Student…(unarypredicates)Atomicroles: parentOf,siblingOf,… (binarypredicates)

Complexconceptsandrolescanbebuiltusingasetofconstructors(whichdependsoneachparFcularDL)

conjuncSon(П),disjuncSon(�),negaSon(¬)HumanП¬Parent Female�Male

restrictedformsofexistenSalanduniversalquanSficaSon(�,�)

∃parentOf.(FemaleПStudent) �parentOf.Male inverseofarole(-),composiSonofroles(o) ∃parentOf- parentOfoparentOf

M.-L.Mugnier–UNILOGSchool– 2018 29

DESCRIPTIONLOGICS:BUILDINGBLOCKS(SEMANTICS)

ToeachconceptisassignedaFOLsentencewithfreevariablex Human Human(x)

HumanП¬Parent Human(x)�¬Parent(x)∃parentOf.(FemaleПStudent) ∃y(parentOf(x,y)�Female(y)

�Student(y))�parentOf.Female �y(parentOf(x,y)àFemale(y))ToeachroleisassignedaFOLsentencewith2freevariablesxandy parentOfoparentOf ∃z(parentOf(x,z)�parentOf(z,y))

M.-L.Mugnier–UNILOGSchool– 2018 30

DESCRIPTIONLOGICS:KNOWLEDGEBASE

KnowledgeBase=TBox(ontology)+ABox(factbase)Tbox:axiomsoftheformC1�C2∀x(fol(C1)àfol(C2))

orr1�r2 ∀x∀y(fol(r1)àfol(r2))

Abox:setofgroundfacts parentOf(A,B),Female(A),…

Human�Male�Female ∀x(Human(x)àMale(x)�Female(x))Adult�¬Child ∀x(Adult(x)∧Child(x)à⊥)Parent��parentOf ∀x(Parent(x)à�yparentOf(x,y))HappyFather��parentOf.Female ∀x(HP(x)à(�y(parentOf(x,y)àFemale(y))Human��parentOf-.Human ∀x(Human(x)à�y(parentOf(y,x)∧Human(y)))parentOfoparentOf�ancestorOf ∀x∀y(�z(parentOf(x,z)∧parentOf(z,y))à

ancestorOf(x,y)

M.-L.Mugnier–UNILOGSchool– 2018 31

DESCRIPTIONLOGICS:STANDARDREASONINGTASKSStandardreasoningtasksonaKB(T,A)

¢  ConceptsubsumpSon T�C�D?¢  ConceptsaSsfiability isCsaSsfiablew.r.t.T ?¢  KBsaSsfiability is(T,A)saSsfiable?

¢  Instancechecking (T,A)�C(b),wherebisaconstant?

ConceptsubsumpSon T�C�Diff(T, {C(a),¬D(a)})unsaSsfiableConceptsaSsfiability CsaSsfiablew.r.t.T iff(T, {C(a)})saSsfiable

Instancechecking (T,A)�C(b)iff(T,A�{¬C(b)})unsaSsfiable

AllthesetaskscanbeexpressedintermsofKB(un)saSsfiabilityprovidedthattheconstructorsintheconsideredDLallowforit

Queryansweringbeyondinstancechecking?cannotbereducedtothestandardreasoningtasks

M.-L.Mugnier–UNILOGSchool– 2018 32

EVOLUTIONOFDLS

¢  Concepts:

¢  TBoxaxioms:onlyconceptinclusions

SaSsfiabilityandinstancecheckinginALCare:EXPTIME-completeincombinedcomplexitycoNP-completeindatacomplexity

Evenworseifweaddinverseroles:2EXPTIME-completeincombinedcomplexity

StandardexpressiveDL ALC

M.-L.Mugnier–UNILOGSchool– 2018 33

TWOCOMPLEXITYMEASURESFORQUERYANSWERINGPROBLEMS

Combined complexity (usual complexity measure) The input is O, F and q

Data complexity

The input is F (O and q supposed to be fixed)

This distinction comes from database theory: the size of the query is negligible compared to the size of the data

Problem: Given a KB = (O, F), with O the ontology and F the factbase, and a query q, is q entailed by the KB?

E.g.,qBooleanCQ,FfactbaseDoesF�q?

NP-complete(combined)PTime(data)

M.-L.Mugnier–UNILOGSchool– 2018 34

EVOLUTIONOFDLS

¢  Concepts:

¢  TBoxaxioms:onlyconceptinclusions

SaSsfiabilityandinstancecheckinginALCare:EXPTIME-completeincombinedcomplexitycoNP-completeindatacomplexity

Evenworseifweaddinverseroles:2EXPTIME-completeincombinedcomplexity

TwofactorsledtotheevoluFonofdescripFonlogics:1.  pracScaluse(e.g.SNOMEDCT):peoplemostlyuseconjuncSonandexistenSal

quanSficaSon2.  complexitytoohighforqueryansweringproblems

StandardexpressiveDL ALC

M.-L.Mugnier–UNILOGSchool– 2018 35

NEWDLSWITHLOWERCOMPLEXITY

DL-LiteR EL

Commonfeature:nodisjuncSon(no«true»negaSon) ThenasaSsfiableKBhasauniquecanonicalmodelM:

ForanyBooleanCQq,KB�qiffMisamodelofqReasoningtechniquesfortheselighterDLsareverysimilartoforwardorbackwardchaininginrule-basesystems

where

where LargeTBoxesClassificaSon

LargeABoxesQueryanswering

M.-L.Mugnier–UNILOGSchool– 2018 36

COMPLEXITYINTRODUCEDBYDISJUNCTIONORNEGATION

q(): �x�y (Blue(x) � on(x,y) � Other(y))

CBA

T: T � Blue � Other A: Blue(A), Other(C), on(A,B), on(B,C)

To answer q, we have to consider two cases: in each model of the KB, either Blue(B) or Other(B) holds

Similarly if we replace T by: ¬Blue � Other (equivalentaxiom)

KB (T,A)

Note that Other � ¬Blue is harmless: it is just a disjointness constraint

M.-L.Mugnier–UNILOGSchool– 2018 37

INSUMMARY

DL ontology (TBox) has axioms of the form

∀x (fol(C1) à fol(C2))

∀x∀y (fol(r1) à fol(r2)) wherefol(r)isapathofatomicrolesortheirinverses

With the new DLs: left and right parts of the implication are both existentially quantified conjunctions of atoms

called « Horn description logics »

DLs essentially satisfy the tree model property:

if a KB is satisfiable then it has a « tree-shaped » model

M.-L.Mugnier–UNILOGSchool– 2018 38

WHY«HORNDLS»ONANEXAMPLE

ELAxiom

Letusskolemize(uandvresp.replacedbyf1(x)andf2(x)):weobtainasetof3Hornclauses(withskolemterms)

HencethenameHorndescripFonlogics

FOLtransla4on

prenexform

M.-L.Mugnier–UNILOGSchool– 2018 39

X,Y,Z:setsofvariables

∀x(actor(x)à∃zplay(x,z))

EXISTENTIALRULES

∀X∀Y(Body[X,Y]à∃ZHead[X,Z])

any positive conjunction (without functional symbols except constants)

Key point: ability to assert the existence of unknown entities

Crucial for representing ontological knowledge in open domains

See « value invention » in databases

∀x ∀y ( siblingOf(x,y) à ∃ z (parentOf(z,x) ∧ parentOf(z,y)) )

we often simplify by omitting universal quantifiers

M.-L.Mugnier–UNILOGSchool– 2018 40

DATA/FACTS

m1m2?x

RelaSonaldatabase RDF

rdf:type

am1am2c?x

ex:actor

AbstracFoninfirst-orderlogic(FOL)

∃x(movie(m1)∧movie(m2)∧movie(x)actor(a)∧actor(b)∧actor(c)play(a,m1)∧play(a,m2)∧play(c,x))

Etc.

ex:play Movie PlayActor

abc

ex:b

rdf:type

rdf:type

ex:movie

rdf:type ex:a ex:m1

ex:c

rdf:type

ex:m2

ex:play

WegeneralizeheretheclassicalnoSonofafactbyallowingexistenSalvariablesfact/factbase=existenFallyclosedconjuncFonofatoms

rdf:type

......

...

...

m_id a_id m_id a_id

_:x

M.-L.Mugnier–UNILOGSchool– 2018 41

LABELLEDHYPERGRAPH/GRAPHREPRESENTATION¢  Afactorasetoffactscanbeseenasasetofatoms

movie(m1),movie(m2),movie(x),actor(a),actor(b),actor(c),play(a,m1),play(a,m2),play(c,x)àhenceahypergraphoritsassociatedbiparFte(mulF-)graph

pvie

12

34

p(x,y,a,x),r(x,y)

•  one(labelled)nodeperterm•  one(labelled)nodeperatom(~hyperedge)•  totallyorderededges

1

2

a

r

x

y

M.-L.Mugnier–UNILOGSchool– 2018 42

movievie

m1 m2

a b c

movievie

actorvie

playvie

movievie

1 1 1

1 11

2 2 2

1 1 1

Ifpredicatesareatmostbinary:atomnodescanbereplacedbylabelsanddirectededges

a

m1 m2

b c

play

actor actoractor

movie movie movie

actor

actor

play

play

movie(m1),movie(m2),movie(x),actor(a),actor(b),actor(c),play(a,m1),play(a,m2),play(c,x)

M.-L.Mugnier–UNILOGSchool– 2018 43

GRAPHHOMOMORPHISMS(1)

•  LetG1=(V1,E1)toG2=(V2,E2)beclassicalgraphs.HomomorphismhfromG1toG2: mappingfromV1toV2s.t.

foreveryedge(u,v)inE1,(h(u),h(v))isinE2

mapsto mapsto

M.-L.Mugnier–UNILOGSchool– 2018 44

a

m1 m2

b c play

actor actoractor

movie movie movie

GRAPHHOMOMORPHISMS(2)

•  LetG1=(V1,E1)toG2=(V2,E2)beclassicalgraphs.HomomorphismhfromG1toG2: mappingfromV1toV2s.t.

foreveryedge(u,v)inE1,(h(u),h(v))isinE2

•  Iftherearelabels:theyhavetobe``kept’’aswell

play

movie

q F

M.-L.Mugnier–UNILOGSchool– 2018 45

GRAPHHOMOMORPHISMS(3)

•  LetG1=(V1,E1)toG2=(V2,E2)beclassicalgraphs.HomomorphismhfromG1toG2: mappingfromV1toV2s.t.

foreveryedge(u,v)inE1,(h(u),h(v))isinE2

•  Iftherearelabels:theyhavetobe``kept’’aswell

play 1

2

1 movie

qF

movievie

m1 m2

a b c

movievie

actorvie

playvie

movievie

1 1 1

1 11

2 2 2

1 1 1

actor

actor

play

play

M.-L.Mugnier–UNILOGSchool– 2018 46

∀x ∀y ( siblingOf(x,y) à ∃ z (parentOf(z,x) ∧ parentOf(z,y)) )

graph graph

P

P 2

S

1

2

2

1

1

GRAPHVIEWOFEXISTENTIALRULES

∀X∀Y(Body[X,Y]à∃ZHead[X,Z])

s

p

p

Theruleheadhas2kindsofvariables:-fronFer:sharedwiththebody-existenFal(new``blank’’nodes)

x x

y

y

zz

M.-L.Mugnier–UNILOGSchool– 2018 47

F = siblingOf(a,b)

R = ∀x ∀y (siblingOf(x,y) à ∃ z (parentOf(z,x) ∧ parentOf(z,y)))

F�=∃z0(siblingOf(a,b)∧parentOf(z0,a)∧parentOf(z0,b))

s

p

p

a

b

s

RisapplicabletoFifthereisahomomorphismhfrombody(R)toF

xàayàb

ApplyingRtoFw.r.t.hproducesF∪h(head(R))whereanewvariableiscreatedforeachexistenSalvariableinR

a

b

s

p

p

GENERATIONOFFRESH(UNKNOWN)INDIVIDUALS

R F

M.-L.Mugnier–UNILOGSchool– 2018 48

EXISTENTIALRULEFRAMEWORK(LOGICAL/GRAPHICAL)

q(x)=∃y(movie(y)∧play(x,y))

Data/Facts

« Pure » existential rules

Equality rules

Negative Constraints

Conjunctive Queries

∀x ( actor(x) à ∃ z (movie(z) ∧ play(x,z)) )

∀x ∀y ∀z ( movie(y) ∧ director(x,y) ∧ director(z,y) à x = z )

∀x ( movie(x) ∧ person(x) à ⊥ )

movie(m1) play(a,m1) play(c, x) ...

M.-L.Mugnier–UNILOGSchool– 2018 49

MULTIPLETHEORETICALFOUNDATIONS

Datalog(70-80s)

ConceptualGraphs[Sowa1984]

logical translation

+ « value invention »

��-rules,existenFalRules[Baget+IJCAI2009]

Datalog+/- [Cali+PODS2009]

[CheinMugnier1992,2009]

•  Samelogicalformas«Tuple-GeneraSngDependencies»(TGDs)longstudiedinrelaSonaldatabases

LightweightDescripSonLogics,e.g.OWL2tractableprofilesMoregenerally,HornDLs

+ «unrestricted cycles » on variables + unbounded arity

M.-L.Mugnier–UNILOGSchool– 2018 50

u  TheFOLtranslaSonofHornDLsyieldsexistenSalrulesu  ExistenSalrulesarestrictlymoreexpressive:

siblingOf(x,y)à∃z(parentOf(z,x)∧parentOf(z,y))cannotbeexpressedinmostDLsbecauseofthe«cycleonvariables»(needsrolecomposi4on: s�pop)

u  Theunboundedpredicatearityallowsformoreflexibility:àdirecttranslaSonofdatabaserelaSonsàaddingcontextualinformaSoniseasy(provenance,trust,etc.)

EXISTENTIALRULESAREMOREEXPRESSIVETHANHORN-DLS

s p

p

x

y

z

Morecomplexinterac4onsbetweenvariablescannotbeexpressedatallinDLs

Unsurprisingly,thisaddedexpressivityhasacost

M.-L.Mugnier–UNILOGSchool– 2018 51

¢  Fundamentaldecisionproblem

Input:K=(F,R)knowledgebase qBooleanconjuncSvequeryQuesSon:isqentailedbyK ?¢  Thisproblemisnotdecidable

f.i.[BeeriVardiICALP1981]onTGDsevenwithasinglerule[Baget&al.KR2010]à  find«decidable»classesofrules

withgoodexpressivity/tractabilitytradeoff

EXISTENTIALRULEFRAMEWORK

Data/Facts

« Pure » existential rules

Equality rules

Negative Constraints

Conjunctive Queries

M.-L.Mugnier–UNILOGSchool– 2018 52

atomic body

frontier-1

weakly- guarded

weakly frontier-guarded

datalog

guarded

weakly- acyclic

acyclic Graph of Rule Dependencies

wa-GRD jointly- acyclic

frontier- guarded

jointly-fg

sticky-join

w-sticky-join

sticky

w-sticky

1970s

2003 2004

Since 2008

(PARTIAL)MAPOFDECIDABLECLASSES

DL-LiteR EL

M.-L.Mugnier–UNILOGSchool– 2018 53

FUNDAMENTALNOTIONSFORREASONINGINFOL(�,∧)

¢  BacktotheposiFveconjuncFveexistenFalfragmentofFOL:FOL(�,∧)

¢  Allowstoexpressfactsand(Boolean)conjuncFvequeries¢  Suchformulascanbeseenassetsofatoms,labelledgraphs,relaSonal

structures,...¢  HomomorphismisafundamentalnoSoninthisfragment:

�  AninterpretaSonI isamodelofasentencefiffthereisahomomorphism

fromftoI

�  OnecandefinehomomorphismsbetweeninterpretaSons.Then:IfI1mapstoI2then,foranyf,I1modeloff�I2modeloff

�  Toaformulaf,weassignitsisomorphicmodelM(f)(akacanonicalmodel)

M.-L.Mugnier–UNILOGSchool– 2018 54

MODELISOMORPHICTOAFOL(�,∧)FORMULA

ToaformulafinFOL(�,∧),weassignitsisomorphicmodelM(f) alsocalledcanonicalmodel

f=�x�y�z(p(x,y)∧p(y,z)∧r(x,z,a))

M(f): D={dx,dy,dz,a} pM(f)={(dx,dy),(dy,dz)} rM(f)={(dx,dz,da)}

ThecanonicalmodelM(f)isuniversal:forallM’modeloff,M(f)mapstoM’

foranyfandginFOL(�,∧),g�fiffM(g)isamodeloffifffmapstog

M.-L.Mugnier–UNILOGSchool– 2018 55

ADDINGRANGE-RESTRICTED(=DATALOG)RULESTOFACTS

K = (F, R) whereRisasetofrange-restrictedrules(i.e.,var(head)var(body))

Fisafactbase(ruleswithanemptybody):groundatomsByapplyingrulesfromRstarSngfromF,auniqueresultisobtained:thesaturaFonofF(denotedbyF*)F*isfinitesincenonewvariableiscreatedF*isacore(noredundancies)

F

q

R

F*

TheniceproperSesofFOL(�,∧)arekept:

F*isauniversalmodelofK Hence:foranyCQq,K ⊨ q iff q maps to F*

M.-L.Mugnier–UNILOGSchool– 2018 56

KNOWLEDGEBASESWITHEXISTENTIALRULES

K = (F, R) whereRisasetofexistenFalrulesFisafactbase(ruleswithanemptybody):existenSalconjuncSonsofatoms

Mainchange:F*canbeinfinite

R=person(x)à�yhasParent(x,y)∧person(y)

∧person(y0)∧hasParent(a,y0)

∧person(y1)∧hasParent(y0,y1) Etc.

F=person(a)

butitremainsauniversalmodelhence K ⊨ q iff q mapstoF*

M.-L.Mugnier–UNILOGSchool– 2018 57

F

q

R

« bottom-up » « chase » (TGDs)

APPROACH1TORULES:FORWARDCHAINING/MATERIALISATION

K= (F, R)

K ⊨ q iff q maps by homomorphism to F*

Pros: materialisation offline, then online query answering is fast Cons: volume of the materialisation

needs writing access rights to the data not feasible if data is distributed among several databases not adapted if data change frequently

F*

M.-L.Mugnier–UNILOGSchool– 2018 58

EXAMPLE(MATERIALIZATION)

x = a y = m1 x = a y = m2 x = b y = z0 x = c y = x0

movie(m1)movie(m2)movie(x0)movieActor(a)movieActor(b)play(a,m1)play(a,m2)play(c,x0)

∀x(movieActor(x)à∃z(movie(z)∧play(x,z)))

q(x) = ∃ y (movie(y) ∧ play(x, y)) « find those who play in a movie »

SaturaFon

movie(z0) play(b,z0)

M.-L.Mugnier–UNILOGSchool– 2018 59

« top-down » decomposition into 2 steps [DL-Lite]

APPROACH2TORULES:BACKWARDCHAINING/QUERYREWRITING

R

Rewriting into a set of CQs, seen as a union of conjunctive queries (UCQ) and more generally into a « first-order » query (core SQL query)

F

q K= (F, R)

Query rewriting is independant from any factbase. For any F, F,R�qiffF�Q(i.e.,ifQisaUCQ:thereisqi�QwithF�qi)

Q

Pros: independent from the data Cons: rewriting done at query time, easily leads to huge and unusual queries

M.-L.Mugnier–UNILOGSchool– 2018 60

EXAMPLE

Rewq(x) = ∃ y (movie(y) ∧ play(x, y)) � movieActor(x) Query rewriting

x = a y = m1 x = a y = m2 x = c y = x0

x = a x = b

movie(m1)movie(m2)movie(x0)movieActor(a)movieActor(b)play(a,m1)play(a,m2)play(c,x0)

∀x(movieActor(x)à∃z(movie(z)∧play(x,z)))

q(x) = ∃ y (movie(y) ∧ play(x, y)) « find those who play in a movie »

M.-L.Mugnier–UNILOGSchool– 2018 61

BACKWARD CHAINING SCHEME!

UnificaSonbyaunifieru(ofq’andh’)

Body Headq’

QueryrewriSng

Body

Basicstep:Queryq RuleR

Newquery

h’

Direct rewriting of q with R and u = u(q \ q’) � u(body(R))

M.-L.Mugnier–UNILOGSchool– 2018 62

BASICPROPERTIES(1)

LetF2beobtainedfromF1bytheapplicaSonofRuleRLetaqueryQ1thatmapstoF2byahomomorphismthatusesatleastoneatom

broughtbyRThenthereisQ2,adirectrewriSngofQ1withR,suchthatQ2mapstoF1

F1 F2application of R

Q1

h1

Q2

h2

direct rewriting with R

and h1usesF2\F1

The reciprocal property holds

M.-L.Mugnier–UNILOGSchool– 2018 63

BASICPROPERTIES(2)

F1 F2application of R

Q1

h1

Q2

h2

direct rewriting with R

LetQ2beadirectrewriSngofQ1withRuleRLetF1beafactbasesuchthatQ2mapstoF2ThenthereisanapplicaSonofRtoF1thatproducesF2suchthatQ2mapstoF1

M.-L.Mugnier–UNILOGSchool– 2018 64

ForanyconjuncSvequeryq,foranyfactbaseF,foranysetofrules:thereisahomomorphismfromqtoF’,whereF’isobtainedfromFbyaruleapplicaSonsequenceoflength≤niffthereisahomomorphismfromq’toF,whereq’isobtainedfromqbyarewriSngsequenceoflength≤n

EQUIVALENCEDERIVATION/REWRITINGSEQUENCES

F1 F2

Q1

h1

Q2

h2

F1 F2

Q1

h1

Q2

h2

s.t. h1 usesF2\F1

M.-L.Mugnier–UNILOGSchool– 2018 65

TAKINGINTOACCOUNTEXISTENTIALVARIABLESINRULEHEADS(1)

¢  WewantacompletesetofsoundrewriSngs(setofCQs):qis.t.foranyF,ifF⊨qithenF,R⊨q

R=person(x)à�yhasParent(x,y)q=hasParent(v,w),denSst(w)u={x�v,y�w}rew(q,R,u)=qi=person(v),denSst(w)

qiisunsound:F=person(Maria),denSst(Giorgos)F⊨qihowever(F,{R})doesnotentailq

(1)IfwinqisunifiedwithanexistenSalvariableofR,thenallatomsinwhichwoccurmustbepartoftheunificaSon

M.-L.Mugnier–UNILOGSchool– 2018 66

TAKINGINTOACCOUNTEXISTENTIALVARIABLESINRULEHEADS(2)

R=p(x)à�z1�z2r(x,z1),r(x,z2),s(z1,z2)q=r(v,w),s(w,w)u={x�v,z1�w,z2�w}rew(q,R,u)=qi=p(v)(2)AnexistenSalvariableofRcannotbeunifiedwithanotherterminhead(R)

qiisunsound:F=p(a)F⊨ qi however(F,{R})doesnotentailq

M.-L.Mugnier–UNILOGSchool– 2018 67

PIECE-UNIFIER(FORBOOLEANCQS)

Apiece-unifieruofq’�qandh’head(R)isasubsStuSonofvar(q’+h’)byterms(q’+h’)[ifxisunchanged,wewriteu(x)=x]suchthat:¢  u(q’)=u(h’)

¢  existenSalvariablesofh’areunifiedonlywithvariablesofq’thatdonotoccur in(q\q’)(i.e.,ifxisexistenSalandu(x)=u(t),thentisavariableofq’andnotof(q\q’))

Queryq RuleR

q’ body head

h’

variablessharedbyq’and(q\q’)

ToextendthenoSontogeneralCQs:universalvariablescannotbeunifiedwithanswervariables

M.-L.Mugnier–UNILOGSchool– 2018 68

EXAMPLE

R=twin(x,y)à�zmotherOf(z,x)∧motherOf(z,y)

q=motherOf(v,w)∧motherOf(v,t)∧Female(w)∧Male(t)?

R=twin(x,y)à�zmotherOf(z,x)∧motherOf(z,y)

q=motherOf(v,w)∧motherOf(v,t)∧Female(w)∧Male(t)?

piece-unifieru1 = {z � v,x� w, y � w}

rewrite(q,R,u1)=motherOf(v,t) ∧Female(w)∧Male(w) ∧ twin(w,w)

R=twin(x,y)à�zmotherOf(z,x)∧motherOf(z,y)

q=motherOf(v,w)∧motherOf(v,t)∧Female(w)∧Male(t)?

piece-unifieru2 = {z � v,x � w,y � t}

rewrite(q,R,u2)=twin(w,t)∧Female(w)∧Male(t)

Ifwerewriteagainthisquerywecouldremovethefirstatom

M.-L.Mugnier–UNILOGSchool– 2018 69

WHATIFWESKOLEMIZEDRULES?

qiisunsound:F=person(Maria),denSst(Giorgos)F⊨qihowever(F,{R})doesnotentailq

R=person(x)à�yhasParent(x,y)q=hasParent(v,w),denSst(w)u = { x � v, y � w } rew(q,R,u)=qi=person(v),denSst(w)

Skolem(R)=person(x)àhasParent(x,f(x))

ClassicalmostgeneralunifierofhasParent(x,f(x))andhasParent(v,w):v�xandw�f(x)rew(q,R,u)=denSst(f(x))�person(x) whichcannotbeunifiedwitharulehead

(wouldnotbekeptintheouputsinceitcontainsaskolemfuncSon

Wecouldskolemizetherulesandrelyonusualm.g.u.thenkeeponlyrewriSngswithoutskolemfuncSon

butthiswouldcreateuselessintermediaterewriSngs

M.-L.Mugnier–UNILOGSchool– 2018 70

WHY«PIECES»?

Apieceisaunitofknowledgebroughtbyarule:¢  FronFervariables(andconstants)actascutpointstodecomposeruleheads

intopieces(«minimalnon-emptysubsetsgluedbyexistenSalvariables»)

R=b(x)à�y�zp(x,y)∧p(y,z)∧p(z,x)∧q(x,x) x y

z

¢  Arulewithkpiecescanbedecomposedintokrules,oneforeachpiece,whilekeepingthesamebody

¢  Itcannotbefurtherdecomposed(exceptbyintroducingnewpredicates)

b(x)à�y�z p(x,y)∧p(y,z)∧p(z,x)

b(x)àq(x,x)

M.-L.Mugnier–UNILOGSchool– 2018 71

DECOMPOSITIONOFRULESINTOATOMICHEADRULES(1)

R:b(x)à�y�z p(x,y)∧p(y,z)∧p(z,x) rule with single-piece head

Decomposition into rules with atomic head by introducing a fresh predicate

R0:b(x)à�y�z pR(x,y,z)R1:pR(x,y,z)àp(x,y)R2:pR(x,y,z)àp(y,z)R3:pR(x,y,z)àp(z,x)

We lose the structure of the head •  much less efficient query rewriting

•  may even lead to lose the property of having a finite universal model

(if the set of rules has this property)

M.-L.Mugnier–UNILOGSchool– 2018 72

DECOMPOSITIONOFRULESINTOATOMICHEADRULES(2)

R:p(x,y)→�zp(y,z),p(z,y)

F:p(a,b)

a b z0 ...F1

F2

z1

z2

F2≡ F1 (F2 mapstoF1 ) henceF*≡ F1

Finiteuniversalmodel

AÅerdecomposiSonintoatomicheadrules:

R0:p(x,y)→�zpR(y,z)R1:pR(y,z)àp(y,z)R2:pR(y,z)àp(z,y)

a b z0 ...F1

F2

z1

z2

F2F1

Nofiniteuniversalmodel

M.-L.Mugnier–UNILOGSchool– 2018 73

OVERVIEWOFTHELECTURE

Part1:Basics

Part2:KRformalismsandalgorithmicapproaches

Part3:DecidabilityissuesintheexistenFalruleframework

UndecidabilityofthefundamentalproblemGenericproperSesthatensuredecidability

Main«concrete»decidableclassesofexistenSalrules

M.-L.Mugnier–UNILOGSchool– 2018 74

However,here:queryrewriSngwithRisfiniteforanyq

SATURATIONMAYNOTHALT

R=person(x)àhasParent(x,y)∧person(y)

∧person(y0)∧hasParent(a,y0)

∧person(y1)∧hasParent(y0,y1)

F=person(a)

NoredundanciesareaddedTheKBhasnofiniteuniversalmodel

M.-L.Mugnier–UNILOGSchool– 2018 75

R=friend(u,v)∧friend(v,w)àfriend(u,w)

q=friend(Giorgos,Maria)

q1=friend(Giorgos,v0)∧friend(v0,Maria)

q2=friend(Giorgos,v1)∧friend(v1,v0)∧friend(v0,Maria)

q2’=friend(Giorgos,v0)∧friend(v0,v1)∧friend(v1,Maria)

q2andq2’areequivalent

Etc.q3=friend(Giorgos,v2)∧friend(v2,v1)∧friend(v1,v0)∧friend(v1,Maria)

QUERYREWRITINGMAYNOTHALT

However,here:saturaSonwithRisfiniteforanyF

There are cases where both processes do not halt (even if the factbase is known)

Thereisaninfinitenumberofnon-redundantrewriSngs

M.-L.Mugnier–UNILOGSchool– 2018 76

UNDECIDABILITYOFTHEFUNDAMENTALPROBLEM

FundamentaldecisionproblemInput:K=(F,R)knowledgebase,qBooleanconjuncSvequeryQuesSon:isqentailedbyK?

This problem is undecidable (only semi-decidable)

E.g. proof by reduction from the word problem in a semi-Thue system

There is a one-step derivation from a word w to w’ if there is a rule wi à wj in G, and w = w1wiw2, w' = w1wjw2 w’ is derived from w if

there is a (finite) sequence of one-step derivations from w to w’

Input: a set G of rules of the form wi à wj, 2 words w0 and wf

Question: is it possible to derive (exactly) wf from w0 using the rules in G?

M.-L.Mugnier–UNILOGSchool– 2018 77

REDUCTIONFROMTHEWORDPROBLEMFromG,w0andwfwebuildaKB(F, R)andaBooleanCQq

Vocabulary constants:thelekersoccuringinG,w0andwf +twospecialconstantsBandE binarypredicates:succandval

FactbaseF=T(w0,B,E)

SetofrulesRisobtainedbytranslaSngeachrulewiàwjintotheexistenSalrule �x �y (T(wi,x,y)à T(wj,x,y))

Toawordw=a1...anweassignthefollowinggraphT(w,x,y) wheretheziareexistenSalvariablesandx,yarefree

x y

Queryq=T(wf,B,E)

Key:anywordwderivablefromw0withGcorrespondstoapathT(w, B, E) inthesaturaSonofFbyR,andreciprocally

M.-L.Mugnier–UNILOGSchool– 2018 78

finite saturation

bounded treewidth saturation

finite UCQ rewriting

atomic body (linear)

frontier-1

weakly- guarded

weakly frontier-guarded

datalog

guarded

weakly- acyclic

acyclic GRD

wa-GRD jointly- acyclic

frontier- guarded

jointly-fg

sticky-join

w-sticky-j

sticky

weakly-sticky

(PARTIAL)MAPOFDECIDABLECASES

DL-Lite

EL

M.-L.Mugnier–UNILOGSchool– 2018 79

GENERICPROPERTIESTHATENSUREDECIDABILITY

ThreegenerickindsofproperSesensuringdecidability:-  SaturaSonbyForwardChaininghaltsforanyfactbase

(«finiteexpansionset»,fes)

-  QueryrewriSnghaltsforanyconjuncSvequery(«finiteunificaSonset»,fus,orUCQ-rewritability)

-  SaturaSonbyForwardChainingmaynothaltbutforanyfactbasethegeneratedfactshaveatree-likestructure(«boundedtreewidthset»,bts)

NoneoftheseproperSesisrecognizable[Baget+KR10]buttheseproperSesprovidegenericalgorithmicschemes

M.-L.Mugnier–UNILOGSchool– 2018 80

No existential variables

MainClasseswithFiniteSaturaFon(fes)

Datalog

Acyclic position dependency graph

Weak-acyclicity

Acyclic Graph of Rule Dependencies Acyclic

GRD

Joint-acyclicity

Acyclic existential dependency graph

GRD with fes strongly connected components

fes-GRD

[BagetKR�04]

[BagetKR�04]

[Deutsch+ICDT�03][Fagin+ICDT03]

[Krötzsch+IJCAI�11]

Position dependency graph: nodes are positions in predicates edges show how existential variables are propagated

Graph of rule dependencies: nodes are rules edges express that a rule may lead to trigger a rule

M.-L. Mugnier – UNILOG School – 2018 81

WEAK-ACYCLICITY

R1:p(x)→�yr(x,y)�q(y)R2:r(x,y)→p(x)

PosiFondependencygraphnodes:posiSons(p,i)inpredicatesedges:foreachfronServariablexinposiSon(p,i)inarulebody -anedgefrom(p,i)toeachposiSon(q,j)ofxintherulehead

-aspecialedgefrom(p,i)toeachposiSonofanexistenSalintheruleheadRisweakly-acyclicifitsposiSongraphcontainsnocircuitwithaspecialedge(*)

weaklyacyclic

notweaklyacyclicspecialedge(p,1)à(r,1)duetoR1edge(r,1)à(p,1)duetoR2

R1:p(x)→�y�zr(x,y)�r(y,z)�r(z,x)R2:r(x,y)�r(y,x)→p(x)

M.-L.Mugnier–UNILOGSchool– 2018 82

ACYCLICGRAPHOFRULEDEPENDENCYGraphofRuleDependenciesnodes:therulesedges:anedgefromRitoRjifanapplicaSonofRimayleadtotriggeranew

applicaSonofRj(«RjdependsonRi»)DependencycanbeeffecSvelycomputedbycheckingifthereisapiece-unifierof

body(Rj)andhead(Ri)

Theseexamplesshowthatweak-acyclicityandacyclicGRDareincomparablecriteriaCommongeneralizaSonsofthesetwonoSonshavebeendefined

CyclicGRDsinceR1andR2dependoneachother

R1:p(x)→�yr(x,y)�q(y)R2:r(x,y)→p(x)

R1:p(x)→�y�zr(x,y)�r(y,z)�r(z,x)R2:r(x,y)�r(y,x)→p(x)

M.-L.Mugnier–UNILOGSchool– 2018 83

E.g. inclusion dependencies, necessary properties of concepts / relations

MainClasseswithFiniteQueryRewriFng(fus)

Atomic- body Sticky Domain-

restricted

Sticky-join Each head atom contains

all or none of the body variables

E.g. concept product Elephant(x) ∧ Mouse(y) à bigger-than(x,y)

[Cali+PVLDB2010]

[Baget+IJCAI�09][Baget+IJCAI�09]

[Cali+RR�10]

= linear Datalog+ [Cali+PODS2009]

Body restricted to a single atom

Restricts multiple occurrences of body variables that do not occur in all head atoms

Each head atom contains all the body variables

E.g. Human(x) àparentOf(y,x) ∧Human(y) is atomic-body, sticky and domain-restricted

M.-L. Mugnier – UNILOG School – 2018 84

WidthofatreedecomposiSon=maxnumberofnodesinabag(minus1)Treewidthofagraph=minwidthoveralldecomposiSontreesofthisgraph

r

a

p(a,b)q(b,z0)r(a,b,t0)p(b,t0)q(t0,z1)r(b,t0,t1)p(t0,t1)

DecomposiFontree1)eachnode(term)appearsinabag2)eachhyperedge(atom)hasallitsnodesinabag3)foreachnodex,thesubgraphinducedbythebagscontainingxisconnected

b

r

t0

z0 ab

node

hyperedge

DecomposiFonTree/Treewidth

p p p

q q

z1

t1bz0 abt0

t0z1 bt0t1

p(a,b)

q(b,z0)

r(a,b,t0) p(b,t0)

r(b,t0,t1) p(t0,t1)

q(t0,z1)

M.-L. Mugnier – UNILOG School – 2018 85

The decidability proof does not provide a halting algorithm (relies on the bounded treewidth model property [Courcelle90])

R is bts if the forward chaining with R generates facts with bounded treewidth: i.e., for any factbase F, there is an integer b s.t. any factbase R -derived from F has treewidth bounded by b

F

BoundedTreewidthoftheDerivedFacts(bts)

EssenSally[CaliGoklobKiferKR’08]

fes (finite saturation) is included in bts (bound given by the number of terms in the finite saturation)

M.-L. Mugnier – UNILOG School – 2018 86

An atom in the body guards all the body variables

Guard only the frontier

Guard only affected variables (i.e.possibly mapped to new existentials)

Guard only affected variables from the frontier

Frontier: variables shared by the body and the head

SomeRecognizablebts(andnotfes)ClassesofRules

guarded [Cali+KR’08]

weakly guarded

[Cali+KR�08]The frontier has size 1

[Baget+KR�10]

frontier guarded

[Baget+KR�10]

weakly frontier guarded

[Baget+IJCAI�09]

frontier 1

r(x,y) ∧ r(y,z) ∧ s(x,y,z) à �u r(y,u) ∧ r(z,u) r(x,y) ∧ r(y,z) ∧ r(x,z) à �u r(z,u)

r(x,y) ∧ r(y,z) à r(y,u) ∧ r(z,u)

datalog

These classes are moreover « greedy bts » => a halting algorithm [Baget+IJCAI�11]M.-L. Mugnier – UNILOG School – 2018 87

Greedybts

R1 = p(x,y) à ∃z p(y,z) R2 = p(x,y) ∧ q(x,z) à ∃t r(x,y,t) ∧ p(y,t)

F = p(a,b) ab

bz0 abt0

t0z1 bt0t1

p(a,b)

q(b,z0)r(a,b,t0)p(b,t0)

r(b,t0,t1)p(t0,t1)q(t0,z1)

R1 R2

R1 R2

Greedy construction of a decomposition tree of derived facts

with bounded width

Etc.

M.-L. Mugnier – UNILOG School – 2018 88

The«Greedybts»Property[Baget+ IJCAI�11]

T0 T0 = terms(F) + {constants} F

T0 ∪ var(h(H))

B

H

h

Derived facts Decomposition tree

All bags contain T0 F

h(H)

Foranyfactbase,foreachruleapplicaSon,fronServariablesnotbeingmappedtoiniSaltermsarejointlymappedtovariablesoccurringinatomsaddedbyasinglepreviousruleapplicaSon

M.-L. Mugnier – UNILOG School – 2018 89

MainIdeasoftheAlgorithmforgbts(1)

1.  Bagpakern={homomorphismsfrompartofarulebodyto«currentfact»

thatusesometermsofthebag}

•  Aruleisapplicabletothecurrentfactbaseiffabagpakerncontainsitsbody•  FCcanbeperformedonthedecoratedtree

2.  EquivalencerelaSononbags

OnlyonebagperequivalenceclassisdevelopedTheothernodesareblocked

Boundednumberofequivalenceclassesàfinite«fullblockedtree»T*

BuildafinitedecomposiSontreethatencodesapotenSallyinfinitefact

M.-L. Mugnier – UNILOG School – 2018 90

MainIdeasoftheAlgorithmforgbts(2)

[Baget+IJCAI2011]qaddedasarule«qàmatch»

qisentailediffmatchoccursinabagpa=erni.e.,qmapsbyhomomorphismtoatoms(T*)

[Thomazo+KR2012]offline/onlineseparaSon

(1)compilaSon:treeT*builtindependentlyfromanyquery

(2)querying:anyqisentailediffitmapsby*-homomorphismtoT* i.e.qmapsbyhomomorphismtoabounded«development»ofT*

QuerythisfinitedecomposiSontree

M.-L. Mugnier – UNILOG School – 2018 91

DataComplexityofgbtsClasses

guarded

weakly guarded

frontier guarded

weakly frontier guarded

frontier 1

Datalog (fes)

ExpTime-c

PTime-c

Previousalgorithmisworst-caseopSmalongbtsfordata/combinedcomplexity.

CanbespecializedtobeopSmalonthesegbtssubclassesM.-L. Mugnier – UNILOG School – 2018 92

FES GBTS

FUS

linear

frontier-1

weakly- guarded

weakly frontier-guarded

Datalog

guarded weakly- acyclic

aGRD jointly- acyclic frontier-

guarded

jointly-fg

sticky-join

w-sticky-join

sticky

weakly-sticky

DL-Lite

EL

MFA

glut-fg (BTS)

domain- restricted

super-weak- acyclic

M.-L.Mugnier–UNILOGSchool– 2018 93

CONCLUSION

•  Reasoningwithontologiesisbecomingcentralinmanydata-centricapplicaSons

•  SolidtheoreScalfoundaSonswitharangeofontologicalformalisms

thatoffervarioustradeoffexpressivity/complexity

•  Ongoingresearch•  Gobeyond(unionsof)conjuncSvequeries,e.g.combinethemwith

navigaSonalquerieslikeregularpathqueries•  NewqueryrewriSngtechniquesthattargetmorepowerful

langages,e.g.Datalog•  NewqueryansweringtechniquesthatcombinematerialisaSonand

queryrewriSng•  StudytheinteracSonoftheontologywithmappings,whichiskey

toefficientqueryansweringoverheterogeneousdata•  RepresenSngandreasoningwithtemporalandspaSaldata

•  Dealingwithdatainconsistencies ....M.-L.Mugnier–UNILOGSchool– 2018 94

(Small) Bibliography BienvenuM.,LeclèreM.,Mugnier,M.-L.andRousset,M.-C., Reasoning with Ontologies, chapter6,volume1in«AguidedtourofarSficialintelligenceresearch»,Springer,toappear.IntroducFonstoseveralaspectsofontology-mediatedqueryansweringwithdescripFonlogicsorexistenFalrulesintheReasoningWebsummerschoolbooks:inparScular:Bienvenu,M.andOrSz,M.(2015).Ontology-mediatedqueryansweringwithdatatractabledescripSonlogics.11thInternaSonalReasoningWebSummerSchool,volume9203ofLNCS,pages218–307.Springer.Mugnier,M.andThomazo,M.(2014).AnintroducSontoontology-basedqueryansweringwithexistenSalrules.10thInternaSonalReasoningWebSummerSchool,volume8714ofLNCS,pages245–278.Springer.Goklob,G.,Orsi,G.,Pieris,A.,andSimkus,M.(2012).DataloganditsextensionsforsemanScwebdatabases.10thInternaSonalReasoningWebSummerSchool,volume7487ofLNCS,pages54–77.Springer.

ThesesynthesesprovidefurtherreferencesM.-L.Mugnier–UNILOGSchool– 2018 95

APPENDIX:FURTHERDETAILS

¢  FundamentaldefiniSonsandproperSesfortheFOL(�,∧)fragment

¢  Piece-unifiers

M.-L.Mugnier–UNILOGSchool– 2018 96

INTERPRETATIONS/MODELS(1)

¢  VocabularyV=(P,C),where P=finitesetofpredicates C=setofconstants

¢  InterpretaFonI=(DI ,.I)ofV,where DI≠ø (domain) forallcinC,cIinDI

forallpinPwitharityk,pIDIk

¢  Furthermore,uniquenameassumpSon:forallcanddinC,cI≠dI

¢  SimplifyingassumpSon(inlinewiththeuniquenameassumpSon):CDIandforallcinC,cI=c

V = ( {p/2, r/3 }, {a, b} ) I: DI = {a, b, d1} pI = { (b, a), (b, d1), (d1, b) }

rI = { (d1, d1, a) }

¢  Iisamodeloff(builtonV)iffistrueinI

M.-L.Mugnier–UNILOGSchool– 2018 97

INTERPRETATIONS/MODELS(2)

¢  LetfinFOL(�,∧).Iisamodeloffiff thereisamappingvfromterms(f)toDIsuchthat

forallp(e1,...,ek)inf,(v(e1),...,v(ek))inpI

I: DI = {a, b, d1} pI = {(b, a), (b, d1), (d1, b)} rI = {(d1, d1, a)} f = �x�y�z ( p(x, y) ∧ p(y, z) ∧ r(x, z, a) )

v: x↦ d1 y↦ b z↦ d1

¢  InterpretaSonscanbeseenassetsofatoms

(withelementsfromD\Cseenasvariables)

p(b,a), p(b,x1), p(x1,b), r(x1, x1,a)¢  I isamodeloffiffthereisahomomorphismfromftoI

M.-L.Mugnier–UNILOGSchool– 2018 98

HOMOMORPHISMSAGAINANDAGAIN

¢  OnecandefinehomomorphismsbetweeninterpretaFons¢  Wehave:

IfI1mapsI2then,foranyf,I1modeloff�I2modeloff

¢  ToaformulafinFOL(�,∧),weassignitsisomorphicmodelM(f) (alsocalledcanonicalmodel)

f=�x�y�z(p(x,y)∧p(y,z)∧r(x,z,a))

M(f): D={dx,dy,dz,a} pM(f)={(dx,dy),(dy,dz)} rM(f)={(dx,dz,da)}

M.-L.Mugnier–UNILOGSchool– 2018 99

NICESEMANTICPROPERTIESOFFOL(�,∧)

¢  ThecanonicalmodelM(f)isuniversal,i.e.,forallM’modeloff,M(f)mapstoM’

Proof:LetM’modeloff.Then,fmapstoM’.SinceM(f)isomorphictof,M(f)mapstoM’

¢  g⊨f(i.e.,everymodelofgisamodeloff)ifffmapsbyhomomorphismtoM(g)ifffmapsbyhomomorphismtogProof:⇒Assumeg⊨f.InparScularM(g)isamodeloff,hencefmapstoM(g)

⇐AssumefmapstoM(g).SinceM(g)isuniversal:foranyM’modelofg,fmapstoM’,i.e.,M’isamodeloff,henceg⊨f

M.-L.Mugnier–UNILOGSchool– 2018 100

WHY«PIECES»?(CONT’D)–PIECE-UNIFIERS

¢  UnificaSonmust«map»partsofqaccordingto«pieces»thatcanbeprovidedbyaruleapplicaSon

(otherwiseitisunsoundoruseless) body head

body head

specializa4onofthefron4er

homomorphism

ThetermsofqunifiedwiththefronSer(orwithconstants)cutqinto«pieces»⇒ enSre«pieces»ofqmustbemappedtothepiecesoftherule

R

q

M.-L.Mugnier–UNILOGSchool– 2018 102

PIECE-UNIFIER:ALTERNATIVEDEFINITION

Letu1:fronSer(R)àfronSer(R)�constants(u1isaspecializaSonofthefronSerofR)Letu2beahomomorphismfromq’�qtou1(head(R))Cutpoints:termsofq’mappedtou1(fron4er(R))

ortoconstantsThecutpointscutqinto«pieces»u1+u2isapiece-unifierifq’iscomposedofpieces

body head

body head

u1

u2

M.-L.Mugnier–UNILOGSchool– 2018 103

Recommended