Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
William Vassilis Karageorgos Relational databases vs. First-Order Logic 1
William Vassilis KarageorgosUniversity of Athens - MPLA
Relational Databases vs. First-Order Logic
Based on
“Equality and Domain Closure in First-Order Databases”
and
“Towards a Logical Reconstruction of Relational Database Theory”
by Raymond Reiter
William Vassilis Karageorgos Relational databases vs. First-Order Logic 2
Part I
Introduction to Relational Databases
William Vassilis Karageorgos Relational databases vs. First-Order Logic 3
Introduction to Relational Databases
Model Theoretic● The database is a model of some
set of integrity constraints● A query is a formula to be
evaluated with respect to the above model
Proof Theoretic● The database is a set of first-order
formulæ● A query is a set of formulæ to be
proven● Integrity constraints are
considered in terms of consistency
Two different approaches: Model and Proof Theoretic
William Vassilis Karageorgos Relational databases vs. First-Order Logic 4
Introduction to Relational DatabasesModel theoretic approach
● Relational language R
● Relational interpretation I
R=A,W , where
alphabetA , such that∥A∥0א
arbitrary wwf ' sW⊆PA
I=D , K , E , wheredomain D≠{}mapping of constants K :A↦D
mapping of predicates E :A Dn , such that E ≈ ={d , d |d∈D }E P is called extentionof P in the interpretation I
William Vassilis Karageorgos Relational databases vs. First-Order Logic 5
● Relational Database DB
● No need for the concept of “safe” and “unsafe” queries, since adom(I) has only finitely many elements
Introduction to Relational DatabasesModel theoretic approach
DB=R , I , IC , whereIC is a set of wwf ' s called integrity constraints ,IC={φ |φ≡∀ x1∀ x2 ...∀ xnP x1 , x2 , ... , xn τ1 x1∧τ 2 x2∧...∧τ n xnfor every predicate P of A , with the exception of ≈_ , _τ i are types called domainsof PE P are called relations
William Vassilis Karageorgos Relational databases vs. First-Order Logic 6
Introduction to Relational DatabasesModel theoretic approach
Example
A relational language R=(A,W) where the only predicates and constants of A are the following:
● Predicates: TEACHER(_), COURSE(_),STUDENT(_), TEACH(_), ENROLLED(_), ≈(_,_)
● Simple Types: TEACHER(_), COURSE(_), STUDENT(_)● Constants: A, B, C, a, b, c, d, CS100, CS200, P100, P200
Then the following table defines an interpretation of R with domainD={A, B, C, a, b, c, d, CS100, CS200, P100, P200}
William Vassilis Karageorgos Relational databases vs. First-Order Logic 7
Problems● Disjunctive information
– Can lead to indefinite answers to queries
– Example: Student “a” is enrolled in class P200 or CS200
Solution: Split initial interpretation into two distinct interpretations I
1 and I
2, where in I
1 the relation ENROLLED contains the tuple
(a,P200) and I2 the tuple (a,CS200)
Any case such as the above could not be solved by treating a relation like ENROLLED(a,P200) ENROLLED(a,CS200) as an integrity constraint, since that would require that at least one of the tuples be present in the relation, and we do not know which is the case
Introduction to Relational DatabasesModel theoretic approach
William Vassilis Karageorgos Relational databases vs. First-Order Logic 8
Problems (cont.)● Null values
– Their need arises when we want a relation to reference a value that does not belong to our set of constants
– Example: Student “a” may be enrolled in class P200 or CS200
Solution: Add a new entity ω and allow tuples such as (a,ω). Now, ENROLLED(a,ω) means that student a is either
● enrolled in one of the known classes (previous disjunctive case)● Enrolled in some “unknown” class, one not in the domain D. Then, ω
takes the form of a truth value in a 3-valued truth system. This is the only case which properly “extends” our existing model in a way that equality and joins keep their intended meaning in relational algebra.
Introduction to Relational DatabasesModel theoretic approach
William Vassilis Karageorgos Relational databases vs. First-Order Logic 9
● The proof theoretic approach stems from the previous model theoretic approach
● In order for every wwf that is satisfiable in the m.t.a to be provable in the p.t.a., we need a domain closure axiom
● To compensate for negations of equality, which are satisfiable in the m.t.a, but unprovable in the p.t.a we need unique name axioms (i.e. no allii), so that for each two constants c, c' of the domain
Introduction to Relational DatabasesProof theoretic approach
∀ x ≈x , A∨ ≈x , B ∨...∨ ≈ x , Z , where D={ A , B , ... , Z }
¬ ≈c , c '
William Vassilis Karageorgos Relational databases vs. First-Order Logic 10
● Relational theory T W, contains
– domain closure axiom
– unique name axioms
– equality axioms
– a set Δ T, and for each predicate P distinct for the equality predicate a set Cp of n-tuples
of constants
– Completion axioms for each predicate P of the form
Introduction to Relational DatabasesProof theoretic approach
⋅ ∀ x ≈ x , x ⋅ ∀ x ≈ x , y ≈ y , x⋅ ∀ x∀ y ∀ z ≈ x , y ∧ ≈ y , z ≈ x , z ⋅ Leibnitz substitutionof equal terms inwwf ' s
C p={c | P c ∈}.The set {P c |c∈}is called the extentionof P∈T
∀ x1 ...∀ xn [P x1 , .. , xn ≈ x1, c11∧ ≈ xn , cn
1∨...∨ ≈ x1, c1r ∧...∧ ≈ xn , cn
r ]assuming that C p={c1
1 , ... , cn1 , ... ,c1
r , ... , cnr }
William Vassilis Karageorgos Relational databases vs. First-Order Logic 11
● Completion axioms consist of a proof theoretic way of describing the contents of a relation (database table)
● Then, a relational database is defined as
DB=(R,T,IC)● The above definitions suffice to prove that
– If T is as relational theory then T has a unique model I which is a relational interpretation for R
– If I is a relational interpretation of R then there is a relational theory T of R such that I is the only model of T
And the model / proof theoretic approach analogy is complete
Introduction to Relational DatabasesProof theoretic approach
William Vassilis Karageorgos Relational databases vs. First-Order Logic 12
Introduction to Relational DatabasesProof theoretic approach
Problems
● Disjunctive information
– The original constraints of the m.t.a. are in inadequate due to the equality predicate
– Example: Foo supplies a or Foo supplies b
Simply adding in IC the wwf
SUPPLIES(Foo,a) SUPPLIES(Foo,b)
in a theory that has the completion axiom
whose contrapositive is
leads to proving which is inconsistent
∀ x∀ y [SUPPLIES x , y ≈ x , Acme∧ ≈ y , a∨ ≈ x , Foo∧ ≈ y , b]
∀ x∀ y [[¬ ≈x , Acme∨¬ ≈ y , a]∧[¬ ≈x , Foo∨¬ ≈ y , b]¬SUPPLIES x , y ]
¬SUPPLIES Foo , a
William Vassilis Karageorgos Relational databases vs. First-Order Logic 13
Problems (cont.)● Need to “enrich” the completion axioms● This leads to a generalized relational theory of T, by redefining the
sets Cp
● A wwf of W is called a positive ground clause of R iff it has the form A
1 ... A
r where each A
i is a ground atomic formula whose
predicate is distinct from the equality predicate.
Introduction to Relational DatabasesProof theoretic approach
C p={c | for some positive ground clause A1∨...∨Ar of and some i ,1≤i≤r , Ai is P c }
William Vassilis Karageorgos Relational databases vs. First-Order Logic 14
Problems (cont.)● Null Values
– As in the m.t.a. we add a new entity ω, which is now taken to be a constant. (Skolem constant)
– As such, the domain closure axiom must be expanded to include ω
– The only thing that distinguishes the new constant from all the others is its absence in any unique name axiom (since it may represent one of those)
– The resulting theory T' is called a generalized relational theory with null values
Introduction to Relational DatabasesProof theoretic approach
William Vassilis Karageorgos Relational databases vs. First-Order Logic 15
Part II
Equality and Domain Closurein
Relational Databases
William Vassilis Karageorgos Relational databases vs. First-Order Logic 16
Equality and Domain Closurein
Relational Databases● From the two different approaches in the first-order logic reconstruction
of relational databases, the later (proof theoretic) has been shown to be far more fruitful
● However, it is clear that the equality relation presents sever computational problems to any automated first-order theorem provers, due to the complexity that it introduces in the generalized relational theories
● Goal: Devise a “natural” class of databases that can be described by a simpler relational theory
● We assume a function-free first-order language to describe our databases
William Vassilis Karageorgos Relational databases vs. First-Order Logic 17
Formal Preliminaries (Entities)
● Constants: c1, c
2, ...
● Variables: x1, x
2, ...
● Logical connectives: , , ,
● Predicates: P, Q, R, ...,
– Subset of unary predicates called simple types
– Remaining predicates called common predicates
● A set of types, where
– A simple type is a type
–
● Quantifiers: ,
Equality and Domain Closurein
Relational Databases
If τ1 and τ 2 are types then so are τ1 Θ τ 2, whereΘ∈{∧,∨ , ,¬}
William Vassilis Karageorgos Relational databases vs. First-Order Logic 18
Formal Preliminaries (Syntax of formulæ)
● Terms Variables Constants
● Literals: P(t1,...,t
r), where P is a predicate and t
i are terms
– Type literals if P is a simple type predicate
– Common literals if P is a common predicate
● Τ-wwf's
– A type literal is a τ-wwf
– If W1 and W
2 are τ-wwf's then so are W
1ΘW
2
– If W is a τ-wwf then so are xW , xW
● Typed wwf's(twwf's)
– A common literal is a twwf
– If W1 and W
2 are twwf's then so are W
1ΘW
2
– If W is a τ-wwf then so are x/tW , x/tW
Equality and Domain Closurein
Relational Databases
William Vassilis Karageorgos Relational databases vs. First-Order Logic 19
Formal Preliminaries (Databases)
● A database is made up of two disjoint components
– A type database (TDB), where all information about types resides
– A principal database (PDB), which contains information about all the other predicates
● So DB=TDB PDB
● No wwf in DB contains any existential quantifier in its PNF
● PDB is a set of twwf's which contains the aforementioned equality axioms
● TDB is a τ-complete finite set of τ-wwf's (for each constant c and type τ, either T⊢τ(c) or T⊢¬τ(c))
● TDB is formally a set of formulæ in the monadic predicate calculus and hence decidable
● If τ is a type, we define |τ |TDB
={ c | c is a constant and TDB⊢τ(c) }
Equality and Domain Closurein
Relational Databases
William Vassilis Karageorgos Relational databases vs. First-Order Logic 20
Example
● TDB
Equality and Domain Closurein
Relational Databases
Axioms∀ x CALCULUS x ∨CS x ∨HISTORY x COURSE x ∀ x¬COURSE x ∧TEACHER x ∀ x¬TEACHER x ∧STUDENT x
Simple type instances|CALCULUS |={C100 ,C200}|CS |={CS100 ,CS200 ,CS300 }| HISTORY |={ H100 , H200}|TEACHER |={ A , B ,C }| STUDENT |={a ,b , c , d }
William Vassilis Karageorgos Relational databases vs. First-Order Logic 21
Example (cont.)
● PDB
Equality and Domain Closurein
Relational Databases
∀ x /CALCULUS TEACH A , x ∀ x /CS TEACH B , x ∀ x /TEACHER∀ y /COURSE ∀ z /STUDENT TEACH x , y ∧ENROLLED z , yTEACHER−OF z , x
TEACH ENROLLEDC H100 a C100C H200 a CS100
b C200b CS200b CS300c C100c H200d H100
William Vassilis Karageorgos Relational databases vs. First-Order Logic 22
Formal Preliminaries (Queries)
● A query is any expression of the form
● Types only appear in the index of a query (the sequence of xi variables)
Examples:
● Who are a's teachers?
● Who teaches all history courses?
● Who teaches calculus but not history?
Equality and Domain Closurein
Relational Databases
< x1/τ 1 , ... , xn /τ n |Θ1 y1/θ1...Θm ym /θmW x1 , ... , xn , y1 , ... , ym>where τ i ,θ i are types , W is a quantifier− free twwf whose onlyfree variables are x1 , ... , xn , y1 , ... , ym , and whichcontains only constantsoccuring∈DB
< x /TEACHER |TEACHER−OF a , x>
< x /TEACHER | ∀ y /HISTORY TEACHES x , y>
< x /TEACHER | ∃ y /CALCULUS ∀ z /HISTORY TEACH x , y∧¬TEACH x , z >
William Vassilis Karageorgos Relational databases vs. First-Order Logic 23
Formal Preliminaries (Semantics of Queries)
● Answers to queries
– Defining an n-tuple as an answer to Q iff
poses problems in the case of disjunctive information
– Correct:
● We define
Equality and Domain Closurein
Relational Databases
DB⊢ τ1c1∧...∧τ ncn∧Θy /θ W c , y
a set of n−tuples{c1 , ... , cm} is ananswer to Q iff
DB⊢ ∨i≤m
[ τ ci∧Θy /θ W ci , y ]
where τ c denotes τ 1c1∧...∧τ ncn
{c1 , ... ,cm}≡c1...cm
William Vassilis Karageorgos Relational databases vs. First-Order Logic 24
Formal Preliminaries (Semantics of Queries cont.)
● An answer to Q is minimal iff
● An answer to Q is definite if m=1
● An answer to Q is indefinite if m>1
An indefinite answer means that x is either one of the tuples c(i), but it is impossible to say which
● The value of a query ||Q||DB
is the set of minimal answers to Q
Equality and Domain Closurein
Relational Databases
for no i ,1≤i≤m ,is c1...ci−1ci1...cm an answer to Q
William Vassilis Karageorgos Relational databases vs. First-Order Logic 25
Reduction of arbitrary queries to existential in closed databases
● If Q=<x/τ,z/ψ|(Θy/θW)(x,y,z) we define the quotient of ||Q|| by ψ, Δψ||Q|| as follows:
● Then, ||<x/τ|(y/θ)W(x,y)>|| = Δθ||<x/τ,y/θ|W(x,y)>||
● So we can strip of any leading universal quantifiers until we remain with a twwf that has only existential quantifiers (if any).
Equality and Domain Closurein
Relational Databases
c1 ...cm
∈Δψ ||Q || iff
⋅ for all ai∈|ψ | ,1≤i≤m , c1 ,a1...cm ,am is an answer ¿Q
⋅ for no i ,1≤i≤m does c1 ...ci−1
ci1...c m have the later property
William Vassilis Karageorgos Relational databases vs. First-Order Logic 26
Reduction of arbitrary queries to existential in closed databases (cont.)
● If Q=<x/τ,z/ψ|(Θy/θW)(x,y,z) we define the projection of ||Q|| with respect to ψ,
πψ||Q|| as follows:
● Then, ||<x/τ|(x/θ)W(x,y)>|| = πθ||<x/τ,y/θ|W(x,y)>||
● So we can strip of any existential quantifiers preceding universal quantifiers
Equality and Domain Closurein
Relational Databases
c1 ...cm
∈πψ ||Q || iff
⋅ there exist constants a1,... ,ar∈|θ | such that j≤r , i≤m
ci , a j is an answer toQ
⋅ for no i ,1≤i≤m does c 1...c i−1
ci1...cm have the later property
William Vassilis Karageorgos Relational databases vs. First-Order Logic 27
Reduction of arbitrary queries to existential in closed databases (cont.)
● Examples:
Equality and Domain Closurein
Relational Databases
⋅ Suppose || Q ||={ A ,a B , b , A , b ,B , a ,C ,a , C ,d , D ,a , E , aF , a , E , b }and |ψ |={a , b}. ThenΔψ ||Q ||={ ΑB ,C }
⋅ Suppose Q=< x1/θ1, x2/θ2, z /ψ |W x , y , z >and that ||Q ||={ a , a , Aa , a , Ba ,d ,C , a , b , Aa , c , D , a ,b ,C }.Thenπψ ||Q ||={ a ,a a , d , a , b}
William Vassilis Karageorgos Relational databases vs. First-Order Logic 28
Reduction of arbitrary queries to existential in closed databases (cont.)
● For closed databases
● Therefore, we can dispose of the domain closure axiom.
● The previous theorems are based on the τ-completeness of the TDB
● The aforementioned operators are appropriate generalizations of the division and projection operators of the relational algebra
● These operators allow us to compute a set of minimal answers to an arbitrary query by applying them on the minimal answers of an existential query
For example: ||<x/τ|(y/θ)(z/ψ)(w/ρ)W(x,y,z,w)>|| =
πθΔ
ψ||<x/τ,y/θ,z/ψ|(w/ρ)W(x,y,z,w)>||
● Their computation is straightforward in the case of definite answers
Equality and Domain Closurein
Relational Databases
DB⊢∃ y /θ W y iff DB∖ DC⊢∃ y /θ W y where DC is the domain closure axiom , and W is a quantifier− free twwf
William Vassilis Karageorgos Relational databases vs. First-Order Logic 29
Special cases
● Let W be a twwf in PNF such that T=(x/τ)C(x), where C in quantifier-free. Then W is Horn iff C(x) has at most one positive literal. A database is Horn if all of its twwf's are Horn
● Closed-world databases are databases with appropriate rules so that the negation of a ground literal is derived upon failure to derive the said literal. Therefore, only positive information need be stored.
● In the case of Horn databases with positive queries, and in the case of closed-world databases, no indefinite answers arise
Equality and Domain Closurein
Relational Databases
William Vassilis Karageorgos Relational databases vs. First-Order Logic 30
Special cases (cont.)
● A database is -saturated iff it has unique name axioms for all constants in its domain
● In such a case, we can dispose of all equality axioms but the one of reflexivity
● This is of great importance since these axioms pose significant computational difficulties
Equality and Domain Closurein
Relational Databases
William Vassilis Karageorgos Relational databases vs. First-Order Logic 31
Conclusion
● Certain types of databases, mainly closed databases and their special cases, exhibit a significant degree of simplification as far as computational feasibility is concerned
● The limitation on the absence of functional symbols is not that restricting; one can insert functions in the database guised as relations
● More generally, functions can be shown to pose no problem to the above findings, as long as they are total
● Partial functions may lead to indefinite answers even if the information is in store
Equality and Domain Closurein
Relational Databases