Upload
magdalen-evans
View
214
Download
0
Embed Size (px)
Citation preview
IDEAS 2011
International Database Engineering & Applications
Symposium
September 21-23, Lisbon – Portugal
Aggregates and Priorities in P2P Data Management Systems
DEIS – University Of Calabria - Italy
Luciano Caroprese - Ester Zumpano
P2P Systems
Peer
D P4
P3
P2
P1
Query
P2P System
• Autonomous system
Import
Export
• Import/export data from/to other peers
IC
• Imported data should not ‘violate’ local integrity constraints
P1
FOL semantics:
q(X) r(X)
r(a)r(b)
q(a)q(b)
The whole system is inconsistent!
To “isolate” the inconsistency…
q(a)q(b)
q(X) r(X)
P2
X=Y Ü q(X), q(Y)
r(a)r(b)
q(a)q(b)
AN EXAMPLE
q(a)q(b)
X=YÜq(X), q(Y)
r(a)
r(b)
The P2P system is consistent after removing inconsistent P2
AN EXAMPLE
P1
q(X) r(X)
P2
r(a)
r(b)
Which are the ‘true’ atoms?
2 possible scenarios :
M1={ r(a), r(b),
q(a)q(b)
M2={ r(a), r(b),
The first step is…
…modeling mapping rules tocapture this semantics
q(a)}
q(b)}
X=Y Ü q(X), q(Y)
Our proposed semantics for mapping rules
P1
q(X) r(X)
P2
q(X) r(X)
FOL semantics:
q(X) r(X)
Satisfied if…
Val(q(X)) ≥Val(r(X))
New semantics:
Satisfied if…
Val(q(X)) ≤Val(r(X))
Possible scenarios:
r(a) r(b) q(b)
r(a) r(b) q(a)
r(a) r(b)
r(a)
r(b)q(a)q(b)
X=Y Ü q(X), q(Y)
Maximal weak model semantics
P1
q(X) r(X)
P2
Our system
q(X) r(X)
Its weak models…
M3={r(a), r(b), q(b) }
M2={r(a), r(b), q(a) }
M1={r(a), r(b)}
r(a)
r(b)
q(X), q(Y), X≠Y
r(a)r(b)
PS
Maximal models are those that contain maximal subset of imported atoms (M2, M3)
In each weak model welook for imported atoms.
MWM(PS)={M2,M3}
Its maximal weak models
X=Y Ü q(X), q(Y)
Modeling a P2P system with a disjunctive logic program with priorities
An equivalent characterization
The program:
a Å b c
c
A (positive) exclusive disjunctive Datalog rule is of the form:
A1 Å … Å Am B1, … ,BnM1 = { a, c }, M2 = { b, c }.
A priority rule is of the form: a ≥ b
The preference rule intuitively reads: a is preferable over
b. Thus M1 is the preferred minimal model.
Modeling a P2P system with a disjunctive logic program with priorities
An equivalent characterization
If r(X) is true in the source peer then it is possible either to import or not to import q(X) in the target peer.
q(X) r(X)
q(X) Å q’(X) r(X)Obviously, we prefer to import as much knolewdge as possible in each peer.Thus…
q(X) ≥ q’(X)
Preferred Minimal Model Semantics
P1
q(X) r(X)
P2
r(a)
r(b)
Our system becomes…
q(X) r(X) q(X), q(Y), X≠Y
r(a)r(b)
PS
q(X) ≥ q’(X)
q(X) Å q’(X) r(X)
Its minimal models…
M1={r(a), r(b), q’(a),q’(b)}
M2={r(a), r(b), q(a), q’(b)
M3={r(a), r(b), q’(a), q(b) }
Used to select preferred models…
Its preferred minimal models
PMM(PS)
Deleting primed atoms, we obtain…
= MWM(PS)
}
X=Y Ü q(X), q(Y)
The problem of this framework is that is does not allow to set preferences among Maximal Weak Models.
Example…
``in the case of conflicting information, it is preferable to import data from the neighbor peer that can provide the maximum number of tuples“
``in the case of conflicting information, it is preferable to import data from the neighbor peer such that the sum of the values of an attribute is minimum"
P1 cons(1,N,S)
emp(N,S)
P1
P3
emp(john,200),
P2
Pa=PbÜcons(Pa,Na,Sa), .
cons(Pb,Nb,Sb)
emp(mary,50),emp(tom,50)}
emp(dan,200),emp(lucy,50)}
cons(2,N,S) emp(N,S)
DB1={
DB2={
M1={cons(1,john,200),cons(1,mary,50),cons(1,tom,50)} U DB1 U DB2
M2={cons(2,dan,200),cons(2,lucy,50)} U DB1 U DB2
We introduce in our framework:1) aggregate functions2) priorities
New Framework…
cons(1,mary,50)cons(1,tom,50)
cons(2,dan,200)cons(2,lucy,50)
cons(1,john,200)
b(Source,<<Salary>>)cons(Source,Name,Salary)s(Source,<Salary>) cons(Source,Name,Salary)
b(1, {200,50,50})
s(1, {200,50})
Bag
Setb(2, {200,50})s(2, {200,50})
DB1 U DB2 DB1 U DB2
M1 M2
New Framework…
cons(1,mary,50)cons(1,tom,50)
cons(2,dan,200)cons(2,lucy,50)
cons(1,john,200)
s(1, 300) s(2, 250)
To bags and sets we can apply manyAggregate Functions:1) MIN / MAX2) AVG (Average)3) Count 4) SUM
S(Source,SUM<<Salary>>) cons(Source,Name,Salary)
DB1 U DB2 DB1 U DB2
M1 M2
Using aggregate functions we can derive aggregate data.Then we can apply priority rules.
Our goal…
P1 cons(1,N,S)
emp(N,S)
P1P3 emp(john,200)
P2
Pa=PbÜcons(Pa,Na,Sa), cons(Pb,Nb,Sb)
emp(mary,50)emp(tom,50)
emp(dan,200)emp(lucy,50)
cons(2,N,S) emp(N,S)
S(Source,SUM<<Salary>>)cons(Source,Name,Salary)
<{S(P1,Sum1) ≥ S(P2,Sum2) | Sum1 < Sum2}>:
LP:
IC:
M1={cons(1,john,200),cons(1,mary,50),cons(1,tom,50),S(1,300)} U DB1 U DB2
M2={cons(2,dan,200),cons(2,lucy,50),S(2,250)} U DB1 U DB2
The complete framework allows to define many levels of preferences!
Levels of preferences…
P1 cons(1,N,S)
emp(N,S)
P1P3 emp(john,200)
P2
Pa=PbÜcons(Pa,Na,Sa), . cons(Pb,Nb,Sb)
emp(mary,50)
emp(dan,150)emp(lucy,50)
cons(2,N,S) emp(N,S)
S(Source,SUM<<Salary>>)cons(Source,Name,Salary)
<{C(P1,Count1) ≥ C(P2,Count2) | Count1 > Count2},:
LP:
IC:
We prefer to import tuples from the peer that can provide the maximum number of tuples. In the case the peers provide the same number of tuples we prefer to import tuples from the peer that can provide tuples s.t. the total amount of the salary is minimum.
C(Source,COUNT<<Salary>>)cons(Source,Name,Salary)
{S(P1,Sum1) ≥ S(P2,Sum2) | Sum1 < Sum2}>
The complete framework allows to define many levels of preferences!
Levels of preferences…
P1 cons(1,N,S)
emp(N,S)
P1P3 emp(john,200)
P2
Pa=PbÜcons(Pa,Na,Sa), . cons(Pb,Nb,Sb)
emp(mary,50)
emp(dan,150)emp(lucy,50)
cons(2,N,S) emp(N,S)
S(Source,SUM<<Salary>>)cons(Source,Name,Salary)
<{C(P1,Count1) ≥ C(P2,Count2) | Count1 > Count2},:
LP:
IC:
C(Source,COUNT<<Salary>>)cons(Source,Name,Salary)
{S(P1,Sum1) ≥ S(P2,Sum2) | Sum1 < Sum2}>
M1={cons(1,john,200),cons(1,mary,50), C(1,2),S(1,250)} U DB1 U DB1
M2={cons(2,dan,150),cons(2,lucy,50), C(2,2),S(2,200)} U DB1 U DB2
We allow many levels of priorities.
Extended Prioritized Logic Program
The program:
a Å b c
c
M1 = { a, c,d }, M2 = { b, c ,d}.M3 = { a, c ,e}, M4 = { b, c ,e}.
A preference rule is of the form: <{a ≥ b}, {d ≥ e}>
We first apply the first level containing a ≥ b and then the second level containing d ≥ e
d Å e c
We extend the previous rewriting allowing levels of priorities.
Let us suppose that our P2P system has just a mapping rule and the following levels of priorities.
Priorities:< 1,..., n>
q(a) r(a)
q(a) Å q’(a) r(a)
<{q(a) ≥ q’(a)}, 1, ..., n >
Computation
The levels of priorities will be used sequentially in order to select the preferred stable models of the logic program.The priorities derived from mapping rules are the most important (the first level)!
This work enhances a previous semantics for P2P systems, introducing aggregate functions and priorities in order to define preferences among maximal weak models.
Presents an alternative characterization of the proposed semantics rewriting the P2P system into an extended prioritized logic program.
Conclusions
The problem of deciding whether an atom is true in some preferred weak models is S - complete.
The problem of deciding whether an atom is true in all preferred weak models is P -complete.
Complexity Results
2
p
2
p