33
1 Consistent Query Answering Under Inclusion Dependencies Authors: Loreto Bravo and Leopoldo Bertossi Carleton University, Canada Presented by: Zhijun Lin Advisors: Dr. Hactor Hernandez Dr. Yuanlin Zhang

Presented by: Zhijun Lin Advisors: Dr. Hactor Hernandez Dr. Yuanlin Zhang

  • Upload
    katima

  • View
    21

  • Download
    0

Embed Size (px)

DESCRIPTION

Consistent Query Answering Under Inclusion Dependencies Authors: Loreto Bravo and Leopoldo Bertossi Carleton University, Canada. Presented by: Zhijun Lin Advisors: Dr. Hactor Hernandez Dr. Yuanlin Zhang. Integrity Constraints. - PowerPoint PPT Presentation

Citation preview

Page 1: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

1

Consistent Query Answering Under Inclusion Dependencies

Authors: Loreto Bravo and Leopoldo BertossiCarleton University, Canada

Presented by: Zhijun Lin

Advisors: Dr. Hactor Hernandez

Dr. Yuanlin Zhang

Page 2: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

2

Integrity Constraints

Integrity constraints (ICs) describe valid database instance.

For example: - “Every student has a unique ID number”

- “Students can enroll only in the offered courses”

- “No employees can have salary higher than his manager”

Page 3: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

3

Inconsistent databases

Inconsistent database: database that violates given integrity constraints.

Some reasons for database to be inconsistent:

- DBMS does not enforce all ICs.

- Integration of data from different databases.

- New constraints are imposed on pre existing database.

- Soft or user constraints.

Page 4: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

4

Inconsistent databases

In several cases we don’t want to repair database to restore consistency:

- no permission.- too expensive.- temporary inconsistency.

How to obtain consistent query answer from inconsistent database?

Page 5: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

5

Example

A database instance r

IC: (functional dependency) Name Grade.

Student Name Grade

John 90

John 80

Smith 70

Page 6: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

6

Example

If only deletion/insertion ofwhole tuples are allowed,there are two ways to repairthe database with minimalchanges.

Note Student (Smith, 70) persists in both repairs whereas Student(John,90) does not.

Student Name Grade

John 90

Smith 70

Student Name Grade

John 80

Smith 70

Page 7: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

7

Repair

A repair of a database instance r is a database instance r’

- over the same database schema and domain,

- satisfies ICs,

- differs from r by a minimal set of changes (insertion or deletion of tuples) wrt set inclusion.

Page 8: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

8

Consistent Query Answer

A tuple (a1,…,an) is a consistent query answer to a query Q (x1,…,xn) in a database r if it is an answer to Q

in every repair of r.

Page 9: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

9

Example

Student (Smith, 70) is a consistent answer.

Student (John, 90) is not a consistent answer.

For query asking for student that has higher grade than Smith, John should be a consistent answer.

Student Name Grade

John 90

Smith 70

Student Name Grade

John 80

Smith 70

Page 10: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

10

Classes of ICs

Consider two important classes of ICs:

- Universal integrity constraints (UICs)

- Referential integrity constraints (RICs), also known as inclusion dependencies (INDs).

Page 11: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

11

UIC and RIC

Q.P, relations database and ,in contained with ariables,distinct v of sequence are where

)3()],(([

form thehas constraintintegrity lReferentia

)2()],,...,()()([,...,

form thehas constraintintegrity Universal

1

2

3,2)131

111

1

xxx

xxPxQxx

xxxPxPxx

i

nii

n

miii

m

in

Page 12: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

12

Example of UIC

Functional dependency “Emp: id dept” can be expressed as:

which is equivalent to:

).),(),((

21

2121

deptdeptdeptidEmpdeptidEmpdeptdeptid

).),(),((

21

2121

deptdeptdeptidEmpdeptidEmpdeptdeptid

Page 13: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

13

Example of RIC

Consider a database schema {Emp (id, dept), People(id, name)}, in order to represent IND “Emp[id] People[id]”, which says that employees are people, we use the RIC:

which is equivalent to:

)],(),([ nameidPeopledeptidEmpnamedeptid

)),((),(( nameidPeoplenamedeptidEmpdeptid

Page 14: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

14

Special treatment of null-value

UIC holds if its satisfied by non-null values.

{Student(john,90), Student (john,null)} satisfies UIC Namegrade.

RIC is satisfied considering only non-null values foruniversally quantified variables and any value for existentially quantified variables.

{Emp(777, CS), People(777,null)} satisfies RIC Emp[id] People [id],so does {Emp(555,null)} and {Emp(null,cs)}.

Page 15: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

15

Example

Repair New Database instance Changes

1 {emp(john,cs), salary (john,null), dept(cs)

emp(mary,ee),dept(ee),salary(mary,2000) }

salary(john,null), dept(cs)

2 {emp(mary,ee),dept(ee),salary(mary,2000) } emp(john,cs)

Given database: D = { emp(john,cs),emp(mary,ee), dept(ee), salary(mary,2000) },

UIC: emp(X,Y) dept(Y)RIC: emp(X,Y) Z salary(Y,Z)

Page 16: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

16

Use ASP to compute repairs

1. dom(john). dom(mary). % for all constants a != null.dom(cs). dom(ee). dom(2000).

2. emp(john,cs,td). emp(mary,ee,td). salary(mary,2000,td). dept(ee,td).

% td denotes database fact.

3. emp(A,B,t1):- emp(A,B,td), dom(A),dom(B).emp(A,B,t1):- emp(A,B,ta), dom(A), dom(B).

% t1 denotes true or becomes true% ta denotes advised to be true.% (also for salary and dept).

Page 17: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

17

Use ASP to compute repairs

4. emp(A, B, fa) v dept(B, ta):- emp(A, B, t1), not dept(B, td), dom(A),dom(B).

emp(A, B, fa) v dept(B, ta):- emp(A, B, t1), dept(B, fa ), dom(A),dom(B).

% fa denotes advised to be false.

% repair for UIC: emp(X,Y) dept(Y)

Page 18: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

18

Use ASP to compute repairs

5. emp(A, B, fa) v salary(B, null, ta) :- emp(A, B, t1), not aux(B), not salary(B, null, td), dom(A), dom(B).

aux(B):- salary(B, Z, td), not salary(B, Z, fa), dom(B), dom(Z).aux(B):- salary(B,Z,ta), dom(B), dom(Z).

% repair for RIC: emp(X,Y) Z salary(Y,Z) % aux(B) means salary(B,Z) in final database for

some Z.

Page 19: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

19

Use ASP to compute repairs

6. emp(A, B, t2) :- emp(A, B, ta). emp(A, B, t2) :- emp(A, B, td), not emp(A, B, fa).

% t2 denotes true in the repair. (Also for dept and salary).

7. % A tuple cannot be both deleted and inserted. :- emp(A, B, ta), emp(A, B, fa). (Also for dept and salary).

Page 20: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

20

Consistent Query Answering

For Query ?emp(X,Y) ,

Add rule

ans(X,Y) :- emp(X,Y,t2).

to the repair problem, if ans(A, B) appears in all stable models, then emp(A,B) is a consistent query

answer.

Page 21: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

21

How does it work

The basic idea behind the repair program:

If there is a possible violation of ICs, it lists possible repairs (insertion/ deletion of tuples) in disjunction.

Since ASP produces answer sets which are minimal wrt set inclusion, the changes should also be minimal wrt set inclusion, matching the definition of repair.

Page 22: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

22

Problem

Now considerD = { p(a,a)}, RICs : p(X,Y) Z p(Y,Z) Clearly D satisfies the RICs.

But the repair program will generate a redundant repair, which deletes p(a,a).

Page 23: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

23

Grounded program

dom(a).p(a,a,td).p(a,a,t1):- p(a,a,td), dom(a).

p(a,a,fa) v p(a,null,ta):- p(a,a,t1), not aux(a), not p(a,null,td),dom(a).

aux(a):- p(a,a,td), not p(a,a,fa), dom(a).aux(a):- p(a,a,ta), dom(a).

% p(a,a,fa) justifies itself.

Page 24: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

24

Other example with Circular justification

p(X,Y)Z q(Y,Z), q(X,Y)Z p(Y,Z), D={p(a,b), q(b,a)}

program:

p(a,b,td). q(b,a,td).

p(a,b,fa) v q(b,null,ta):- p(a,b,t1),not aux1(b).q(b,a,fa) v p(a,null,td):- q(b,a,t1),not aux2(a).

aux1(b):- q(b,a,td), not q(b,a,fa).aux2(a):- p(a,b,td), not p(a,b,fa).

Page 25: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

25

- A set of RICs is said to be acyclic if there is no cycle in the directed graph whose vertices correspond to the relations in R, and an edge from P to R correspond to a RIC P(X1) Z R(X2,Z). Otherwise it is cyclic.

- Examples of cyclic RIC(s):1. XY (p(X,Y) Z q(Y,Z)). XY (q(X,Y) Z p(Y,Z)).

2. XY (p(X,Y) Z p(Y,Z)).

p q

p

Cyclic / Acyclic RICs

Page 26: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

26

Problem

The problem we show earlier happens only for cyclic RICs.

The authors concluded that their repair program generates the exact repairs for UICs and acyclic RICs. When cyclic RICs are presented, the program will produce a superset of the set of the repairs.

Can we fix the repair program to make it work for cyclic RICs?

Page 27: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

27

New repair program

Our solution is to add constraints to prevent redundant changes.Suppose we have cyclic RIC set {p, q}, and in the old program p(A,B,fa) and q(B,A,fa) justify each other.The repair rules for this RIC in the old program look like: p(A,B,fa) v q(B,null,ta):- G1. -- r1 q(A,B,fa) v p(B,null,ta):- G2. -- r2and assume there is another repair rule (not for cyclic RIC) involves p(A,B,fa). p(A,B,fa) v H :- G3. -- r3

Page 28: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

28

New repair program

First we rewrite r1-r3 to:

p(A,B,fa) v q(B,null,ta):- G1, not other_p_fa(1,A,B). p_fa(1,A,B):- p(A,B,fa), G1, not other_p_fa(1,A,B).

q(A,B,fa) v p(B,null,ta):- G2, not other_q_fa(1,A,B).q_fa(1,A,B):- q(A,B,fa), G2, not other_q_fa(1,A,B).

p(A,B,fa) v H :- G3. p_fa(0,A,B):- p(A,B,fa), G3.

Page 29: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

29

New repair program

Add following rules:1. suppose we have only one cyclic RIC set,

type(1). % repair rules for cyclic RIC violation.type(0). % repair rules for other IC violation.

2. other_p_fa(X,A,B):- p_fa(Y,A,B), X!=Y, type(X), type(Y). other_q_fa(X,A,B):- q_fa(Y,A,B), X!=Y, type(X), type(Y).

3. Deny circular justification:

:- p_fa(1,A,B), q_fa(1,B,A).

Page 30: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

30

An interesting observation

The main idea of our new program is to avoid circular justification. Our method can be used in ASP- SAT translation process.

Consider program: P = { a :- b. b:- a. }Its completion, Comp(P) = {(a b), (b a) } has two models {} and {a,b}, while P has one answer set {}.

We want to prevent circular justification between a and b.

Page 31: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

31

ASP to SAT

Rewrite P to P’: a :- b, not a2. % a2 -- other rule makes ‘a’ true. a1:- a, b, not a2. b:- a, not b2. b1:- b, a, not b2. :- a1, b1.Now comp(P’) = { ( a (b a2)), (a1 (a b a2)), ( b (a b2)), (b1 (b a b2)), (a1 b1), a2, b2 }which has only one model {}.

Page 32: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

32

ASP to SAT

Suppose we add fact {a} to program P, the rewritten P’ should add{ a, a2 :- a }. Then comp(P’) = { a, a2 a, (a1 (a b a2)), ( b (a b2)), (b1 (b a b2)), (a1 b1), b2}

Now it has single model {a, b, a2, b1}, corresponds to the answer set of P, which is {a, b}.

Page 33: Presented by: Zhijun Lin Advisors:     Dr. Hactor Hernandez                     Dr. Yuanlin Zhang

33

THE END