Query Folding Xiaolei Qian Presented by Ram Kumar Vangala

Preview:

Citation preview

Query FoldingQuery Folding

Xiaolei QianXiaolei Qian

Presented by Presented by Ram Kumar VangalaRam Kumar Vangala

Query FoldingQuery Folding

Query Folding refers to the activity of Query Folding refers to the activity of determining if and how a query can determining if and how a query can be answered using a given set of be answered using a given set of resources.resources.

Resources can be views or cached Resources can be views or cached results of previous queries.results of previous queries.

Why Query FoldingWhy Query Folding

The base relation referred to in a The base relation referred to in a query might be stored remotely and query might be stored remotely and accessing it might be expensiveaccessing it might be expensive

Accessing the database might not be Accessing the database might not be possible because of network possible because of network problem( disconnected).problem( disconnected).

Database might be conceptual but Database might be conceptual but not physically available.not physically available.

Query folding Used forQuery folding Used for

Query optimization in centralized Query optimization in centralized databasedatabase

Query processing in distributed Query processing in distributed databasedatabase

Query answering in federated Query answering in federated database.database.

ExampleExample

Patients (patient_id, clinic,dob,insurance)Patients (patient_id, clinic,dob,insurance) Physician (physician_id,clinic,pager_no)Physician (physician_id,clinic,pager_no) Drugs (drug_name,generic)Drugs (drug_name,generic) Notes Notes

(note_id,patient_id,physican_id,note_text)(note_id,patient_id,physican_id,note_text) Allergy (note_id,drug_name,allergy_text)Allergy (note_id,drug_name,allergy_text) Prescription Prescription

(note_id,drug_name,prescription_text)(note_id,drug_name,prescription_text)

Suppose that the database maintains Suppose that the database maintains materialized views defined asmaterialized views defined as

CREATE VIEW Drug_Allergy CREATE VIEW Drug_Allergy (patient_id,drug_name,text) SELECT (patient_id,drug_name,text) SELECT patient_id, drug_name, allergy_text patient_id, drug_name, allergy_text FROM Notes, Allergy WHERE FROM Notes, Allergy WHERE Notes.note_id=Allergy.note_idNotes.note_id=Allergy.note_id

General queryGeneral query

A user might use the following query to A user might use the following query to get the patient ids who are allergic to drug get the patient ids who are allergic to drug xd_2001.xd_2001.

SELECT patient_id,allergy_text FROM SELECT patient_id,allergy_text FROM Patients,Notes, Allergy WHERE Patients,Notes, Allergy WHERE Patients.patients_id=Notes.patient_id AND Patients.patients_id=Notes.patient_id AND Notes.note_id=Allergy.note_id AND Notes.note_id=Allergy.note_id AND clinic=palo_alto AND drug_name=xd_2001clinic=palo_alto AND drug_name=xd_2001

Folded Query Using ViewFolded Query Using View

SELECT patient_id,text FROM SELECT patient_id,text FROM Patients, Drug_Allergy WHERE Patients, Drug_Allergy WHERE Patients.patient_id=Drug_Allergy.patiPatients.patient_id=Drug_Allergy.patient_id AND clinic=palo_alto AND ent_id AND clinic=palo_alto AND drug_name= xd_2001drug_name= xd_2001

This query is more efficient than the This query is more efficient than the original queryoriginal query

Query containment is special case of Query containment is special case of Query foldingQuery folding

The problem of containment for The problem of containment for conjunctive queries is known as NP-conjunctive queries is known as NP-complete.complete.

NP-Complete: Toughest problems which do not NP-Complete: Toughest problems which do not have perfect solutionhave perfect solution

Conjunctive QueriesConjunctive Queries

Queries which are result of project-Queries which are result of project-select-join where the selection select-join where the selection condition are restricted to equality.condition are restricted to equality.

Conjunctive Query form:Conjunctive Query form:h:- ph:- p11,…….,p,…….,pnn

Where h,p1,..,pn are atomic formulas whose Where h,p1,..,pn are atomic formulas whose arguments are variables or constants, h is the arguments are variables or constants, h is the head, and phead, and p11,…,p,…,pnn is the body. is the body.

Variables in the head are distinguished Variables in the head are distinguished and also appear in the body.and also appear in the body.

X, Y X, Y distinguished variables distinguished variables W, U W, U other variables other variables A, B A, B constants constants

Example of conjunctive queryExample of conjunctive query q(X,Y) :- patients(X,palo_alto,Wq(X,Y) :- patients(X,palo_alto,W11,W,W22), ),

notes(Wnotes(W33,X,W,X,W44,W,W55), ), allergy(Wallergy(W33,xd_2001,Y),xd_2001,Y)

Hypergraph RepresentationHypergraph Representation A hypergraph is a set of nodesA hypergraph is a set of nodes A hypergraph is a graph where edges can A hypergraph is a graph where edges can

connect any number of verticesconnect any number of vertices Conjunctive query can be represented by a Conjunctive query can be represented by a

hypergraph.hypergraph. A conjunctive query is said to be acyclic if its A conjunctive query is said to be acyclic if its

hypergraph is acyclic.hypergraph is acyclic.

Example:Example: q(X,Y):- notes(Wq(X,Y):- notes(W11,X,W,X,W22,W,W33), allergy(W), allergy(W11,Y,W,Y,W44), ),

notes(Wnotes(W55,X,W,X,W66,W,W77), prescription(W), prescription(W55,Y,W,Y,W88))

The example computes patients X The example computes patients X and drugs Y such that X is prescribed and drugs Y such that X is prescribed to Y and is treated with allergy to Y. to Y and is treated with allergy to Y.

Query-Folding ProblemQuery-Folding Problem

Folding RulesFolding Rules

Let Q be a query, and R={RLet Q be a query, and R={R11,…,R,…,Rnn} } be a set of resources.be a set of resources.

We assume that no two resources have We assume that no two resources have the same resource predicate, and the same resource predicate, and there are no variables in common there are no variables in common between Q and Rbetween Q and Rii or between R or between Rii and R and Rjj for 1≤i, j≤n for 1≤i, j≤n

Folding typesFolding types

Partial foldingPartial folding Strong foldingStrong folding

Partial Folding:Partial Folding: A partial folding of Q using R is a A partial folding of Q using R is a

conjunctive query Q’ such that Q’ Q conjunctive query Q’ such that Q’ Q and the body of Q’ contains one or and the body of Q’ contains one or more resource predicate defined in more resource predicate defined in R.R.

Strong FoldingStrong FoldingA strong folding of Q using R is a partial A strong folding of Q using R is a partial

folding Q’ of Q using R such that Q Q’folding Q’ of Q using R such that Q Q’

A strong folding of a query is a partial A strong folding of a query is a partial folding that contains the original query.folding that contains the original query.

Example:Example:r1(Xr1(X11,X,X22,X,X33):- notes(U):- notes(U11,X,X11,U,U22,U,U33), ),

allergy(U allergy(U11,X,X22,X,X33))r2(Yr2(Y11,Y,Y22,Y,Y33,Y,Y44):-notes(V):-notes(V11,Y,Y11,Y,Y22,V,V22), ),

prescription(Vprescription(V11,Y,Y33,V,V33),),

drugs(Ydrugs(Y33,Y,Y44).).Where X,YWhere X,Y distinguished variable distinguished variable

U,V U,V other variables other variables

A complete folding of the above example will A complete folding of the above example will be as follows:be as follows:

q(X,Y) :-rq(X,Y) :-r11(X,Y,W),r(X,Y,W),r22(X,W(X,W11,Y,W,Y,W22).).

Query Folding AlgorithmQuery Folding Algorithm

Let Q be a query, GLet Q be a query, GQ Q be the be the hypergraph representing Q, and F be hypergraph representing Q, and F be a set of folding rules. Then the query a set of folding rules. Then the query folding algorithm computes complete folding algorithm computes complete or partial folding of Q using F.or partial folding of Q using F.

Two steps:Two steps:• InitializationInitialization• Folding GenerationFolding Generation

Initialization:Initialization:• Compute labels for every hyperedge in GCompute labels for every hyperedge in GQQ

• Given hyperedge e GGiven hyperedge e GQQ and conjunct p and conjunct p assosiated with e, its label Le is a relation with assosiated with e, its label Le is a relation with attributes var(p). For every F f such that p attributes var(p). For every F f such that p unifies with head(F). with most general unifier unifies with head(F). with most general unifier , there is a tuple in Le consisting of two , there is a tuple in Le consisting of two parts: tuple var(p) and expression body (F) parts: tuple var(p) and expression body (F) ,where second part is used to store folding of ,where second part is used to store folding of p.p.

Folding GenerationFolding Generation• Construct set of folding by u-joining the Construct set of folding by u-joining the

labels of all the hyperedges in an labels of all the hyperedges in an arbitrary order.arbitrary order.

Query Folding for Acyclic QueriesQuery Folding for Acyclic Queries

Existence of FoldingExistence of Folding• Pairwise consistency is necessary but not Pairwise consistency is necessary but not

sufficient for the existence of foldings of cyclic sufficient for the existence of foldings of cyclic queries.queries.

Example:Example:q(X,Y):-patients(Wq(X,Y):-patients(W11,W,W22,W,W33,W,W44), ),

notes(X,Wnotes(X,W11,W,W55,Y), physician(W,Y), physician(W55,W,W22,W,W66))

with resourceswith resourcesrr11(X(X11,X,X22) :-patients(B) :-patients(B11,A,A11,U,U11,U,U22), ),

notes(Xnotes(X11,B,B11,C,C11,X,X22),physician(C),physician(C11,A,A22,U,U33))

rr22(Y(Y11,Y,Y22):-patients(B):-patients(B22,A,A22,V,V11,V,V22), ), notes(Ynotes(Y11,B,B22,C,C22,Y,Y22), ),

physician(C physician(C22,A,A11,V,V33))

Example:Example:

Label for hyperedgesLabel for hyperedges

Theorem: There exists a complete Theorem: There exists a complete folding of acyclic query Q using folding of acyclic query Q using folding rules F iff no hyperedges in folding rules F iff no hyperedges in reduction(Greduction(GQQ) have empty labels.) have empty labels.

Example : consider an acyclic query which Example : consider an acyclic query which computes notes from clinics with allergic computes notes from clinics with allergic reactions.reactions.

q(X,Y):- allergy(X,Wq(X,Y):- allergy(X,W11,W,W22), drug(W), drug(W11,W,W33), ), notes(X,Wnotes(X,W44,W,W55,W,W66), patients(W), patients(W44,Y,W,Y,W77,W,W88))

Resources:Resources: r1(Xr1(X11,X,X22):-):-

allergy(Xallergy(X11,U,U11,U,U22),drugs(U),drugs(U11,X,X22),notes(X),notes(X11,U,U33,,UU44,U,U55))

r2(Yr2(Y11,Y,Y22):- ):- notes(Ynotes(Y11,V,V11,V,V22,V,V33),patients(V),patients(V11,Y,Y22,V,V44,V,V55),dru),drugs(Vgs(V66,V,V77))

Folding rulesFolding rules

Labels for hypergraphLabels for hypergraph

Theorem:Theorem:

There does not exist a partial folding of There does not exist a partial folding of acyclic query Q using folding rules F acyclic query Q using folding rules F iff every hyperedge in reduction (Giff every hyperedge in reduction (GQQ) ) has a singleton label.has a singleton label.

Resources:Resources:

Folding Rules:Folding Rules:

Labels to HypergraphLabels to Hypergraph

ConclusionConclusion

Query folding can be used in Query folding can be used in centralized databasescentralized databases

Queries can be answered using views Queries can be answered using views instead of base relations.instead of base relations.

In multiple queries, the result of a In multiple queries, the result of a query can be used to partially query can be used to partially answer another query.answer another query.

In client server application, views can In client server application, views can be cached.be cached.

Questions?Questions?

Thank youThank you