28
CS 4432 logical query rewriting - lecture 15 1 CS4432: Database Systems II Lecture #15 Logical Query Rewriting Professor Elke A. Rundensteiner

CS 4432logical query rewriting - lecture 151 CS4432: Database Systems II Lecture #15 Logical Query Rewriting Professor Elke A. Rundensteiner

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

CS 4432 logical query rewriting - lecture 15

1

CS4432: Database Systems IILecture #15

Logical Query Rewriting

Professor Elke A. Rundensteiner

CS 4432 logical query rewriting - lecture 15

2

Query in SQL Query Plan in Algebra (logical) Other Query Plan in Algebra (logical)

CS 4432 logical query rewriting - lecture 15

3

Query plan 1 (in relational algebra)

B,D

R.A=“c” S.E=2 R.C=S.C

XR S

CS 4432 logical query rewriting - lecture 15

4

Query plan 2 (in relational algebra)

B,D

R.A = “c” S.E = 2

R S

natural join on R.C=S.C

CS 4432 logical query rewriting - lecture 15

5

Relational algebra optimization

• What are transformation rules ?– preserve equivalence

• What are good transformations?– reduce query execution costs

CS 4432 logical query rewriting - lecture 15

6

Rules: Natural join rewriting.

R S = S R(R S) T = R (S T)

R S S T

T R

Can also write as trees, e.g.:

CS 4432 logical query rewriting - lecture 15

7

Rules: Other binary operators ?

R S = S R(R S) T = R (S T)

What about :• Cross product?• Condition join?• Union?• Intersection ?• Difference ?

CS 4432 logical query rewriting - lecture 15

10

Rules: Selects

p1p2(R)= p1 [ p2 (R)]

[ p1 (R)] U [ p2 (R)]

p1vp2(R) =

CS 4432 logical query rewriting - lecture 15

11

Bags vs. Sets

R = {a,a,b,b,b,c}S = {b,b,c,c,d}What about union R U S = ?

• Option 1 SUMR U S = {a,a,b,b,b,b,b,c,c,c,d}

• Option 2 MAXR U S = {a,a,b,b,b,c,c,d}

CS 4432 logical query rewriting - lecture 15

12

Which option makes this rule work ?

p1vp2 (R) = p1(R) U p2(R)

Example: R={a,a,b,b,b,c}P1 satisfied by a,b; P2 satisfied by b,c

p1vp2 (R) = {a,a,b,b,b,c}

p1(R) = {a,a,b,b,b}

p2(R) = {b,b,b,c}

p1(R) U p2 (R) = {a,a,b,b,b,c}

Let us try MAX():

CS 4432 logical query rewriting - lecture 15

13

Which option makes this rule work ?

p1vp2 (R) = p1(R) U p2(R)

Example: R={a,a,b,b,b,c}P1 satisfied by a,b; P2 satisfied by b,c

p1vp2 (R) = {a,a,b,b,b,c}

p1(R) = {a,a,b,b,b}

p2(R) = {b,b,b,c}

p1(R) U p2 (R) = {a,a,b,b,b,b,b,b,c}

What about Sum()?

CS 4432 logical query rewriting - lecture 15

14

Which option makes this rule work ?

p1p2(R)= p1 [ p2 (R)]

Example: R={a,a,b,b,b,c}P1 satisfied by a,b; P2 satisfied by b,c

MAX or SUM ?

CS 4432 logical query rewriting - lecture 15

16

Yet another example !

Senators (……) Reps (……)

T1 = yr,state Senators; T2 = yr,state Reps

T1 Yr State T2 Yr State 97 CA 99 CA 99 CA 99 CA 98 AZ 98 CA

Union?

“Sum” option makes more sense!

CS 4432 logical query rewriting - lecture 15

17

Decision

-> In summary, we tend to use “SUM” option for bag union

-> Thus great care must be taken, as some rules cannot be used for bags !

CS 4432 logical query rewriting - lecture 15

18

Rules: Project

Let: X = set of attributesY = set of attributesXY = X U Y

xy (R) = x [y (R)]

CS 4432 logical query rewriting - lecture 15

19

Let p = predicate with only R attributes q = predicate with only S attributes m = predicate with both R and S attribs

p (R S) =

q (R S) =

Rules: combined

[p (R)] S

R [q (S)]

CS 4432 logical query rewriting - lecture 15

20

pq (R S) = ?

Rules: combined

Rule can be derived !

CS 4432 logical query rewriting - lecture 15

21

Derivation for rule :

pq (R S) =

p [q (R S) ] =

p [ R q (S) ] =

[p (R)] [q (S)]

CS 4432 logical query rewriting - lecture 15

22

More Rules can be Derived:

pq (R S) =

pqm (R S) =

pvq (R S) =

Rules: combined (continued)

CS 4432 logical query rewriting - lecture 15

23

We did one, do others on your own :

pq (R S) = [p (R)] [q (S)]

pqm (R S) =

m [(p R) (q S)]pvq (R S) =

[(p R) S] U [R (q S)]

CS 4432 logical query rewriting - lecture 15

24

Rules: combined

Let x = subset of R attributes z = attributes in predicate P

(subset of R attributes)

x[p (R) ] = {p [ x (R) ]} x xz

CS 4432 logical query rewriting - lecture 15

25

Rules: combined

Let x = subset of R attributes y = subset of S attributes

z = intersection of R,S attributes

xy (R S) =

xy{[xz (R) ] [yz (S) ]}

CS 4432 logical query rewriting - lecture 15

26

In textbook: more transformations

• More rewrite rules

• Other operations, such as, duplicate elimination, etc.

• Eliminate common sub-expressions

• Identify contradictions

CS 4432 logical query rewriting - lecture 15

30

Which are “good” transformations?

CS 4432 logical query rewriting - lecture 15

31

Conventional wisdom: do projects early

Example: relation R(A,B,C,D,E) predicate P: (A=3) (B=“cat”)

E {p (R)} vs. E {p{ABE(R)}}

CS 4432 logical query rewriting - lecture 15

32

What if we have A, B indexes?

B = “cat” A=3

Intersect pointers to get

pointers to matching tuples!

But

Then better to do projection later !

CS 4432 logical query rewriting - lecture 15

33

p1p2 (R) p1 [p2 (R)]

p (R S) [p (R)] S

R S S R

x [p (R)] x {p [xz (R)]}

Which are “good” transformations?

CS 4432 logical query rewriting - lecture 15

34

Bottom line:

• Some heuristics : – Early selection is usually good

• No transformation is always good

• Rule application defines a search space– Need cost criteria to make decision