ICS 421 Spring 2010 Query Evaluation ( i )

Preview:

DESCRIPTION

ICS 421 Spring 2010 Query Evaluation ( i ). Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa. SELECT * FROM Reserves WHERE sid =101. Query.  Sid=101. Parse Query. A. B. Reserves. Enumerate Plans. fetch. SCAN ( sid =101). Reserves. - PowerPoint PPT Presentation

Citation preview

1

ICS 421 Spring 2010

Query Evaluation (i)

Asst. Prof. Lipyeow LimInformation & Computer Science Department

University of Hawaii at Manoa

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

22/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Parse Query

Enumerate Plans

Estimate Cost

Choose Best Plan

Evaluate Query Plan

Result

QuerySELECT * FROM Reserves WHERE sid=101

Sid=101

Reserves

SCAN (sid=101)

ReservesIDXSCAN (sid=101)

Reserves

Index(sid)

fetch

32.0 25.0

Pick B

A B

Evaluate Plan A

Optimizer

3

Parse Query• Input : SQL

– Eg. SELECT-FROM-WHERE, CREATE TABLE, DROP TABLE statements

• Output: Some data structure to represent the “query”– Relational algebra ?

• Also checks syntax, resolves aliases, binds names in SQL to objects in the catalog

• How ?2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Parse Query

Enumerate Plans

Estimate Cost

Choose Best Plan

Evaluate Query Plan

Result

Query

4

Enumerate Plans• Input : a data structure representing the

“query”• Output: a collection of equivalent query

evaluation plans• Query Execution Plan (QEP): tree of

database operators.– high-level: RA operators are used– low-level: RA operators with particular

implementation algorithm.• Plan enumeration: find equivalent plans

– Different QEPs that return the same results– Query rewriting : transformation of one

QEP to another equivalent QEP.2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Parse Query

Enumerate Plans

Estimate Cost

Choose Best Plan

Evaluate Query Plan

Result

Query

5

Estimate Cost• Input : a collection of equivalent

query evaluation plans• Output: a cost estimate for each

QEP in the collection• Cost estimation: a mapping of a

QEP to a cost– Cost Model: a model of what counts

in the cost estimate. Eg. Disk accesses, CPU cost …

• Statistics about the data and the hardware are used.

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Parse Query

Enumerate Plans

Estimate Cost

Choose Best Plan

Evaluate Query Plan

Result

Query

6

Choose Best Plan• Input : a collection of equivalent

query evaluation plans and their cost estimate

• Output: best QEP in the collection• The steps: enumerate plans, estimate

cost, choose best plan collectively called the:

• Query Optimizer: – Explores the space of equivalent plan

for a query– Chooses the best plan according to a

cost model2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Parse Query

Enumerate Plans

Estimate Cost

Choose Best Plan

Evaluate Query Plan

Result

Query

7

Evaluate Query Plan• Input : a QEP (hopefully the best)• Output: Query results• Often includes a “code

generation” step to generate a lower level QEP in executable “code”.

• Query evaluation engine is a “virtual machine” that executes some code representing low level QEP.

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Parse Query

Enumerate Plans

Estimate Cost

Choose Best Plan

Evaluate Query Plan

Result

Query

8

Query Execution Plans (QEPs)• A tree of database operators: each operator is a RA

operator with specific implementation• Selection : Index Scan or Table Scan• Projection π:

– Without DISTINCT : Table Scan– With DISTINCT : requires sorting or index scan

• Join : – Nested loop joins (naïve)– Index nested loop joins– Sort merge joins

• Sort :– In-memory sort– External sort

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

9

QEP Examples

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

SELECT S.snameFROM Reserves R, Sailors SWHERE R.sid=S.sid AND R.bid=100 AND S.rating>5

S.rating>5 AND R.bid=100

Reserves Sailors

R.sid=S.sid

πS.sname

Nested Loop Join

On the fly

On the fly

(SCAN) (SCAN)

S.rating>5 AND R.bid=100

Reserves Sailors

R.sid=S.sid

πS.sname

S.rating>5

Reserves Sailors

R.sid=S.sid

πS.sname

R.bid=100

S.rating>5

Reserves Sailors

R.sid=S.sid

πS.sname

Nested Loop Join

On the fly

R.bid=100

(SCAN) (SCAN)

Temp T1

10

Access Paths• An access path is a method of retrieving

tuples. Eg. Given a query with a selection condition:– File or table scan– Index scan

• Index matching problem: given a selection condition, which indexes can be used for the selection, i.e., matches the selection ?– Selection condition normalized to conjunctive

normal form (CNF), where each term is a conjunct

– Eg. (day<8/9/94 AND rname=‘Paul’) OR bid=5 OR sid=3

– CNF: (day<8/9/94 OR bid=5 OR sid=3 ) AND (rname=‘Paul’ OR bid=5 OR sid=3)

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

S.rating>5

Reserves Sailors

R.sid=S.sid

πS.sname

Nested Loop Join

On the fly

R.bid=100

(SCAN) (SCAN)

Temp T1

Index(R.bid)

R.bid=100

(IDXSCAN)

Fetch

Reserves

11

Index Matching

• A tree index matches a selection condition if the selection condition is a prefix of the index search key.

• A hash index matches a selection condition if the selection condition has a term attribute=value for every attribute in the index search key

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

I1: Tree Index (a,b,c)

I2: Tree Index (b,c,d)

I3: Hash Index (a,b,c)

Q1: a=5 AND b=3

Q2: a=5 AND b>6

Q3: b=3

Q4: a=5 AND b=3 AND c=5

Q5: a>5 AND b=3 AND c=5

12

One Approach to Selections

• The selectivity of an access path is the size of the result set (in terms of tuples or pages).– Sometimes selectivity is also used to mean reduction factor:

fraction of tuples in a table retrieved by the access path or selection condition.

• Eg. Consider the selection: day<8/9/94 AND bid=5 AND sid=3

– Tree Index(day) – Hash index (bid,sid)

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

1. Find the most selective access path, retrieve tuples using it2. Apply remaining terms in selection not matched by the

chosen access path

13

Join Algorithms• Cost model

– Single DBMS server: I/Os in number of pages– Distributed DBMS: network I/Os + local disk I/Os– td : time to read/write one page to local disk

– ts: time to ship one page over the network to another node

• Single server:– Nested Loop Join– Index Nested Loop Join– Sort Merge Join– Hash Join

• Distributed:– Semi-Join– Bloom Join

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

14

Nested Loop Join

For each data page PS1 of S1

For each tuple s in PS1

For each data page PR1 of R1

For each tuple r in PR1

if (s.sid==r.sid) then output s,r

• Worst case number of local disk reads = Npages(S1) + |S1|*Npages(R1)

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

sid bid day22 101 10/10/9658 103 11/12/96

sid sname rating age

22 Dustin 7 45.0

31 Lubber 8 55.5

58 Rusty 10 35.0

R1S1

15

Index Nested Loop Join

For each data page PS1 of S1

For each tuple s in PS1

if (s.sid Index(R1.sid)) then fetch r & output <s,r>

• Worst case number of local disk reads with tree index= Npages(S1) + |S1|*( 1 + logF Npages(R1))

• Worst case number of local disk reads with hash index= Npages(S1) + |S1|* 2

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

sid bid day22 101 10/10/9658 103 11/12/96

sid sname rating age

22 Dustin 7 45.0

31 Lubber 8 55.5

58 Rusty 10 35.0

R1S1

Index(R1.sid)

16

Sort Merge Join

1. Sort S1 on SID2. Sort R1 on SID3. Compute join on SID using Merging algorithm

• If join attributes are relatively unique, the number of disk pages = Npages(S1) log Npages(S1) + Npages(R1) log Npages(R1) + Npages(S1) + Npages(R1)

• If the number of duplicates in the join attributes is large, the number of disk pages approaches that of nested loop join.

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

sid bid day19 100 8/8/9922 101 10/10/9622 99 10/12/9558 103 11/12/96

sid sname rating age

22 Dustin 7 45.0

31 Lubber 8 55.5

58 Rusty 10 35.0

R1S1

17

Distributed Joins

• Consider:– Reserves join Sailors

• Depends on:– Which node get the query– Whether tables are

fragmented/partitioned or not

• Node 1 gets query– Perform join at Node 3 (or 4)

ship results to Node 1 ?– Ship tables to Node 1 ?

• Node 3 gets query– Fetch sailors in loop ?– Cache sailors locally ?

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Network

Boats1

Node 1

Boats2

Node 2

Reserves

Node 3

Sailors

Node 4

18

Distributed Joins over Fragments

R join S = R.sid=S.sid (R S)

= R.sid=S.sid ((R1R2) (S1 S2))

= R.sid=S.sid ((R1 S1) (R1 S2) (R2 S1) (R2 S2))

= R.sid=S.sid (R1 S1) R.sid=S.sid (R1 S2) R.sid=S.sid (R2 S1) R.sid=S.sid (R2 S2)

= (R1 join S1) (R1 join S2) (R2 join S1) (R2 join S2)

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Network

Reserves1

Node 1

Reserves2

Node 2

Sailors1

Node 3

Sailors2

Node 4

Equivalent to a union of joins over each pair of fragments

This equivalence applies to splitting a relation into pages in a single server DBMS system too!

19

Distributed Nested Loop• Consider performing R1 join S2 on

Node 1• Page-oriented nested loop join:

For each page r of R1Fetch r from local diskFor each page s of S2

Fetch s if scacheOutput r join s

• Cost = Npages(R1)* td + Npages(R1)*Npages(S2)*(td + ts)

• If cache can hold entire S2, cost is Npages(R1)* td + Npages(S2)* (td + ts)

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Network

R1

Node 1

S2

Node 2

foreachR1 page r Fetch

S2 page s

r join s

20

Semijoins• Consider performing R1 join S2 on

Node 1• S2 needs to be shipped to R1• Does every tuple in S2 join with R1 ?• Semijoin:

– Don’t ship all of S2– Ship only those S2 rows that will join with

R1– Assumes that the join causes a reduction

in S2!

• Cost = Npages(R1)*td + Npages(πsidR1)*ts + Cost() + Npages(sidjsidS2)*ts + Cost(R1 join sidjsidS2)

2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Network

R1

Node 1

S2

Node 2

πsidR1 (jsid,πsidR1

πsidS2)

sidjsidS2

R1 joinsidjsidS2

21

Bloomjoins• Consider performing R1 join S2 on

Node 1• Can we do better than semijoin ?• Bloomjoin:

– Don’t ship all of (πsidR1)– Node 1: Ship a “bloom filter” (like a

signature) of (πsidR1)• Hash each sid• Set the bit for hash value in a bit vector• Send the bit vector v1

– Node 2: • Hash each (πsidS2) to bit vector v2• Computer (v1 v2) • Send rows of S2 in the intersection

• False positives2/9/2010 Lipyeow Lim -- University of Hawaii at Manoa

Network

R1

Node 1

S2

Node 2

v1=Bloom(πsidR1)

v2=Bloom(πsidS2)

sidjsidS2R1 joinsidjsidS2

jsid=v1v2

Recommended