Status “Lifetime of a Query” –Query Rewrite –Query Optimization –Query Execution Optimization –Use cost-estimation to iterate over all possible plans,

Status• “Lifetime of a Query”

– Query Rewrite– Query Optimization– Query Execution

• Optimization– Use cost-estimation to iterate over all possible plans, pick

one of minimum cost– Saw how to cost 1 relation ops– Saw how to cost joins– Saw that join ordering is complex

• Inner vs. outer (e.g., AB ≠ BA)• Join ordering (e.g., A(BC) ≠ (AB)C)• Join type (e.g., nested loops vs. sort-merge)

– We will return to Shapiro at the end of class

O(2n-1)! Plans; n = 7 -> > 6 Billion

Selinger Pruning• How does Selinger reduce the search space?

– Only considers left-deep plans– Pushes all cross products to the top– Uses a dynamic programming algorithm

(Basic) Selinger Dynamic Prog Alg.

if (bestPlan[S].cost ≠ ∞) ; array lookup ; independent of ; ordering of S

return bestPlan[S]if (|S| = 1)

bestPlan[S].plan = scan SbestPlan[S].cost = cost(scan S)return bestPlan[S]

for each size 1 non-empty subset S1 of SP1 = findBestPlan(S1)P2 = findBestPlan(S - S1)A = best algorithm for joining P1, P2 ;inner v outer?cost = P1.cost + P2.cost + cost(A)if (cost < bestPlan[S].cost)

bestPlan[S].plan ={execute P1.plan, execute P2.plan, join P1 and P2 using A }bestPlan[S].cost = cost

return bestPlan[S]

findBestPlan(JoinList S)

Merge-SortPhase 1:Repeat until S is done

Read a run of SSortWrite out

(Repeat with R)

Phase 2:Read concurrently from each run of S and RMerge runs, then join overlapping regions

Simple Hashi=0

Choose partition of hash range {vi, vi+1}Scan S, hash, if in partition, insert into hash

tableOtherwise, write back outScan R, hash, probe into hash table, output

matchesOtherwise, write back outRepeat with reduced R and S, in round i+1

GRACE HashChoose sqrt(|R|) partitions, with one page memory

per partition

Hash R into partitions, flushing pages as they fill

Hash S into partitions, flushing pages as they fill

For each partition p

Build a hash table H on R tuples in p

Hash S tuples in p into H, output matches

ComparisonChoose sqrt(|R|) partitions, with one page

memory per partition

Hash R into partitions, flushing pages as they fill

Hash S into partitions, flushing pages as they fill

For each partition p

Build a hash table H on R tuples in p

Hash S tuples in p into H, output matches

GRACE

i=0

Choose partition of hash range {vi, vi+1}Scan S, hash, if in partition, insert into hash tableOtherwise, write back outScan R, hash, probe into hash table, output

matchesOtherwise, write back outRepeat with reduced R and S, in round i+1

Simple

Phase 1:Repeat until S is done

Read a run of SSortWrite out

(Repeat with R)

Phase 2:Read concurrently from each run of S and RMerge runs, then join overlapping regions

Sort-Merge

Documents

Status “Lifetime of a Query” –Query Rewrite –Query Optimization –Query Execution Optimization –Use cost-estimation to iterate over all possible plans,