Upload
stephen-hicks
View
224
Download
2
Embed Size (px)
Citation preview
Status• “Lifetime of a Query”
– Query Rewrite– Query Optimization– Query Execution
• Optimization– Use cost-estimation to iterate over all possible plans, pick
one of minimum cost– Saw how to cost 1 relation ops– Saw how to cost joins– Saw that join ordering is complex
• Inner vs. outer (e.g., AB ≠ BA)• Join ordering (e.g., A(BC) ≠ (AB)C)• Join type (e.g., nested loops vs. sort-merge)
– We will return to Shapiro at the end of class
O(2n-1)! Plans; n = 7 -> > 6 Billion
Selinger Pruning• How does Selinger reduce the search space?
– Only considers left-deep plans– Pushes all cross products to the top– Uses a dynamic programming algorithm
(Basic) Selinger Dynamic Prog Alg.
if (bestPlan[S].cost ≠ ∞) ; array lookup ; independent of ; ordering of S
return bestPlan[S]if (|S| = 1)
bestPlan[S].plan = scan SbestPlan[S].cost = cost(scan S)return bestPlan[S]
for each size 1 non-empty subset S1 of SP1 = findBestPlan(S1)P2 = findBestPlan(S - S1)A = best algorithm for joining P1, P2 ;inner v outer?cost = P1.cost + P2.cost + cost(A)if (cost < bestPlan[S].cost)
bestPlan[S].plan ={execute P1.plan, execute P2.plan, join P1 and P2 using A }bestPlan[S].cost = cost
return bestPlan[S]
findBestPlan(JoinList S)
Merge-SortPhase 1:Repeat until S is done
Read a run of SSortWrite out
(Repeat with R)
Phase 2:Read concurrently from each run of S and RMerge runs, then join overlapping regions
Simple Hashi=0
Choose partition of hash range {vi, vi+1}Scan S, hash, if in partition, insert into hash
tableOtherwise, write back outScan R, hash, probe into hash table, output
matchesOtherwise, write back outRepeat with reduced R and S, in round i+1
GRACE HashChoose sqrt(|R|) partitions, with one page memory
per partition
Hash R into partitions, flushing pages as they fill
Hash S into partitions, flushing pages as they fill
For each partition p
Build a hash table H on R tuples in p
Hash S tuples in p into H, output matches
ComparisonChoose sqrt(|R|) partitions, with one page
memory per partition
Hash R into partitions, flushing pages as they fill
Hash S into partitions, flushing pages as they fill
For each partition p
Build a hash table H on R tuples in p
Hash S tuples in p into H, output matches
GRACE
i=0
Choose partition of hash range {vi, vi+1}Scan S, hash, if in partition, insert into hash tableOtherwise, write back outScan R, hash, probe into hash table, output
matchesOtherwise, write back outRepeat with reduced R and S, in round i+1
Simple
Phase 1:Repeat until S is done
Read a run of SSortWrite out
(Repeat with R)
Phase 2:Read concurrently from each run of S and RMerge runs, then join overlapping regions
Sort-Merge