Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13

Herodotos Herodotou

Nedyalko Borisov

Shivnath Babu

Duke University

Table Partitioning

Split parent table into smaller child tables (partitions)

Partitioning methods: hash, range, list

Benefits

Improve query performance

Faster data loading, archival, backup

Efficient statistics maintenance

Better cardinality estimation

Fine-grained control for tuning

. . .

06/14/2011 Duke University 2

id date . . .

Sales

id date . . .

Sales_Jan_11

id date . . .

Sales_Feb_11

id date . . .

Sales_Mar_11

Recent Trends Growing usage due to increased data sizes

Growing user control due to new SQL extensions

Partitioning conditions for derived tables

DBA must satisfy multiple objectives and constraints regarding partitioning

Implications

DBA may have limited control over partitioning scheme

Diverse mix of partitioning schemes


Partitioning SchemesMultidimensional (Hierarchical) partitioning Tables partitioned on same key but different ranges Range partitioning with non-equi ranges


S11 S12 S13

S21 S22 S23

S31 S32 S33

S1 S2 S3

S.d

Sales S

S11 S12 S13

S21 S22 S23

S31 S32 S33

S41 S42 S43

S.id

S.d

20

0

40

60

80

L1

L2

L3

L4

20

0

40

60

80

L.idLoans L

P1

P3

P5

P7

P2

P4

P6

P8

20

0

40

60

80

30

10

50

70

P.idPayments P

A1 A2 A3 A4 A5

A.dAds A

Partition-aware Query Optimization


S11 S12 S13

S21 S22 S23

S31 S32 S33

S1 S2 S3

S.dSales S

S11 S12 S13

S21 S22 S23

S31 S32 S33

S41 S42 S43

S.id

S.d

20

0

40

60

80

P1

P3

P5

P7

P2

P4

P6

P8

20

0

40

60

80

30

10

50

70

P.idPayments P

P5

P7

P4

P6

P8

P1

P3

P2

S11

S21

S31

S41

S12

S22

S32

S42

S13

S23

S33

S43

S32 S33S32 S33

S42 S43

S12 S13

S22 S23

SELECT *

FROM Sales S, Payments P

WHERE S.id = P.id

AND S.d > Feb-15 AND P.id < 25



S12

S22

S.idS.d

20

0

40

P1

P3

P2 20

0

30

10

P.idPayments PSales S

S13

S23

SELECT *

FROM Sales S, Payments P

WHERE S.id = P.id

AND S.d > Feb-15

AND P.id < 25

HJ

Union

TS(P1)TS(P2)TS(P3)

Union

TS(S22)TS(S23)

TS(S12)TS(S13)

P1: “join of unions”

MJ

IS(P3)Union

TS(S22)

TS(S23)

Union

HJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

P2: “union of (partition-wise) joins”


More efficient partition-wise joins

More appropriate join orders

More appropriate join operators


MJ

IS(P3)Union

TS(S22)

TS(S23)

Union

HJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

TS(L1)

TS(L2)

Union

HJ

LegendS = SalesP = PaymentsL = Loans

S ⋈ P ⋈ L

IS(P3)MJ TS(L1)

TS(L2)Union

MJ

TS(S22)

TS(S23)

Union

HJ

INLJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

S ⋈ P ⋈ L

Problem & Challenges Problem Definition

Given a partitioning scheme and a query

find the optimal query execution plan

Challenges

Dealing with plan space explosion

Incorporating into state-of-the-art optimizers

Partitions as physical or logical properties?

Supporting a wide range of partitioning conditions


Overview


P1

P2

P3

P

S12S13

S22

S

S23

S.id = P.id

P1

P2

P3

S12

S13

S22

S23

1. Matching

P1

P2

P3

S12

S13

S22

S23

2. Clustering

3. Path Selection

TS(S22)TS(S23)

Union TS(P3)

HJ

TS(S22)TS(S23)

UnionTS(P1)TS(P2)

Union

MJ

Matching Phase Goal

Identify partition-wise join pairs that can generate output records

New data structure: Partition Index Tree (PIT)

Core idea: associate each partition with intervals

Functionalities: index intervals, efficient lookups

Implementation: Augmented red-black tree


Matching AlgorithmInputs: Table S, Table P, Join condition J

Step 1: Convert partitioning conditions to intervals

Step 2: Build PIT with intervals of S

Step 3: Probe PIT with intervals of P

Output: Partition-wise join pairs (Si, Pj)


Partition Interval MaxNotation:

S22 [20, 40) 40

S13 [0, 20) 40 S23 [20, 40) 40

S12 [0, 20) 40

Partition Interval MaxNotation:

S22 [20, 40) 40

S13 [0, 20) 40 S23 [20, 40) 40

S12 [0, 20) 40

P2 [10, 20)Probe

✓

✓

Output: Partition Join Pairs(S12, P1)(S12, P2)(S13, P1)(S13, P2)(S22, P3)(S23, P3)

Matching Algorithm


Benefits

O(n log(n)), n = number of partitions of S

Θ(n) memory needs

Build & reuse PITs multiple times

Additional support

Numeric, dates, and string ranges or lists

Complex partitioning conditions (AND, OR)

Complex join conditions (AND, OR)

Non-equi joins

Details in the paper

Goal

Minimize number of partition-wise join pairs

Clustering Phase


Input: Partition Join Pairs

(S12, P1)(S12, P2)(S13, P1)(S13, P2)(S22, P3)(S23, P3)

Output: Clustered Join Pairs

({S12,S13},{P1,P2})({S22,S23},{P3})

Intermediate:Join Partition Graph

S12 P1

P2S13

S12P3

S13

Input: Partition join pairs (output from Matching Phase)Step 1: Build bipartite join partition graphStep 2: Find connected components using Breadth-First-SearchOutput: Clustered join pairs

Path Creation and Selection Goal

Create and cost partition-wise join paths for child tables


IS(P3)MJ TS(L1)

TS(L2)Union

MJ

TS(S22)

TS(S23)

Union

HJ

INLJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

S ⋈ P ⋈ L

Bottom-up Query Optimization


LSP

Find & retain best 3-way join paths per interesting order

LS LP SP

Find & retain best 2-way join paths per interesting order

PSL

Find & retain best access paths

Bottom-up Query Optimization


SP

HJ

UnionTS(P1)TS(P2)TS(P3)

Union

TS(S22)TS(S23)

TS(S12)TS(S13)

MJ

UnionIS(P1)IS(P2)IS(P3)

Union

TS(S22)TS(S23)

TS(S12)TS(S13)

Logical Relation (Join)

Best (physical) join path

Best (physical) join path with interesting order

Original enumeration

Create and cost join of unions

Extended enumeration

Create and cost union of joins

Retain best path per interesting order

Not enough!

Not considering entire

plan space (e.g.,

if Pj is best, no 3-way

partition-wise joins)

Extended Enumeration


HJ

UnionTS(P1)TS(P2)TS(P3)

Union

TS(S22)TS(S23)

TS(S12)TS(S13)

Pj

MJ

TS(P3)Union

TS(S22)

TS(S23)

Union

HJ

Union

Union

TS(P1)

TS(P2)

TS(S12)

TS(S13)

Pu

Treating Partitions as Physical Properties

Interesting partitions in joins

Can make later joins less expensive

Approach

Retain best path for each interesting order

Retain best path for each interesting partition

Limitation

Not considering entire plan space (e.g., cannot create union of joins with different join orders)


Treating Partitions as Logical Properties


Logical child joins

Treated like logical parent joins

Retain best join paths per interesting order

LSPL1S12S13P1P2

L2S22S23P3

PP1

P2

P3

SS12S13S22S23

LL1

L2

LSL1S12S13

L2S22S23

LPL1P1P2

L2P3

SPS12S13P1P2

S22S23P3

Logical parent join

Logical child joins

Treating Partitions as Logical Properties

Property 1

Interesting orders independent across child joins

Property 2

Child joins can have different join orders/operators

Property 3

Entire extended plan space is enumerated

Optimality guarantee

Our bottom-up optimizer will find the optimal plan in the extended plan space


Experimental Evaluation Prototype using PostgreSQL 8.3.7

TPC-H benchmark (scale 30)

Evaluation Methodology

DBA has full/limited control over partitioning scheme

State-of-the-art Vs. Our partition-aware optimizer

Optimizer evaluation metrics

Execution time

Optimization time

Memory utilization


Evaluation: Execution Time


0

20

40

60

80

100

120

140

2 3 4 5 7 8 9 10 12 14

Exe

cuti

on

Tim

e (

min

)

Query

Basic AdvancedState-of-the-art Partition aware

Evaluation: Optimization Time


0

50

100

150

200

250

300

350

2 3 4 5 7 8 9 10 12 14

Op

tim

izat

ion

Tim

e (

ms)

Query

Basic/Intermediate Advanced

Evaluation: Memory Utilization


0

10

20

30

40

2 3 4 5 7 8 9 10 12 14

Me

mo

ry U

sage

(M

B)

Query

Basic AdvancedPostgreSQL Partition-aware

Summary Extended plan space to include plans with multiway

partition-wise joins

Developed new partition-aware optimization techniques

Easy incorporation into bottom-up query optimizers


Extensions to Parallel Databases Data placement strategy

Hash partitioning to nodes

Range/list partitioning within each node

Our extensions

Create partition-wise joins Si ⋈ Pi for each node Ni

Produce child joins for Si ⋈ Pi


Data placement strategy

Replicate dimension tables

Partition fact tablesaaaa

Our extensions

Further partition all tables

Produce child joins on each node

Evaluation: Vary Partition Size


0

50

100

150

200

250

300

350

64 96 128 192 256 64 96 128 192 256

Op

tim

izat

ion

Tim

e (

ms)

Partition Size (MB)

Evaluation: Vary Data Size


Cardinality Estimation


1.E+001.E+011.E+021.E+031.E+041.E+051.E+061.E+071.E+081.E+091.E+101.E+111.E+121.E+131.E+141.E+15

Q02 Q03 Q04 Q05 Q07 Q08 Q09 Q10 Q12 Q14

Esti

mat

ed

Nu

mb

er

of

Re

cord

s

Query

Basic/Intermediate Advanced Actual

Documents

Query Optimization Techniques for Partitioned Tables06/14/2011 Duke University 29 1.E+00 1.E+01 1.E+02 1.E+03 1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09 1.E+10 1.E+11 1.E+12 1.E+13