Andy Pavlo

Preview:

DESCRIPTION

Magical Parallel. OLTP Databases. Andy Pavlo. November 10 th , 2011 – MIT CSAIL. Databases?. Evan Jones?. Lebron is going to Miami!. The McRib will be back!. Michael Jackson is in trouble!. On-Line. Transaction. Processing. Fast. Repetitive. Small. H -Store. H -Store Partitioning. - PowerPoint PPT Presentation

Citation preview

Andy PavloAndy PavloOLTP OLTP

DatabaDatabasesses

Magical Magical ParallelParallel

November 10November 10thth, 2011 – MIT CSAIL, 2011 – MIT CSAIL

Databases?Evan Jones?

The McRib will be back!

Michael Jackson

is in trouble!

Lebron is going

to Miami!

TransTransactionaction

On-LineOn-LineProceProcessingssing

FastRepetitiveSmall

H-Store

ITEMITEMITEMjITEMjITEMITEMITEMITEMITEMITEM

P2

P4

DISTRICTDISTRICT

CUSTOMERCUSTOMER

ORDER_ITEMORDER_ITEM

STOCKSTOCK

ORDERSORDERS

Replicated

WAREHOUSEWAREHOUSE

P1

P1

P1

P1

P1

P1

P2

P2

P2

P2

P2

P2

P3

P3

P3

P3

P3

P3

P4

P4

P4

P4

P4

P4

P5

P5

P5

P5

P5

P5

P5

P3

P1

ITEMITEMITEMITEM

ITEMITEM ITEMITEM

ITEMITEM

Partitions

ITEMITEM

H-Store Partitioning

Tables

ClientApplication

April 20, 2023

H-Store Cluster

Procedure NameInput

Parameters

This transaction

will execute 4 queries on partitions 1,3, and 6!

Optimization #1

ClientApplication

P1

P2

P3

P4

P1

P2

P3

P4

Optimization #2

ClientApplication

ClientApplication

P1

P2

P3

P4

Optimization #2

ClientApplication

ClientApplication

April 20, 2023

InsertOrderInsertOrderLineUpdateStockInsertOrderLineUpdateStock

STOP LOGGINGSTOP LOGGING

Optimization #3

April 20, 2023

Optimization #4

Throughput(txn/s)

Why this Matters

CanadianCanadians do not s do not like like unnecessunnecessary ary surgeriessurgeries..

Pro Tip:Pro Tip:

On Predictive Modeling forOn Predictive Modeling forOptimizing Transaction ExecutionOptimizing Transaction Executionin Parallelin ParallelOLTP SystemsOLTP Systemsin PVLDB, vol 5. issue 2,in PVLDB, vol 5. issue 2,October 2011October 2011

April 20, 2023Houdini

ClientApplication

Use models to predict before execution.

Use models to predict before execution.

Main Idea:Main Idea:

Estimate the path that a transaction will take

Estimate the path that a transaction will take

Step #1:Step #1:

Determine which optimizations to enable.

Determine which optimizations to enable.

Step #2:Step #2:

Current State:

SELECT * FROM WAREHOUSE WHERE W_ID = ?SELECT * FROM WAREHOUSE WHERE W_ID = ?

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

Input Parameters:

GetWarehouse:????

April 20, 2023

Confidence Coefficient:

0.56

Best Partition: 0

Use Undo Logging:

Yes

Partitions Read:

{ 0 }

Partitions Written:

{ 0 }

Partitions Done:

{1, 2, 3}

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

Input Parameters:

Transaction Estimate:

+1

+1

+1

+1

+1

+1

1) Long/wide models.

2) Keeping models in synch.

3) Incorrect predictions.

LimitatioLimitations:ns:

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;??

??

Current State:

XX

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

CheckStock:

Input Parameters:

INSERT INTO ORDERS(o_id, o_w_id)VALUES (?, ?);

INSERT INTO ORDERS(o_id, o_w_id)VALUES (?, ?);

InsertOrder:

Partition models based on input properties.

Partition models based on input properties.

Refinement:Refinement:

April 20, 2023

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

w_id=0i_w_id=[0,1] i_ids=[1001,1002]

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?

??

CheckStock:

??

Input Parameters:

1) Execute txn at the best partition.

2) Only lock the partitions needed.

3) Disable undo logging if not needed.

4) Speculatively commit transactions.

Houdini:Houdini:

5) Estimate initial path.6) Update as transaction executes.7) Recompute if workload changes.8) Partition for better accuracy.

Houdini:Houdini:

EvalEvaluatiouationn

ExperimentalExperimental

Model Accuracy

TATP TPC-CAuctionMark94.9%95.0%90.2%

Estimation Overhead

TATP TPC-CAuctionMark

TATP TPC-CAuctionMark

(txn/s)

Throughput

+57% +126%+117%

Small overhead cost improves throughput.

Small overhead cost improves throughput.

Conclusion:Conclusion:

November 9, 2011

h-storehstore.cs.br

own.eduTwitter: @andy_pavlo

Future work

Recommended