@andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Preview:

Citation preview

@andy_pavlo@andy_pavlo

On Predictive Modeling On Predictive Modeling forfor

DDistributed istributed DDatabasesatabases

VLDB - August 28VLDB - August 28thth, , 20122012

Databases?Evan Jones?

Romney has a Swiss

bank account!

Muammar Gaddafi is in trouble!

Putin is going to get re-

elected!

TransactiTransaction on

ProcessinProcessingg

High-High-VolumeVolume

Main MemoryMain Memory •• ParallelParallel •• Shared-NothingShared-Nothing

H-Store: A High-Performance, DistributedMain Memory Transaction Processing SystemProc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.

FastRepetitiveSmall

Client

Database Cluster

Proc. NameInput

Params

Transaction

Execution

Database Cluster

Transaction

Result

Client

Database ClusterDatabase Cluster

P1P2P3P4

This transactio

n will execute 4 queries on partitions 1,3, and 6!

Pro Tip:Pro Tip:Canadians Canadians do notdo notlike like unnecessary unnecessary surgeries.surgeries.

Main Main Idea:Idea:

On Predictive Modeling for OptimizingTransaction Execution in Parallel OLTP SystemsProc. VLDB Endow., vol. 5, iss. 2, pp. 85-96, 2011.

Use models Use models to predict to predict transaction transaction behavior behavior beforebefore execution. execution.

Client

Database ClusterDatabase Cluster

Step #1:Step #1:Estimate the Estimate the pathpaththat the that the transaction transaction will take.will take.

Current State

SELECT * FROM WAREHOUSE WHERE W_ID = ?SELECT * FROM WAREHOUSE WHERE W_ID = ?

w_id=0i_w_ids=[0,0] i_ids=[1001,1002]

w_id=0i_w_ids=[0,0] i_ids=[1001,1002]

GetWarehouse:

Input Parameters:

Step #2:Step #2:Determine Determine which which optimizationoptimizations to enable s to enable in the in the DBMS.DBMS.

Optimizations:

+1

+1

+1

+1

+1

w_id=0i_w_ids=[0,0] i_ids=[1001,1002]

w_id=0i_w_ids=[0,0] i_ids=[1001,1002]

• Best Partition?• Touched Partitions?• Finished Partitions?

Input Parameters:

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;

Current State

XX

w_id=0i_w_ids=[0,1] i_ids=[1001,1002]

w_id=0i_w_ids=[0,1] i_ids=[1001,1002]

CheckStock:

Input Parameters:

INSERT INTO ORDERS(o_id, o_w_id)VALUES (?, ?);

INSERT INTO ORDERS(o_id, o_w_id)VALUES (?, ?);

InsertOrder:

November 9, 2011

=2

w_id=0i_w_ids=[0,1] i_ids=[1001,1002]

w_id=0i_w_ids=[0,1] i_ids=[1001,1002]

=1 =1 =2

ArrayLengthArrayLength(i_(i_w_ids)w_ids)

ArrayLengthArrayLength(i_(i_w_ids)w_ids)

ArrayLengthArrayLength(i_(i_w_ids)w_ids)

ArrayLengthArrayLength(i_(i_w_ids)w_ids)

=1=0

HashValueHashValue(w_i(w_id)d)

HashValueHashValue(w_i(w_id)d)

EvaluatEvaluationion

ExperimeExperimentalntal

Accuracy

Overhead

TATPTPC-CAuctionM

94.9%95.0%90.2%

+1.86%+1.17%+8.15%

TATP TPC-CAuctionM

(txn/s)

+57% +126% +117%

HoudiniAssume Single-Partitioned

Scaling your Scaling your OLTP DBMS OLTP DBMS must come must come from within.from within.

ConclusioConclusion:n:

November 9, 2011

Recommended