27
@andy_pavlo @andy_pavlo On Predictive Modeling On Predictive Modeling for for D D istributed istributed D D atabases atabases VLDB - August 28 VLDB - August 28 th th , , 2012 2012

@andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Embed Size (px)

Citation preview

Page 1: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

@andy_pavlo@andy_pavlo

On Predictive Modeling On Predictive Modeling forfor

DDistributed istributed DDatabasesatabases

VLDB - August 28VLDB - August 28thth, , 20122012

Page 2: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Databases?Evan Jones?

Page 3: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Romney has a Swiss

bank account!

Muammar Gaddafi is in trouble!

Putin is going to get re-

elected!

Page 4: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012
Page 5: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

TransactiTransaction on

ProcessinProcessingg

High-High-VolumeVolume

Page 6: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Main MemoryMain Memory •• ParallelParallel •• Shared-NothingShared-Nothing

H-Store: A High-Performance, DistributedMain Memory Transaction Processing SystemProc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.

Page 7: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

FastRepetitiveSmall

Page 8: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Client

Database Cluster

Proc. NameInput

Params

Transaction

Execution

Database Cluster

Transaction

Result

Page 9: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Client

Database ClusterDatabase Cluster

P1P2P3P4

Page 10: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

This transactio

n will execute 4 queries on partitions 1,3, and 6!

Page 11: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Pro Tip:Pro Tip:Canadians Canadians do notdo notlike like unnecessary unnecessary surgeries.surgeries.

Page 12: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Main Main Idea:Idea:

On Predictive Modeling for OptimizingTransaction Execution in Parallel OLTP SystemsProc. VLDB Endow., vol. 5, iss. 2, pp. 85-96, 2011.

Use models Use models to predict to predict transaction transaction behavior behavior beforebefore execution. execution.

Page 13: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Client

Database ClusterDatabase Cluster

Page 14: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012
Page 15: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Step #1:Step #1:Estimate the Estimate the pathpaththat the that the transaction transaction will take.will take.

Page 16: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Current State

SELECT * FROM WAREHOUSE WHERE W_ID = ?SELECT * FROM WAREHOUSE WHERE W_ID = ?

w_id=0i_w_ids=[0,0] i_ids=[1001,1002]

w_id=0i_w_ids=[0,0] i_ids=[1001,1002]

GetWarehouse:

Input Parameters:

Page 17: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Step #2:Step #2:Determine Determine which which optimizationoptimizations to enable s to enable in the in the DBMS.DBMS.

Page 18: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Optimizations:

+1

+1

+1

+1

+1

w_id=0i_w_ids=[0,0] i_ids=[1001,1002]

w_id=0i_w_ids=[0,0] i_ids=[1001,1002]

• Best Partition?• Touched Partitions?• Finished Partitions?

Input Parameters:

Page 19: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;

Current State

XX

w_id=0i_w_ids=[0,1] i_ids=[1001,1002]

w_id=0i_w_ids=[0,1] i_ids=[1001,1002]

CheckStock:

Input Parameters:

INSERT INTO ORDERS(o_id, o_w_id)VALUES (?, ?);

INSERT INTO ORDERS(o_id, o_w_id)VALUES (?, ?);

InsertOrder:

Page 20: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

November 9, 2011

Page 21: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

=2

w_id=0i_w_ids=[0,1] i_ids=[1001,1002]

w_id=0i_w_ids=[0,1] i_ids=[1001,1002]

=1 =1 =2

ArrayLengthArrayLength(i_(i_w_ids)w_ids)

ArrayLengthArrayLength(i_(i_w_ids)w_ids)

ArrayLengthArrayLength(i_(i_w_ids)w_ids)

ArrayLengthArrayLength(i_(i_w_ids)w_ids)

=1=0

HashValueHashValue(w_i(w_id)d)

HashValueHashValue(w_i(w_id)d)

Page 22: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

EvaluatEvaluationion

ExperimeExperimentalntal

Page 23: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Accuracy

Overhead

TATPTPC-CAuctionM

94.9%95.0%90.2%

+1.86%+1.17%+8.15%

Page 24: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

TATP TPC-CAuctionM

(txn/s)

+57% +126% +117%

HoudiniAssume Single-Partitioned

Page 25: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

Scaling your Scaling your OLTP DBMS OLTP DBMS must come must come from within.from within.

ConclusioConclusion:n:

Page 27: @andy_pavlo On Predictive Modeling for D istributed D atabases VLDB - August 28 th, 2012

November 9, 2011