Upload
hayden-muncy
View
217
Download
1
Embed Size (px)
Citation preview
@andy_pavlo@andy_pavlo
On Predictive Modeling On Predictive Modeling forfor
DDistributed istributed DDatabasesatabases
VLDB - August 28VLDB - August 28thth, , 20122012
Databases?Evan Jones?
Romney has a Swiss
bank account!
Muammar Gaddafi is in trouble!
Putin is going to get re-
elected!
TransactiTransaction on
ProcessinProcessingg
High-High-VolumeVolume
Main MemoryMain Memory •• ParallelParallel •• Shared-NothingShared-Nothing
H-Store: A High-Performance, DistributedMain Memory Transaction Processing SystemProc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.
FastRepetitiveSmall
Client
Database Cluster
Proc. NameInput
Params
Transaction
Execution
Database Cluster
Transaction
Result
Client
Database ClusterDatabase Cluster
P1P2P3P4
This transactio
n will execute 4 queries on partitions 1,3, and 6!
Pro Tip:Pro Tip:Canadians Canadians do notdo notlike like unnecessary unnecessary surgeries.surgeries.
Main Main Idea:Idea:
On Predictive Modeling for OptimizingTransaction Execution in Parallel OLTP SystemsProc. VLDB Endow., vol. 5, iss. 2, pp. 85-96, 2011.
Use models Use models to predict to predict transaction transaction behavior behavior beforebefore execution. execution.
Client
Database ClusterDatabase Cluster
Step #1:Step #1:Estimate the Estimate the pathpaththat the that the transaction transaction will take.will take.
Current State
SELECT * FROM WAREHOUSE WHERE W_ID = ?SELECT * FROM WAREHOUSE WHERE W_ID = ?
w_id=0i_w_ids=[0,0] i_ids=[1001,1002]
w_id=0i_w_ids=[0,0] i_ids=[1001,1002]
GetWarehouse:
Input Parameters:
Step #2:Step #2:Determine Determine which which optimizationoptimizations to enable s to enable in the in the DBMS.DBMS.
Optimizations:
+1
+1
+1
+1
+1
w_id=0i_w_ids=[0,0] i_ids=[1001,1002]
w_id=0i_w_ids=[0,0] i_ids=[1001,1002]
• Best Partition?• Touched Partitions?• Finished Partitions?
Input Parameters:
SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;
SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;
Current State
XX
w_id=0i_w_ids=[0,1] i_ids=[1001,1002]
w_id=0i_w_ids=[0,1] i_ids=[1001,1002]
CheckStock:
Input Parameters:
INSERT INTO ORDERS(o_id, o_w_id)VALUES (?, ?);
INSERT INTO ORDERS(o_id, o_w_id)VALUES (?, ?);
InsertOrder:
November 9, 2011
=2
w_id=0i_w_ids=[0,1] i_ids=[1001,1002]
w_id=0i_w_ids=[0,1] i_ids=[1001,1002]
=1 =1 =2
ArrayLengthArrayLength(i_(i_w_ids)w_ids)
ArrayLengthArrayLength(i_(i_w_ids)w_ids)
ArrayLengthArrayLength(i_(i_w_ids)w_ids)
ArrayLengthArrayLength(i_(i_w_ids)w_ids)
=1=0
HashValueHashValue(w_i(w_id)d)
HashValueHashValue(w_i(w_id)d)
EvaluatEvaluationion
ExperimeExperimentalntal
Accuracy
Overhead
TATPTPC-CAuctionM
94.9%95.0%90.2%
+1.86%+1.17%+8.15%
TATP TPC-CAuctionM
(txn/s)
+57% +126% +117%
HoudiniAssume Single-Partitioned
Scaling your Scaling your OLTP DBMS OLTP DBMS must come must come from within.from within.
ConclusioConclusion:n:
https://github.com/apavlo/h-storehttp://hstore.cs.brown.edu
November 9, 2011