Upload
lynn-owens
View
218
Download
2
Embed Size (px)
Citation preview
@andy_pavlo@andy_pavloFASFASTERTER
Making Making Fast Fast
DatabasesDatabases
FastCheap
+
OLTP Through the Looking Glass,OLTP Through the Looking Glass,and What We Found There and What We Found There SIGMOD 2008SIGMOD 2008
CPU Cycles
TPC-C NewOrder
8.1%
21.1%
18.7%
10.2%
29.6%
12.3%
Legacy Systems
FastRepetitiveSmall
OLTP Transactio
ns
Main Memory • Parallel • Shared-NothingTransaction Processing
H-Store: A High-Performance, DistributedH-Store: A High-Performance, DistributedMain Memory Transaction Processing SystemMain Memory Transaction Processing SystemVLDB VLDB vol. 1, issue 2, 2008vol. 1, issue 2, 2008
ClientApplication
Database Cluster
Procedure NameInput
Parameters
Stored Procedure Execution
Database Cluster
Transaction
Result
TPC-C NewOrder
Skew-Aware Automatic Database PartitioningSkew-Aware Automatic Database Partitioningin Shared-Nothing, Parallel OLTP Systemsin Shared-Nothing, Parallel OLTP SystemsSIGMOD 2012SIGMOD 2012
Partition Partition database to database to reduce the reduce the number of number of distributed distributed txns.txns.
OptimizatioOptimization #1:n #1:
o_id o_c_id
o_w_id
…
78703 1004 5 -
78704 1002 3 -
78705 1006 7 -
78706 1005 6 -
78707 1005 6 -
78708 1003 12 -
c_id c_w_id
c_last
…
1001 5 RZA -
1002 3 GZA -
1003 12 Raekwon
-
1004 5 Deck -
1005 6 Killah -
1006 7 ODB -
CUSTOMER ORDERS
CUSTOCUSTOMERMERORDERORDER
SS
CUSTOCUSTOMERMERORDERORDER
SS
CUSTOCUSTOMERMERORDERORDER
SS
ITEMi_id i_na
mei_price
…
603514
XXX 23.99 -
267923
XXX 19.99 -
475386
XXX 14.99 -
578945
XXX 9.98 -
476348
XXX 103.49
-
784285
XXX 69.99 -
ITEMITEM ITEMITEM ITEMITEM
CUSTOMERc_id c_w_i
dc_las
t…
1001 5 RZA -
1002 3 GZA -
1003 12 Raekwon
-
1004 5 Deck -
1005 6 Killah -
1006 7 ODB -
CUSTOCUSTOMERMERORDERORDER
SS
CUSTOCUSTOMERMERORDERORDER
SS
CUSTOCUSTOMERMERORDERORDER
SSITEMITEM ITEMITEM ITEMITEM
Client Application
NewOrder(5, “Method Man”,
1234)
CUSTCUSTOMEROMERORDEORDE
RSRSITEMITEM
CUSTCUSTOMEROMERORDEORDE
RSRSITEMITEM
CUSTCUSTOMEROMERORDEORDE
RSRSITEMITEM
CUSTCUSTOMEROMERORDEORDE
RSRSITEMITEM
…
Large-Neighorhood Large-Neighorhood Search AlgorithmSearch Algorithm
Schema
Workload
-------
-------
-------
-------
-------
-------
DDL
DDL
SELECT * FROM WAREHOUSE WHERE W_ID = 10;
INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);
⋮
SELECT * FROM WAREHOUSE WHERE W_ID = 10;
INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);
⋮
SELECT * FROM WAREHOUSE WHERE W_ID = 10;
INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);
⋮
SELECT * FROM WAREHOUSE WHERE W_ID = 10;
INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);
⋮
SELECT * FROM WAREHOUSE WHERE W_ID = 10;
SELECT * FROM DISTRICT D_W_ID = 10 AND D_ID =9;
INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);
⋮
SELECT * FROM WAREHOUSE WHERE W_ID = 10;
SELECT * FROM DISTRICT D_W_ID = 10 AND D_ID =9;
INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);
⋮
SELECT * FROM WAREHOUSE WHERE W_ID = 10;
SELECT * FROM DISTRICT WHERE D_W_ID = 10 AND D_ID =9;
INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID,…) VALUES(10, 9, 12345,…);
⋮
SELECT * FROM WAREHOUSE WHERE W_ID = 10;
SELECT * FROM DISTRICT WHERE D_W_ID = 10 AND D_ID =9;
INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID,…) VALUES(10, 9, 12345,…);
⋮
NewOrdNewOrd
ererNewOrdNewOrd
ererDDLDDLCUSTOMERCUSTOMER
ITEMITEM
ORDERSORDERS
Workload
-------
-------
-------
-------
-------
-------
Schema
DDL
DDL
Initial DesignRelaxationLocal Search
Restart
Large-Neighborhood
Search
TATP TPC-CTPC-C Skewed
(txn/s)
+88% +16% +183%
HorticultureState-of-the-Art
Throughput
TATP SEATS
TPC-C TPC-C Skewed
AuctionMark TPC-E
Search Times
% S
ing
le-P
art
itio
ned
Tra
nsa
ctio
ns
Database Cluster
Database Cluster
ClientApplication
Undo Log
On Predictive Modeling for OptimizingOn Predictive Modeling for OptimizingTransaction Execution in Parallel OLTP SystemsTransaction Execution in Parallel OLTP SystemsVLDB, vol 5. issue 2, October 2011VLDB, vol 5. issue 2, October 2011
Predict what Predict what txns will do txns will do beforebefore they they execute.execute.
OptimizatioOptimization #2:n #2:
Database Cluster
Database Cluster
ClientApplication
» Partitions Touched?» Undo Log?
» Done with Partitions?
w_id=0i_w_ids=[0,1] i_ids=[1001,1002]
w_id=0i_w_ids=[0,1] i_ids=[1001,1002]
Input Parameters:Current State:
SELECT * FROM WAREHOUSE WHERE W_ID = ?SELECT * FROM WAREHOUSE WHERE W_ID = ?
GetWarehouse:
Confidence Coefficient:
0.96
Best Partition: 0
Partitions Accessed:
{ 0 }
Use Undo Logging:
Yes
Transaction Estimate:
Estimated Execution Path
w_id=0i_w_ids=[0,1] i_ids=[1001,1002]
w_id=0i_w_ids=[0,1] i_ids=[1001,1002]
Input Parameters:
TATP TPC-CAuctionMark
(txn/s)
+57% +126%+117%
HoudiniAssume Single-Partitioned
Throughput
TATP TPC-CAuctionMark
Prediction Overhead
Future WorkFuture Work::Reduce distributed txn overheadReduce distributed txn overheadthrough creative scheduling.through creative scheduling.
Conclusion:Conclusion:Achieving fast performance isAchieving fast performance ismore than just using only RAM.more than just using only RAM.
h-h-storestorehstore.cs.br
own.eduhstore.cs.br
own.edugithub.com/apavlo/h-storegithub.com/apavlo/h-store
Help is AvailableHelp is Available
Graduate Student Graduate Student Abuse HotlineAbuse Hotline
AvailableAvailable 24/724/7Collect Calls AcceptedCollect Calls Accepted
+1-212-939-7064+1-212-939-7064