47
MY DATABASE SYSTEM IS THE ONLY THING I CAN TRUS T @ANDY_PAVLO

MY DATABASE SYSTEM IS THE ONLY THING I CAN

Embed Size (px)

DESCRIPTION

MY DATABASE SYSTEM IS THE ONLY THING I CAN. TRUST. @ ANDY_PAVLO. Thirty Years Ago…. I NTERACTIVE T ransactions. S mall # of CPU C ores. S mall M emory S izes. Voter Setup. TP C-C BENCHMARK. Warehouse Order Processing. Application. Transaction:. NewOrder Transaction. - PowerPoint PPT Presentation

Citation preview

Page 1: MY DATABASE SYSTEM IS THE ONLY THING I CAN

MY DATABASE SYSTEM ISTHE ONLY

THING I CANTRUST @ANDY_PAVLO

Page 2: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Thirty Years Ago…

2

Page 3: MY DATABASE SYSTEM IS THE ONLY THING I CAN

SMALL # OF CPU CORESSMALL MEMORY SIZES

INTERACTIVE TRANSACTIONS

Page 4: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Voter Setup

1.Check item stock level.

2.Create new order information.

3.Update item stock levels.

NewOrder Transaction

TPC-C BENCHMARKWarehouse Order Processing

4

APPLICATION

Page 5: MY DATABASE SYSTEM IS THE ONLY THING I CAN

1 2 3 4 5 6 7 8 9 10 11 120

5,000

10,000

15,000

20,000

Voter MySQL/Postgres

5

TPC-C BENCHMARKWarehouse Order Processing

MySQL

Postgres

TXN/SEC

CPU CORES

Page 6: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Buffer PoolLockingRecoveryReal Work

28%

30%

30% 12%

CPU OverheadTRADITIONAL

DBMSMeasured CPU Cycles

OLTP THROUGH THE LOOKING GLASS, AND WHAT WE FOUND THERESIGMOD, pp. 981-992, 2008. 6

Page 7: MY DATABASE SYSTEM IS THE ONLY THING I CAN

NOSQL

SHARDING MIDDLEWARE

HARDWARE UPGRADEREPLICATIONDISTRIBUTED CACHE

Page 8: MY DATABASE SYSTEM IS THE ONLY THING I CAN

HOW TO SCALE UP WITHOUT GIVING UP TRANSACTI

ONS?

Page 9: MY DATABASE SYSTEM IS THE ONLY THING I CAN

H-STORE: A HIGH-PERFORMANCE, DISTRIBUTEDMAIN MEMORY TRANSACTION PROCESSING SYSTEMProc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.

Distributed Main MemoryTransaction Processing System

Page 10: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DISK ORIENTED

CONCURRENT EXECUTION

HEAVYWEIGHT RECOVERY

xi/

MAIN MEMORY STORAGE

SERIAL EXECUTION

COMPACT LOGGING

Page 11: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Transaction

Execution

Applic

ati

on

PARTITIONS

SINGLE-THREADEDEXECUTION ENGINES

Transaction

Result

CMD LOGSNAPSHOTS

H-Store Architecture DiagramProcedure

NameInput

Parameters

run(phoneNum, contestantId, currentTime) {result = execute(VoteCount,

phoneNum);if (result > MAX_VOTES) {

return (ERROR);}execute(InsertVote, phoneNum,

contestantId, currentTime);

return (SUCCESS);}

VoteCount:SELECT COUNT(*) FROM votes WHERE phone_num = ?;

InsertVote:INSERT INTO votes VALUES (?, ?, ?);

STORED PROCEDURE

11

Page 12: MY DATABASE SYSTEM IS THE ONLY THING I CAN

1 2 3 4 5 6 7 8 9 10 11 120

5,000

10,000

15,000

20,000

Voter MySQL/Postgres

12

TPC-C BENCHMARKWarehouse Order Processing

MySQL

Postgres H-Store

40x

TXN/SEC

CPU CORES

Page 13: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DISTRIBUTED

TRANSACTIONS

Page 14: MY DATABASE SYSTEM IS THE ONLY THING I CAN

14

1 2 3 40

10,000

20,000

30,000

40,000

8 Cores per Node10% Distributed Transactions

Distributed TPC-CTPC-C

BENCHMARKH-Store

TXN/SEC

NODES

Page 15: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Query CountP1P2P3P4

Distributed Txn Example

Applic

ati

on

15

DISTRIBUTED TRANSACTION

S

Page 16: MY DATABASE SYSTEM IS THE ONLY THING I CAN

KNOW WHAT TRANSACTIONS

WILL DO BEFORE THEY

START

Page 17: MY DATABASE SYSTEM IS THE ONLY THING I CAN

BUT PEOPLE ALWAYS GIVE ME

BAD ADVICE

Page 18: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DON’T GET INVOLVED WITH COMPUTERS.YOU’LL NEVER MAKE ANY MONEY.

Page 19: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DON’T GET A PHD.EVERYONE WILL THINK YOU ARE A JERK.

Page 20: MY DATABASE SYSTEM IS THE ONLY THING I CAN

THE DATABASE SYSTEM

ALWAYS HAS MOREINFOR

MATION

Page 21: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DO USE MACHINE

LEARNING TO PREDICT

TRANSACTION BEHAVIOR.ON PREDICTIVE MODELING FOR OPTIMIZING

TRANSACTION EXECUTION IN PARALLEL OLTP SYSTEMSProc. VLDB Endow., Vol 5, Iss. 2, pp. 85-96, 2011

Page 22: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Houdini OverviewPREDICTIVE

MODELS

22

Page 23: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Model Genera

tor

Feature2 Feature2

Feature1

Classifier

Feature

Clusterer

SELECT * FROM WAREHOUSE WHERE W_ID = 10;

INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);

SELECT * FROM WAREHOUSE WHERE W_ID = 10;

INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);

SELECT * FROM WAREHOUSE WHERE W_ID = 10;

SELECT * FROM DISTRICT D_W_ID = 10 AND D_ID =9;

INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID) VALUES (10, 9, 12345);

SELECT * FROM WAREHOUSE WHERE W_ID = 10;

SELECT * FROM DISTRICT WHERE D_W_ID = 10 AND D_ID =9;

INSERT INTO ORDERS (O_W_ID, O_D_ID, O_C_ID,…) VALUES(10, 9, 12345,…);

WORKL

OAD

Decision Tree

______________________________________________________

____________________________________________________________________________________________________________

______________________________________________________

Creating Models

Markov Models

23

Page 24: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Houdini ExampleA

pplic

ati

on

DISTRIBUTED TRANSACTION

S

24

Page 25: MY DATABASE SYSTEM IS THE ONLY THING I CAN

25

1 2 3 40

5,000

10,000

15,000

20,000

25,000

1 2 3 40

15,000

30,000

45,000

60,000

Houdini TPC-C

2xHoudiniNaïve OPTIMAL

8 Cores per Node10% Distributed Transactions

TPC-C BENCHMARK

TXN/SEC

NODES

Page 26: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Houdini Example

Zzzz…

Zzzz…

Zzzz…

Applic

ati

on

DISTRIBUTED TRANSACTION

S

SP1 - Waiting for Query ResultSP2 - Waiting for Query RequestSP3 - Two-Phase Commit

Zzzz…

26

Page 27: MY DATABASE SYSTEM IS THE ONLY THING I CAN

TRANSACTION STALL POINTS

27

BASE PARTITION

REMOTE PARTITION

SP1 - Waiting for Query ResultSP2 - Waiting for Query RequestReal Work

45%

18% 37

%

73% 22

%

5%

SP3 - Two-Phase Commit

Page 28: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DO SOMETHING

USEFULWHEN

STALLED

Page 29: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DON’T BE SURPRISED IF YOU & KB DON’T LAST THROUGHGRAD SCHOOL.

Page 30: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DON’T BE STAN’S STUDENTIF YOU GO TOBROWN.

Page 31: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DO USE MACHINE

LEARNING TO SCHEDULE

SPECULATIVE TASKS.THE ART OF SPECULATIVE EXECUTION

In Progress (August 2013)

Page 32: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Serializable Schedule

32

Distributed Transaction

Speculative Transaction

Speculative Transaction

Zzzz… C

C

C

C

C

VERIFY

Single-Partition Transaction

Single-Partition Transaction

SERIALIZABLE SCHEDULE

Page 33: MY DATABASE SYSTEM IS THE ONLY THING I CAN

SPECULATIVE TRANSACTIONS

Transaction Queue

Distributed Transaction:

Speculation Candidate:

Hermes OverviewZzzz…

READ X READ X

WRITE X

33

Page 34: MY DATABASE SYSTEM IS THE ONLY THING I CAN

SPECULATIVE QUERIES

Distributed Transaction:

Hermes Overview

34

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;

QueryY:

Page 35: MY DATABASE SYSTEM IS THE ONLY THING I CAN

SELECT * FROM WAREHOUSE WHERE W_ID = ?

w_id=0i_w_ids=[1,0] i_ids=[1001,1002]

GetWarehouse:

Transaction Parameters:

SELECT S_QTY FROM STOCK WHERE S_W_ID = ? AND S_I_ID = ?;

CheckStock:

Mapping: Example

35

Page 36: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Distributed Transaction

Speculative Transactions

Hermes Overview

Query1Query2Query3

Query1Query3Query3

Query1Query2Query3

Query3

VERIFICATION

36

Page 37: MY DATABASE SYSTEM IS THE ONLY THING I CAN

1 2 3 40

10,000

20,000

30,000

40,000

50,000Spec QueriesNone

Hermes TPC-C

All

37

8 Cores per Node10% Distributed Transactions

TPC-C BENCHMARK

Spec Txns

TXN/SEC

NODES

Page 38: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Optimize Single-Partition ExecutionH-STORE: A HIGH-PERFORMANCE, DISTRIBUTED MAIN MEMORY TRANSACTION PROCESSING SYSTEM Proc. VLDB Endow., vol. 1, iss. 2, pp. 1496-1499, 2008.

Minimize Distributed TransactionsSKEW-AWARE AUTOMATIC DATABASE PARTITIONING IN SHARED-NOTHING, PARALLEL OLTP SYSTEMSProceedings of SIGMOD, 2012.

Identify Distributed Transactions ON PREDICTIVE MODELING FOR OPTIMIZING TRANSACTION EXECUTION IN PARALLEL OLTP SYSTEMSProc. VLDB Endow., vol. 5, pp. 85-96, 2011.

Utilize Transaction StallsTHE ART OF SPECULATIVE EXECUTIONIn Progress (August 2013)

Page 39: MY DATABASE SYSTEM IS THE ONLY THING I CAN

FUTUREWORK

Page 40: MY DATABASE SYSTEM IS THE ONLY THING I CAN

One SizeAlmostFits All

Page 41: MY DATABASE SYSTEM IS THE ONLY THING I CAN

41

H-STORE

N-STORE

NS-STORE

Page 42: MY DATABASE SYSTEM IS THE ONLY THING I CAN

One SizeAlmostFits All™

Page 43: MY DATABASE SYSTEM IS THE ONLY THING I CAN

43

H-STORE

N-STORE

NS-STORE

Page 44: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Escape From Planet

Zdonik(i.e., Andy Needs to Get Tenure)

Page 45: MY DATABASE SYSTEM IS THE ONLY THING I CAN

Beyond the ‘Stores• Non-Partitionable

Workloads.

• The Poor Man’s Spanner.

• Scientific Databases.

Page 46: MY DATABASE SYSTEM IS THE ONLY THING I CAN

DON’T MESS IT UPWITH KB.

Page 47: MY DATABASE SYSTEM IS THE ONLY THING I CAN

StanZdonik

“The Thrill”Stonebraker

SamMadden

UgurCetintemel

DavidDeWitt

DanAbadi

JustinDeBrabant

SauryaVelagapudi

JohnMeehan

XinJia

CarloCurino

EvanJones

YangZou

NingShi

VisaweeAngkana.