29
Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

Embed Size (px)

Citation preview

Page 1: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

Scaling Dynamic Content Applications through Data Replication -

Opportunities for Compiler Optimizations

Cristiana Amza

UofT

Page 2: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

2

Dynamic Content is Ubiquitous

1 2

3

Page 3: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

3

We focus on scaling the database

Dynamic Content Web Sites

3-tier: Web Server, Application Server, Database

Query

Response

HTTP Request

HTML Page

Web Server Database ServerApp Server

Page 4: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

4

Common Scaling Solution

Query

Response

HTTP Request

HTML Page

Web Server App ServerSMP Database Server

Page 5: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

5

Alternative: Database Cluster

Read-one, write all replication

Cluster of Database Servers

Query

Response

HTTP Request

HTML Page

Web Server App Server

Page 6: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

6

Where to Implement?

Web Server SchedulerApp Server

Query

Response

Query

Response

Cluster of Database Servers

Page 7: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

7

Problem: Conflict Ordering

Bestseller book

Consistent order for conflicting transactions

(Same client buys the book on all replicas)

BuyBuy

Query

Response

Cluster of Database Servers

Page 8: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

8

Synchronous (Eager) Replication

Web Server SchedulerApp Server

Query

Response

Cluster of Database Servers

Write

Does not perform well

Page 9: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

9

Our Asynchronous Solution

Web Server SchedulerApp Server

Query

Response

Cluster of Database Servers

Write

Page 10: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

10

Our Asynchronous Solution

Web Server SchedulerApp Server

Query

Response

Cluster of Database Servers

How about conflict ordering ?

Page 11: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

11

Idea: Application Code Known

begin-transaction

write a write b write c

commit-transaction

Page 12: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

12

Make Conflicts Explicit

begin-transaction use a, b, c

write a last-use a write b last-use b write ccommit-transaction

(Could be automated by compiler)

Page 13: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

13

Order Conflicts

Scheduler assigns versions at “use” table declaration

Atomic (per transaction)

Per table

If conflict, assign a higher version number

Page 14: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

14

Execution Rules

Enforced by database proxy at each query

Wait for appropriate version of tables to be produced

Versions are produced

• At commit/abort• At last-use

Page 15: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

15

How It Works

beginuse a, b, c get versions e.g., (0, 0, 0) write a write b write ccommit

1 2 3 4 5 6T0: a0, b0, c0T1: a1, b1, c1

Page 16: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

16

Order-placement

beginorders, order_line, item, customer, credit_cardinsert new order in orders for all items in shopping_cart {

insert order in order_line adjust item.stock field in item }find customerinsert info in credit_cardcommit

Page 17: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

17

Annotated Order-placement

beginorders, order_line, item, customer, credit_cardinsert new order in orders tablerelease ordersfor all items in shopping_cart

insert order in order_linerelease order_linefor all items in shopping_cart adjust item.stock field in itemrelease item find customerrelease customerinsert info in credit_cardcommit

Page 18: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

18

Refine “Object” Granularity

beginorders.*, order_line.*, item.stock, customer.* … insert new order in orders tablerelease orders.*for all items in shopping_cart

insert order in order_linerelease order_line.*for all items in shopping_cart adjust item.stock field in itemrelease item.stock find customerrelease customer.*insert info in credit_cardcommit

Page 19: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

19

Value Flow Analysis

Known argument values for programBestSellers.php ? Category=“KIDS”

Infer values of query fields

Bestseller transaction on category = “KIDS”is disjoint from book order in category = “SPORTS”

Page 20: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

20

Dependences

$var = SELECT …

SELECT …. WHERE … = $var

Page 21: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

21

Other Applications for Optimizations

Transparent CachingDetermine if an update transaction should invalidatea cached query response

Scheduler

Database server

App Server

Query

Response

Query

Response

Page 22: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

22

Comparison to Eager

TPC-W

0

100

200

300

400

500

0 10 20 30 40 50 60Database engines

Th

rou

gh

pu

t

DVersion

Eager

Page 23: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

23

TPC-W Benchmark

Models an on-line book store

Three standard workloads (differ in % of writes)

• browsing (5%), • shopping (20%)• ordering (50%)

Application size: 4 GB

Page 24: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

24

Comparison to Eager

TPC-W browsing mix

0

100

200

300

400

0 10 20 30 40 50 60Database engines

Th

rou

gh

pu

t

DVersion

Eager

Page 25: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

25

Comparison to Eager

TPC-W ordering mix

020406080

100120140160180

0 10 20 30Database engines

Th

rou

gh

pu

t DVersion

Eager

Page 26: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

26

Comparison to Loose Consistency Methods

TPC-W shopping mix

0

100

200

300

400

500

0 10 20 30 40 50 60

Database engines

Th

rou

gh

pu

t

Level 0

Level 1

Level 2

DVersion

Eager

Page 27: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

27

Conclusions

1-copy SR can be implemented with good performance

Key ingredients: asynchrony and conflict reduction

Looser consistency models needed only for (very) write-heavy workloads

Page 28: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

28

Consistency & Lazy Replication

L4 SwitchSchedulers

LOCKWRRD

LOCK

Seq

SN

SNSN LOCK

Async writes scaling

Conflict ordering consistency

Conflict ? Order SN

Web/App servers Database engines

Page 29: Scaling Dynamic Content Applications through Data Replication - Opportunities for Compiler Optimizations Cristiana Amza UofT

29

Experimental Environment

Software

• Apache + PHP, MySQL

Hardware

• Athlon 800 Mhz, 256 MB RAM, Fast Ethernet

Implementation: on 8 replicas

Simulation: for up to 60 databases

• Calibrated against prototype