41
DOLLY: Virtualization-Driven Database Provisioning for the Cloud Emmanuel Cecchet Joint work with Rahul Singh, Upendra Sharma and Prashant Shenoy VEE 2011 paper available at: http://lass.cs.umass.edu/papers.html

DOLLY: Virtualization-Driven Database Provisioning for the Cloud

  • Upload
    durin

  • View
    51

  • Download
    0

Embed Size (px)

DESCRIPTION

VEE 2011 paper available at: http://lass.cs.umass.edu/papers.html. DOLLY: Virtualization-Driven Database Provisioning for the Cloud. Emmanuel Cecchet Joint work with Rahul Singh, Upendra Sharma and Prashant Shenoy. PROVISIONING IN THE CLOUD. Virtualization No shared disk - PowerPoint PPT Presentation

Citation preview

Page 1: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

DOLLY: Virtualization-Driven Database Provisioning for the

Cloud

Emmanuel CecchetJoint work with Rahul Singh, Upendra Sharma and Prashant Shenoy

VEE 2011 paper available at:http://lass.cs.umass.edu/papers.html

Page 2: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

2

PROVISIONING IN THE CLOUD Virtualization No shared disk Provisioning based on volume and machine load

InternetFrontend/Load balancer

DatabasesProvisioning

logic

App.Servers

Page 3: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

3

WHY IS IT HARD TO ADD A DB REPLICA?

Administration issues Install/setup new database Backup/Restore database content Configure new replica Synchronize live content

How much time does it take? Depends on the database size Depends on the backup/restore technique From minutes to hours…

Dolly: VM cloning to blackbox the database

Page 4: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

Dolly

Database replication in the Cloud

Provisioning with DollyPrototype & Evaluation

Page 5: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

5

SPAWNING A REPLICA Multi-master middleware-based replication Three main phases

Backup Restore Replay

DB1new

replicaDB2

Client SQL requests

Replication middleware

Transactional log

Load balancer

Management console

1add

replica

snapshotbackup

2

4

resynchronize

restore3

Page 6: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

6

SPAWNING A REPLICA WITH CLONING Backup & Restore replace by VM cloning

DB1 DB2

Client SQL requests

Replication middleware

Transactional log

Load balancer

Management console

1add

replica

VM1

OSVM2

OS

clone2

DB2

VM3

OS

clone3 DB2

VM3

OS

4

resynchronize

3

Page 7: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

7

CLONING: BACKUP/RESTORE IN CONSTANT TIME

Database DB size on disk

DB Backup Restore

Dolly 4GB VM cloning

Dolly 16GB VM

cloningRUBiS –c–i 1022MB 843s 281s 899sRUBiS +c+bi 1.4GB 5761s 282s 900sRUBiS +c+fi 1.5GB 6017s 280s 900sTPC-W 684MB 288s 275s 905sTPC-H 1GB 1.8GB 1477s 271s 918sTPC-H 10GB 12GB 5573s n/a 911s

Filesystem snapshot/copy is DB agnostic Only depends on VM size

Page 8: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

Dolly

Database replication in the Cloud

Provisioning with DollyPrototype & Evaluation

Page 9: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

9

MODELING SPAWNING TIME Predictable backup and restore times are

required Replay time can be estimated from current

write throughput (wt)

timebackup restore replay

bi ri

updates

max max max

. . .t t ti i

w w wb rw w w

replica spawning time

Page 10: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

10

WHEN TO SNAPSHOT? Time to spawn from a live replica

Time to spawn from an existing snapshot

Faster to take a new snapshot j to spawn a new replica than using old snapshot i if:

backupj+restorej<restorei +replayi

max

maxi i

t

ws b r

w w

max

maxi i

t

ws r replay

w w

Page 11: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

11

DOLLY OVERVIEW

Input capacity

prediction write prediction

Output schedule of

snapshots schedule of replica

spawning admission control if

needed

Predictors

Capacity Provisioning

Spawning options

Admission Control Management API

Scheduler

start/stopclone/ snapshot

Monitoring

write throttling/ read throttling

Dolly

Snapshot scheduler

write predictionscapacity predictions

Paused pool cleaner

Free pool Manager

delete VM/ snapshot

HA adjuster

Write throttling

reclaim

Page 12: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

12

PROVISIONING REPLICAS

Workloadprediction

Capacityprediction

Writeprediction

Dolly does not provide predictors Dolly can work with any predictor (see

[Eurosys09])

Page 13: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

13

CLOUD COST FUNCTIONS Adapt the provisioning decisions to the cloud

platform specifics Cost can be $ on public cloud or time on private

cloud

Cost function name Definitionpause_cost(VM, t) cost of pausing VM at time tspawn_cost(s, t, d) cost to spawn a replica from snapshot s at time t to

meet deadline dspawn_cost(VM, t, d) cost to spawn a replica from a paused VM at time t to

meet deadline drunning_cost(VM,t1,t2) cost to run a VM from time t1 to time t2pause_resume_cost(VM, t1, t2) cost to pause a VM at time t1 and resume it at time t2backup_paused_cost(VM) cost to backup a paused VMbackup_live_cost(VM, t) cost to backup an active VM at time t

Page 14: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

14

PROVISIONING REPLICAS Parse capacity provisioning predictions Decrease capacity by pausing VMs Increasing capacity

Check if we can reuse a paused VMCheck if we can spawn from an existing

snapshotChoose cheapest options according to spawn_cost function

Perform admission control if all replicas cannot be provisioned in time

Page 15: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

15

SNAPSHOT SCHEDULING How to snapshot?

Clone a paused VMPause an active VM to clone it

When to snapshot?At time j when backupj+restorej<restorei

+replayi If new snapshot is scheduled, re-run capacity

provisioning

Prediction window must have minimum size

Page 16: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

Dolly

Database replication in the Cloud

Provisioning with DollyPrototype & Evaluation

Page 17: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

17

IMPLEMENTATION

C-JDBC/Sequoia replication middleware

OpenNebula Cloud management middleware

Cost functions private cloud: minimize

resource utilization time Amazon EC2: minimize

cost

OpenNebula

TPC-W load injector

Scheduler

Recovery LogLog table

Dump table

JMX Management API

Backupers

Dolly OpenNebula

DB1 DB2 DB3

SQL requestsadd/remove replica snapshot/pause/…

VM1

OSVM2

OSVM3

OS

New replica

VM5

OS

DB3 snapshot

VMclone

OS

Loadbalancer

New replica

VM4

OS

Sequoia controller

predictions

Sequoia driver

admission control

Backup serveror NAS

start/stop/ clone/…

clone clone

DollyPrivate EC2

write throttling

Page 18: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

18

IMPLEMENTATION – COST FUNCTIONS Private cloud: minimize resource utilization Amazon EC2: minimize cost

Cost function name Private Cloud EC2pause_cost(VM, t) return 1/VM->machine->temp return 60-((t-VM->start)%60)

spawn_cost(s, t, d) return d-t comp$=(d-t)/60*hour$io$=EBS_storage$*s->size + EBS_io$* (s->restore_io+s->replay_io)return comp$+io$

spawn_cost(VM, t, d) return d-t comp$=(d-t)/60*hour$io$= EBS_io$* (s->resume_io+s->replay_io)return comp$+io$

running_cost(VM,t1,t2) return 1 (t2-t1)/60*hour$

pause_resume_cost(VM, t1, t2)

if (t2-t1 > VM->pause + VM->resume) return 0else return 2

io$= EBS_io$* (VM->pause_io+VM->resume_io)comp$=(60-(VM->stop-VM->start) %60)/60*hour$return io$+ comp$

backup_paused_cost(VM) return backup_time return S3_storage$*s->size

backup_live_cost(VM, t) return VM->pause + backup_time + VM->resume

return pause_cost(VM, t)$+ S3_storage$*s->size + (VM->stop_io+VM->start_io)* EBS_io$

Page 19: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

19

WORKLOAD DESCRIPTION TPC-W online bookstore e-commerce benchmark Snapshot s0 available at t0

Page 20: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

20

Overprovisioning with 6 replicas – 1h snapshotCost MRM

Private cloud 720m 0Amazon EC2 $8.39 0

Page 21: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

21

Reactive provisioning – 2h snapshotCost MRM

Private cloud 410m 42.1Amazon EC2 $4.61 41.5

replica spawning triggered here

replicas available

Page 22: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

22

Dolly – 30m Prediction Window

Cost MRMPrivate cloud 352m 0Amazon EC2 $3.73 0

Private Cloud Amazon EC2

s1 s2 cheaper to leave instances online

Page 23: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

23

CONCLUSION VM cloning

Solves administration issues by blackboxing the database

Constant time backup/restore needed to predict replica spawning time

New provisioning algorithmDecouples capacity provisioning from

snapshot schedulingCost functions to optimize for cloud

platform specifics

VEE 2011 paper available at:http://lass.cs.umass.edu/papers.html

Page 24: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

BONUS SLIDES

Page 25: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

25

Reactive provisioning – 15m snapshotCost MRM

Private cloud 381m42s 17.5Amazon EC2 $18.29 27.2

Page 26: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

26

Reactive provisioning – 1h snapshotCost MRM

Private cloud 360m30s 25.8Amazon EC2 $5.00 33.7

Page 27: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

27

Dolly – 10m Prediction Window

Cost MRMPrivate cloud 381m54s 0Amazon EC2 $7.16 0

Private Cloud Amazon EC2

s1 s2 s1 s2

Page 28: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

28

SPAWNING IN A PUBLIC CLOUD Storage decoupled from computing resource Starting a new instance clones the volume

DB1

Vol1

OS

DB1

Vol1

OS

stop snapshot DB2

Vol2

OSregister

DB2

Vol2

OSrestartDB1

Vol1

OS

DB2

Vol3

OS

DB2

Vol4

OS

start

Page 29: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

29

BACKUP/RESTORE TECHNIQUES

Database native toolsVendor specific or 3rd party ETLUnderstand database semantics

Filesystem copyLow-level data copyNeed to know what to copy

VM cloningCopies database content + configuration

+ OSUnused space can be compressed

Page 30: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

30

DATABASE SIZESBenchmark DB size Snapshot size VM size

RUBiS

MyISAM no constraint 836MB

844MB 4.1GB

MyISAM w/ constraints 1.1GBMyISAM w/ constraint &

index 1.2GB

InnoDB no constraint 1022MBInnoDB w/ constraints 1.4GB

InnoDB w/ constraint & index 1.5GB

TPC-WPostgreSQL binary dump

684MB210MB

2.1GBPostgreSQL sql dump 314MB

TPC-H scale

1(GB)

PostgreSQL binary dump

1.8GB

307MB 1.1GB (OS) + 2.1GB (data)

PostgreSQL sql dump 1.2GB

TPC-H scale

10(GB)

PostgreSQL binary dump12GB

2.0GB16GB

PostgreSQL sql dump 7.3GB

Page 31: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

31

BACKUP/RESTORE PERFORMANCE (1/3) Performance depends on database content

290

826 843

292

899

5761

293

1141

6017

0

1000

2000

3000

4000

5000

6000

7000

VMcloning

MySQLMyISAM

MySQLInnoDB

VMcloning

MySQLMyISAM

MySQLInnoDB

VMcloning

MySQLMyISAM

MySQLInnoDB

RUBiS no constraint RUBiS w/ constraints & basicindex

RUBiS w/ constraints & full textindex

Tim

e in

sec

onds

VM shutdownVM copyVM cloningVM bootMySQL backupMySQL restore

Page 32: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

32

BACKUP/RESTORE PERFORMANCE (2/3)

65

126

177

288275

0

50

100

150

200

250

300

350

File copy VM cloning direct VM cloning & copy PG bin PG sql

TPC-W

Tim

e in

sec

onds

VM shutdownVM/File copyVM cloningVM bootPostgreSQL stop/startPG bin backupPG bin restorePG sql backupPG sql restore

File copy is the most effective for small databases

Page 33: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

33

BACKUP/RESTORE PERFORMANCE (3/3)

147 189 273

1477 1468

937699

1215

55735256

0

1000

2000

3000

4000

5000

6000

File copy VMcloningdirect

VMcloning& copy

PG bin PG sql File copy VMcloningdirect

VMcloning& copy

PG bin PG sql

TPC-H 1GB TPC-H 10GB

Tim

e in

sec

onds

VM shutdownVM/File copyVM cloningVM bootPostgreSQL stop/startPG bin backupPG bin restorePG sql backupPG sql restore

VM cloning most effective on large databases

Page 34: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

34

BACKUP/RESTORE SUMMARY

Feature DB Backup/Restore

Filesystem Copy

VM Cloning

Database specific knowledge Medium Very high None

Performance Slow Fastest FastSnapshot size Small DB size VM sizeSpawning time

predictability Hard Moderate Easy

Database installation Moderate Moderate NoneDatabase configuration Hard Hard NoneMissing data in transfer Possible Unlikely NoSpawning atomicity No No YesResynchronization

limitations Yes Yes Yes

Page 35: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

35

DOLLY MAIN ALGORITHM Capacity provisioning depends on available snapshots Snapshots scheduled according to capacity demand Decouple capacity provisioning from snapshot scheduling

if (predictor.capacity_changes || predictor.write_workload_changes) { do { schedule = capacity_provisioning(predictions) snapshot_schedule = snapshot_scheduling(predictions) } while (snapshot_schedule schedules new snapshots) scheduler.schedule(snapshot_schedule) scheduler.schedule(capacity_schedule)}if (time since last operation > threshold) { paused_pool_cleaner.release_old_paused_vms(); paused_pool_cleaner.delete_old_snapshots();}

Page 36: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

36

RELEASING RESOURCES

Paused VMs VM never re-used if cost to resume > cost to

spawn from last snapshot Snapshots

Old snapshots can be released based on cost to keep them around

Free server pool Can reclaim servers with paused VMs when pool

is empty

Page 37: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

37

TPC-W EVALUATION Multi-tier online bookstore benchmark 4GB VM for the database Large EC2 instances from EBS volumes with CloudWatch

Operation Private Cloud Public Cloud (EC2)

start VM 42s 220spause VM 26s 30sresume VM 42s 30sbackup (stop/clone) 150s 320srestore (clone/start) 165s 220swmax 149 writes/sec 197 writes/sec

Avg IOs per write 15 13

Page 38: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

38

PHASE 1 – CAPACITY PROVISIONING Pause VMs when capacity decreases Resume VMs when capacity increases Spawn VMs from snapshot when additional capacity

required Not cost effective to pause VMs for less than 1 hour with

EC2Deadline Private Cloud EC2

d1 Pause v4

d2 Pause v3

d3

Resume v3 @ d3-7minResume v4 @ d4-9min

Spawn v5 from s0 @ d3-13min

d4 Pause v1,v2,v3 -

d5

Resume v1,v2,v3 @ d5-6min -

Spawn v6 from s0 @ d5-18min

Page 39: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

39

PHASE 2 – SNAPSHOT SCHEDULING Faster (but more costly) to spawn a new replica and take a

new snapshot from it to spawn 3 replicas at d3 Faster to snapshot paused VM and spawn replica from it for

d5 Cost of volume storage on EC2 more expensive than IO

costDeadline Private Cloud EC2

d3 Spawn replica v5 at d3-18m + snapshot s1 -

d5 Snapshot s2 from paused v1@ d4+1min -

Page 40: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

40

PHASE 3 – CAPACITY PROVISIONING Schedule new replica spawning using new snapshots Next snapshot scheduling does not generate new

snapshots and the algorithm terminates

Deadline Private Cloud

d1 Pause v4

d2 Pause v3

d3

Resume v5 @ d3-1minSpawn v6,v7 from s1 @ d3-3min

d4 Pause v1,v2,v5

d5

Resume v1,v2,v5@ d5-6minSpawn v8 from s2 @ d5-3min

Page 41: DOLLY: Virtualization-Driven Database Provisioning for the Cloud

NE

DB

Sum

mit

– Ja

n 28

, 201

1 –

cecc

het@

cs.u

mas

s.ed

u

41

FINAL SCHEDULING Different scheduling for private and public cloud

Minimize resource (energy) usage for private cloud Minimize cost for for public cloud