Upload
durin
View
51
Download
0
Embed Size (px)
DESCRIPTION
VEE 2011 paper available at: http://lass.cs.umass.edu/papers.html. DOLLY: Virtualization-Driven Database Provisioning for the Cloud. Emmanuel Cecchet Joint work with Rahul Singh, Upendra Sharma and Prashant Shenoy. PROVISIONING IN THE CLOUD. Virtualization No shared disk - PowerPoint PPT Presentation
Citation preview
DOLLY: Virtualization-Driven Database Provisioning for the
Cloud
Emmanuel CecchetJoint work with Rahul Singh, Upendra Sharma and Prashant Shenoy
VEE 2011 paper available at:http://lass.cs.umass.edu/papers.html
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
2
PROVISIONING IN THE CLOUD Virtualization No shared disk Provisioning based on volume and machine load
InternetFrontend/Load balancer
DatabasesProvisioning
logic
App.Servers
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
3
WHY IS IT HARD TO ADD A DB REPLICA?
Administration issues Install/setup new database Backup/Restore database content Configure new replica Synchronize live content
How much time does it take? Depends on the database size Depends on the backup/restore technique From minutes to hours…
Dolly: VM cloning to blackbox the database
Dolly
Database replication in the Cloud
Provisioning with DollyPrototype & Evaluation
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
5
SPAWNING A REPLICA Multi-master middleware-based replication Three main phases
Backup Restore Replay
DB1new
replicaDB2
Client SQL requests
Replication middleware
Transactional log
Load balancer
Management console
1add
replica
snapshotbackup
2
4
resynchronize
restore3
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
6
SPAWNING A REPLICA WITH CLONING Backup & Restore replace by VM cloning
DB1 DB2
Client SQL requests
Replication middleware
Transactional log
Load balancer
Management console
1add
replica
VM1
OSVM2
OS
clone2
DB2
VM3
OS
clone3 DB2
VM3
OS
4
resynchronize
3
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
7
CLONING: BACKUP/RESTORE IN CONSTANT TIME
Database DB size on disk
DB Backup Restore
Dolly 4GB VM cloning
Dolly 16GB VM
cloningRUBiS –c–i 1022MB 843s 281s 899sRUBiS +c+bi 1.4GB 5761s 282s 900sRUBiS +c+fi 1.5GB 6017s 280s 900sTPC-W 684MB 288s 275s 905sTPC-H 1GB 1.8GB 1477s 271s 918sTPC-H 10GB 12GB 5573s n/a 911s
Filesystem snapshot/copy is DB agnostic Only depends on VM size
Dolly
Database replication in the Cloud
Provisioning with DollyPrototype & Evaluation
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
9
MODELING SPAWNING TIME Predictable backup and restore times are
required Replay time can be estimated from current
write throughput (wt)
timebackup restore replay
bi ri
updates
max max max
. . .t t ti i
w w wb rw w w
replica spawning time
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
10
WHEN TO SNAPSHOT? Time to spawn from a live replica
Time to spawn from an existing snapshot
Faster to take a new snapshot j to spawn a new replica than using old snapshot i if:
backupj+restorej<restorei +replayi
max
maxi i
t
ws b r
w w
max
maxi i
t
ws r replay
w w
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
11
DOLLY OVERVIEW
Input capacity
prediction write prediction
Output schedule of
snapshots schedule of replica
spawning admission control if
needed
Predictors
Capacity Provisioning
Spawning options
Admission Control Management API
Scheduler
start/stopclone/ snapshot
Monitoring
write throttling/ read throttling
Dolly
Snapshot scheduler
write predictionscapacity predictions
Paused pool cleaner
Free pool Manager
delete VM/ snapshot
HA adjuster
Write throttling
reclaim
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
12
PROVISIONING REPLICAS
Workloadprediction
Capacityprediction
Writeprediction
Dolly does not provide predictors Dolly can work with any predictor (see
[Eurosys09])
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
13
CLOUD COST FUNCTIONS Adapt the provisioning decisions to the cloud
platform specifics Cost can be $ on public cloud or time on private
cloud
Cost function name Definitionpause_cost(VM, t) cost of pausing VM at time tspawn_cost(s, t, d) cost to spawn a replica from snapshot s at time t to
meet deadline dspawn_cost(VM, t, d) cost to spawn a replica from a paused VM at time t to
meet deadline drunning_cost(VM,t1,t2) cost to run a VM from time t1 to time t2pause_resume_cost(VM, t1, t2) cost to pause a VM at time t1 and resume it at time t2backup_paused_cost(VM) cost to backup a paused VMbackup_live_cost(VM, t) cost to backup an active VM at time t
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
14
PROVISIONING REPLICAS Parse capacity provisioning predictions Decrease capacity by pausing VMs Increasing capacity
Check if we can reuse a paused VMCheck if we can spawn from an existing
snapshotChoose cheapest options according to spawn_cost function
Perform admission control if all replicas cannot be provisioned in time
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
15
SNAPSHOT SCHEDULING How to snapshot?
Clone a paused VMPause an active VM to clone it
When to snapshot?At time j when backupj+restorej<restorei
+replayi If new snapshot is scheduled, re-run capacity
provisioning
Prediction window must have minimum size
Dolly
Database replication in the Cloud
Provisioning with DollyPrototype & Evaluation
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
17
IMPLEMENTATION
C-JDBC/Sequoia replication middleware
OpenNebula Cloud management middleware
Cost functions private cloud: minimize
resource utilization time Amazon EC2: minimize
cost
OpenNebula
TPC-W load injector
Scheduler
Recovery LogLog table
Dump table
JMX Management API
Backupers
Dolly OpenNebula
DB1 DB2 DB3
SQL requestsadd/remove replica snapshot/pause/…
VM1
OSVM2
OSVM3
OS
New replica
VM5
OS
DB3 snapshot
VMclone
OS
Loadbalancer
New replica
VM4
OS
Sequoia controller
predictions
Sequoia driver
admission control
Backup serveror NAS
start/stop/ clone/…
clone clone
DollyPrivate EC2
write throttling
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
18
IMPLEMENTATION – COST FUNCTIONS Private cloud: minimize resource utilization Amazon EC2: minimize cost
Cost function name Private Cloud EC2pause_cost(VM, t) return 1/VM->machine->temp return 60-((t-VM->start)%60)
spawn_cost(s, t, d) return d-t comp$=(d-t)/60*hour$io$=EBS_storage$*s->size + EBS_io$* (s->restore_io+s->replay_io)return comp$+io$
spawn_cost(VM, t, d) return d-t comp$=(d-t)/60*hour$io$= EBS_io$* (s->resume_io+s->replay_io)return comp$+io$
running_cost(VM,t1,t2) return 1 (t2-t1)/60*hour$
pause_resume_cost(VM, t1, t2)
if (t2-t1 > VM->pause + VM->resume) return 0else return 2
io$= EBS_io$* (VM->pause_io+VM->resume_io)comp$=(60-(VM->stop-VM->start) %60)/60*hour$return io$+ comp$
backup_paused_cost(VM) return backup_time return S3_storage$*s->size
backup_live_cost(VM, t) return VM->pause + backup_time + VM->resume
return pause_cost(VM, t)$+ S3_storage$*s->size + (VM->stop_io+VM->start_io)* EBS_io$
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
19
WORKLOAD DESCRIPTION TPC-W online bookstore e-commerce benchmark Snapshot s0 available at t0
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
20
Overprovisioning with 6 replicas – 1h snapshotCost MRM
Private cloud 720m 0Amazon EC2 $8.39 0
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
21
Reactive provisioning – 2h snapshotCost MRM
Private cloud 410m 42.1Amazon EC2 $4.61 41.5
replica spawning triggered here
replicas available
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
22
Dolly – 30m Prediction Window
Cost MRMPrivate cloud 352m 0Amazon EC2 $3.73 0
Private Cloud Amazon EC2
s1 s2 cheaper to leave instances online
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
23
CONCLUSION VM cloning
Solves administration issues by blackboxing the database
Constant time backup/restore needed to predict replica spawning time
New provisioning algorithmDecouples capacity provisioning from
snapshot schedulingCost functions to optimize for cloud
platform specifics
VEE 2011 paper available at:http://lass.cs.umass.edu/papers.html
BONUS SLIDES
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
25
Reactive provisioning – 15m snapshotCost MRM
Private cloud 381m42s 17.5Amazon EC2 $18.29 27.2
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
26
Reactive provisioning – 1h snapshotCost MRM
Private cloud 360m30s 25.8Amazon EC2 $5.00 33.7
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
27
Dolly – 10m Prediction Window
Cost MRMPrivate cloud 381m54s 0Amazon EC2 $7.16 0
Private Cloud Amazon EC2
s1 s2 s1 s2
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
28
SPAWNING IN A PUBLIC CLOUD Storage decoupled from computing resource Starting a new instance clones the volume
DB1
Vol1
OS
DB1
Vol1
OS
stop snapshot DB2
Vol2
OSregister
DB2
Vol2
OSrestartDB1
Vol1
OS
DB2
Vol3
OS
DB2
Vol4
OS
start
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
29
BACKUP/RESTORE TECHNIQUES
Database native toolsVendor specific or 3rd party ETLUnderstand database semantics
Filesystem copyLow-level data copyNeed to know what to copy
VM cloningCopies database content + configuration
+ OSUnused space can be compressed
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
30
DATABASE SIZESBenchmark DB size Snapshot size VM size
RUBiS
MyISAM no constraint 836MB
844MB 4.1GB
MyISAM w/ constraints 1.1GBMyISAM w/ constraint &
index 1.2GB
InnoDB no constraint 1022MBInnoDB w/ constraints 1.4GB
InnoDB w/ constraint & index 1.5GB
TPC-WPostgreSQL binary dump
684MB210MB
2.1GBPostgreSQL sql dump 314MB
TPC-H scale
1(GB)
PostgreSQL binary dump
1.8GB
307MB 1.1GB (OS) + 2.1GB (data)
PostgreSQL sql dump 1.2GB
TPC-H scale
10(GB)
PostgreSQL binary dump12GB
2.0GB16GB
PostgreSQL sql dump 7.3GB
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
31
BACKUP/RESTORE PERFORMANCE (1/3) Performance depends on database content
290
826 843
292
899
5761
293
1141
6017
0
1000
2000
3000
4000
5000
6000
7000
VMcloning
MySQLMyISAM
MySQLInnoDB
VMcloning
MySQLMyISAM
MySQLInnoDB
VMcloning
MySQLMyISAM
MySQLInnoDB
RUBiS no constraint RUBiS w/ constraints & basicindex
RUBiS w/ constraints & full textindex
Tim
e in
sec
onds
VM shutdownVM copyVM cloningVM bootMySQL backupMySQL restore
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
32
BACKUP/RESTORE PERFORMANCE (2/3)
65
126
177
288275
0
50
100
150
200
250
300
350
File copy VM cloning direct VM cloning & copy PG bin PG sql
TPC-W
Tim
e in
sec
onds
VM shutdownVM/File copyVM cloningVM bootPostgreSQL stop/startPG bin backupPG bin restorePG sql backupPG sql restore
File copy is the most effective for small databases
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
33
BACKUP/RESTORE PERFORMANCE (3/3)
147 189 273
1477 1468
937699
1215
55735256
0
1000
2000
3000
4000
5000
6000
File copy VMcloningdirect
VMcloning& copy
PG bin PG sql File copy VMcloningdirect
VMcloning& copy
PG bin PG sql
TPC-H 1GB TPC-H 10GB
Tim
e in
sec
onds
VM shutdownVM/File copyVM cloningVM bootPostgreSQL stop/startPG bin backupPG bin restorePG sql backupPG sql restore
VM cloning most effective on large databases
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
34
BACKUP/RESTORE SUMMARY
Feature DB Backup/Restore
Filesystem Copy
VM Cloning
Database specific knowledge Medium Very high None
Performance Slow Fastest FastSnapshot size Small DB size VM sizeSpawning time
predictability Hard Moderate Easy
Database installation Moderate Moderate NoneDatabase configuration Hard Hard NoneMissing data in transfer Possible Unlikely NoSpawning atomicity No No YesResynchronization
limitations Yes Yes Yes
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
35
DOLLY MAIN ALGORITHM Capacity provisioning depends on available snapshots Snapshots scheduled according to capacity demand Decouple capacity provisioning from snapshot scheduling
if (predictor.capacity_changes || predictor.write_workload_changes) { do { schedule = capacity_provisioning(predictions) snapshot_schedule = snapshot_scheduling(predictions) } while (snapshot_schedule schedules new snapshots) scheduler.schedule(snapshot_schedule) scheduler.schedule(capacity_schedule)}if (time since last operation > threshold) { paused_pool_cleaner.release_old_paused_vms(); paused_pool_cleaner.delete_old_snapshots();}
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
36
RELEASING RESOURCES
Paused VMs VM never re-used if cost to resume > cost to
spawn from last snapshot Snapshots
Old snapshots can be released based on cost to keep them around
Free server pool Can reclaim servers with paused VMs when pool
is empty
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
37
TPC-W EVALUATION Multi-tier online bookstore benchmark 4GB VM for the database Large EC2 instances from EBS volumes with CloudWatch
Operation Private Cloud Public Cloud (EC2)
start VM 42s 220spause VM 26s 30sresume VM 42s 30sbackup (stop/clone) 150s 320srestore (clone/start) 165s 220swmax 149 writes/sec 197 writes/sec
Avg IOs per write 15 13
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
38
PHASE 1 – CAPACITY PROVISIONING Pause VMs when capacity decreases Resume VMs when capacity increases Spawn VMs from snapshot when additional capacity
required Not cost effective to pause VMs for less than 1 hour with
EC2Deadline Private Cloud EC2
d1 Pause v4
d2 Pause v3
d3
Resume v3 @ d3-7minResume v4 @ d4-9min
Spawn v5 from s0 @ d3-13min
d4 Pause v1,v2,v3 -
d5
Resume v1,v2,v3 @ d5-6min -
Spawn v6 from s0 @ d5-18min
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
39
PHASE 2 – SNAPSHOT SCHEDULING Faster (but more costly) to spawn a new replica and take a
new snapshot from it to spawn 3 replicas at d3 Faster to snapshot paused VM and spawn replica from it for
d5 Cost of volume storage on EC2 more expensive than IO
costDeadline Private Cloud EC2
d3 Spawn replica v5 at d3-18m + snapshot s1 -
d5 Snapshot s2 from paused v1@ d4+1min -
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
40
PHASE 3 – CAPACITY PROVISIONING Schedule new replica spawning using new snapshots Next snapshot scheduling does not generate new
snapshots and the algorithm terminates
Deadline Private Cloud
d1 Pause v4
d2 Pause v3
d3
Resume v5 @ d3-1minSpawn v6,v7 from s1 @ d3-3min
d4 Pause v1,v2,v5
d5
Resume v1,v2,v5@ d5-6minSpawn v8 from s2 @ d5-3min
NE
DB
Sum
mit
– Ja
n 28
, 201
1 –
cecc
het@
cs.u
mas
s.ed
u
41
FINAL SCHEDULING Different scheduling for private and public cloud
Minimize resource (energy) usage for private cloud Minimize cost for for public cloud