21
Cellular Disco: Resource management using virtual clusters on shared-memory multiprocessors Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum Computer Systems Laboratory, Stanford University * Xift, Inc., Palo Alto, CA www-flash.stanford.edu

Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

  • Upload
    traci

  • View
    31

  • Download
    1

Embed Size (px)

DESCRIPTION

Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors. Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum Computer Systems Laboratory, Stanford University * Xift, Inc., Palo Alto, CA www-flash.stanford.edu. Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

Cellular Disco:Resource management using virtual clusters on shared-memory multiprocessors

Kinshuk Govil, Dan Teodosiu*, Yongqiang Huang, and Mendel Rosenblum

Computer Systems Laboratory, Stanford University* Xift, Inc., Palo Alto, CA

www-flash.stanford.edu

Page 2: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

2

Motivation• Why buy a large shared-memory machine?

– Performance, flexibility, manageability, show-off• These machines are not being used at their

full potential– Operating system scalability bottlenecks– No fault containment support– Lack of scalable resource management

• Operating systems are too large to adapt

Page 3: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

3

Previous approaches• Operating system: Hive, SGI IRIX 6.4, 6.5

+Knowledge of application resource needs– Huge implementation cost (a few million lines)

• Hardware: static and dynamic partitioning+Cluster-like (fault containment)– Inefficient, granularity, OS changes, large apps

• Virtual machine monitor: Disco+Low implementation cost (13K lines of code)– Cost of virtualization

Page 4: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

4

Questions• Can virtualization overhead be kept low?

– Usually within 10%

• Can fault containment overhead be kept low?– In the noise

• Can a virtual machine monitor manage resources as well as an operating system?– Yes

Page 5: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

5

Virtual Machine Monitor

Hardware

Overview of virtual machines• IBM 1960s• Trap privileged

instructions• Physical to machine

address mapping• No/minor OS

modifications

Virtual Machine

OS

AppVirtual Machine

OS

App

Page 6: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

6

Cellular Disco

CPU CPU CPU CPU CPU CPU CPU

Interconnect

Avoiding OS scalability bottlenecks

Virtual MachineVMVMApplication App App App

OSOS Operating System

32-processor SGI Origin 2000

. .

.

Page 7: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

7

Experimental setup

• Workloads– Informix TPC-D (Decision support database)– Kernel build (parallel compilation of IRIX5.3)– Raytrace (from Stanford Splash suite)– SpecWEB (Apache web server)

32P Origin 2000

IRIX 6.4Cellular Disco

32P Origin 2000

IRIX 6.2

vs.

Page 8: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

8

MP virtualization overheads

32-processor overheads

020406080

100120

InformixTPC-D

Kernelbuild

Raytrace SpecWEBNorm

aliz

ed e

xecu

tion

time

IRIXCellular Disco

• Worst case uniprocessor overhead only 9%

+10% +20%+1% +4%

Page 9: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

9

VM

Cellular Disco

CPU CPU CPU CPU CPU CPU CPU CPU

Interconnect

Fault containment

VM VM

• Requires hardware support as designed in FLASH multiprocessor

Page 10: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

10

Fault containment overhead 0%

• 1000 fault injection experiments (SimOS): 100% success

0

20

40

60

80

100

120

InformixTPC-D

Kernelbuild

Raytrace SpecWEBNorm

aliz

ed e

xecu

tion

time 1 cell

8 cells+1% -2% +1% +1%

Page 11: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

11

Resource management challenges

• Conflicting constraints– Fault containment– Resource load balancing

• Scalability• Decentralized control• Migrate VMs without OS support

Page 12: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

12

VM

Cellular Disco

CPU CPU CPU CPU CPU CPU CPU CPU

Interconnect

CPU load balancing

VM VM

VM VM VM

Page 13: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

13

Idle balancer (local view)

• Check neighboring run queues (intra-cell only)

• VCPU migration cost: 37µs to 1.5ms– Cache and node memory affinity: > 8 ms

• Backoff• Fast, local

A0B1

CPU 0 CPU 1 CPU 2 CPU 3A1

B0 B1 VCPUs

Page 14: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

14

Periodic balancer (global view)

• Check for disparity in load tree

• Cost– Affinity loss– Fault

dependencies

1 3

4

A0

1CPU 0

0CPU 1

2CPU 2

1CPU 3A1

B0 B1B1

fault containment boundary

Page 15: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

15

CPU management results

0

50

100

150

200

250

One VM Four VMs Eight VMs

Nor

mal

ized

exe

cutio

n tim

e

Ideal

Cellular Disco

• IRIX overhead (13%) is higher

+0.3%

+9%

Page 16: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

16

VM

Cellular Disco

RAM RAM RAM RAM RAM RAM RAM RAM

Interconnect

Memory load balancing

VM VMVM

Page 17: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

17

Memory load balancing policy• Borrow memory before running out• Allocation preferences for each VM• Borrow based on:

– Combined allocation preferences of VMs– Memory availability on other cells– Memory usage

• Loan when enough memory available

Page 18: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

18

Memory management results

• Ideally: same time if perfect memory balancing

Cellular Disco DB

4Interconnect

4 4 4 4 4 4 4 Cellular Disco

DB

32 CPUs, 3.5GBInterconnect

Only +1% overhead

Page 19: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

19

• Operating system (IRIX6.4)• Hardware partitioning

– Simulated by disabling inter-cell resource balancing

Comparison to related work

Cellular Disco TPC-D

8 CPUs 8 CPUs 8 CPUs 8 CPUsInterconnect

16 processRaytrace

Page 20: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

20

• CPU utilization: 31% (HW) vs. 58% (VC)

216 231221 229

434

325

0

100

200

300

400

500

Raytrace Database

Tim

e (s

econ

ds)

OperatingsystemVirtual clusters

Hardwarepartitioning

Results of comparison

Page 21: Cellular Disco : Resource management using virtual clusters on shared-memory multiprocessors

21

Conclusions• Virtual machine approach adds flexibility

to system at a low development cost• Virtual clusters address the needs of large

shared-memory multiprocessors– Avoid operating system scalability bottlenecks– Support fault containment– Provide scalable resource management– Small overheads and low implementation cost