26
Cellular Disco: resource management using virtual clusters on shared memory multiprocessors Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M. Rosenblum. Presenter: Soumya Eachempati

Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

  • Upload
    nikkos

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

Cellular Disco: resource management using virtual clusters on shared memory multiprocessors. Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M. Rosenblum. Presenter: Soumya Eachempati. Motivation. Large scale shared-Memory Multiprocessors Large number of CPUs (32-128) - PowerPoint PPT Presentation

Citation preview

Page 1: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Cellular Disco: resource management using virtual clusters on shared memory

multiprocessors

Published in ACM 1999 by K.Govil, D. Teodosiu,Y. Huang, M. Rosenblum.Presenter: Soumya Eachempati

Page 2: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Motivation

• Large scale shared-Memory Multiprocessors– Large number of CPUs (32-128)– NUMA Architectures

• Off-the-shelf OS not scalable– Cannot handle large number of resources– Memory management not optimized for NUMA– No fault containment

Page 3: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Existing Solutions• Hardware partitioning

– Provides fault containment– Rigid resource allocation– Low resource utilization– Cannot dynamically adapt to workload

• New Operating System– Provides flexibility and efficient resource management.

– Considerable effort and time

Goal: To exploit hardware resources to the fullest with minimal effort while improving flexibility and fault-tolerance.

Page 4: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Solution: DISCO(VMM)– Virtual Machine monitor– Addresses NUMA awareness issues and scalability

Issues not dealt by DISCO:– Hardware fault tolerance/containment

– Resource management policies

Page 5: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Cellular DISCO

• Approach: Convert Multiprocessor machine into a Virtual Cluster

• Advantages:– Inherits the benefits of DISCO– Can support legacy OS transparently– Combines the goodness of H/W Partitioning and new OS.

– Provides fault containment– Fine grained resource sharing– Less effort than developing an OS

Page 6: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Cellular DISCO• Internally structured into semi-independent cells.

• Much less development effort compared to HIVE

• No performance loss - with fault containment.

WARRANTED DESIGN DECISION: Code of Cellular DISCO is correct.

Page 7: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Cellular Disco Architecture

Page 8: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Resource Management• Over-commits resources• Gives flexibility to adjust fraction of

resources assigned to VM.• Restrictions on resource allocation due to

fault containment.• Both CPU and memory load balancing under

constraints.– Scalability– Fault containment– Avoid contention

• First touch allocation, dynamic migration, replication of hot memory pages

Page 9: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Hardware Virtualization

• VM’s interface mimics the underlying H/W.• Virtual Machine Resources (User-defined)

– VCPUs, memory, I/O devices(physical)• Physical vs. machine resources(allocated dynamically

- priority of VM)– VCPUs - CPUs– Physical - machine pages

• VMM intercepts privileged instructions– 3 modes - user & supervisor(guest OS), kernel(VMM).– Supervisor mode all memory accesses are mapped.

• Allocates machine memory to back the physical memory.• Pmap and memmap data structure.• Second level software TLB(L2TLB).

Page 10: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Hardware fault containment

Page 11: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Hardware fault containment

• VMM - software fault containment.• Cell• Inter-cell communication

– Inter-processor RPC– Messages - no need for locking since serialized.

– Shared memory for some data structures(pmap, memmap).

– Low latency, exactly once semantics• Trusted system software layer - enables us to use shared memory.

Page 12: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Implementation 1: MIPS R10000

• 32-processor SGI Origin 2000• Piggybacked on IRIX 6.4(Host OS)• Guest OS - IRIX 6.2• Spawns Cellular DISCO(CD) as a multi-threaded kernel process.– Additional overhead < 2%(time spent in host IRIX)

– No fault isolation: IRIX kernel is monolithic

• Solution: Some host OS support needed-one copy of host OS per cell.

Page 13: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

I/O Request execution• Cellular Disco piggybacked on IRIX kernel

Page 14: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

32 - MIPS R10000

Page 15: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Characteristics of workloads

• Database - decision support workload

• Pmake - IO intensive workload• Raytrace - CPU intensive• Web - kernel intensive web-server workload.

Page 16: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Virtualization Overheads

Page 17: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Fault-containment Overheads

Left bar - single cell config

Right bar - 8 cell system.

Page 18: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

CPU Management• Load Balancing mechanisms:

– Three types of VCPU migrations - Intra-node, Inter-node, Inter-cell.

– Intra node - loss of CPU cache affinity– Inter node - cost of copying L2TLB, higher long term cost.– Inter cell - loss of both cache and node affinity,

increases fault vulnerability. • Alleviates penalty by replicating pages.• Load balancing policies - idle (local load

stealer) and periodic (global redistribution) balancers.

• Each CPU has local run queue of VCPUs.• Gang-scheduling

– Run all VCPUs of a VM simultaneously.

Page 19: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Load Balancing• Low contention distributed data structure - load tree.

• Contention on higher level nodes

• List of cells vulnerable to - VCPU.

• Heavy loaded - idle balancer not enough

• Local periodic balancer for 8 CPU region.

Page 20: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

CPU Scheduling and Results

• Scheduling - highest-priority gang runnable VCPU that has been waiting. Sends out RPC.

• 3 configs: 32- processors.a) One VM - 8 VCPUs--8 process raytrace.b) 4 VMsc) 8 VMs (total of 64 VCPUs).

• Pmap migrated only when all VCPUs are migrated out of a cell.

• Data pages also migrated for independence

Page 21: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Memory Management• Each cell has its own freelist of pages indexed by the home node.

• Page allocation request– Satisfied from local node– Else satisfied from same cell– Else borrowed from another cell

• Memory balancing– Low memory threshold for borrowing and lending

– Each VM has priority list of lender cells

Page 22: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Memory Paging• Page Replacement

– Second-chance FIFO• Avoids double paging overheads.• Tracking used pages

– Use annotated OS routines• Page Sharing

– Explicit marking of shared pages• Redundant Paging

– Avoids by trapping every access to virtual paging disk

Page 23: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Implementation 2: FLASH Simulation

• FLASH has hardware fault recovery support

• Simulation of FLASH architecture on SimOS

• Use Fault injector– Power failure– Link failure– Firmware failure (?)

• Results: 100% fault containment

Page 24: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Fault Recovery• Hardware support needed

– Determine what resources are operational– Reconfigure the machine to use good resources

• Cellular Disco recovery– Step 1: All cells agree on a liveset of nodes

– Step 2: Abort RPCs/messages to dead cells– Step 3: Kill VMs dependent on failed cells

Page 25: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Fault-recovery Times

• Recovery times higher for larger memory– Requires memory scanning for fault detections

Page 26: Cellular Disco: resource management using virtual clusters on shared memory multiprocessors

Summary• Virtual Machine Monitor

– Flexible Resource Management– Legacy OS support

• Cellular Disco– Cells provide fault-containment– Create Virtual Cluster– Need hardware support