25
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

  • View
    216

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Disco: Running Commodity Operating Systems on Scalable

Multiprocessors

Bugnion et al.

Presented by: Ahmed Wafa

Page 2: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Overview

• Shared Memory Multiprocessor Machines

• System Software for Shared Memory Multi-processor

• Disco as a virtual machine monitor

• Performance

• Conclusions

Page 3: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Shared Memory Multiprocessors

• UMA (Uniform Memory Access)

• NUMA (Non Uniform Memory Access)

• ccNUMA (Cache-Coherent NUMA)

Page 4: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

The Stanford FLASH Multiprocessor

Fig1. The FLASH system architecture

Page 5: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

System Software for Shared Memory Multi-processors

• Invest a large development effort for designing/developing custom OS

• Statically Partition the machine and run commodity systems that communicate together using distributed protocols

• Run multiple commodity systems on top of a virtual machine monitor that provide dynamic resource sharing

Page 6: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Disco: A Virtual Machine Monitor

Fig.2 Disco's Architecture

Page 7: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Challenges facing Virtual Machines

Overheads. Resource Management. Communication and Sharing.

Page 8: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Disco

Interface. Implementation. Running Commodity Operating systems.

Page 9: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Disco's Interface

Processors

– Virtual CPUs that abstracts the MIPS R10000 Physical Memory

– Contiguous space starting at address 0

– Near Uniform memory access time I/O Devices

– Virtualize access to devices.

– Virtual subnet to allow communication

Page 10: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Disco's Implementation

Virtual CPUs Virtual Physical Memory NUMA Memory Management Virtual I/O Devices Copy-on-write Disks Virtual Network Interfaces

Page 11: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Implementation: Virtual CPUs

• Direct Execution when possible

• MIPS modes

– Kernel

– Supervisor

– User

Page 12: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Implementation: Virtual Physical Memory

• Fast and Correct Mapping from the VM virtual address space to the real machine address space

• Handling of TLB misses and the pmap data structures

• TLB miss cost and Disco's Second-Level TLB

Page 13: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Implementation: NUMA Memory Management

• Satisfying Cache misses from local memory

• Page Migration and Replication

• FLASH cache miss counter and Hot Pages

• Disco memmap and TLB shootdowns

Fig.3 Page Replication

Page 14: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Implementation: I/O

• Virtual I/O Devices

– Intercepting device access and DMA requests

– Special device drivers, single trap per request

• Copy-on-Write Disks

– Used for shared disks

– Speed up disk reads by sharing memory

Fig.4 Copy-on-Write Page Sharing

Page 15: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Virtual Network Interface

• Virtual subnet to allow VM communication

• No MTU limit for packets

• Aligned messages that span complete pages will be re-mapped rather than copied

• Sharing vs Replicating for maximizing data locality

Fig.5 NFS sharing, replacing bcopy to remap data

Page 16: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Running Commodity Operating Systems

IRIX 5.3 a SVR4 based UNIX from SG. Changes for IRIX

MIPS and KSEG0 Device Drivers HAL Network mbuf sharing

Page 17: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Experimental Results

Setup Execution Overheads Memory Overheads Scalability Dynamic Page Migration and Replication

Page 18: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Experimental Setup

• The FLASH machine was not available at the time of development

• Hardware emulation using SimOS

Page 19: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Execution Overheads

• 3-16% slowdown

– TLB misses

– System calls

Fig.2 Virtualization Overhead

Fig.6 Service Breakdown for pmake workload

Page 20: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Memory Overheads

Fig.7 Service Breakdown for pmake workload

Page 21: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Scalability

• IRIX and memory management

• Using 8 VM(s) reduces delay by 40%

Fig.8 Workload Scalability Under Disco

Page 22: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Dynamic Page Migration and Replication

• The NUMA problem

• IRIX(non NUMA aware) vs DISCO

Fig.9 Performance Benefits of Page Migration

Page 23: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Conclusions

• Developing system for software for Shared Memory Multiprocessor machines can cost much

• Disco can present such a machine as a cluster of connected machines through virtualization

• Disco will allow commodity operating system to be used on such machines

http://groups.csail.mit.edu/cag/raw/documents/R4400_U man_book_Ed2.pdf

Page 24: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

Conclusions

• Virtualization overhead can be as low as 3%-16%

• Optimization techniques can enhance scalability and hide NUMAness from commodity OS

Page 25: Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa

References

• “The Stanford FLASH multiprocessor”, kuskin et al.

• MIPS R4400 manual http://groups.csail.mit.edu/cag/raw/documents/R4400_Uman_book_Ed2.pd