Upload
kiefer
View
72
Download
5
Tags:
Embed Size (px)
DESCRIPTION
Process Migration Checkpoint/Restart. ECI, July 2005. Process Migration. Process migration benefits: Tool for load balancing Data access locality Improved system administration Mobile computing. Process Migration Issues. Execution model: home, remote Migrating virtual memory - PowerPoint PPT Presentation
Citation preview
Process MigrationCheckpoint/Restart
ECI, July 2005ECI, July 2005
ECI – July 2005 2
Process Migration
Process migration benefits: Tool for load balancing Data access locality Improved system administration Mobile computing
ECI – July 2005 3
Process Migration Issues
Execution model: home, remote Migrating virtual memory Minimizing downtime Cost of migration
Run time cost (home, remote) Migration operation
Limitations of migration
ECI – July 2005 4
Checkpoint / Restart
Checkpoint/restart benefits: Like migration plus … Fault resilience Fault recovery High availability Gang scheduling Debugging, testing, developing Security (honey-pot)
ECI – July 2005 5
Checkpoint/restart goals
Transparency Support parallel programs
Multi-process Multi-node
Security Minimize required state Minimize required storage
ECI – July 2005 6
CKPT: Application Level
Application level Efficient Non-preemptive Lack of common API Source code changes Possible compiler support
Examples ?
ECI – July 2005 7
CKPT: Library Level
Library level Typically use a signal handler (callback) Common API Restricts functionality (e.g., no IPC) Relatively portable
Examples…
ECI – July 2005 8
CKPT: Library (contd)
Libckpt Memory exclusion, incremental, forked Modify source code, link statically
Condor Support memory mapping, shared
libraries Relink to special library (needs object file)
Score, co-check Parallel applications Modify communication layer
ECI – July 2005 9
Implementation (contd)
Kernel level Loadable kernel module vs. change
kernel Preemptive / cooperative Access to entire process state Complex, less portable Examples: Sprite, Zap
Virtual machines (soon)
ECI – July 2005 10
Multi-process Checkpoint
Global state A set of states from all processes
Consistent global state If the state of A reflects a message
received from B, then the state of B reflects sending
If the state of A reflect a message sent to B but not yet received, it must be part of the channel state
ECI – July 2005 11
Consistent Global State
ECI – July 2005 12
Multi-process Checkpoint
Uncoordinated checkpoint Inspect data to find recovery line Processes are independent, efficient Domino effect, much storage
ECI – July 2005 13
Multi-process Checkpoint
Coordinated checkpoint Centrally managed Blocking
All processes suspended Flush communication channels
Non blocking Delay in triggers may yield
inconsistency
ECI – July 2005 14
Multi-process Checkpoint
Communication-induced Piggyback process checkpoint status
and requests on messages May require enforcing global checkpoint Unpredictable checkpoint times
ECI – July 2005 15
Multi-process Checkpoint
Summary:
Uncoordinated
CoordinatedCommunication induced
Domino effect
Possible No No
Management
overheadNone More Less
Decision making
Local Central Local/central
Checkpoint data
storedAll Latest only Several
Virtual Machines
“Any problem in computer science can be solved by another layer of
indirection”
ECI, July 2005ECI, July 2005
ECI – July 2005 17
What is a Virtual Machine ?
An indirection layer below the execution environment seen by applications and OS
Decouple architecture and user perceived behavior of SW and HW resources from their physical implementation
Provide a uniform view of the underlying resources
Multiplex multiple virtual systems on a single (physical) resource
ECI – July 2005 18
VM History
1960’s – Hypervisors (mainframes) Time-share expensive hardware No change to legacy software
1980-90’s – Obsolete Proliferation of cheap hardware Hardware support neglected
Later 1990’s – Reincarnation For complex MPP lacking OS infrastructure
2000 - Today: Renaissance Consolidation, isolation, reliability
ECI – July 2005 19
VM Benefits
Performance Server consolidation Efficient HW utilization Adaptive resource balancing Checkpoint/restart and migration
Security Simple (reduced complexity) Encapsulation and isolation Mediation
ECI – July 2005 20
VM benefits (contd)
Reliability Redundancy through replication Disaster recovery Deployment testing
And… Quality of service Transparent (for legacy SW) Enhanced interoperability Development & testing
ECI – July 2005 21
Server utilization
Cumulative usage of 28 servers:Memory 45% of RAM not used 99.9% of time 25% of RAM never used concurrently
CPU 85% of CPU not used 99.9% of time 81% of CPU never used concurrently
Disk 68% of storage space never used
ECI – July 2005 22
Virtualization levels
HOST entity: encapsulates the guest GUEST entity: managed by the host
Application programs
Libraries
Operating system
Hardware
API
ABI
ISA
ECI – July 2005 23
Process & System VM
Application Application
Processvirtual
machineHardware
OS
VMM
Application
Application
Virtualmachine
Hardware
VMM
OS OS
ECI – July 2005 24
VM at different levels
HW level VMware, Xen, Denali, Virtual PC, UML
OS level Virtual Servers, BSD Jail, Zap
Programming language level Java, .NET
Network VLAN, VPN
ECI – July 2005 25
VM Taxonomy
Process VM - virtual platform that exists solely to support the process Unix Emulators (interpreters) Dynamic binary translators
Optimize by block translation and caching Java – “compile once run everywhere”
Intermediate machine code Optimize by native compilation on-the-fly
ECI – July 2005 26
VM Taxonomy (contd)
System VM - complete persistent system environment providing access to virtual hardware Classic - bare HW Hosted VM
Easy install and maintenance Leverage native services of underlying OS
Multiprocessor virtualization
ECI – July 2005 27
Hardware Virtualization
Challenges to build virtual machines Performance isolation
Scheduling priority Memory demand Network traffic Disk Access
Support for various OS platforms Small performance overhead
ECI – July 2005 28
Lack of Hardware Support
Ring aliasing Non-faulting access to privileged state
Does the guest see the right state ? Address space compression
Where does the VMM reside ? Impact on transitions
Traps, SYSENTER, SYSEXIT Interrupts masking Hidden state
ECI – July 2005 29
Now What ?
Hardware extensions Change semantics to support VM Intel, AMD
Software virtualization Translate code to emulate desired
behavior VMware
Paravirtualization Xen, Denali
ECI – July 2005 30
Hardware Extensions for VM
Root mode Runs VMM Like ring-0 before
Non-Root mode Runs guest OS Less privileged
Mask of events to trap
ECI – July 2005 31
VMware
Hardware virtualization CPU, memory, I/O Suspend/resume Live migration
Design goals: Compatibility Performance Simplicity
ECI – July 2005 32
VMware: CPU Virtualization
CPU Virtualization Execute guest on bare hardware while
retaining control by the VMM Traps privileged ops & emulates their
action Challenge: lack of HW support
POPF and read access to privileged state Solution: fast binary translation
Only kernel mode code Eliminate unnecessary traps
ECI – July 2005 33
VMware: Memory Virtualization
Memory virtualization Shadow page tables
Challenges: Inefficient page replacement Oversized due to replication
Solutions: Ballooning Content based sharing
ECI – July 2005 34
VMware: I/O Virtualization
Challenge: wide variety of devices and interfaces
Solution: Hosted architecture Trap through the VMM Export special devices
ECI – July 2005 35
Xen: Paravirtualization
Provide some exposure to the underlying hardware Better performance Must modify OS to adapt No modifications to applications
ECI – July 2005 36
Xen (contd)
Downgrade privilege of guest OS Guest registers syscall and page-fault
handlers with Xen Partial access to page tables Fast handlers for most exceptions Expose set of simple device
abstractions
ECI – July 2005 37
Xen (contd)
The cost of porting an OS to Xen: Privileged instructions Page table access Network driver Block device driver <2% of code-base
ECI – July 2005 38
Denali
Lightweight protection domains Minimalistic method geared for
performance Changes:
Idle loops - avoid busy wait Interrupt queueing - save context switch Interrupt semantics – “just”/”recent” No virtual memory (!) No BIOS – no legacy “crap” Generic I/O devices
ECI – July 2005 39
Virtual Machine Migration
Optimizations: Reduce memory state before snapshot
ballooning Reduce total cost by incremental updates
COW hierarchy Reduce start-up time by paging on-demand Reduce transfer time relying on common data
Use hash functions to identify common blocks
ECI – July 2005 40
Virtual Machine Migration
Minimizing down time Reduce size of VM state Pre-copy static parts (or..) Demand-copy static parts Hot-copy dynamic parts
ECI – July 2005 41
OS Virtualization
Confine applications in containers Advantages:
Fine granularity Low overhead Easier maintenance
Challenges Transparency Correctness Extend OS: Modify kernel, loadable module, library
ECI – July 2005 42
Isolation – BSD Jail
Create an isolated existing environment via software means.
Uses chroot (private root per jail) Processes in a jail are isolated from
files, processes, or network services in other jails.
A jail can be restricted to a single IP address.
ECI – July 2005 43
Specialized Virtualization – Linux VServer
Hosting (consolidation) Experimentation Education (do you trust students … ?) Personal security box Manage several "versions“ Applications
Virtual servers Per user firewall Fail over servers Honey-pots
ECI – July 2005 44
Specialized Virtualization – Linux VServer
Isolation Processes, file system, IPC, network,
super user capabilities Kernel patch
Add a “context” tag per process/resource syscalls to handle contexts (irreversible)
Challenges Capture all holes (indirect access !) Efficient storage
ECI – July 2005 45
General Virtualization – Zap
Virtualization for isolation POD – PrOcess Domain Private namespace
Virtualization for migration Decouple process from OS Capture state and reconstruct state
ECI – July 2005 46
Zap – virtualization
Process environment Interpose on system calls
File system Rely on “chroot” environment
Network Per protocol methods
Challenges Race conditions (smp) Life-span of objects Fast translation
ECI – July 2005 47
Zap – Migration
Checkpoint – outside process context Capture process tree Capture pod state Capture per-process state
Restart – inside process context Restore process tree Restore processes
Example issues Sharing Deleted files