47
Process Migration Checkpoint/Restar t ECI, July 2005 ECI, July 2005

Process Migration Checkpoint/Restart

  • Upload
    kiefer

  • View
    72

  • Download
    5

Embed Size (px)

DESCRIPTION

Process Migration Checkpoint/Restart. ECI, July 2005. Process Migration. Process migration benefits: Tool for load balancing Data access locality Improved system administration Mobile computing. Process Migration Issues. Execution model: home, remote Migrating virtual memory - PowerPoint PPT Presentation

Citation preview

Page 1: Process Migration Checkpoint/Restart

Process MigrationCheckpoint/Restart

ECI, July 2005ECI, July 2005

Page 2: Process Migration Checkpoint/Restart

ECI – July 2005 2

Process Migration

Process migration benefits: Tool for load balancing Data access locality Improved system administration Mobile computing

Page 3: Process Migration Checkpoint/Restart

ECI – July 2005 3

Process Migration Issues

Execution model: home, remote Migrating virtual memory Minimizing downtime Cost of migration

Run time cost (home, remote) Migration operation

Limitations of migration

Page 4: Process Migration Checkpoint/Restart

ECI – July 2005 4

Checkpoint / Restart

Checkpoint/restart benefits: Like migration plus … Fault resilience Fault recovery High availability Gang scheduling Debugging, testing, developing Security (honey-pot)

Page 5: Process Migration Checkpoint/Restart

ECI – July 2005 5

Checkpoint/restart goals

Transparency Support parallel programs

Multi-process Multi-node

Security Minimize required state Minimize required storage

Page 6: Process Migration Checkpoint/Restart

ECI – July 2005 6

CKPT: Application Level

Application level Efficient Non-preemptive Lack of common API Source code changes Possible compiler support

Examples ?

Page 7: Process Migration Checkpoint/Restart

ECI – July 2005 7

CKPT: Library Level

Library level Typically use a signal handler (callback) Common API Restricts functionality (e.g., no IPC) Relatively portable

Examples…

Page 8: Process Migration Checkpoint/Restart

ECI – July 2005 8

CKPT: Library (contd)

Libckpt Memory exclusion, incremental, forked Modify source code, link statically

Condor Support memory mapping, shared

libraries Relink to special library (needs object file)

Score, co-check Parallel applications Modify communication layer

Page 9: Process Migration Checkpoint/Restart

ECI – July 2005 9

Implementation (contd)

Kernel level Loadable kernel module vs. change

kernel Preemptive / cooperative Access to entire process state Complex, less portable Examples: Sprite, Zap

Virtual machines (soon)

Page 10: Process Migration Checkpoint/Restart

ECI – July 2005 10

Multi-process Checkpoint

Global state A set of states from all processes

Consistent global state If the state of A reflects a message

received from B, then the state of B reflects sending

If the state of A reflect a message sent to B but not yet received, it must be part of the channel state

Page 11: Process Migration Checkpoint/Restart

ECI – July 2005 11

Consistent Global State

Page 12: Process Migration Checkpoint/Restart

ECI – July 2005 12

Multi-process Checkpoint

Uncoordinated checkpoint Inspect data to find recovery line Processes are independent, efficient Domino effect, much storage

Page 13: Process Migration Checkpoint/Restart

ECI – July 2005 13

Multi-process Checkpoint

Coordinated checkpoint Centrally managed Blocking

All processes suspended Flush communication channels

Non blocking Delay in triggers may yield

inconsistency

Page 14: Process Migration Checkpoint/Restart

ECI – July 2005 14

Multi-process Checkpoint

Communication-induced Piggyback process checkpoint status

and requests on messages May require enforcing global checkpoint Unpredictable checkpoint times

Page 15: Process Migration Checkpoint/Restart

ECI – July 2005 15

Multi-process Checkpoint

Summary:

Uncoordinated

CoordinatedCommunication induced

Domino effect

Possible No No

Management

overheadNone More Less

Decision making

Local Central Local/central

Checkpoint data

storedAll Latest only Several

Page 16: Process Migration Checkpoint/Restart

Virtual Machines

“Any problem in computer science can be solved by another layer of

indirection”

ECI, July 2005ECI, July 2005

Page 17: Process Migration Checkpoint/Restart

ECI – July 2005 17

What is a Virtual Machine ?

An indirection layer below the execution environment seen by applications and OS

Decouple architecture and user perceived behavior of SW and HW resources from their physical implementation

Provide a uniform view of the underlying resources

Multiplex multiple virtual systems on a single (physical) resource

Page 18: Process Migration Checkpoint/Restart

ECI – July 2005 18

VM History

1960’s – Hypervisors (mainframes) Time-share expensive hardware No change to legacy software

1980-90’s – Obsolete Proliferation of cheap hardware Hardware support neglected

Later 1990’s – Reincarnation For complex MPP lacking OS infrastructure

2000 - Today: Renaissance Consolidation, isolation, reliability

Page 19: Process Migration Checkpoint/Restart

ECI – July 2005 19

VM Benefits

Performance Server consolidation Efficient HW utilization Adaptive resource balancing Checkpoint/restart and migration

Security Simple (reduced complexity) Encapsulation and isolation Mediation

Page 20: Process Migration Checkpoint/Restart

ECI – July 2005 20

VM benefits (contd)

Reliability Redundancy through replication Disaster recovery Deployment testing

And… Quality of service Transparent (for legacy SW) Enhanced interoperability Development & testing

Page 21: Process Migration Checkpoint/Restart

ECI – July 2005 21

Server utilization

Cumulative usage of 28 servers:Memory 45% of RAM not used 99.9% of time 25% of RAM never used concurrently

CPU 85% of CPU not used 99.9% of time 81% of CPU never used concurrently

Disk 68% of storage space never used

Page 22: Process Migration Checkpoint/Restart

ECI – July 2005 22

Virtualization levels

HOST entity: encapsulates the guest GUEST entity: managed by the host

Application programs

Libraries

Operating system

Hardware

API

ABI

ISA

Page 23: Process Migration Checkpoint/Restart

ECI – July 2005 23

Process & System VM

Application Application

Processvirtual

machineHardware

OS

VMM

Application

Application

Virtualmachine

Hardware

VMM

OS OS

Page 24: Process Migration Checkpoint/Restart

ECI – July 2005 24

VM at different levels

HW level VMware, Xen, Denali, Virtual PC, UML

OS level Virtual Servers, BSD Jail, Zap

Programming language level Java, .NET

Network VLAN, VPN

Page 25: Process Migration Checkpoint/Restart

ECI – July 2005 25

VM Taxonomy

Process VM - virtual platform that exists solely to support the process Unix Emulators (interpreters) Dynamic binary translators

Optimize by block translation and caching Java – “compile once run everywhere”

Intermediate machine code Optimize by native compilation on-the-fly

Page 26: Process Migration Checkpoint/Restart

ECI – July 2005 26

VM Taxonomy (contd)

System VM - complete persistent system environment providing access to virtual hardware Classic - bare HW Hosted VM

Easy install and maintenance Leverage native services of underlying OS

Multiprocessor virtualization

Page 27: Process Migration Checkpoint/Restart

ECI – July 2005 27

Hardware Virtualization

Challenges to build virtual machines Performance isolation

Scheduling priority Memory demand Network traffic Disk Access

Support for various OS platforms Small performance overhead

Page 28: Process Migration Checkpoint/Restart

ECI – July 2005 28

Lack of Hardware Support

Ring aliasing Non-faulting access to privileged state

Does the guest see the right state ? Address space compression

Where does the VMM reside ? Impact on transitions

Traps, SYSENTER, SYSEXIT Interrupts masking Hidden state

Page 29: Process Migration Checkpoint/Restart

ECI – July 2005 29

Now What ?

Hardware extensions Change semantics to support VM Intel, AMD

Software virtualization Translate code to emulate desired

behavior VMware

Paravirtualization Xen, Denali

Page 30: Process Migration Checkpoint/Restart

ECI – July 2005 30

Hardware Extensions for VM

Root mode Runs VMM Like ring-0 before

Non-Root mode Runs guest OS Less privileged

Mask of events to trap

Page 31: Process Migration Checkpoint/Restart

ECI – July 2005 31

VMware

Hardware virtualization CPU, memory, I/O Suspend/resume Live migration

Design goals: Compatibility Performance Simplicity

Page 32: Process Migration Checkpoint/Restart

ECI – July 2005 32

VMware: CPU Virtualization

CPU Virtualization Execute guest on bare hardware while

retaining control by the VMM Traps privileged ops & emulates their

action Challenge: lack of HW support

POPF and read access to privileged state Solution: fast binary translation

Only kernel mode code Eliminate unnecessary traps

Page 33: Process Migration Checkpoint/Restart

ECI – July 2005 33

VMware: Memory Virtualization

Memory virtualization Shadow page tables

Challenges: Inefficient page replacement Oversized due to replication

Solutions: Ballooning Content based sharing

Page 34: Process Migration Checkpoint/Restart

ECI – July 2005 34

VMware: I/O Virtualization

Challenge: wide variety of devices and interfaces

Solution: Hosted architecture Trap through the VMM Export special devices

Page 35: Process Migration Checkpoint/Restart

ECI – July 2005 35

Xen: Paravirtualization

Provide some exposure to the underlying hardware Better performance Must modify OS to adapt No modifications to applications

Page 36: Process Migration Checkpoint/Restart

ECI – July 2005 36

Xen (contd)

Downgrade privilege of guest OS Guest registers syscall and page-fault

handlers with Xen Partial access to page tables Fast handlers for most exceptions Expose set of simple device

abstractions

Page 37: Process Migration Checkpoint/Restart

ECI – July 2005 37

Xen (contd)

The cost of porting an OS to Xen: Privileged instructions Page table access Network driver Block device driver <2% of code-base

Page 38: Process Migration Checkpoint/Restart

ECI – July 2005 38

Denali

Lightweight protection domains Minimalistic method geared for

performance Changes:

Idle loops - avoid busy wait Interrupt queueing - save context switch Interrupt semantics – “just”/”recent” No virtual memory (!) No BIOS – no legacy “crap” Generic I/O devices

Page 39: Process Migration Checkpoint/Restart

ECI – July 2005 39

Virtual Machine Migration

Optimizations: Reduce memory state before snapshot

ballooning Reduce total cost by incremental updates

COW hierarchy Reduce start-up time by paging on-demand Reduce transfer time relying on common data

Use hash functions to identify common blocks

Page 40: Process Migration Checkpoint/Restart

ECI – July 2005 40

Virtual Machine Migration

Minimizing down time Reduce size of VM state Pre-copy static parts (or..) Demand-copy static parts Hot-copy dynamic parts

Page 41: Process Migration Checkpoint/Restart

ECI – July 2005 41

OS Virtualization

Confine applications in containers Advantages:

Fine granularity Low overhead Easier maintenance

Challenges Transparency Correctness Extend OS: Modify kernel, loadable module, library

Page 42: Process Migration Checkpoint/Restart

ECI – July 2005 42

Isolation – BSD Jail

Create an isolated existing environment via software means.

Uses chroot (private root per jail) Processes in a jail are isolated from

files, processes, or network services in other jails.

A jail can be restricted to a single IP address.

Page 43: Process Migration Checkpoint/Restart

ECI – July 2005 43

Specialized Virtualization – Linux VServer

Hosting (consolidation) Experimentation Education (do you trust students … ?) Personal security box Manage several "versions“ Applications

Virtual servers Per user firewall Fail over servers Honey-pots

Page 44: Process Migration Checkpoint/Restart

ECI – July 2005 44

Specialized Virtualization – Linux VServer

Isolation Processes, file system, IPC, network,

super user capabilities Kernel patch

Add a “context” tag per process/resource syscalls to handle contexts (irreversible)

Challenges Capture all holes (indirect access !) Efficient storage

Page 45: Process Migration Checkpoint/Restart

ECI – July 2005 45

General Virtualization – Zap

Virtualization for isolation POD – PrOcess Domain Private namespace

Virtualization for migration Decouple process from OS Capture state and reconstruct state

Page 46: Process Migration Checkpoint/Restart

ECI – July 2005 46

Zap – virtualization

Process environment Interpose on system calls

File system Rely on “chroot” environment

Network Per protocol methods

Challenges Race conditions (smp) Life-span of objects Fast translation

Page 47: Process Migration Checkpoint/Restart

ECI – July 2005 47

Zap – Migration

Checkpoint – outside process context Capture process tree Capture pod state Capture per-process state

Restart – inside process context Restore process tree Restore processes

Example issues Sharing Deleted files