16
Xen Summit at Oracle Feb 24-25, 2009 VM Snapshots Patrick Colp Chris Matthews, Bill Aiello, Andrew Warfield Department of Computer Science University of British Columbia Vancouver, BC, Canada (in collaboration with George S. Coker, II)

XS Oracle 2009 Vm Snapshots

Embed Size (px)

DESCRIPTION

Patrick Colp: VM Snapshots

Citation preview

Page 1: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

VM Snapshots

Patrick Colp

Chris Matthews, Bill Aiello, Andrew Warfield

Department of Computer ScienceUniversity of British Columbia

Vancouver, BC, Canada

(in collaboration with George S. Coker, II)

Page 2: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Problem

● Want a fixed image of a VM at a point in time– e.g. to dynamically inspect a VM from the outside

● VMs don't sit still– memory/state keeps changing

Page 3: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Goal

● Capture a consistent view of a VM– Without pausing and still allowing the VM to run

– Create a lightweight, general purpose mechanism

Snapshot

Page 4: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Overview

● Copy-on-Write frames to snapshot images– Allows VM to keep running

– Lightweight as only dirtied frames are copied

● General purpose mechanism– Can use CoW to do other things too!

snapshot = vm_snapshot(domid);

Page 5: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Degrees of Snapshots

1. Static view of guest memory• VM introspection (XenAccess, LKIM)

• VM checkpointing (Remus)

2. Execution rollback• VM undo• Speculative execution (SpecHint)

3. VM fork• Quickly spawn template VMs (Potemkin, SnowFlock)

Page 6: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Implementation

● Save initial state– Registers, shared frames, etc.

● Copy-on-Write VM's memory– Mark memory read-only

– Catch write faults (log dirty)

● What about disks?– Easy: use Parallax for disk snapshotting

– Alternatively can use VHD/QCoW

Page 7: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Architecture

Page 8: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Architecture

● Library– Provide a nice API for snapshotting

– Integrate with things like XenAccess and Remus

● FUSE driver– Allocate per snapshot buffer space

– Copy buffered frames into backing file

– Fetch frames from target VM as requested

– Exposes snapshot through read() and mmap()

Page 9: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Architecture

● Kernel driver– Validate and pin user buffer frames

– Populate ring with MFNs and pass to Xen

● Xen– domctl to take snapshots

– Copy frames into snapshot buffers on write faults

Page 10: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Fault Path Flow

Page 11: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Accessing CoW Memory

Page 12: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Trickier Bits

● Write after read● Catching all writes● Running out of buffer space

Page 13: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Write After Read

● Problem:– Check snapshot buffer for frame (not found)

– Read VM's memory directly

– VM writes to frame after buffer was checked but before it was read

● Solution:– After reading from VM's memory directly, re-check

snapshot buffer for frame

Page 14: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Catching All Writes

● Problem:– Normal writes, mark dirty called before writes

– Emulated mode/QEMU, mark dirty after writes

– Grant tables, mark dirty called on unmap

● Solution:– Normal writes, save frame when mark dirty is called

– Emulated mode/QEMU, add new hypercall before write

– Grant tables, save frame on map

Page 15: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Running Out of Buffer Space

● Problem:– Buffer is limited size and filled in fault context

– Once faulted, it's too late to flush the buffer

● Solution:– Guarantee enough buffer space before faulting

– Threshold with minimum number of frames (~16)

– Pause target domain and flush buffer on return from fault

Page 16: XS Oracle 2009 Vm Snapshots

Xen Summit at Oracle Feb 24-25, 2009

Conclusion

● VM snapshots● General purpose, lightweight mechanism

Coming soon to Xen-devel!

I'd be happy to talk about OCaml XenStore too