74
Linux Symposium, July 2010 1 [email protected] Linux Symposium, July 2010 1 [email protected] Linux-CR: Transparent Application Checkpoint-Restart in Linux Linux-CR: Transparent Application Checkpoint-Restart in Linux Oren Laadan Columbia University [email protected] Serge E. Hallyn IBM [email protected]

Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 [email protected]@cs.columbia.edu

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 1 [email protected] Symposium, July 2010 1 [email protected]

Linux-CR:Transparent Application

Checkpoint-Restart in Linux

Linux-CR:Transparent Application

Checkpoint-Restart in Linux

Oren LaadanColumbia University

[email protected]

Serge E. HallynIBM

[email protected]

Page 2: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 2 [email protected] Symposium, July 2010 2 [email protected]

Application C/RApplication C/R

◆ Application Checkpoint/Restart:

a mechanism to save the state of a running application so that it can later resume its execution from that point

Page 3: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 3 [email protected] Symposium, July 2010 3 [email protected]

What is it good for ?What is it good for ?

◆ Application roll back to the past◆ recover from faults◆ effective debugging◆ improved response time◆ retry a move in a game

Page 4: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 4 [email protected] Symposium, July 2010 4 [email protected]

What else is it good for ?What else is it good for ?

◆ Application suspend and resume◆ improved system utilization◆ suspend/resume a user's session

Page 5: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 5 [email protected] Symposium, July 2010 5 [email protected]

What more is it good for ?What more is it good for ?

◆ Application migration◆ load balancing and resource sharing◆ mobile desktop on a USB key◆ zero-downtime maintenance◆ improved availability

Page 6: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 6 [email protected] Symposium, July 2010 6 [email protected]

Application vs Virtual-MachineApplication vs Virtual-Machine

Application Virtual C/R Machine

granularity specific operating systemapplications as a whole unit

saved state application entire operatingstate only system state

overhead none visible overhead

Page 7: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 7 [email protected] Symposium, July 2010 7 [email protected]

Some HistorySome History

◆ Linux 2.4◆ EPCKPT (Rutgers)◆ CRAK (Columbia)

◆ Linux 2.6◆ BLCR (Berkeley)◆ OpenVZ (Parallels)◆ Zap (Columbia)

Page 8: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 8 [email protected] Symposium, July 2010 8 [email protected]

RequirementsRequirements

◆ Reliable◆ if checkpoint succeeds – restart succeeds

◆ Transparent◆ applications are oblivious to operation

◆ Secure◆ must not introduce vulnerabilities

◆ Mainline◆ aim for inclusion in mainline kernel

Page 9: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 9 [email protected] Symposium, July 2010 9 [email protected]

BLOB

Usage ModelUsage Model

◆ Checkpoint granularity◆ a process hierarchy◆ top-down traversal

original restoredhierarchy hierarchy

restartcheckpoint

Page 10: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 10 [email protected] Symposium, July 2010 10 [email protected]

Checkpoint CategoriesCheckpoint Categories

◆ Container-checkpoint◆ Subtree-checkpoint◆ Self-checkpoint

Page 11: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 11 [email protected] Symposium, July 2010 11 [email protected]

NamespacesNamespaces

◆ Private and virtual view of resources◆ e.g. pid, mount, ipc, network...

◆ Private view◆ provide isolation from other processes

◆ Virtual view◆ decouple from underlying kernel instance

Page 12: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 12 [email protected] Symposium, July 2010 12 [email protected]

Container CheckpointContainer Checkpoint

◆ Hierarchy is self-contained◆ includes all the processes that are

referenced within the hierarchy

◆ Hierarchy is isolated◆ resources only referenced by processes

that belong to the hierarchy

⇩◆ Checkpoint is consistent and reliable

Page 13: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 13 [email protected] Symposium, July 2010 13 [email protected]

Container CheckpointContainer Checkpoint

Page 14: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 14 [email protected] Symposium, July 2010 14 [email protected]

Container CheckpointContainer Checkpoint

Page 15: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 15 [email protected] Symposium, July 2010 15 [email protected]

Subtree CheckpointSubtree Checkpoint

◆ Arbitrary process hierarchy◆ no constraints on the target hierarchy◆ simplifies admin, but no guarantees◆ suitable for many use-cases

Page 16: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 16 [email protected] Symposium, July 2010 16 [email protected]

Subtree CheckpointSubtree Checkpoint

Page 17: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 17 [email protected] Symposium, July 2010 17 [email protected]

Subtree CheckpointSubtree Checkpoint

Page 18: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 18 [email protected] Symposium, July 2010 18 [email protected]

Self-CheckpointSelf-Checkpoint

◆ For a process to save its state◆ record only current process◆ ignore sharing and dependencies◆ Analogous to fork syscall:

…ret = checkpoint(0, fd, flags, -1);if (ret < 0)

return ret;else if (ret)

printf(“checkpoint succeeded\n”);else

printf(“returned from restart\n”);…

Page 19: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 19 [email protected] Symposium, July 2010 19 [email protected]

System CallsSystem Calls

long checkpoint(pid, fd, flags, logfd)

◆ target hierarchy with root task @pid◆ output to @fd, log to @logfd

long restart(pid, fd, flags, logfd)

◆ New hierarchy with coordinator @pid◆ Input from @fd, log to @logfd

Page 20: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 20 [email protected] Symposium, July 2010 20 [email protected]

ExampleExample

cat > myscript.sh << EOF#!/bin/shecho $$ > /cgroup/1/tasksexec 0>&- ; exec 1>&- ; exec 2>&-/usr/sbin/sshd -p 9999screen -A -d -m -S mysession somejob.shEOF

Page 21: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 21 [email protected] Symposium, July 2010 21 [email protected]

ExampleExample

cat > myscript.sh << EOF#!/bin/shecho $$ > /cgroup/1/tasksexec 0>&- ; exec 1>&- ; exec 2>&-/usr/sbin/sshd -p 9999screen -A -d -m -S mysession somejob.shEOF

mkdir -p /cgroupmount -t cgroup -o freezer cgroup /cgroupmkdir /cgroup/1

Page 22: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 22 [email protected] Symposium, July 2010 22 [email protected]

ExampleExample

cat > myscript.sh << EOF#!/bin/shecho $$ > /cgroup/1/tasksexec 0>&- ; exec 1>&- ; exec 2>&-/usr/sbin/sshd -p 9999screen -A -d -m -S mysession somejob.shEOF

mkdir -p /cgroupmount -t cgroup -o freezer cgroup /cgroupmkdir /cgroup/1

nohup nsexec -tgcmpiUP pid.out myscript.sh &

Page 23: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 23 [email protected] Symposium, July 2010 23 [email protected]

ExampleExample

cat > myscript.sh << EOF#!/bin/shecho $$ > /cgroup/1/tasksexec 0>&- ; exec 1>&- ; exec 2>&-/usr/sbin/sshd -p 9999screen -A -d -m -S mysession somejob.shEOF

mkdir -p /cgroupmount -t cgroup -o freezer cgroup /cgroupmkdir /cgroup/1

nohup nsexec -tgcmpiUP pid.out myscript.sh &

PID=`cat pid.out`echo FROZEN > /cgroup/1/freezer.statecheckpoint $PID -l clog.out -o image.outkill -9 $PIDecho THAWED > /cgroup/1/freezer.state

Page 24: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 24 [email protected] Symposium, July 2010 24 [email protected]

ExampleExample

cat > myscript.sh << EOF#!/bin/shecho $$ > /cgroup/1/tasksexec 0>&- ; exec 1>&- ; exec 2>&-/usr/sbin/sshd -p 9999screen -A -d -m -S mysession somejob.shEOF

mkdir -p /cgroupmount -t cgroup -o freezer cgroup /cgroupmkdir /cgroup/1

nohup nsexec -tgcmpiUP pid.out myscript.sh &

PID=`cat pid.out`echo FROZEN > /cgroup/1/freezer.statecheckpoint $PID -l clog.out -o image.outkill -9 $PIDecho THAWED > /cgroup/1/freezer.state

restart -l rlog.out -i image.out

Page 25: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 25 [email protected] Symposium, July 2010 25 [email protected]

ArchitectureArchitecture

◆ Reliability◆ Transparency◆ Kernel vs userspace◆ Checkpoint image◆ Shared resources◆ Leak detection

Page 26: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 26 [email protected] Symposium, July 2010 26 [email protected]

Architecture: ReliabilityArchitecture: Reliability

◆ How to maintain global consistency ?◆ Requirements:

◆ keep tasks frozen◆ keep resources unmodified

◆ Outcome: state is protected◆ from tasks in the hierarchy◆ from tasks outside the hierarchy

Page 27: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 27 [email protected] Symposium, July 2010 27 [email protected]

Architecture: TransparencyArchitecture: Transparency

◆ How to maintain transparency ?◆ Requirements:

◆ include all resources in use by tasks◆ preserve resources identifiers on restart

◆ Outcome: state visible as before◆ all necessary state is restored◆ state accessible via same identifiers

Page 28: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 28 [email protected] Symposium, July 2010 28 [email protected]

Kernel vs. UserspaceKernel vs. Userspace

CompletenessTransparencyExtensibility

Page 29: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 29 [email protected] Symposium, July 2010 29 [email protected]

Kernel vs. UserspaceKernel vs. Userspace

In-Kernel

CompletenessTransparencyExtensibility

Userspace

Page 30: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 30 [email protected] Symposium, July 2010 30 [email protected]

Kernel vs. UserspaceKernel vs. Userspace

◆ The rule: in-kernel implementation◆ transparency, completeness◆ leverage extensive kernel API

◆ The exception: userspace possible◆ if straightforward with existing APIs◆ if provides significant added value◆ If occurs before entering the kernel

Page 31: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 31 [email protected] Symposium, July 2010 31 [email protected]

CheckpointCheckpoint

(1) Freeze process hierarchy

(2) Save global data

(3) Save process hierarchy

(4) Save state of all tasks

(?) Filesystem snapshot

(5) Thaw/kill process hierarchy

In-Kernel

Page 32: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 32 [email protected] Symposium, July 2010 32 [email protected]

RestartRestart

(1) Create container

(?) Restore (stage) filesystem

(3) Create process hierarchy

(4) Restore state of all tasks

(5) Resume execution

In-Kernel

Page 33: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 33 [email protected] Symposium, July 2010 33 [email protected]

Restart: Create HierarchyRestart: Create Hierarchy

◆ DumpForest◆ convert hierarchy data to instructions

◆ CreateForest◆ execute instructions to re-create tasks◆ proceeds from root task recursively

◆ Curious how it works ?◆ see USENIX 2007 paper (Zap)◆ read comments in code

Page 34: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 34 [email protected] Symposium, July 2010 34 [email protected]

Restart CoordinationRestart CoordinationCoordinator create tree

T

Page 35: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 35 [email protected] Symposium, July 2010 35 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task create tree wait tasks created

T

Page 36: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 36 [email protected] Symposium, July 2010 36 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task create tree wait tasks created

sys_restart created

T

Page 37: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 37 [email protected] Symposium, July 2010 37 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... created

T

Page 38: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 38 [email protected] Symposium, July 2010 38 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

T

Page 39: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 39 [email protected] Symposium, July 2010 39 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

wait coord ... wait coord

T

Page 40: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 40 [email protected] Symposium, July 2010 40 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

wait coord ... wait coord wait done

T

Page 41: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 41 [email protected] Symposium, July 2010 41 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

wait coord ... wait coord wait done wake 1st

T

Page 42: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 42 [email protected] Symposium, July 2010 42 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

wait coord ... wait coord wait done wake 1st

restore self wake 2nd

T

Page 43: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 43 [email protected] Symposium, July 2010 43 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

wait coord ... wait coord wait done wake 1st

restore self wake 2nd

wait coord restore self wake 3rd ...

T

Page 44: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 44 [email protected] Symposium, July 2010 44 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

wait coord ... wait coord wait done wake 1st

restore self wake 2nd

wait coord restore self wake 3rd ... wait coord restore self wake coord wait coord

T

Page 45: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 45 [email protected] Symposium, July 2010 45 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

wait coord ... wait coord wait done wake 1st

restore self wake 2nd

wait coord restore self wake 3rd ... wait coord restore self wake coord wait coord wait done

T

Page 46: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 46 [email protected] Symposium, July 2010 46 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

wait coord ... wait coord wait done wake 1st

restore self wake 2nd

wait coord restore self wake 3rd ... wait coord restore self wake coord wait coord wait done wake tasks T

Page 47: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 47 [email protected] Symposium, July 2010 47 [email protected]

Restart CoordinationRestart CoordinationCoordinator 1st task 2nd task … Nth task create tree wait tasks created

sys_restart created ... createdwait coord sys_restart ... sys_restart

wait coord ... wait coord wait done wake 1st

restore self wake 2nd

wait coord restore self wake 3rd ... wait coord restore self wake coord wait coord wait done wake tasks resume resume ... resumeT

Page 48: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 48 [email protected] Symposium, July 2010 48 [email protected]

Checkpoint/Restart ComparedCheckpoint/Restart Compared

Checkpoint Restart

auxiliary process restore in context

tasks are passive tasks participate

detect non-restartable detect non-secure

non-intrusive (errors) cleanup (errors)

Page 49: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 49 [email protected] Symposium, July 2010 49 [email protected]

Checkpoint ImageCheckpoint Image

◆ The image is a BLOB◆ internals may change over time◆ conversion to be done in userspace

◆ Designed for streaming◆ for migration, or for image filters: sign,

compress, encrypt, convert, etc.

Page 50: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 50 [email protected] Symposium, July 2010 50 [email protected]

Checkpoint ImageCheckpoint Image

◆ Representation of kernel data◆ already need to inspect on restart◆ compatibility across kernel versions◆ does not save unnecessary fields◆ unified format for 32/64 bit architectures

Page 51: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 51 [email protected] Symposium, July 2010 51 [email protected]

Checkpoint ImageCheckpoint Image

◆ a sequence of object records◆ records have header and payload

struct ckpt_hdr {

__u32 type;

__u32 len;

};

struct ckpt_hdr_task {

struct ckpt_hdr h;

__u32 state;

...

};

Page 52: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 52 [email protected] Symposium, July 2010 52 [email protected]

Shared ResourcesShared Resources

◆ Resources in use by multiple tasks◆ open files, namespaces, signals,

handlers, memory descriptor◆ only checkpoint/restore once each◆ use a hash-table to track instances

Page 53: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 53 [email protected] Symposium, July 2010 53 [email protected]

Shared ResourcesShared Resources

◆ Checkpoint:◆ physical pointer → unique tag◆ save before the “parent” object◆ “parent” objects saves only tag

◆ Restart◆ unique tag → (new) physical pointer◆ restore before the “parent” object◆ use tag in “parent” to locate instance

Page 54: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 54 [email protected] Symposium, July 2010 54 [email protected]

Shared ResourcesShared Resources

…mm→files→

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Page 55: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 55 [email protected] Symposium, July 2010 55 [email protected]

Shared ResourcesShared Resources

…mm→files→

fd 0fd 1fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

...

Page 56: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 56 [email protected] Symposium, July 2010 56 [email protected]

Shared ResourcesShared Resources

…mm→files→

fd 0fd 1fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp3fp1 fp2

Page 57: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 57 [email protected] Symposium, July 2010 57 [email protected]

Shared ResourcesShared Resources

…mm→files→

fd 0fd 1fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp5fp3fp1 fp2 fp4 ...

Page 58: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 58 [email protected] Symposium, July 2010 58 [email protected]

Checkpoint Image ExampleCheckpoint Image Example…hdr_mm [1st]hdr_fd [fp1]hdr_fd [fp2]hdr_fd [fp3]hdr_files [files A]hdr_task [1st]…

…mm→files→

fd 0fd 1fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp5fp3fp1 fp2 fp4 ...

Page 59: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 59 [email protected] Symposium, July 2010 59 [email protected]

Checkpoint Image ExampleCheckpoint Image Example…hdr_mm [1st]hdr_fd [fp1]hdr_fd [fp2]hdr_fd [fp3]hdr_files [files A]hdr_task [1st]…hdr_mm [2nd]hdr_task [2st]…

…mm→files→

fd 0fd 1fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp5fp3fp1 fp2 fp4 ...

Page 60: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 60 [email protected] Symposium, July 2010 60 [email protected]

Checkpoint Image ExampleCheckpoint Image Example…hdr_mm [1st]hdr_fd [fp1]hdr_fd [fp2]hdr_fd [fp3]hdr_files [files A]hdr_task [1st]…hdr_mm [2nd]hdr_task [2st]…hdr_mm [3st]hdr_fd [fp4]hdr_fd [fp5]hdr_files [files B]hdr_task [3rd]...

…mm→files→

fd 0fd 1fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp5fp3fp1 fp2 fp4 ...

Page 61: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 61 [email protected] Symposium, July 2010 61 [email protected]

Leak DetectionLeak Detection

◆ Resources in use must not be modified from outside the container

◆ collect: count reference in hierarchy◆ compare with kernel reference count◆ leverage shared instances repository

Page 62: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 62 [email protected] Symposium, July 2010 62 [email protected]

Leak DetectionLeak Detection

…mm→files→

fd 0fd 1fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp5fp3fp1 fp2 fp4 ...

…mm→files→

Task Z

Page 63: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 63 [email protected] Symposium, July 2010 63 [email protected]

Leak DetectionLeak Detection

◆ Resources in use must no be modified from outside the container

◆ collect: count references in hierarchy◆ compare with kernel reference count◆ leverage shared instances repository

◆ What about races during collection ?

Page 64: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 64 [email protected] Symposium, July 2010 64 [email protected]

Leak DetectionLeak Detection

…mm→files→

fd 0fd 1fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp5fp3fp1 fp2 fp4 ...

…mm→files→

Task Z

Page 65: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 65 [email protected] Symposium, July 2010 65 [email protected]

Leak DetectionLeak Detection

…mm→files→

fd 0

fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp5fp3fp1 fp2 fp4 ...

…mm→files→

Task Z

Page 66: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 66 [email protected] Symposium, July 2010 66 [email protected]

Leak DetectionLeak Detection

…mm→files→

fd 0fd 3fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp5fp3fp1 fp2 fp4 ...

…mm→files→

Task Z

fp6

Page 67: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 67 [email protected] Symposium, July 2010 67 [email protected]

Leak DetectionLeak Detection

…mm→files→

fd 0fd 3fd 2...

fd 0fd 1fd 2...

…mm→files→

…mm→files→

1st task 3rd Task2nd Task

Files A Files B

fp5fp3fp1 fp2 fp4 ...

…mm→files→

Task Z

fp6

Page 68: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 68 [email protected] Symposium, July 2010 68 [email protected]

Error HandlingError Handling

◆ When error occurs:◆ syscall reports single error value◆ detailed log written to @logfd ◆ users can examine the log

Page 69: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 69 [email protected] Symposium, July 2010 69 [email protected]

Kernel API - OverviewKernel API - Overview

ckpt_hdr_...(): record handling (eg alloc/dealloc)

ckpt_write_...():write records/data to image

ckpt_read_...(): read records/data from image

ckpt_msg_...(): output to log file (and debug)

ckpt_err_...(): report an error condition

ckpt_obj_...(): manage objects and hash-table

Page 70: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 70 [email protected] Symposium, July 2010 70 [email protected]

Kernel API – Shared ObjectsKernel API – Shared Objects

struct ckpt_obj_ops { char *obj_name; Int obj_type; void (*ref_drop)(...); int (*ref_grab)(...); int (*ref_users)(...); int (*checkpoint)(...); void (*restart)(...);};

register/unregister object handlers register_checkpoint_obj(ops): unregister_checkpoint_obj(ops):

Page 71: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 71 [email protected] Symposium, July 2010 71 [email protected]

Current StateCurrent State

◆ Supported architectures:◆ x86-32, x86-64, s390x, PowerPC, ARM

◆ Features:◆ see up to date information at

https://ckpt.wiki.kernel.org/index.php/Checklist

◆ experimental integration with LXC

Page 72: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 72 [email protected] Symposium, July 2010 72 [email protected]

ContributionsContributions

◆ Sukadev Bhattiprolu, Serge Hallyn,Dave Hansen, Matt Helsley, Nathan Lynch, Dan Smith, and myself...

◆ Suggestion, ideas and reviews from many other people … Thank You !

Page 73: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 73 [email protected] Symposium, July 2010 73 [email protected]

Join the Effort !Join the Effort !

◆ Implement more features [kernel]◆ Checkpoint optimizations [kernel]◆ Convert between kernel versions [user]◆ Inspection of checkpoint image [user]◆ Plug-in architecture for restart [user]◆ … and more ...

Page 74: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ols2010-linuxcr.pdf · Linux Symposium, July 2010Linux Symposium, July 2010 22 orenl@cs.columbia.eduorenl@cs.columbia.edu

Linux Symposium, July 2010 74 [email protected] Symposium, July 2010 74 [email protected]

Questions ?Questions ?

◆ More information◆ Web page: http://www.linux-cr.org/◆ Git tree(s): git://www.linux-cr.org/git/◆ Email: [email protected]

Thanks You !