Virtualization Recap - OS3 · Hardware 5. Interrupt 2. Set interval timer t' Kernel Mode 1....

Preview:

Citation preview

1

VirtualizationVirtualizationRecapRecap

2

Recap 1

What is the user part of an ISA? What is the system part of an ISA? What functionality do they provide?

3

Recap 2

Application Programs

Hardware

Libraries

Operating System

Arrows?What runs in User / Kernel Mode?

4

Recap 3

Difference Process VM and System VM in terms of the interface virtualized?

Classify Java in the taxonomy

5

Recap 4

Sa

S'a

Sb

S'b

e

e'

6

Implementing Virtual Machines Implementing Virtual Machines with the Same ISAwith the Same ISA

7

Same ISA VMs

Emulation needed for different ISA VMs For same ISA: Theoretically, source instructions

can be executed directly on target Fastest Does this work for all

instructions?

Virtual Machine Monitor

Application Programs

Hardware

Libraries

Guest OS

8

Same ISA VMs (cont'd)

No: System ISA instructions need to be controlled When? System VMs with a guest OS How?

Guest OS runs in CPU User Mode System ISA instructions called in User Mode

activate Kernel Mode (trap) VMM in Kernel Mode then emulates system

instruction

9

Example: App Scheduling

Applications

Hardware

Operating System

Run

Interrupts, Traps, faults

Privileged instructions

User Mode

Kernel Mode

10

Example: App Scheduling

Applications

Hardware

Operating System

2. Run app in User Mode

3. Interrupt 1. Set interval timer

Kernel Mode

User Mode

11

Virtual Machine Monitor

Example: App Scheduling

Applications

Operating System

4. Run app in User Mode

Hardware

5. Interrupt 2. Set interval timer t'

Kernel Mode 1. Set interval timer t

3. Run OS in User Mode

User Mode

12

Virtual Machine Monitor

Example: App Scheduling

Applications

Operating System

4. Run app in User Mode

Hardware

5. Interrupt 2. Set interval timer t'

Kernel Mode 1. Set interval timer t

3. Run OS in User Mode

t = requested by OS

t' = granted by VMM for fair scheduling of multiple VMs

User Mode

13

Virtual Machine Monitor

Example: App Scheduling

Applications

Operating System

4. Run app in User Mode

Hardware

5. Interrupt 2. Set interval timer t'

1. Set interval timer t

3. Run OS in User Mode

Guest OS schedules Apps

VMM schedules VMs

14

x86 Same ISA Problems (1/3)

Normal: VM: OS no longer in kernel mode!

Apps

OSOS

OS + Apps

OSVMM

Ring 0: Kernel Mode Ring 3: User Mode

15

x86 Same ISA Problems (2/3)

In x86, not all system instructions in User Mode activate Kernel Mode!

When Guest OS runs in User Mode, not all system instruction calls observed by VMM

Old solution: patch all binary code! Replace these critical instructions with explicit traps

to Kernel Mode

16

x86 Same ISA Problems (3/3)

New solution: Intel VT-x Allows Guest OS to run in Kernel Mode (Ring 0) Shared resources still controlled by VMM Using extra mode: VMX

VMX Root for VMX VMX Non-root for Guest OS

“Ring -1” Also hardware support for

VM context switch

Apps

OS

OS

VMM

17

VM Memory Management

Normal: Guest OS has virtual

memory Guest OS maintains

mapping to physical memory

VM: Guest OS has virtual

memory Guest OS maintains

mapping to real memory

VMM maintains mapping to physical memory

Virtual Real Physical

18

Host OS

Native and Hosted VMs

Guest OS

Apps

VMM

Hardware

Guest OS

Apps

Hardware

VMM

(a) Native VM

(b) Dual-Mode Hosted VM

User Mode

Kernel Mode

19

VMWare Workstation

Install on top of existing host OS Easy to use Can use myriad of device drivers available in host

OS

20

VMWare Architecture

VMApp

Host OS

VMMonitor

X86 Hardware

VMDriver

Applications

Guest OS

22

VMWare I/O

VMApp

Host OS

VMMonitor

X86 Hardware

VMDriver

Applications

Guest OS

DeviceDriver

DeviceDriver

DeviceDriver

Direct support, e.g. IDE

Use host support, e.g. CD, sound, serial port

23

VMWare: New Capabilities

VMApp

Host OS

VMMonitor

X86 Hardware

VMDriver

Applications

Guest OS

DeviceDriver

DeviceDriver

COW

24

Operating System Support Operating System Support for Virtualizationfor Virtualization

25

Native, Hosted, Paravirtualized VMs

Guest OS

Apps

VMM

Hardware

Host OS

Guest OS

Apps

Hardware

VMM

Apps

VMM

Hardware

Guest OS

Modify the Guest OS!

26

Paravirtualization

System VMs can be faster when Guest OS can be modified for virtualization

Showcased in Xen Project Modified

Linux Windows XP

Near native performance!

27

Xen Evolution

Problems: only open-source OSes can be modified Xen implementation tricks not on x86-64

New approach: Start from Full virtualization with Hardware Support

Apply Paravirtualization in areas where speed can be gained:

1. Disk and network I/O2. Interrupts and timers3. Emulated motherboard, legacy boot4. Privileged instructions, page tables

28

Xen Mode: HVM

Source: Lars Kurth, http://wiki.xen.org/wiki/File:XenModes.pnghttp://wiki.xen.org/wiki/Virtualization_Spectrum

30

Xen Mode: HVM + PV Drivers

31

Xen Mode: PVHVM Drivers

32

Xen Mode: PVH

33

KVM

KVM

34

Applications

Guest OS(Modified)

Xen Hypervisor

Xen Architecture

Domain 0Toolstack

Host OS

X86 Hardware

Applications

Guest OS(Modified)

Domain

virtual x86 CPUScheduler MMU Timers

Drivers PV front PV front

35

Applications

Guest OS(Modified)

Xen Hypervisor

Xen Architecture

Domain 0

Toolstack

Host OS

X86 Hardware

Applications

Guest OS(Modified)

Domain

virtual x86 CPUScheduler MMU Timers

Drivers PV front PV front

36

Operating-System Level Virtualization

In between System VM and Process VM Not System VM:

Cannot choose OS Not Process VM:

Multiple processes, not isolated As if multiple instances of the same OS are running

on the same machine Example: Linux Containers

cf. Docker

37

Linux Host OS

Linux Containers

Applications

OS View

X86 Hardware

Namespaces CGroups

Applications

OS View

Applications

OS View

38

Linux Containers: Namespaces

Linux has a configuration Controlled via many files and outside input Idea: allow a configuration per process (group) E.g. for process A the hostname is X, for process B

the hostname is Y cf. chroot Now: configuration is set of 6 namespaces

Source: Rami Rosen, “Linux Kernel Networking”, APress.

http://www.haifux.org/lectures/299/netLec7.pdf

39

6 Namespaces of Linux

uts (hostname) mnt (mount points, filesystems) pid (processes) user (UIDs) ipc (System V IPC) net (network stack)

(plans to add more)

40

UTS Namespace (1/3)

“UNIX time sharing” ?! Contains 6 strings:

sysname Operating system name (e.g., "Linux") nodename Name within "some implementation

-defined network" release OS release (e.g., "2.6.28") version OS version machine Hardware identifier domainname NIS or YP domain name

i.e., control the names of the containerSource: uname(2)

41

UTS Namespaces (2/3)

The old implementation of gethostname():

asmlinkage long sys_gethostname(char __user *name, int len)

{

...

if (copy_to_user(name, system_utsname.nodename, I))

errno = -EFAULT;

...

}

system_utsname is a global variable

42

UTS Namespaces (3/3)The new implementation of gethostname():static inline struct new_utsname *utsname(void)

{

return &current->nsproxy->uts_ns->name;

}

SYSCALL_DEFINE2(gethostname, char __user *, name, int, len)

{

struct new_utsname *u;

...

u = utsname();

if (copy_to_user(name, u->nodename, i))

errno = -EFAULT;

43

MNT Namespace

View of which filesystems are mounted New mounts only visible in current mnt namespace Unless special flags are used:

mount –make-shared

/ (root)

/mnt /home

sdb1 sda3 mnt ns1

/ (root)

/mnt /home

sdc1 mnt ns1

44

PID Namespace

Processes in different PID namespaces can have the same process ID.

When creating the first process in a new namespace, its PID is 1.

Hierarchy of PID namespaces: PIDs visible to parent namespace Nested upto 32 levels deep

45

User namespace

New namespace = new set of UIDs and GIDs Hierarchy Existing UIDs are mapped into new space

E.g. UID 1000 becomes UID 0 in new space First process in the new space has “root”

Only for namespaces inside the new space! Create container:

1. Create new user namespace2. Create new UTS, MNT, PID, etc. namespaces from that

namespace Outside: permissions of parent UID

46

NET Namespace (1/2)

A network namespace is logically another copy of the network stack: own routes, own firewall rules, own network devices.

A network device belongs to exactly one network namespace

A socket belongs to exactly one network namespace

47

NET Namespace (2/2)

The initial network namespace includes: loopback device all physical devices, networking tables, etc.

New network namespace includes only the loopback device Real devices can be moved into NS Virtual devices can be added Control via ip netns command And e.g. /etc/netns/<nsname>/hosts

48

Containers via namespaces

Create a container: 1. Create a user namespace2. Create a PID and UTS namespace inside3. Create a MNT namespace to get your own

filesystem 4. Mount container disk image5. Create NET namespace, add virtual devices6. Connect virtual devices to real network via e.g.

virtual bridges

49

CGroups

Namespaces can give groups of processes: Same view of the OS Illusion there are no other groups

Control Groups is a mechanism for resource management for groups of processes: Set limits, e.g. on memory usage (main + FS cache) Set priorities (CPU or disk bandwidth) Accounting Checkpointing

Recommended