22
SSG System Software Division Virtualization Technology Overview Liu, Jinsong ([email protected])

2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

SSG System Software Division

Virtualization Technology Overview

Liu, Jinsong

([email protected])

Page 2: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 2

Agenda

• Introduction history

Usage model

•Virtualization overview cpu virtualiztion

memory virtualization

I/O virtualization

•Xen/KVM architecture Xen

KVM

•Some intel work for Openstack OAT

Page 3: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 3

Virtualization history

• 60s’ IBM - CP/CMS on S360, VM370, …

• 70’s 80s’ Silence

• 1998 VMWare - SimOS project, Stanford

• 2003 Xen - Xen project, Cambridge

• After that: KVM/Hyper-v/Parallels …

Page 4: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 4

What is Virtualization

• VMM is a layer of abstraction support multiple guest OSes de-privilege each OS to run as Guest OS

• VMM is a layer of redirection redirect physical platform to virtual platform illusions of many

provide virtaul platfom to guest os

...

Virtual Machine Monitor (VMM)

VMnVM0

Guest OS

VM1

Platform HW

I/O DevicesProcessorsMemory

Apps

Guest OS

Apps

Guest OS

Apps

...

Virtual Machine Monitor (VMM)

VMnVM0

Guest OS

VM1

Platform HW

I/O DevicesProcessorsMemory

Apps

Guest OS

Apps

Guest OS

Apps

Page 5: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 5

Server Virtualization Usage Model

Server Consolidation

Benefit: Cost Savings • Consolidate services • Power saving

HW HW

HW

VMM

Disaster Recovery

HW

VMM

HW

VMM

… OS

App

OS

App

OS

App … OS

App

HW

VMM HW

VMM

• Benefit: Productivity

Dynamic Load Balancing

OS

App 1

OS

App 2

OS

App 3

OS

App 4

CPU Usage

30%

CPU Usage

90%

CPU Usage CPU Usage

Benefit: Business Agility and Productivity

R&D Production

HW

VMM

OS

App

Benefit: Lost saving • RAS • live migration • relief lost

Page 6: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 6

Agenda

• Introduction

•Virtualization overview CPU virtualization

Memory virtualization

I/O virtualization

•Xen/KVM architecture

•Some intel work for Openstack

Page 7: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 7

X86 virtualization challenges • Ring Deprivileging

Goal: isolate guest OS from • Controlling physical resources directly • Modifying VMM code and data

Ring deprivileging layout • vmm runs at full privileged ring0 • Guest kernel runs at

• X86-32: deprivileging ring 1 • X86-64: deprivileging ring 3

• Guest app runs at ring 3

Ring deprivileging problems • Unnecessary faulting

• some privilege instructions • some exceptions

• Guest kernel protection (x86-64)

• Virtualization holes 19 instructions

• SIDT/SGDT/SLDT … • PUSHF/POPF …

Some userspace holes hard to fix by s/w approach • Hard to trap, or • Performance overhead

Page 8: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 8

X86 virtualization challenges

Virtual Machine Monitor (VMM) Virtual Machine Monitor (VMM)

VM 0

Guest OS

Apps

VM 0

Guest Kernel

Guest Apps

VM 0

Guest OS

Apps

VM 1

Guest Kernel

Guest Apps

VM 0

Guest OS

Apps

VM 2

Guest Kernel

Guest Apps

Ring0

Ring1

Ring3

Page 9: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 9

Typical X86 virtualization approaches • Para-virtualization (PV)

Para virtualization approach, like Xen Modified guest OS aware and co-work with VMM Standardization milestone: linux3.0

• VMI vs. PVOPS • Bare metal vs. virtual platform

• Binary Translation (BT) Full virtualization approach, like VMWare Unmodified guest OS Translate binary ‘on-the-fly’

• translation block w/ caching, • usually used for kernel, ~80% native performance • userspace app directly runs natively as much as possible, ~100% native performance • overall ~95% native performance

• Complicated • Involves excessive complexities. e.g., self-modifying code

• Hardware-assisted Virtualization (VT) Full virtualization approach assisted by hardware, like KVM Unmodified guest OS Intel VT-x, AMD-v Benefits:

• Closing virtualization holes in hardware • Simplify VMM software • Optimizing for performance

Page 10: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 10

Memory virtualization challenges

• Guest OS has 2 assumptions expect to own physical memory starting from 0

• BIOS/Legacy OS are designed to boot from address low 1M

expect to own basically contiguous physical memory • OS kernel requires minimal contiguous low memory

• DMA require certain level of contiguous memory

• Efficient MM management, e.g., less buddy overhead

• Efficient TLB, e.g., super page TLB

• MMU virtualization How to keep physical TLB valid

Different approaches involve different complication and overhead

Page 11: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/5/13 11

Machine

Physical

Memory

Hypervisor

Guest

Pseudo

Physical

Memory

5

1

3

2

4

3

2

1

4

5

VM1 VM4 VM3 VM2

Memory virtualization challenges

Page 12: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 12

Memory virtualization approaches • Direct page table

Guest/VMM in same linear space Guest/VMM share same page table

• Shadow page table Guest page table unmodified

• gva -> gpa

VMM shadow page table • gva -> hpa

Complication and memory overhead

• Extended page table Guest page table unmodified

• gva -> gpa • full control CR3, page fault

VMM extended page table • gpa -> hpa • hardware based • good scalability for SMP • low memory overhead • Reduce page fault VMexit greatly

• Flexible choices Para virtualization

• Direct page table • Shadow page table

Full virtualization • Shadow page table • Extended page table

GVA

GPA

HPA

Extended page table

Shadow page table

Direct page table

Guest page table

Page 13: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

13

Shadow page table

• Guest page table remains unmodified to guest Translate from gva -> gpa

• Hypervisor create a new page table for physical Use hpa in PDE/PTE

Translate from gva -> hpa

Invisible to guest

Page Directory

Page Table

PDE

PTE

Page Directory

Page Table

PDE

PTE

vCR3

pCR3

Virtual

Physical

2012/11/28

Page 14: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

14

• Extended page table Guest can have full control over its page tables and events

• CR3, INVLPG, page fault

VMM controls Extended Page Tables • Complicated shadow page table is eliminated • Improved scalability for SMP guest

Guest Page Tables

Extended Page Tables

Guest Physical Address Host Physical

Address Guest Linear

Address

Guest CR3 EPT base pointer

Extended page table

2012/11/28

Page 15: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 15

I/O virtualization requirements

• I/O device from OS point of view Resource configuration and probe I/O request: IO, MMIO I/O data: DMA Interrupt

• I/O Virtualization require presenting guestos driver a complete device interface

• Presenting an existing interface • Software Emulation • Direct assignment

• Presenting a brand new interface • Paravirtualization

Device

CPU

Shared

Memory

Interrupt

Register Access

DMA

Page 16: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 16

I/O virtualization approaches

• Emulated I/O Software emulates real hardware device VMs run same driver for the emulated hardware device Good legacy software compatibility Emulation overheads limit performance

• Paravirtualized I/O Uses abstract interfaces and stack for I/O services FE driver: guest run virtualization-aware drivers BE driver: driver based on simplified I/O interface and stack Better performance over emulated I/O

• Direct I/O Directly assign device to Guest

• Guest access I/O device directly • High performance and low CPU utilization

DMA issue • Guest set guest physical address • DMA hardware only accept host physical address

Solution: DMA Remapping (a.k.a IOMMU) • I/O page table is introduced • DMA engine translate according to I/O page table

Some limitations under live migration

Page 17: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 17

Virtual platform models

ULM

Hypervisor Host

OS

Guest

OS

Guest

Apps

LKM

Guest

OS

Guest

Apps

ULM

U-Hypervisor

Service

VM Preferred

OS

Apps

P Processor Mgt code

M Memory Mgt code

DR Device Driver

DM Device Model

P

P

P M

M

M

DR

DR

DR

DM

DM

Hypervisor Model

DM

Host-based Model Hybrid Model

N NoDMA

N

Preferred

OS

Apps

Guest

OS

Guest

Apps

Page 18: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 18

Agenda

• Introduction

•Virtualization

•Xen/KVM architecture

•Some intel work for Openstack

Page 19: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 19

Xen Architecture

0P

1/3P

3P

I/O: PIT, APIC, PIC, IOAPIC Processor Memory

Control Interface Hypercalls Event Channel Scheduler

Inter-domain Event Channels

Xen Hypervisor

Fro

nt e

nd

Virtu

al

Driv

ers

XenLinux64

DomainU

Ba

ck

en

d

Virtu

al d

rive

r

Callback / Hypercall

Native

Device

Drivers

Co

ntro

l

Pa

ne

l

(xm

/xe

nd

)

XenLinux64

Domain 0

De

vic

e

Mo

de

ls

Virtual Platform

VM Exit

0D

HVM Domain

(64-bit)

3D

Guest BIOS

Unmodified

OS F

E

Driv

ers

Virtual Platform

VM Exit

Guest BIOS

Unmodified

OS

FE

Driv

ers

HVM Domain

(32-bit)

Page 20: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 20

KVM Architecture

VMCS VMCS VMCS

vCPU vMEM vTimer

vPIC vAPIC vIOAPIC

Windows

Guest

Linux

Guest

Qemu-kvm

Linux Kernel

Root

Non Root

KVM module

Page 21: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

2012/11/28 21

Agenda

• Introduction

•Virtualization

•Xen/KVM architecture

•Some intel work for Openstack

Page 22: 2virtualizationtechnologyoverview 13540659831745-phpapp02-121127193019-phpapp01

Trusted Pools - Implementation

Attestation

Service

Scheduler

EC

2 A

PI

OS

AP

I

Query API

User specifies :: Mem > 2G Disk > 50G GPGPU=Intel trusted_host=trusted HW/TXT

Hypervisor / tboot

OS

App App

App

OS

App App

App Host

agent

Attestation Server

Privacy CA

Appraiser

Whitelist DB

Whitelist API

Ho

st A

ge

nt A

PI

Qu

ery

AP

I

OpenStack

TrustedFilter Create

Atte

st

Rep

ort

Qu

ery

tru

ste

d/

u

ntr

uste

d

Create VM

OAT-Based

Tboot-Enabled