46
Intel’s Vision For Virtualization And Benchmarking Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Embed Size (px)

Citation preview

Page 1: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Intel’s Vision For Virtualization And BenchmarkingFernando MartinsDirectorVirtualization Strategy and Planning

Tom Adelmeyer Principal Engineer Virtualization Performanceand Benchmarking

Page 2: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Legal DisclaimerINFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.

Intel may make changes to specifications and product descriptions at any time, without notice.

All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.

Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel, the Intel logo, Intel Leap ahead, Intel Leap ahead logo, Intel vPro, Intel vPro logo, Intel VIIV, Intel VIIV logo, Intel Centrino Duo, Intel Centrino Duo logo, Intel Xeon, Intel Xeon Inside logo, Intel Itanium 2 and Intel Itanium 2 Inside logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries."

*Other names and brands may be claimed as the property of others.

Copyright © 2006 Intel Corporation.Throughout this presentation:VT-x refers to Intel® VT for IA-32 and Intel® 64VT-i refers to the Intel® VT for IA-64, andVT-d refers to Intel® VT for Directed I/O and its extensions

Page 3: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

AbstractThe confluence of compelling usage models and robust solutions is driving virtualization to mainstream adoption

New usage models require radically new approaches to performance measurement and capacity planning

This session will describe Intel’s portfolio of virtualization technologies, and through practical examples provide a deep technical dive into the challenging problem of meaningful benchmarking in a virtualized environment

We will discuss Intel’s research in the space and share our latest results, including vConsolidate - Intel’s seed contribution to a vendor-agnostic standard virtualization benchmark currently being developed by SPEC

Page 4: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Agenda

Intel’s Strategy for VirtualizationIntel® Virtualization Technology EvolutionCurrent and Emerging Usage models Usage Model based Benchmarking

Page 5: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Virtualization Is Mainstream

2005 2006 2007 2008 2009 20100.00%

5.00%

10.00%

15.00%

20.00%IDC Virtualized Server Forecast

Apr-07 Forecast Update

Feb-07 Forecast Update

Sep-06 WW Forecast

41% of new server x86 purchased in 2007 will be virtualized- IDC End User Study; Jun-06

Server Virtualization is now considered a mainstream technology among IT buyers.IT professional are bullish in future use: driving 45% server use in 12 months- IDC Directions 2007 Feb-07

>81% of business are using virtualization in production environments- 451 Group Special Report – Dec-06

Page 6: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Intel’s Strategy

Platform of Choice for Virtualization

Broad Ecosystem Support

Remove Adoption Barriers

Page 7: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Platform Of Choice For Virtualization

Leadership in HW assists for Virtualization

CPU virtualization (VT-x and VT-i)

IO virtualization (VT-d)

Networking virtualization (IOAT and VMDq)

Better Platform Reliability Features Leader in Reliability features

Proven Platform Architecture: 40X more Intel servers

More Power/Performance HeadroomQuad-Core 4-way NICs

Q4’05 IDC server Tracker, 1996-2005 total system shipped

Page 8: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

IA System Virtualization Today

Physical Memory I/O DevicesLogicalProcessors

Virtual Machine Monitor (VMM)

VirtualMachines

BinaryTranslation

Paravirtualization

Page-tableShadowing

IO-DeviceEmulation

InterruptVirtualization

DMA Remap

IA-based System Virtualization TodayRequires Frequent VMM Software Intervention

Page 9: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Hardware support for I/O virtualizationDevice DMA remappingDirect assignment of I/O devices to VMsInterrupt Routing and Remapping

VT-d

Intel® Virtualization Technology Evolution

VMMSoftwareEvolution

PastNo Hardware

Support

Software-only VMMs Binary translation Paravirtualization

Simpler and more Secure

VMM through foundation

of virtualizable ISAs

Vector 3:I/O Focus

Vector 1:Processor Focus

Vector 2:Platform Focus

Establish foundation for virtualization in the

IA-32 and Itanium architectures…

VT-x

VT-i

… followed by on-going evolution of support: Micro-architectural (e.g., lower VM switch times)

Architectural (e.g., Extended Page Tables)

Increasingly better CPU and I/O virtualization performance and functionality as I/O devices

and VMMs exploit infrastructure provided by VT-x, VT-i, VT-d

*Other names and brands may be claimed as the property of others

Today

PCI-SIG

Standards for IO-device sharing:Multi-Context I/O DevicesEndpoint Address Translation CachingUnder definition in the PCI-SIG* IOV WG

IOATVMDq

Page 10: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

CPU Virtualization With VT-x

New CPU Operating Mode

VMX Root Operation (for VMM)Non-Root Operation (for Guest)Eliminates ring deprivileging

New TransitionsVM entry to guest OSVM exit to VMM

VM Control Structure (VMCS)

Configured by VMM softwareSpecifies guest Operating System (OS) stateControls when VM exits occur (eliminates over and under exiting)Supports on-die CPU state caching

VM0

WinXP

Apps

VMn

Linux

Apps

CPUn

Processorswith

VT-x (or VT-i)

VT-x

CPU0

Ring 3

Ring 0

VMXRootMode

H/W VM Control

Structure (VMCS)

VMM

Memory and I/OVirtualization

VMEntry

VMExit

Guest OSes run at intended rings

VMCSConfiguration

Page 11: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Memory Virtualization With EPT

Extended Page Table (EPT)

A new page-table structure under the control of the VMMMap guest-physical to host-physical (accesses memory)

Performance BenefitGuest OS able to freely modify its own page tablesEliminates VM exits due to page faults, INVLPG, or CR3 accesses

Memory SavingsShadow page tables required for each guest user process (w/o EPT)A single EPT supports entire VM

VM0 VMn

VMM

I/OVirtualization

VT-xwith EPT

No VM Exits

ExtendedPage Table (EPT)

CPU0

EPTWalker

Page 12: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

VM0 VMn

I/O Virtualization With VT-dPlatform implementation for I/O virtualization

Defines an architecture for DMA remappingImplemented as part of core logic chipsetWill be supported broadly in Intel server and client chipsets

Improves system reliability

Contains and reports errant DMA to software

Basic infrastructure for I/O virtualization

Enable direct assignment of I/O devices to unmodified or paravirtualized VMs

VMM

Memory

Storage

Network

Guest DeviceDriver

Device Model

Guest DeviceDriver

Device Model

Physical Device Driver

Page 13: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Intel Vt-d Features Include

DMA-remapping Improves reliability and security through device isolationImproves I/O performance through direct assignment of devicesImproves I/O performance of 32 bit devices that happen to use bounce buffer condition

Interrupt-remappingInterrupt isolation: isolate interrupts across VMsInterrupt migration: efficiently migrate interrupts across CPUs

Address Translation Services (ATS)Support for ATS capable endpoint devicesDMA remapping performance improvements

Page 14: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Additional Platform Innovations

Intel® I/OAT (NIC + chipset)Lower CPU OverheadEfficient VM Migration with Data Acceleration support

VMDq Network Performance and CPU utilization reductionEfficient end point sharing

Quad Core Unrivalled energy efficient Performance

Network

Chipset

Processor

Intel’s holistic design approach delivers Platforms built to excel in Virtualization

Page 15: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Physical Memory I/O DevicesLogicalProcessors

Virtual Machine Monitor (VMM)

VirtualMachines

BinaryTranslation

Paravirtualization

Page-tableShadowing

IO-DeviceEmulation

InterruptVirtualization

DMA Remap

Putting It All Together…

Hardware VirtualizationMechanisms under VMM Control

VT-x

VT-d

IOAT

VMDq

QuadCore

Page 16: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Established Usage ModelsServer Consolidation

HWn

HW0

VM1 VMn

OS

App

OS

App …

HW

VM1 VMn

VMM

OS

App

OS

App

Test and Development

HW

VMM

OS

App

OS

App

Cost Savings Power and CoolingHardware, Software, ManagementReal Estate

Business Agility and Productivity

Page 17: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Dynamic Load Balancing Disaster Recovery

Emerging Usage Models

HW0

VM1 VMn

VMM

OS

App

OS

App …

HWn

VM1 VMn

VMM

OS

App

OS

App

HW0

VMM

VM1

OS

App

HW0

VMM

VMn

OS

App

HW

VMM

VMn

OS

App

High Availability and Productivity Business Continuity and Operational

Efficiency

VM1

OS

App

Page 18: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Benchmarking In A Virtualized Environment

Traditional benchmarking covers Performance, Power, Scalability

Metrics: Throughput (MB/s), Response time, #users, etc

Micro-architecture focus: cache sizing, frequency, bandwidth, etc.

New technology requires new areas of analysis and metrics

Areas of focus driven by use models.E.g., VM migration time, VM utilization

Need to measure how Intel® Virtualization technology benefits end-users and ISVs

Page 19: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Benchmarking ChallengesVirtualization presents unique challenges

Which configurations to focus onHomogeneous or heterogeneous OS

Number Virtual Machines

Configuration of individual VMs (CPU, Memory, NIC, HBA, HDD)

Measuring performance Virtual clock accuracy induces platform dependent error

Availability of performance monitoring capabilities

Consolidation use case adds additional testing challengesSynchronicity: Use automation scriptsUtilization: Avoid harmonic bottlenecksSteady State: Easy, repeatable measurements

Only way to overcome the challenges is to develop the benchmarks

Tier consolidation using SAP SDvConsolidate: a server application consolidation benchmark

Page 20: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Tier Consolidation Using SAP-SD

SAP SD (Sales and Distribution)

OLTP-style benchmark that measures performance of a server running the Enterprise Resource Planning (ERP) solution from SAP AG

Tier ConsolidationDatabase and app server run in VMsBenefits of 3-Tier (isolation, maintainability), cost of 2-Tier

Benchmark value Reuse existing Metrics New focus area

Inter VM communication

3-Tier Configuration

Clients

Application Server(s)

Database Server

Storage

Tier Consolidation

Clients

HWVMM

OS

DB

OS

AppSvr

VM1 VMn

Page 21: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Server Consolidation Benchmark vConsolidate

DescriptionBenchmark that represents predominant use case -> server application consolidationApplication types selected for consolidation guided by market data

vConsolidate providesA methodology for measuring performance in a consolidated environmentA means for fellow travelers to publish virtualization performance proof pointsThe ability to analyze performance across VMMs and hardware platforms

Knowledge obtained SPEC virtualization workload

(Data based on IDC Multiclient Report: Server

Virtualization 2005)

Percent of Application Types Consolidated

Application TypeBusiness Processing 26.2%

Database 28.5%

Decision Support 9.2%

Collaborative 8.4%

Application Development 12.0%

Web Infrastructure 6.8%

IT Infrastructure 4.8%

Technical 3.5%

Other 0.6%

Page 22: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

vConsolidate Framework

5 Virtual Machines3 Clients: Controller, Mail, and Web

Page 23: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Series1

Software Stack

Database

Mail

Web

Java

WebWebbenchMicrosoft* Windows/IIS

or RedHat Linux/Apache

JavaModified Specjbb2005

BEA* JrockitMicrosoft* Windows

or RedHat Linux

MailMicrosoft* LoadSim

Microsoft* Exchange

Microsoft* Windows

DatabaseSysbenchWindows*/SQL Server

or Linux*/MySQL

*Other names and brands may be claimed as the property of others

Page 24: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

vConsolidate CSUConsolidation Stack Unit – (CSU)

Smallest granule in vCon

Consist of 5 Virtual Machines Database

Commercial Mail

Web Server

Java Application Server

Idle

Each CSU represents single score

Final score is aggregate of the individual CSU scores

Series1

1 CSU

Database

Mail

WebJava

Database

Web

Web

Java idle

idle

Page 25: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

vConsolidate ProfilesWorkload

vCPUs vMemory OS App vCPUs vMemory OS AppWebWebbench 1 1.0 GB

Windows32-bit IIS 2 1.5 GB

Windows32-bit IIS

MailLoadsim 1 1.0 GB

Windows32-bit Exchange 1 1.5 GB

Windows32-bit Exchange

DatabaseSysbench 1 1.0 GB

Windows32-bit MS SQL 2 1.5 GB

Windows64-bit MS SQL

JavaSPECjbb 1 1.7 GB

Windows32-bit BEA JVM 2 2.0 GB

Windows64-bit BEA JVM

Idle 1 0.4 GBWindows

32-bit 1 0.4 GBWindows

32-bit

Workload

vCPUs vMemory OS App vCPUs vMemory OS AppWebWebbench 2 1.5 GB

Linux32-bit Apache 2 2.0 GB

Windows32-bit IIS

MailLoadsim 1 1.5 GB

Windows32-bit Exchange 2 2.0 GB

Windows32-bit Exchange

DatabaseSysbench 2 1.5 GB

Linux64-bit MySQL 4 2.0 GB

Windows64-bit MS SQL

JavaSPECjbb 2 2.0 GB

Linux64-bit BEA JVM 4 2.0 GB

Windows64-bit BEA JVM

Idle 1 0.4 GBWindows

32-bit 1 0.4 GBWindows

32-bit

Profile # 4Profile # 3

Profile # 1 Profile # 2

Workload

vCPUs vMemory OS App vCPUs vMemory OS AppWebWebbench 1 1.0 GB

Windows32-bit IIS 2 1.5 GB

Windows32-bit IIS

MailLoadsim 1 1.0 GB

Windows32-bit Exchange 1 1.5 GB

Windows32-bit Exchange

DatabaseSysbench 1 1.0 GB

Windows32-bit MS SQL 2 1.5 GB

Windows64-bit MS SQL

JavaSPECjbb 1 1.7 GB

Windows32-bit BEA JVM 2 2.0 GB

Windows64-bit BEA JVM

Idle 1 0.4 GBWindows

32-bit 1 0.4 GBWindows

32-bit

Workload

vCPUs vMemory OS App vCPUs vMemory OS AppWebWebbench 2 1.5 GB

Linux32-bit Apache 2 2.0 GB

Windows32-bit IIS

MailLoadsim 1 1.5 GB

Windows32-bit Exchange 2 2.0 GB

Windows32-bit Exchange

DatabaseSysbench 2 1.5 GB

Linux64-bit MySQL 4 2.0 GB

Windows64-bit MS SQL

JavaSPECjbb 2 2.0 GB

Linux64-bit BEA JVM 4 2.0 GB

Windows64-bit BEA JVM

Idle 1 0.4 GBWindows

32-bit 1 0.4 GBWindows

32-bit

Profile # 4Profile # 3

Profile # 1 Profile # 2

Profile defines a CSU configurations

Page 26: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Methodology Running vConsolidate

Controller applicationStarts the tests via helper scripts; Runs for 30 minutesStops the test and reports score

Time measured in “Controller Client” external timer

ScoringThe “Controller” application calculates final scoreSpecJBB, Sysbench and Loadsim - transactions/second WebBench – throughput

CSU Final Score = GEOMEAN (VM Relative Perf[i])

Page 27: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

vConsolidateExample Scoring

VM relative scores = Measured/Reference (E.g., WebBench = 3.52)

1 CSU score: GEOMEAN (3.52, 1.04, 1.14, 1.16) = 1.48

#CSU CPU% Web Java Database Mail

    Raw RelativeRaw RelativeRaw Relative Raw Relative

1 65% 1124 3.52 14842 1.04 229 1.14 15.6 1.16

Web Java Database Mail

319 14236 201 13.5

Reference Score

Measured Score: SUT

Page 28: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

vConsolidate Results

77%68%

33%

Dual Core AMD

Dual Core Intel

Quad Core Intel

SystemUtilization

Lower is better

1.0+7%

+22%

Dual Core AMD

Dual Core Intel

Quad Core Intel

SystemPerformance

Higher is better

vConsolidate Workload with Xen Release 3.0.3* running Red Hat Linux 4 update 4*

22%more

responsive

Headroomfor

more VMs

Administrators can - at minimum - double the number of virtual

machines on the server using quad core instead of dual core,

with the same amount of power and space

Page 29: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

What Is Intel Doing On Benchmarking?

Seeding Industry with Benchmark Workloads vConsolidate– Consolidated stack of business workloads consisting of Server Side Java, Commercial Database, Commercial Mail, Commercial Web Server on 4 VMs

Collaborating with Virtualization leaders Microsoft and OEMs - consolidation workloads, methodology & metrics VMware – VMmark* consolidation stack

Establishing benchmarks with ISV/OSVs Contributing to standard benchmarks through SPEC (long term)

New usage models

New benchmarks

*Other names and brands may be claimed as the property of others.Intel is working with the Industry to remove barriers to adoption

Page 30: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Summary

Platform of Choice for VirtualizationDedicated HW support Reliability LeadershipHigh Performance / Energy Efficient

Broader Ecosystem Support VMM vendors, ISVs, OEMs, SIGs, Standards

Removing Adoption BarriersEducation Programs / Best PracticesNew Benchmarks

Page 31: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

FACT: “Works with Windows Vista” Means:Baseline Compatibility

Page 32: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking
Page 33: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Intel® Dual-Port Gigabit Ethernet Controller (Zoar)

Dual Port 10/100/1000 x4 PCI Express* Gigabit Ethernet Controller

External InterfacesDual 1000BASE-T, SerDes, and SGMII interfacesPCIe ver 1.1 x4

Intel® I/O Acceleration Technology (IOAT2)

MSI-XLow Latency InterruptDirect Cache AccessHeader-splitting and replication

Virtualization support (VMDq): 4 TX/RX Queues (per port)I/O Enhancements

Offloads compatible with IPv4, IPv6 & multiple VLAN tagsReceive Side Scaling

ManageabilityPXE, iSCSI BootRMII, SMBus Interfaces

ECC on all memory25mm x 25mm FCBGA

Schedule Sampling now Production: Q2’07

GbE MACGbE MAC

PHYPHY SerDesSerDes

DMA/Host Interface

PCIPCI--ExpressExpress

DMA/Host Interface

Mgmt

GbE GbE MACMAC

PHYPHYSerDesSerDes

GbE GbE MACMAC

PHYPHYSerDesSerDes

PCIex4, x2, x1

Mgmt RAM

FIFOTX

FIFOMgmt RAM

FIFO

GbE MACGbE MAC

PHYPHY SerDesSerDes

GbE MACGbE MAC

PHYPHY SerDesSerDes

RXFIFO

TXFIFO

RXFIFO

RMII SMBus

SerDes/SGMII

1000BASE-T

1000BASE-T

SerDes/SGMII

Page 34: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

A More Reliable ServerUnique Intel x86 Reliability Features

Enabled by a combination of processor, chipset and platform memory technologies. Data as of March 6, 2006

Other x86 Based

Servers

A Better Business Foundation Less Downtime, Higher Service Availability and Improved Confidence

Intel Xeon processor

Based Servers

Memory Sparing

Predicts a “failing” DIMM & copies the data to a spare memory DIMM , maintaining server available & uptime

Memory Mirroring

Data is written to 2 locations in system memory so that if a DRAM device fails, mirrored memory enables continued operation and data availability

Symmetric Access

to all CPUs

Enables a system to restart and operate if the primary processor fails

Memory CRC (FBD)

Address & command transmissions are automatically retried if a transient error occurs vs. the potential of silent data corruption

Enhanced Memory ECC

Retry double-bit errors vs. standard memory ECC that does single-bit errors only

Memory ECC Detects & corrects single-bit errors

Data Integrity & Availability

Data Integrity & Availability

Continued Operation

& Availability

Data Availability

Data Protection

Server Continuity

Feature Benefit Description

Page 35: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

VT-d OverviewIntel Virtualization TechnologyFor Directed I/O

Page 36: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Options For I/O Virtualization

Pro: Higher PerformancePro: I/O Device SharingPro: VM MigrationCon: Larger Hypervisor

Hypervisor

SharedDevices

I/O Services

Device Drivers

VM0

Guest OSand Apps

VMn

Guest OSand Apps

Monolithic Model

Pro: Highest PerformancePro: Smaller HypervisorPro: Device assisted sharingCon: Migration Challenges

AssignedDevices

Hypervisor

VM0

Guest OSand Apps

DeviceDrivers

VMn

Guest OSand Apps

DeviceDrivers

Pass-through Model

VT-d Goal: Support all Models

Pro: High SecurityPro: I/O Device SharingPro: VM MigrationCon: Lower Performance

SharedDevices

I/O Services

Hypervisor

Device Drivers

Service VMs

VMn

VM0

Guest OSand Apps

Guest VMs

Service VM Model

Page 37: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

VT-d OverviewVT-d is platform infrastructure for I/O virtualization

Defines architecture for DMA remappingImplemented as part of platform core logicWill be supported broadly in Intel server and client chipsets

CPU CPU

DRAM

South Bridge

System Bus

PCI Express

PCI, LPC, Legacy devices, …

IntegratedDevices

North Bridge

VT-dPCIe* Root Ports

Page 38: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

VT-d Usage

Basic infrastructure for I/O virtualizationEnable direct assignment of I/O devices to unmodified or paravirtualized VMs

Improves system reliability Contain and report errant DMA to software

Enhances security Support multiple protection domains under SW controlProvide foundation for building trusted I/O capabilities

Other usagesGeneric facility for DMA scatter/gatherOvercome addressability limitations on legacy devices

Page 39: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Memory-resident Partitioning And

Translation Structures

Device Assignment Structures

Address Translation Structures

Device D1

Device D2

Address Translation Structures

VT-d Architecture Detail

DMA Requests

Device ID Virtual Address Length

Memory Access with System Physical Address

DMA RemappingEngine

Translation Cache

Context Cache

Fault Generation

…Bus 255

Bus 0

Bus N

Dev 31, Func 7

Dev P, Func 1

Dev 0, Func 0

Dev P, Func 2

PageFram

e

4KB Page Tables

Page 40: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

VT-d: Remapping Structures

PControlsRsvdPage-Table Root Pointer

Address WidthRsvdRsvd Domain ID

Ext.Controls

0

64

63

127

VT-d Page Table Entry

RSP

Page-Frame / Page-Table Address

063

WAvailableRsvd Rsvd Ext.Controls

VT-d supports hierarchical page tables for address translationPage directories and page tables are 4 KB in size4KB base page size with support for larger page sizesSupport for DMA snoop control through page table entries

VT-d hardware selects page-table based on source of DMA request Requestor ID (bus / device / function) in request identifies DMA source

VT-d Device Assignment Entry

Page 41: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

VT-d: Hardware Page Walk

000000bBus Device Func

0237815

Requestor ID

DeviceAssignment

Tables

Base

Level-4 Page Table

Level-3 Page Table

Level-2 Page Table Level-1

Page Table

Page

Example Device Assignment Table Entry specifying 4-level page table

56

DMA Virtual Address

011

Level-4 table offset

Level-3 table offset

Level-2 table offset

Level-1 table offset

1220212930383947

000000000b

63 4857

Page Offset

Page 42: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

VT-d: Translation CachingArchitecture supports caching of remapping structures

Context Cache: Caches frequently used device-assignment entries

IOTLB: Caches frequently used translations (results of page walk)

Non-leaf Cache: Caches frequently used page-directory entries

When updating VT-d translation structures, software enforces consistency of these caches

Architecture supports global, domain-selective, and page-range invalidations of these caches

Primary invalidation interface through MMIO registers for synchronous invalidations

Extended invalidation interface for queued invalidations

Page 43: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

VT-d: Extended FeaturesPCI Express protocol extensions being defined by PCISIG for Address Translation Services (ATS)

Enables scaling of translation caches to devicesDevices may request translations from root complex and cacheProtocol extensions to invalidate translation caches on devices

VT-d extended capabilitiesSupport for ATSEnables VMM software to control device participation in ATSReturns translations for valid ATS translation requestsSupports ATS invalidationsProvides capability to isolate, remap and route interrupts to VMsSupport device-specific demand paging by ATS capable devices

VT-d Extended features utilize PCI Express enhancements being pursued within the PCI-SIG

Page 44: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Extended Page Tables (EPT)

A VMM must protect host physical memoryMultiple guest operating systems share the same host physical memory

VMM typically implements protections through “page-table shadowing” in software

Page-table shadowing accounts for a large portion of virtualization overheads

VM exits due to: #PF, INVLPG, MOV CR3

Goal of EPT is to reduce these overheads

Page 45: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

What Is EPT?

Extended Page TableA new page-table structure, under the control of the VMM

Defines mapping between guest- and host-physical addressesEPT base pointer (new VMCS field) points to the EPT page tablesEPT (optionally) activated on VM entry, deactivated on VM exit

Guest has full control over its own IA-32 page tables

No VM exits due to guest page faults, INVLPG, or CR3 changes

Guest IA-32PageTables

Guest Linear Address Guest Physical Address ExtendedPageTables

Host Physical Address

EPT Base Pointer (EPTP)CR3

Page 46: Fernando Martins Director Virtualization Strategy and Planning Tom Adelmeyer Principal Engineer Virtualization Performance and Benchmarking

Guest Linear Address

EPT Tables

CR3

EPT Tables

+

EPT Tables

+

Page TablePage

Directory

Host Physical Address

Guest Physical

Page Base Address

+

Guest Physical Address

EPT Translation: Details

All guest-physical memory addresses go through EPT tables(CR3, PDE, PTE, etc.)

Above example is for 2-level table for 32-bit address spaceTranslation possible for other page-table formats (e.g., PAE)