Upload
cheng-chun-tu
View
111
Download
0
Tags:
Embed Size (px)
Citation preview
1
A Comprehensive Implementation and Evaluation of Direct Interrupt Delivery
Cheng-Chun Tu, Michael Ferdman, Chao-tung Lee, Tzi-cker Chiueh
Oracle Labs, Stony Brook University & Industrial Technology Research
Institute
2
Motivation
I/O intensive workload incurs high overhead in VMs
High context switch stems from:Setting up the device’s data pathCopying payload to/from systemInterrupt handling
Hardware solution (SR-IOV) and software solution (Paravirtualized) optimize over the first twoReducing the interrupt overhead is the last battleground
3
Hypervisor Context Switch
Under intensive I/O workload:More than 100K VM exits per second, each ~2usOnly 80% of cpu time spent in guest mode
Guest (VM)
Host
Exit handling costguest/host context switch (exit and entries)arrival of the external interrupt
SR-IOVinterrupt
End-of-Interrupt
PVInterruptOr Timer Interrupt
Interrupt Injection
EOI
4
Direct Interrupt Delivery
Definition: An interrupt destined for a VM
goes directly to VM without any additional software intervention.
Challenges: mis-delivery problemDelivering interrupt to the unintended VMRouting: which core is the VM runs on?Scheduled: Is the VM currently de-scheduled or not?Directly signaling completion of interrupt to the controller
VMVirtual deviceLocal APIC timerSRIOV device
5
DID: Requirements
If the target VM is runningAll interrupts are directly deliveredInterrupts from: SRIOV, emulated devices, timers, …etc
If the target VM is not runningDelivered indirectly through virtual interrupt
The interrupt delivery/completion do not incur context switch
No VM exit
No paravirtualization / VM modification
6
Related Work
ELI: Bare-Metal Performance for I/O VirtualizationConstruct a shadow IDT to force VM exit when interrupt is mis-deliverySRIOV: directly deliver, Others: indirect
ELVIS: Efficient and Scalable Paravirtual I/O SystemParavirtualize guest to support direct interruptDID sends IPI directly without requiring guest modification
Fully dedicated HW solutionEx: NoHype, Jailhouse: heavily guest modification
Hardware solutions:Intel APIC-v provides APIC virtualization without trappingOthers include: AMD’s AVIC, ARM VGICDID does not depend on hardware support
7
Contributions
Direct Interrupt Delivery:Leverage existing HW features:
interrupt remapping table (IOMMU), inter-processor-interrupt (IPI), x2APIC support
Implement direct HW/emulated device interrupt, direct timer delivery, direct interrupt completion (EOI)First system supports delivery of all interrupt types
Performance:Reduce invocation latency by 80% (14us -> 2.9us)Improve TCP throughput by 21%Improve Memcached RPS(request per sec) by x3
8
DESIGN / IMPLEMENTATION
Direct Delivery of SRIOV, PV, Timer, and direct EOI
9
Sources of Interrupts
Virtual Device: the I/O thread triggers VM exit by sending IPISRIOV / Local Timer: VM core receiving external interrupt triggers VM exit
Hypervisor
VMcore
SRIOV
Back-endDrivers
VM
core
Virtual deviceLocal APIC timerSRIOV device
Virtual Devices
10
Direct SRIOV Interrupt
Assume VM M, SRIOV virtual function FIf M is running on core C,
Program IOMMU to direct the F’s interrupt to CDisable VMCS External Interrupt Existing (EIE bit) Interrupt from F triggers M’s IDT (Interrupt Descriptor Table)
IOMMU
Core C
VM (M)
1. VM M is running.
SRIOVVF (F)
VMCS: Virtual Machine Control Structure
11
Direct SRIOV Interrupt
If M is de-scheduled:Delivery the interrupt to the hypervisorHypervisor create virtual interrupt when VM is scheduled
Challenges: How to force VM exit? NMIHow to inject virtual interrupt? Self-IPI (not emulated LAPIC)
IOMMU
coreC
VM(N)
2. Interrupt for VM M, but VM M is de-scheduled.
NMI
1. VM Exit
2. KVM receives INT3. Inject vINTSRIOV
VF (F)
12
Virtual Device Interrupt
Assume VM M has virtual device with vector #vDID: Virtual device thread (back-end driver) issues IPI with vector #v to the CPU core running VMThe device’s handler in VM gets invoked directly
core
VM (v)
core
I/O thread
Tradition: send pre-defined IPI and kick off the VM, hypervisor inject virtual interrupt v
core
VM (v)
core
I/O thread
DID: send IPI directly with vector v
VM Exit
Assume device vector #: v
(x) (v)
13
Direct Timer Interrupt / EOI
Direct End Of InterruptSoftware-emulated virtual interrupt requires trapping to the hypervisor and updates related data structureDID uses HW LAPIC without trapping to host, and enable direct EOI
LAPIC
CPU1
Timerexpiration
Local APIC:x86 timer is located in the per-core local APIC registersKVM virtualizes LAPIC timer to VM. Drawback: high latency due to several VM exits per timer operation.Timer interrupt directly deliver to the VM.
VM (t)
exitInject
14
Performance Evaluation
Experimental SetupTwo servers connected back-to-backEach VM pinned to one CPU, 1GB RAM, 1 VF, 1 virtio-net with vhost backend
Compared application performance running in Bare-metal without hypervisorVanilla Linux KVMKVM with DID support
Performance MetricsTime in Guest (TIG): % of time spent in guest modeVM exit rate and breakdown exit reasons
Server:Supermicro E3 tower 8-core Intel Xeon 3.4GHz 8GB memory
SRIOV NIC: Intel 82599
OS/hypervisor: Linux 3.6 QEMU 1.0
VM Exit Breakdown
15
Setup: Client sends UDP packets through SRIOV NIC to VMKVM-TIG: TIG value of vanilla KVMDID-TIG: TIG value of DIDDID keeps most of the cycles in guest mode under intensive I/O!
KVM:- MSR write due to EOI - External INT dominate
DID: - No exit due to interrupt- VM exits rate < 1k- TIG > 99%
16
Interrupt Invocation Latency
Setup: VM runs cyclictest, measuring the latency between hardware interrupt generated and user level handler is invoked. experiment: highest priority, 1K interrupts / secKVM shows 14us due to 3 exits: external interrupt, program x2APIC (TMICT), and EOI per interrupt handling.
KVM latency is much higher due to 3 VM exits
DID has 0.9us overhead
17
Network Performance
SRIOV/SRIOV-DID: compare SRIOV VF in KVM and DID DID improves 11% on TIG, 0.1Gbps on throughput
Unable to transfer cycles to throughput due to link capacity PV/PV-DID: compare virtio-net in KVM and DID
DID improves 10% on TIG, 4Gbps on throughput
VM exits rate:50%: Interrupt50%: EOI
IO Instruction exit due to front end driver
18
Block I/O Performance
Setup: KVM 1GB ramdisk to VM, fio in VM performs 4KB rand read/write Note: In iperf: # of exits due to interrupt is closed to MSR writeObservation: Fio programs timer after submitting I/O as timeout, and clears the timer when the request is complete -> two more MSR writes
high exits due to MSR write due to set/clear timer registers DID remains
high TIG: 98%
19
Conclusion
Achieving high application performance requires CPU to spend most of the cycles in guest mode
DID completely eliminates VM exits due to interrupt dispatches and EOI notification
First system supports delivery of all interrupt types Consistently preserves high TIG%Greatly reduce interrupt invocation latencyLeverage existing hardware and requires no guest modification
20
THANK YOU Question?
Dislike? Like?