Xen on ARMA success story
Stefano Stabellini - Citrix Xen Project Team
Achievements of one year
11/11 08/12 11/12 03/13 07/13
First Xen on ARM talk atXen Summit 2012
Xen support for ARM upstream in Linux 3.7
Xen running on real ARM hardware
09/12
Xen 64-bit on ARM64
01/13
Citrix announces that will be joining Linaro
Xen support for ARM64 upstream in Linux 3.11
06/13
Xen 4.3 released with ARM and ARM64 support
Part-time Xen ARM hacking starts
You are here
Xen-devel ARM traffic from August 2012: ● 4685 emails: 360 emails per month!● 39% of which are not from Citrix
A growing community
Hardware supportUpstream:● Versatile Express Cortex A15● Arndale board● ARMv8 FVP
In progress:● Calxeda “Midway”● Applied Micro “Mustang”● Cubieboard2● Broadcom Brahma-B15● OMAP5
Upstream featuresXen v4.3:● basic lifecycle operations● memory ballooning● scheduler configurations and vcpu pinning
Linux v3.11:● dom0 and domU support● 32-bit and 64-bit support● SMP support● PV disk, network and console
Coming in Xen 4.4● 64-bit guest support● live-migration● SWIOTLB
Coming in Xen 4.4● 64-bit guest support● live-migration● SWIOTLB
The problem
virtual address
physical address
machine address
hardware
Linu
xX
en
2 st
age
1 S
tage
The problem: dom0 DMA
virtual address
physical address
machine address
Device DMA
Linu
xX
en
2 st
age
1 S
tage
The best solution: IOMMU
virtual address
physical address
machine address
Device DMA IOM
MU
MM
U
Linu
xX
en
2 st
age
The workaround:Dom0 1:1 mapping
virtual address
physical address = machine address
Device DMA
Linu
xX
en
The workaround:Dom0 1:1 mapping● rigid solution● no ballooning in dom0● no page sharing in dom0● does not work with foreign grant table
mappings
UNHAPPY
The alternative: SWIOTLB
virtual address
physical address
machine address
Device DMA
DM
A o
psM
MU
Linu
x
The alternative: SWIOTLB● use memory_exchange_and_pin hypercall
○ create a contiguous buffer in machine memory○ retrieve the machine address of the buffer
● introduce an additional memcpy
● remove the need for the 1:1 workaround
STILL UNHAPPY
SWIOTLB:the improved versionpin and unpin hypercalls:● dynamically retrieve P2M mappings● pin a mapping for DMA● remove additional memcpy
map_page
XENMEM_pin
pin
pfn
pfn
mfn
mfn
SWIOTLB:the “improved” version● Linux rbtree maintenance is expensive● too many uncached address translations in
Xen○ guest virtual to machine○ guest physical to machine
cpu utilization increase
NOT AN IMPROVEMENT
SWIOTLB: the compromise● keep the dom0 1:1 workaround
○ dom0 without ballooning and page sharing is the default configuration in XenServer x86 today
● use the swiotlb only to handle DMA involving foreign grants○ we already know the p2m mappings of grants
■ no need for pin and unpin hypercalls○ can take shortcuts: avoid many tree lookups○ tree lookups are much faster○ avoidable with IOMMU support
SWIOTLB: the compromiseTesting platform:● 1.5Ghz quad-core Cortex A15● 1 Gbit link
Benchmark results:● same network throughput as native (line
rate)● < 2% cpu usage increase
THAT’S BETTER
SWIOTLB: where to find itThe patches (swiotlb-xen v8):http://marc.info/?l=linux-kernel&m=138203180707683&w=2
The kernel tree:git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git swiotlb-xen-8
Xen 4.5+● IOMMU support in Xen● device assignment● UEFI booting● ACPI support
DEMO
Questions?