Xen on ARM A success story Stefano Stabellini - Citrix Xen Project Team
Achievements of one year Xen support for ARM Xen 64-bit Xen support for ARM64 upstream in Linux 3.7 on ARM64 upstream in Linux 3.11 You are here 11/11 08/12 09/12 11/12 01/13 03/13 06/13 07/13 Part-time Xen ARM Xen running on real Citrix announces Xen 4.3 released with ARM hardware hacking starts that will be joining ARM and ARM64 Linaro support First Xen on ARM talk at Xen Summit 2012
A growing community Xen-devel ARM traffic from August 2012: ● 4685 emails: 360 emails per month! ● 39% of which are not from Citrix
Hardware support Upstream: ● Versatile Express Cortex A15 ● Arndale board ● ARMv8 FVP In progress: ● Calxeda “Midway” ● Applied Micro “Mustang” ● Cubieboard2 ● Broadcom Brahma-B15 ● OMAP5
Upstream features Xen v4.3: ● basic lifecycle operations ● memory ballooning ● scheduler configurations and vcpu pinning Linux v3.11: ● dom0 and domU support ● 32-bit and 64-bit support ● SMP support ● PV disk, network and console
Coming in Xen 4.4 ● 64-bit guest support ● live-migration ● SWIOTLB
Coming in Xen 4.4 ● 64-bit guest support ● live-migration ● SWIOTLB
The problem virtual address 1 Stage Linux physical address 2 stage Xen machine address hardware
The problem: dom0 DMA virtual address 1 Stage Linux physical address 2 stage Xen machine address Device DMA
The best solution: IOMMU virtual address Linux MMU physical address 2 stage machine address IOMMU Xen Device DMA
The workaround: Dom0 1:1 mapping virtual address Linux physical address = machine address Xen Device DMA
The workaround: Dom0 1:1 mapping ● rigid solution ● no ballooning in dom0 ● no page sharing in dom0 ● does not work with foreign grant table mappings
UNHAPPY
The alternative: SWIOTLB virtual address MMU physical address DMA ops Linux machine address Device DMA
The alternative: SWIOTLB ● use memory_exchange_and_pin hypercall ○ create a contiguous buffer in machine memory ○ retrieve the machine address of the buffer ● introduce an additional memcpy ● remove the need for the 1:1 workaround
STILL UNHAPPY
SWIOTLB: the improved version pin and unpin hypercalls: ● dynamically retrieve P2M mappings ● pin a mapping for DMA ● remove additional memcpy pfn map_page pfn XENMEM_pin mfn pin mfn
SWIOTLB: the “improved” version ● Linux rbtree maintenance is expensive ● too many uncached address translations in Xen ○ guest virtual to machine ○ guest physical to machine cpu utilization increase
NOT AN IMPROVEMENT
SWIOTLB: the compromise ● keep the dom0 1:1 workaround ○ dom0 without ballooning and page sharing is the default configuration in XenServer x86 today ● use the swiotlb only to handle DMA involving foreign grants ○ we already know the p2m mappings of grants ■ no need for pin and unpin hypercalls ○ can take shortcuts: avoid many tree lookups ○ tree lookups are much faster ○ avoidable with IOMMU support
SWIOTLB: the compromise Testing platform: ● 1.5Ghz quad-core Cortex A15 ● 1 Gbit link Benchmark results: ● same network throughput as native (line rate) ● < 2% cpu usage increase
THAT’S BETTER
SWIOTLB: where to find it The patches (swiotlb-xen v8): http://marc.info/?l=linux-kernel&m=138203180707683&w=2 The kernel tree: git://git.kernel.org/pub/scm/linux/kernel/git/sstabellini/xen.git swiotlb-xen-8
Xen 4.5+ ● IOMMU support in Xen ● device assignment ● UEFI booting ● ACPI support
DEMO
Questions?
Recommend
More recommend