Improve ARM guest performance with 64KB pages Julien Grall julien.grall@citrix.com Xen Developper Summit 2015
K´ ezaco Why? Constraints Implementation Improvements Conclusion K´ ezaco ◮ Page is 64KB ◮ Remove 1-level of page table compare to 4K ◮ Faster TLB lookup ◮ Introduced for AArch64 in ARMv8 Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 2 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion 4KB page granularity Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 3 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion 64KB page granularity Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 4 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Why? ◮ Choice of the granularity done at config time in Linux ◮ Some major distribution will ship only Linux with 64KB page Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 5 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Xen and hypercall ◮ Based on 4KB page granularity ◮ Must be able to run guests with different page granularity ◮ Modifying the interface too much might not be possible Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 6 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion PV drivers ◮ Grant are currently only 4KB ◮ Based on the hypercall page granularity ◮ Must be able to talk with the current backend/frontend Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 7 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Goal ◮ First implementation ◮ Allowing 64KB guest running on current Xen ◮ No modification in hypercalls and PV protocol ◮ Get something upstreamed quickly ◮ Linux with 64KB page is crashing at the moment Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 8 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Changes in Xen ◮ Hypervisor : None ◮ Tools : 3 minor patches to use correct size for the rings ◮ Present in Xen 4.6 ◮ Backported requested in Xen 4.5 Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 9 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Changes in Linux ◮ Linux is assuming that Xen is using the same page granularity ◮ Need to introduce XEN PAGE * helpers ◮ 1 foreign grant = 1 Linux page ◮ Easier implementation ◮ 60KB of memory waste per grant ◮ Affect only backend domain ◮ A Linux page may be split between multiple grant Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 10 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Example of handling request on 4K Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 11 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Example of handling request on 64K Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 12 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Changes in Linux - 2 ◮ Introduce of helpers to deal with the splitting ◮ Avoid to expose the page granularity to PV drivers ◮ Easier to spot changes which don’t handle 64/4 KB granularity Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 13 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Improvement - 1 Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 14 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Improvement - 1 Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 15 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Support of 64KB grant - 1 Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 16 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Support of 64KB grant - 2 ◮ PV drivers can take advantages of it ◮ No need to split page ◮ Less grants to setup ◮ Need to find agreement on where the grant size is decided: ◮ during the protocol negotiation ◮ can change for each request Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 17 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Improvement 2 - Memory Usage ◮ Sharing a Linux page between multiple foreign grant ◮ Need some care with swiotlb ◮ Make Xen drivers fully using the Linux page ◮ Event Channel ◮ PV Ring ◮ ... Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 18 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Status ◮ Where are we? ◮ First implementation done ◮ Only net and block PV drivers supported ◮ On the way to version 4 ◮ Future ◮ Write design doc for grant improvement ◮ Fix memory usage with 64KB page granularity ◮ Convert the remaining PV drivers and QEMU Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 19 / 20
K´ ezaco Why? Constraints Implementation Improvements Conclusion Fin Xen Developper Summit 2015 Improve ARM guest performance with 64KB pages 20 / 20
Recommend
More recommend