RT-Xen: Real-Time Virtualization Sisu Xi, Meng Xu , Chenyang Lu, Linh T.X. Phan, Christopher D. Gill, Insup Lee, Oleg Sokolsky
Real-Time Virtualization Cars: Consolidate ~100 ECUs -> ~10 multicore processors Infotainment on Linux or Android Safety-critical control on AUTOSAR Cloud Computing’s Killer App: Gaming [IEEE Spectrum] Need to compute and stream 30 to 50 frames per second Applications must meet real-time performance constraints on virtualized platforms! 1
RT-Xen: Real-Time Virtualization Real-time hypervisor scheduling framework in Xen Implement a suite of real-time scheduling algorithms Based on compositional scheduling theory VMs specify resource interfaces Real-time guarantees to tasks in VMs Open source RT-Xen patch submitted https://sites.google.com/site/realtimexen/ 2
Xen Virtualization Architecture Guest OS runs on VCPUs Hypervisor schedules VCPUs on PCPUs Credit scheduler [Weight, Cap] per VM Round robin 3
RT-Xen Interface VM resource interface A set of VCPUs, each characterized by <period, budget> Optional: use cpumask to specify VCPU affinity with PCPUs Hide task-specific information Real-Time scheduling algorithms Ordering of VCPUs? Priority scheme Placement of VCPUs? Global vs. partition Resource isolation? Server mechanisms 4
Real-Time Scheduling Policies Priority scheme Static priority: Deadline Monotonic (DM) Dynamic priority: Earliest Deadline First (EDF) Global scheduling Schedule VCPUs based on global information Allow VCPU migration across cores Flexible use of multiple cores Migration overhead and cache penalty Partitioned scheduling Assign and bind VCPUs to PCPUs Schedule VCPUs on each core independently May underutilize PCPUs No migration overhead or associated cache penalty 5
Scheduling a VCPU as a Deferrable Server A VCPU receives budget us of CPU resources every period us Budget is replenished at every start of period VCPU consumes budget when running, suspends when no budget left Preserves budget when there is no task T2 (10, 3) T1 (10, 3) 0 5 10 15 time Deferrable Budget Server (5,3) 0 5 10 15 time 6
RT-Xen Investigation Roadmap Single-core RT-Xen 1.0 Single-core enhanced RT-Xen 1.1 Multi-core RT-Xen 2.0 Fixed Priority (DM) gDM pDM Capacity Work Reclaiming Conserving Polling Periodic Periodic Deferrable Periodic Periodic Deferrable Sporadic Partitioned Global Scheduling Scheduling Deferrable Periodic Periodic Deferrable gEDF pEDF Dynamic Priority (EDF) 7
RT-Xen 2.0: Run Queues A run queue holds VCPUs that are runnable (have task to run) has two parts: VCPUs with budget and out of budget is sorted by priority (DM or EDF) within each part VCPUs out of budget VCPUs with budget RunQ Sorted by priority rt-global: all cores share one run queue with a spinlock rt-partition: one run queue per core Patches for more efficient implementation on the way! 8
Experimental Setup Hardware: Intel i7 processor, six cores running at 3.33 GHz Dedicate one PCPU to domain 0 All guest VMs use the remaining cores Cache architecture Each core has dedicated L1 cache (32 KB) and L2 cache (256 KB) All six cores share L3 cache (12 MB) Inclusive L3 cache, all data in L2 cache must also be in L3 cache Software Xen 4.3 patched with RT-Xen Guest OS: Linux patched with LITMUS RT 9
RT-Xen 2.0: Scheduling Overhead rt-global has extra overhead due to global lock credit has high max overhead due to load balancing 10
RT-Xen 2.0: Credit Scheduler credit missed deadline at 22% CPU capacity RT-Xen delivers real-time performance up to 78% 11
RT-Xen 2.0: gEDF vs. pEDF Global scheduling wins empirically! gEDF + deferrable server -> best real-time performance 12
Demo YouTube: “RT - Xen Demonstration” https://www.youtube.com/watch?v=wisxWn3mR5s 13
Patch Status July 10 th & July 29 th Patch RFC v1 & v2 gEDF + deferrable server cpupool support Patch RFC v3 Expected on Aug 24 th Scheduling trace support Performance improvement (splitting RunQ) Patch RFC v4 Expected before Sep 10 th Performance improvement • Timer based budget replenishment • Improve the timing resolution of budget 14
Conclusion Diverse applications demand real-time virtualization Real-time virtualization in embedded area Cloud gaming RT-Xen provides real-time performance Efficient implementation of diverse real-time scheduling policies Leverage compositional scheduling theory -> analytical guarantee gEDF + deferrable server wins empirically 15
Research Contributions RT-Xen 1.0: S. Xi, J. Wilson, C. Lu, and C.D. Gill, RT-Xen: Towards Real-Time Hypervisor Scheduling in Xen, ACM International Conferences on Embedded Software (EMSOFT) 2011 RT-Xen 1.1: J. Lee, S. Xi, S. Chen, L.T.X. Phan, C. Gill, I. Lee, C. Lu and O. Sokolsky Realizing Compositional Scheduling through Virtualization, IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS) 2012 RT-Xen 2.0: S. Xi, M. Xu, C. Lu, L.T.X. Phan, C. Gill, I. Lee, and O. Sokolsky Real-Time Multi-Core Virtual Machine Scheduling in Xen, ACM International Conferences on Embedded Software (EMSOFT) 2014 RT-Xen 2.1 + RT-OpenStack: S. Xi, C. Li, C. Lu, C. Gill, M. Xu, L.T.X. Phan, C. Gill, I. Lee, and O. Sokolsky RT-OpenStack: Co-Hosting RT VM with non RT VMs, in submission RTCA: S. Xi, C. Li, C. Lu, and C. Gill Prioritizing Local Inter-Domain Communication in Xen, ACM/IEEE International Symposium on Quality of Service (IWQoS) 2013 RT-Xen patch (gEDF with deferrable server) RT-Xen: Real-Time Virtualization in Xen, Xen Blog, 2013 RT-Xen: Real-Time Virtualization in Xen, Xen Developer Summit, 2014 16
Backup Slides 17
RT-Xen 1.0 18
Implementation – PCPU Budget X Budget VCPU VCPU (period, budget, priority) Task RunQ RunQ Position Params X Task RdyQ RdyQ IDLE IDLE 3 0 Three Queues within One Physical Core Periodic 19
Server Design – Deferrable & Polling 1. Replenish? Servers ( Period, Budget, Priority ) 2. Budget but NO task? T1 ( 10 , 3 S1 (5, 3) with Two ) Tasks T2 ( 10 , 3 ) 0 10 15 Time 5 Actual Execution back-to-back 3 Deferrable Budget in S1 Server 0 10 15 Time 5 Actual Execution 3 Polling Budget in S1 Server 0 10 Time 15 5 20
Server Design – Periodic & Sporadic 1. Replenish? Servers ( Period, Budget, Priority ) 2. Budget but NO task? T1 ( 10 , 3 S1 (5, 3) with Two ) Tasks T2 ( 10 , 3 ) 0 10 15 Time 5 Actual Execution ? theory favored Periodic 3 Budget in S1 Server 0 10 15 Time 5 Actual Execution overhead ++ +3 +3 Sporadic 3 Budget in S1 Server 0 10 15 Time 5 5 5 21
RT-Xen 2.0 22
RT-Xen 2.0: Workload Periodic task sets: [period, execution time, deadline] CPU-intensive, independent tasks Randomly generate the task sets until a total task utilization, then distribute tasks to four VMs, and apply compositional scheduling theory to calculate each VM’s resource interface 25 task sets per data point, measure fraction of schedulable tasksets 23
RT-Xen 2.0: Context Saved Less than 1 us overhead for the spinlock 24 8/18/2014
RT-Xen 2.0: Theory vs. Experiments gEDF < pEDF theoretically due to pessimistic analysis gEDF > pEDF empirically, thanks to global scheduling 25
RT-Xen 2.0: Theory vs. Experiments gEDF > pEDF empirically, thanks to global scheduling gEDF < pEDF theoretically due to pessimistic analysis 26 8/18/2014
RT-Xen 2.0: How about Cache? Benefit of global scheduling dominate migration cost on a shared L3-cache platform. 27
RT-Xen 2.0: Context Switch “Fake” context switch with idle VCPUs 28 8/18/2014
Recommend
More recommend