a coordinated approach for
play

A Coordinated Approach for Practical OS-Level Cache Management in - PowerPoint PPT Presentation

ECRTS 2013 A Coordinated Approach for Practical OS-Level Cache Management in Multi-Core Real-Time Systems Hyoseung Kim Arvind Kandhalu Prof. Raj Rajkumar Electrical and Computer Engineering Carnegie Mellon University ECRTS 2013 Why


  1. ECRTS 2013 A Coordinated Approach for Practical OS-Level Cache Management in Multi-Core Real-Time Systems Hyoseung Kim Arvind Kandhalu Prof. Raj Rajkumar Electrical and Computer Engineering Carnegie Mellon University

  2. ECRTS 2013 Why Multi-Core Processors? • Processor development trend – Increasing overall performance by integrating multiple cores • Embedded systems: Actively adopting multi-core CPUs • Automotive: – Freescale i.MX6 Quad-core CPU – Qorivva Dual-core ECU • Avionics and defense: – COTS multi-core processors – ex) Rugged Intel i7-based single board computers Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 2/24

  3. ECRTS 2013 Multi-Core CPUs for Real-Time Systems • Large shared cache in COTS multi-core processors Intel Core i7 8-15 MB L3 Cache Freescale i.MX6 1MB L2 Cache • Use of shared cache in real-time systems – Reduce task execution time – Consolidate more tasks on a single multi-core chip processor – Implement a cost-efficient real-time system Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 3/24

  4. ECRTS 2013 Uncontrolled Shared Cache 1. Inter-core Interference 2. Intra-core Interference Tasks Tasks C1 C2 C3 C4 Cores Cores C1 C2 C3 C4 L1 L1 L1 L1 Private L1 L1 L1 L1 Private Caches L2 L2 L2 L2 Caches L2 L2 L2 L2 Shared Shared L3 L3 Cache Cache 40% Slowdown* 27% Slowdown* Uncontrolled use of shared cache  Severely degrade the predictability of real-time systems * PARSEC Benchmark on Intel i7 Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 4/24

  5. ECRTS 2013 Cache Partitioning • Page coloring (S/W cache partitioning) – Can be implemented on COTS multi-core processors – Provides cache performance isolation among tasks g bits (Page size : 2 g ) Task virtual address Virtual page # Page offset Color Index Physical address Physical page # Page offset (s+ l – g) bits l bits s bits (# of sets: 2 s ) (cache-line: 2 l ) Cache mapping Set index Line offset Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 5/24

  6. ECRTS 2013 Problems with Page Coloring (1/2) 1. Memory co-partitioning problem – Physical pages are grouped into memory partitions – Memory usage ≠ Cache usage Virtual Address Physical pages Cache partitions Space (Memory partitions) Task τ 1 1 Color Index 0 2 … If 𝜐 2 ’s memory usage < 2 memory partitions Color Index 1 i  Memory wastage i+1 …… …… Task τ 2 If 𝜐 1 ’s memory usage > 1 memory partition 1 Color Index 29 2  Page swapping or memory pressure … Color Index 30 i i+1 Color Index 31 Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 6/24

  7. ECRTS 2013 Problems with Page Coloring (2/2) 2. Limited number of cache partitions – Results in degraded performance as the number of tasks increases – The number of tasks cannot exceed the number of cache partitions 32 Cache partitions 32 Tasks Task τ 1 Color Index 0 Task τ 2 Color Index 1 …… … Task τ 30 Color Index 29 Task τ 31 Color Index 30 Color Index 31 Task τ 32 Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 7/24

  8. ECRTS 2013 Our Goals • Challenges – Uncontrolled shared cache: Cache interference penalties – Cache partitioning (page coloring): • Memory co-partitioning  Memory wastage or shortage • Limited number of cache partitions • Key idea: Controlled sharing of partitioned caches while maintaining timing predictability 1. Provide predictability on multi-core real-time systems 2. Mitigate the problems of memory co-partitioning, limited partitions 3. Allocate cache partitions efficiently Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 8/24

  9. ECRTS 2013 Outline • Motivation • Coordinated Cache Management – System Model – Per-core Cache Reservation – Reserved Cache Sharing – Cache-Aware Task Allocation • Evaluation • Conclusion 9/24

  10. ECRTS 2013 System Model 𝑞 , 𝑈 𝑗 , 𝐸 𝑗 , 𝑁 𝑗 Task Model 𝜐 𝑗 : 𝐷 𝑗 • 𝑞 : Worst-case execution time (WCET) of task 𝜐 𝑗 , – 𝐷 𝑗 when it runs alone in a system with 𝑞 cache partitions 𝑞 is non-increasing with 𝑞  𝐷 𝑗 WCET – 𝑈 𝑗 : Period of task 𝜐 𝑗 – 𝐸 𝑗 : Relative deadline of task 𝜐 𝑗 – 𝑁 𝑗 : Maximum physical memory 1 2 3 4 5 6 # of cache partitions requirement of task 𝜐 𝑗 • Partitioned fixed-priority preemptive scheduling • Assumptions – Tasks do not self-suspend – Tasks do not share memory Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 10/24

  11. ECRTS 2013 Coordinated Cache Management Bounded Memory Cache Tasks Penalties partitions partitions Coordinated Cache 1 Management Core + Partitioned Fixed-priority Scheduling 1 Page Coloring (Cache Partitioning) 2 1. Per-core Cache Reservation Mechanisms for controlled sharing of Core cache partitions + 2. Reserved 3 2 Cache Sharing 3. Cache-Aware Policy module controlling the mechanisms Task Allocation … … … Task Parameters N P -1 Per-core cache reservation Reserved cache sharing: Mitigate the problems with page coloring Core + τ i :( C i p , T i , D i , M i )  Prevent Inter-core cache interference N C Considerations 1. Preserving schedulability N P 2. Guaranteeing memory requirements Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 11/24

  12. ECRTS 2013 Intra-Core Cache Interference 1. Cache warm-up delay – Occurs at the beginning of each period of a task – Caused by the executions of other tasks while the task is inactive 2. Cache-related preemption delay – Occurs when a task is preempted by a higher-priority task – Imposed on the preempted task 𝜐 2 arrival 𝜐 1 arrival Tasks 𝜐 1 𝐷 1 =3 (High) Resumption Preemption 𝜐 2 Bounds intra-core cache interference 𝐷 2 =3 (Low) Our RT-test Independent of specific cache analysis used t t+1 t+2 t+3 t+4 t+5 t+6 t+7 t+8 t+9 t+10 Time Allows estimating WCET in isolation from others Cache warm-up delay Cache-related preemption delay Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 12/24

  13. ECRTS 2013 Page Allocation for Cache Sharing • Sharing cache partitions = Sharing memory partitions – Cache sharing can be restricted by task memory requirements – Depends on how pages are allocated • Our approach – Allocate pages to a task from memory partitions in round-robin order Cache partitions Virtual Address Memory partitions Space Task τ 1 Color Index 0 1 2 8 pages … Color Index 1  Bounds the worst-case memory usage in a memory partition ……  Developed a memory feasibility test for cache-partition sharing 4 pages from each Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 13/24

  14. ECRTS 2013 Coordinated Cache Management Bounded Memory Cache Tasks Penalties partitions partitions Coordinated Cache 1 Partitioned Fixed-priority Task Scheduling Management Core + 1 Page Coloring (Cache Partitioning) 2 1. Per-core Cache Reservation Core + 2. Reserved 3 2 Cache Sharing 3. Cache-Aware Task Allocation … … … Task Parameters N P -1 p , T i , D i , M i ) Cache-Aware Task Allocation Core + τ i :( C i  Algorithm to allocate tasks and cache partitions to cores N C N P Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 14/24

  15. ECRTS 2013 Cache-Aware Task Allocation (1/2) • Objectives – Reduce the number of cache partitions required for a given taskset • Remaining cache partitions Non-real-time tasks Saving CPU usage – Exploit the benefits of cache sharing • Our approach – Based on the BFD (best-fit decreasing) bin-packing heuristic • Load concentration is helpful for cache sharing – Gradually assign caches to cores while allocating tasks to cores • Use cache reservation and cache sharing during task allocation Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 15/24

  16. ECRTS 2013 Cache-Aware Task Allocation (2/2) • Step 1: Each core is initially assigned zero cache partitions • Step 2 : Find a core where a task fits best • Step 3 : If not found, try to find the best-fit core for the task, assuming each core has 1 more cache partition than before • Step 4: Once found, the best-fit core is assigned the task and the assumed cache partition(s) Utilization of 𝜐 1 decreased ( U i = C i / T i ) Available cache partitions: Assigned cache partitions Tasks: Remaining Remaining 𝜐 1 0.7 (Harmonic) space: 0.5 space: 0.3 𝜐 2 0.4 𝜐 2 0.4 𝜐 4 0.2 𝜐 1 0.7 𝜐 3 0.3 𝜐 1 0.5 𝜐 3 0.3 𝜐 4 0.2 Core 1 Core 2 Core 3 Core 4 Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 16/24

  17. ECRTS 2013 Outline • Motivation • Coordinated Cache Management – Task model – Per-core Cache Reservation – Reserved Cache Sharing – Cache-Aware Task Allocation • Evaluation • Conclusion 17/24

  18. ECRTS 2013 Implementation • Based on Linux/RK Memory Reservation – Page pool stores unallocated physical pages – Classifies pages into memory partitions with their color indices Page Pool of Linux/RK Memory Reservation RT Taskset Pages in Mem-partition Mem-partition header Task i : Parameters    - p Cache color index: 1 : C , T , D , M i i i i i - Mem Req M i = m pages - Cache indices, Core index Cache color index: 2 … Task i : CPU/Mem reserve with cache partitions c Cache color index: N P Motivation → Coordinated Cache Mgmt → Evaluation → Conclusion 18/24

Recommend


More recommend