reli liability aware scheduling on
play

Reli liability-Aware Scheduling on Heterogeneous Multicore - PowerPoint PPT Presentation

Reli liability-Aware Scheduling on Heterogeneous Multicore Processors Ajeya Naithani Stijn Eyerman Lieven Eeckhout Ghent University, Belgium Intel, Belgium Ghent University, Belgium HPCA 2017 Motivation Wide use of


  1. Reli liability-Aware Scheduling on Heterogeneous Multicore Processors Ajeya Naithani Stijn Eyerman Lieven Eeckhout Ghent University, Belgium Intel, Belgium Ghent University, Belgium HPCA 2017

  2. Motivation • Wide use of Heterogeneous Chip Multi-Processors (HCMPs) in mobile SoCs - e.g., ARM’s big.LITTLE, Nvidia’s Tegra Fast More Transistors! Simple Longer Execution! 1 of 15

  3. Goal • Increasing robustness of HCMPs against soft errors • Reliability-aware scheduling of HCMPs based on: - Core type - Workload 2 of 15

  4. Terminology ACE Bit • ACE bit : Architecturally Correct Execution bit ABC • ABC : ACE Bit Count Time 0 T • AVF : Architectural Vulnerability Factor 𝐵𝐶𝐷 𝐵𝑊𝐺 = 𝑈𝑝𝑢𝑏𝑚 𝐶𝑗𝑢 𝐷𝑝𝑣𝑜𝑢 • IFR : Intrinsic Fault Rate • SER : Soft Error Rate 𝑇𝐹𝑆 = 𝐵𝐶𝐷 × 𝐽𝐺𝑆 𝑈 3 of 15

  5. Reliability-Aware Scheduling • Requires a metric for system-wide reliability 𝑜 𝑇𝑧𝑡𝑢𝑓𝑛_𝑇𝐹𝑆 = ෍ 𝑇𝐹𝑆 𝑗 𝑗=1 Applications may slow down! 4 of 15

  6. Reliability-Aware Scheduling • Reliability Metric: System-level Soft Error Rate(SSER) 𝑜 𝑇𝑇𝐹𝑆 = ෍ 𝑥𝑇𝐹𝑆 𝑗 𝑗=1 • Weighted SER: 𝑥𝑇𝐹𝑆 = 𝐵𝐶𝐷 𝑈𝑠𝑓𝑔 × 𝐽𝐺𝑆 = 𝐵𝐶𝐷 𝑈 × 𝑈𝑠𝑓𝑔 × 𝐽𝐺𝑆 𝑈 5 of 15

  7. Reliability-Aware Scheduling Start No Yes Update all Sampling data Update sampling data the possible wSERs up to date? Switch couple of apps which decreases SSER Sampling Data per Application: 1. ABC on Different Cores 2. Performance on Different Cores Record performance and ABC for each application 6 of 15

  8. Hardware Overhead • Profiled structures: - ROB - Issue queue - Load/store queue - Physical output registers - Functional units • Total hardware overhead: - 904 bytes per big core - 67 bytes per little core • Approximation: Only profile ROB → HW overhead: 296 bytes per big core 7 of 15

  9. Evaluation • Three schedulers: - Random - Reliability-optimized - Performance-optimized • Benchmark suite - SPEC CPU2006 8 of 15

  10. Application Characteristics [Fig. 1 in the paper] 9 of 15

  11. Results – Symmetric HCMP (2B2S) 32% improvement over random scheduling Same as random scheduling 25.4% improvement over performance-optimized scheduling 6.3% degradation over performance-optimized scheduling [Fig. 6 in the paper] 10 of 15

  12. Results – Asymmetric HCMP [Fig. 8 in the paper] 11 of 15

  13. Results – Workload Categories [Fig. 7 in the paper] 12 of 15

  14. Results – Approximate Profiling [Modified version of Fig. 10 in the paper] 13 of 15

  15. Conclusion • Applications and cores → D ifferent vulnerability characteristics • Reliability-aware scheduling: - S ystem reliability 25.4% ↑ - Performance 6.3% ↓ • The proposed scheduler is robust across: - Core configurations - Workload types 14 of 15

  16. Discussion Points • HCMPs are mainly used in mobile SoCs, while the evaluated benchmarks target server/desktop processors. Is the sampling method actually applicable to mobile applications, which are heavily I/O based? • The scheduling is implemented in software. Is it efficient in a battery- constrained platform? • What if the number of applications is more than the number of cores? 15 of 15

Recommend


More recommend