the linux scheduler a decade of wasted cores
play

THE LINUX SCHEDULER: A DECADE OF WASTED CORES Jean-Pierre Lozi - PowerPoint PPT Presentation

THE LINUX SCHEDULER: A DECADE OF WASTED CORES Jean-Pierre Lozi Baptiste Lepers Fabien Gaud jplozi@unice.fr baptiste.lepers@epfl.ch me@fabiengaud.net Vivien Quma Alexandra Fedorova vivien.quema@imag.fr sasha@ece.ubc.ca Justin Funston


  1. THE LINUX SCHEDULER: A DECADE OF WASTED CORES Jean-Pierre Lozi Baptiste Lepers Fabien Gaud jplozi@unice.fr baptiste.lepers@epfl.ch me@fabiengaud.net Vivien Quéma Alexandra Fedorova vivien.quema@imag.fr sasha@ece.ubc.ca Justin Funston jfunston@ece.ubc.ca THE LINUX SCHEDULER: A DECADE OF WASTED CORES 1/16

  2. INTRODUCTION  Take a machine with a lot of cores (64 in our case) THE LINUX SCHEDULER: A DECADE OF WASTED CORES 2/16

  3. INTRODUCTION  Take a machine with a lot of cores (64 in our case)  Run two CPU-intensive processes in two terminals (e.g. R scripts): R < script.R --nosave & R < script.R --nosave & THE LINUX SCHEDULER: A DECADE OF WASTED CORES 2/16

  4. INTRODUCTION  Take a machine with a lot of cores (64 in our case)  Run two CPU-intensive processes in two terminals (e.g. R scripts): R < script.R --nosave & R < script.R --nosave &  Compile your kernel in a third terminal: make – j 62 kernel THE LINUX SCHEDULER: A DECADE OF WASTED CORES 2/16

  5. INTRODUCTION  Take a machine with a lot of cores (64 in our case)  Run two CPU-intensive processes in two terminals (e.g. R scripts): R < script.R --nosave & R < script.R --nosave &  Compile your kernel in a third terminal: make – j 62 kernel  Here is what might happen: THE LINUX SCHEDULER: A DECADE OF WASTED CORES 2/16

  6. INTRODUCTION  Take a machine with a lot of cores (64 in our case)  Run two CPU-intensive processes in two terminals (e.g. R scripts): R < script.R --nosave & R < script.R --nosave &  Compile your kernel in a third terminal: make – j 62 kernel  Here is what might happen:  Two NUMA nodes with many idle cores (white) THE LINUX SCHEDULER: A DECADE OF WASTED CORES 2/16

  7. INTRODUCTION  Take a machine with a lot of cores (64 in our case)  Run two CPU-intensive processes in two terminals (e.g. R scripts): R < script.R --nosave & R < script.R --nosave &  Compile your kernel in a third terminal: make – j 62 kernel  Here is what might happen:  Two NUMA nodes with many idle cores (white)  Other NUMA nodes with many overloaded cores (orange, red) THE LINUX SCHEDULER: A DECADE OF WASTED CORES 2/16

  8. Performance degradation: INTRODUCTION 14% for the make process!  Take a machine with a lot of cores (64 in our case)  Run two CPU-intensive processes in two terminals (e.g. R scripts): R < script.R --nosave & R < script.R --nosave &  Compile your kernel in a third terminal: make – j 62 kernel  Here is what might happen:  Two NUMA nodes with many idle cores (white)  Other NUMA nodes with many overloaded cores (orange, red) THE LINUX SCHEDULER: A DECADE OF WASTED CORES 2/16

  9. INTRODUCTION  General-purpose schedulers aim to be work-conserving on multicore architectures THE LINUX SCHEDULER: A DECADE OF WASTED CORES 3/16

  10. INTRODUCTION  General-purpose schedulers aim to be work-conserving on multicore architectures  Basic invariant: no idle cores if some cores have several threads in their runqueues  Can actually happen, but only in transient situations! THE LINUX SCHEDULER: A DECADE OF WASTED CORES 3/16

  11. INTRODUCTION  General-purpose schedulers aim to be work-conserving on multicore architectures  Basic invariant: no idle cores if some cores have several threads in their runqueues  Can actually happen, but only in transient situations! We found four major bugs that break this invariant in the Linux scheduler (CFS)! THE LINUX SCHEDULER: A DECADE OF WASTED CORES 3/16

  12. INTRODUCTION  General-purpose schedulers aim to be work-conserving on multicore architectures  Basic invariant: no idle cores if some cores have several threads in their runqueues  Can actually happen, but only in transient situations! We found four major bugs that break this invariant in the Linux scheduler (CFS)!  This talk: presentation of the CFS scheduler + issues we found + discussion THE LINUX SCHEDULER: A DECADE OF WASTED CORES 3/16

  13. INTRODUCTION  General-purpose schedulers aim to be work-conserving on multicore architectures  Basic invariant: no idle cores if some cores have several threads in their runqueues  Can actually happen, but only in transient situations! We found four major bugs that break this invariant in the Linux scheduler (CFS)!  This talk: presentation of the CFS scheduler + issues we found + discussion Disclaimer: this is a motivation paper! Don’t expect a solved problem  THE LINUX SCHEDULER: A DECADE OF WASTED CORES 3/16

  14. THE COMPLETELY FAIR SCHEDULER (CFS): CONCEPT Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 4/16

  15. THE COMPLETELY FAIR SCHEDULER (CFS): CONCEPT One runqueue, threads sorted by runtime R = 103 R = 82 R = 24 R = 18 R = 12 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 4/16

  16. THE COMPLETELY FAIR SCHEDULER (CFS): CONCEPT When thread done running One runqueue, threads for its timeslice : enqueued again R = 112 sorted by runtime R = 103 R = 82 R = 24 R = 18 R = 12 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 4/16

  17. THE COMPLETELY FAIR SCHEDULER (CFS): CONCEPT When thread done running One runqueue, threads for its timeslice : enqueued again R = 112 sorted by runtime R = 103 R = 82 R = 24 Lower niceness = longer timeslice R = 18 (tasks allowed to run longer) R = 12 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 4/16

  18. THE COMPLETELY FAIR SCHEDULER (CFS): CONCEPT When thread done running One runqueue, threads for its timeslice : enqueued again R = 112 sorted by runtime R = 103 Cores: next task from runqueue R = 82 R = 24 Lower niceness = longer timeslice R = 18 (tasks allowed to run longer) R = 12 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 4/16

  19. THE COMPLETELY FAIR SCHEDULER (CFS): CONCEPT When thread done running One runqueue, threads for its timeslice : enqueued again R = 112 sorted by runtime R = 103 Cores: next task from runqueue R = 82 R = 24 Lower niceness = longer timeslice In practice: cannot work with single R = 18 (tasks allowed to run longer) runqueue because of contention! R = 12 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 4/16

  20. CFS: IN PRACTICE  One runqueue per core to avoid contention W=1 W=1 W=1 W=6 W=1 W=1 W=1 Core 0 Core 1 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 5/16

  21. CFS: IN PRACTICE  One runqueue per core to avoid contention W=1 W=1  CFS periodically balances “loads”: W=1 W=6 load(task) = weight 1 x % cpu use 2 W=1 W=1 W=1 Core 0 Core 1 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 5/16

  22. CFS: IN PRACTICE  One runqueue per core to avoid contention W=1 W=1  CFS periodically balances “loads”: W=1 W=6 load(task) = weight 1 x % cpu use 2 W=1 W=1 W=1 1 Lower niceness = higher weight Core 0 Core 1 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 5/16

  23. CFS: IN PRACTICE  One runqueue per core to avoid contention W=1 W=1  CFS periodically balances “loads”: W=1 W=6 load(task) = weight 1 x % cpu use 2 W=1 W=1 W=1 1 Lower niceness = higher weight 2 Prevent high-priority thread from taking whole CPU just to sleep Core 0 Core 1 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 5/16

  24. CFS: IN PRACTICE  One runqueue per core to avoid contention W=1 W=1  CFS periodically balances “loads”: W=1 W=6 load(task) = weight 1 x % cpu use 2 W=1 W=1 W=1 1 Lower niceness = higher weight 2 Prevent high-priority thread from taking whole CPU just to sleep Core 0 Core 1 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 5/16

  25. CFS: IN PRACTICE  One runqueue per core to avoid contention W=1 W=1  CFS periodically balances “loads”: W=1 W=6 load(task) = weight 1 x % cpu use 2 W=1 W=1 W=1 1 Lower niceness = higher weight 2 Prevent high-priority thread from taking whole CPU just to sleep Core 0 Core 1  Since there can be many cores: hierarchical approach! THE LINUX SCHEDULER: A DECADE OF WASTED CORES 5/16

  26. CFS: BALANCING THE LOAD L=2000 L=3000 L=6000 L=1000 L=1000 L=1000 L=1000 L=1000 L=1000 L=3000 L=1000 L=1000 L=1000 L=1000 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 6/16

  27. CFS: BALANCING THE LOAD L=2000 L=3000 L=6000 L=1000 L=1000 L=1000 L=1000 L=1000 L=1000 L=3000 L=1000 L=1000 L=1000 L=1000 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 6/16

  28. CFS: BALANCING THE LOAD L=2000 L=3000 L=6000 L=1000 L=1000 L=1000 L=1000 L=1000 L=1000 L=3000 L=1000 L=1000 L=1000 L=1000 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 6/16

  29. CFS: BALANCING THE LOAD L=2000 L=3000 L=6000 L=1000 Balanced! L=1000 L=1000 L=1000 L=1000 L=1000 L=3000 L=1000 L=1000 L=1000 L=1000 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 6/16

  30. CFS: BALANCING THE LOAD L=2000 L=3000 L=6000 L=1000 Balanced! L=1000 L=1000 L=1000 L=1000 L=1000 L=3000 L=1000 L=1000 L=1000 L=1000 Core 0 Core 1 Core 2 Core 3 THE LINUX SCHEDULER: A DECADE OF WASTED CORES 6/16

Recommend


More recommend