heterogeneity aware
play

Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud - PowerPoint PPT Presentation

Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud Gunho Lee (UC Berkeley) Byung-Gon Chun (Yahoo! Research) Randy H. Katz (UC Berkeley) We have resources and jobs Resource Job/Task Allocate resources (slots) Allocation


  1. Heterogeneity-Aware Resource Allocation and Scheduling in the Cloud Gunho Lee (UC Berkeley) Byung-Gon Chun (Yahoo! Research) Randy H. Katz (UC Berkeley)

  2. We have resources and jobs Resource Job/Task

  3. Allocate resources (slots) Allocation Resource Job/Task

  4. Then schedule jobs/tasks on them Allocation Scheduling Resource Job/Task

  5. Goal 1. Minimize the cluster size while providing good performance Dynamic Resource Allocation Resource Job/Task

  6. Goal 2. Provide each job with “fair share” of resources Fair scheduling Resource Job/Task

  7. Heterogeneity makes the problem more complex Allocation ??? Scheduling ??? Resource Job/Task

  8. Our Approach • Consider Job Affinity to match more suitable resources to jobs • Redefine a share metric to provide fairness • Allocation – Core Nodes + Accelerator Nodes • Scheduling – Progress Share

  9. Fair Share Metric • The scheduler try to equalize “share” of all jobs – SlotShare : Number of slots owned • Does not work well in heterogeneous environments – ProgressShare: Progress being made with owned slots / all slots • Contribution of a slot to a job’s progress rate

  10. Progress Share 1 Progress without sharing (1 job) Progress 0 Time

  11. Progress Share 1 Progress without sharing (1 job) Progress Just good progress with sharing (2 jobs) 0 Time

  12. Progress Share 1 Progress without sharing (1 job) (Even better) Progress Just good progress with sharing (2 jobs) (Under-served) 0 Time

  13. Progress Share 1 Progress without sharing (1 job) (Even better) Progress Just good progress with sharing (2 jobs) a (Under-served) b 0 Time Progress Share of Job A = Ratio of progress slope (b/a)

  14. Homogeneous case 1 Slot Share 1 0 Job A Progress 1 Job B Progress 0 Share Time 0

  15. Heterogeneous case Job A runs faster on gray slots B B B B B A A B B B B B A A B B B B B A A B B B B B A A B B B B B A A A A A B B B B B A A A A A 1 1 Job A Progress Progress Job B 0 0 Time Time

  16. Heterogeneous case 1 Using SlotShare B B A 1 A B B B B B A Slot A B A Share B A B A B A B B B B 1 0 Time 1 Job A Progress Job B Progress Share 0 Time 0 Time

  17. Heterogeneous case 1 Using SlotShare B B A 1 A B B B B B A Slot A B A Share B A B A B A B B B B 1 0 Time 1 Job A Progress Job B Progress Share 0 Time 0 Time

  18. Heterogeneous case 1 Using SlotShare Job A is making less progress, B B A with the same number of slots 1 A B B B B B A Slot A B A Share B A B A B A B B B B 1 0 Time 1 Job A Progress Job B Progress Share 0 Time 0 Time

  19. Heterogeneous case 2 Using ProgressShare B B B B B 1 B B B B B B B B B B Slot B A B Share A A A A A A A A A A 1 0 Time Job A 1 Progress Job B Progress Share 0 Time 0 Time

  20. Heterogeneous case 2 Using ProgressShare B B B B B 1 B B B B B B B B B B Slot B A B Share A A A A A A A A A A 1 0 Time Job A 1 Progress Job B Progress Share 0 Time 0 Time

  21. Heterogeneous case 2 Using ProgressShare Both jobs making B B B B B progress >= 0.5 1 B B B B B B B B B B Slot B A B Share A A A A A A A A A A 1 0 Time Job A 1 Progress Job B Progress Share 0 Time 0 Time

  22. Performance Gain of Using Progress Share

  23. Summary • Heterogeneity should be taken account at both level of two-level scheduling – Resource Allocation and Job Scheduling • Need to redefine “share” to provide performance and fairness simultaneously in heterogeneous environments – Propose “progress share” • Future Work – Combine with sub-linear performance model – Consider inference of co-located jobs

Recommend


More recommend