D Dominant Resource Fairness (DRF) i t R F i (DRF) Fair Allocation of Multiple Resource Types Ali Ghodsi , Matei Zaharia Benjamin Hindman, Andy Konwinski, B j i Hi d A d K i ki Scott Shenker, Ion Stoica University of California, Berkeley 1 alig@cs.berkeley.edu
What is fair sharing? CPU 1 0 0 % 1 0 0 % • n users want to share a resource (e.g. CPU) 33% – Solution: 33% 5 0 % Allocate each 1/n of the shared resource 33% 0 % 1 0 0 % • Generalized by max ‐ min fairness 20% – Handles if a user wants less than its fair share 40% 5 0 % 5 0 % – E.g. user 1 wants no more than 20% 40% 0 % • Generalized by weighted max ‐ min fairness G li d b i ht d i f i 1 0 0 % 33% – Give weights to users according to importance – User 1 gets weight 1, user 2 weight 2 User 1 gets weight 1 user 2 weight 2 5 0 % 5 0 % 66% alig@cs.berkeley.edu 2 0 %
Properties of max ‐ min fairness Properties of max min fairness • Share guarantee – Each user can get at least 1/n of the resource – But will get less if her demand is less • Strategy ‐ proof – Users are not better off by asking for more than they need – Users have no reason to lie • Max ‐ min fairness is the only ”reasonable” mechanism with these two properties alig@cs.berkeley.edu 3
Why care about fairness? Why care about fairness? • Desirable properties of max ‐ min fairness – Isolation policy: A user gets her fair share irrespective of the demands of other users other users – Flexibility separates mechanism from policy: Proportional sharing, priority, reservation,... P i l h i i i i • Many schedulers use max ‐ min fairness Many schedulers use max min fairness – Datacenters: Hadoop’s fair sched, capacity, Quincy – OS: rr, prop sharing, lottery, linux cfs, ... – Networking: wfq, wf2q, sfq, drr, csfq, ... alig@cs.berkeley.edu
Why is max ‐ min fairness not enough? Why is max min fairness not enough? • Job scheduling in datacenters is not only Job scheduling in datacenters is not only about CPUs – Jobs consume CPU, memory, disk, and I/O Jobs consume CPU memory disk and I/O • Does this pose any challenge? D hi h ll ? alig@cs.berkeley.edu 5
Heterogeneous Resource Demands Some tasks are CPU ‐ intensive Some tasks are Most task need ~ memory ‐ intensive <2 CPU, 2 GB RAM> 2000 ‐ node Hadoop Cluster at Facebook (Oct 2010) alig@cs.berkeley.edu 6
Problem Problem 1 0 0 % Single resource example Single resource example 50% – 1 resource: CPU 5 0 % – User 1 wants <1 CPU> per task p 50% 50% – User 2 wants <3 CPU> per task 0 % CPU 1 0 0 % Multi ‐ resource example – 2 resources: CPUs & mem ? ? – User 1 wants <1 CPU, 4 GB> per task 5 0 % – User 2 wants <3 CPU, 1 GB> per task p – What’s a fair allocation? 0 % CPU m em alig@cs.berkeley.edu 7
Problem definition How to fairly share multiple resources when f i l h l i l h users have heterogenous demands on them? alig@cs.berkeley.edu 8
Talk Outline Talk Outline • What properties do we want? • How do we solve it (DRF)? • How would an economist solve this? • How well does this work in practice? alig@cs.berkeley.edu 9
Model Model • Users have tasks according to a demand vector Users have tasks according to a demand vector – e.g. <2, 3, 1> user’s tasks need 2 R 1 , 3 R 2 , 1 R 3 – Not needed in practice, measure actual consumption Not needed in practice measure actual consumption • Resources given in multiples of demand vectors R i i lti l f d d t • Assume divisible resources alig@cs.berkeley.edu 10
A Natural Policy A Natural Policy • Asset Fairness Asset Fairness – Equalize each user’s sum of resource shares • Cluster with 70 CPUs, 70 GB RAM – U 1 needs <2 CPU, 2 GB RAM> per task , p 1 – U 2 needs <1 CPU, 2 GB RAM> per task alig@cs.berkeley.edu
A Natural Policy A Natural Policy • Asset Fairness Asset Fairness – Equalize each user’s sum of resource shares User 1 User 2 100% • Cluster with 70 CPUs, 70 GB RAM Problem 43% 43% User 1 has < 50% of both CPUs and RAM – U 1 needs <2 CPU, 2 GB RAM> per task , p 1 50% – U 2 needs <1 CPU, 2 GB RAM> per task Better off in a separate cluster with 50% of 57% the resources 28% 0% • Asset fairness yields CPU RAM – U 1 : 15 tasks: 30 CPUs, 30 GB ( ∑ =60) – U 2 : 20 tasks: 20 CPUs, 40 GB ( ∑ =60) alig@cs.berkeley.edu
Share Guarantee Share Guarantee • Every user should get 1/ n of at least one Every user should get 1/ n of at least one resource • Intuition: – “You shouldn’t be worse off than if you ran your own cluster with 1/n of the resources” alig@cs.berkeley.edu 13
Cheating the Scheduler Cheating the Scheduler • Users willing to game the system to get more resources g y g g • Real ‐ life examples – A cloud provider had quotas on map and reduce slots Some users found out that the map ‐ quota was low – Users implemented maps in the reduce slots! Users implemented maps in the reduce slots! – A search company provided dedicated machines to users p y p that could ensure certain level of utilization (e.g. 80%) – Users used busy ‐ loops to inflate utllization alig@cs.berkeley.edu 14
Strategy ‐ proofness Strategy proofness • A user should not be able to increase her A user should not be able to increase her allocation by lying about her demand vector • Intuition: – Users are incentivized to provide truthful resource requirements alig@cs.berkeley.edu 15
Challenge Challenge • Can we find a fair sharing policy that provides Can we find a fair sharing policy that provides – Strategy ‐ proofness – Share guarantee Share guarantee • Max ‐ min fairness for a single resource had M i f i f i l h d these properties – Can we generalize max ‐ min fairness to multiple resources? alig@cs.berkeley.edu 16
Talk Outline Talk Outline • What properties do we want? • How do we solve it (DRF)? • How would an economist solve this? • How well does this work in practice? alig@cs.berkeley.edu 17
Dominant Resource Fairness Dominant Resource Fairness • A user’s dominant resource is the resource she use s do a t esou ce s t e esou ce s e has the biggest share of – Example: Total resources: <10 CPU, 4 GB> User 1’s allocation: <2 CPU, 1 GB> Dominant resource is memory as 1/4 > 2/10 (1/5) i i 1/4 2/10 (1/ ) • A user’s dominant share is the fraction of the • A user s dominant share is the fraction of the dominant resource she is allocated – User 1’s dominant share is 25% (1/4) User 1 s dominant share is 25% (1/4) alig@cs.berkeley.edu 18
Dominant Resource Fairness (2) • Apply max ‐ min fairness to dominant shares • Equalize the dominant share of the users q – Example: Total resources: <9 CPU, 18 GB> User 1 demand: <1 CPU, 4 GB> dom res: mem User 2 demand: <3 CPU, 1 GB> dom res: CPU 100% 12 GB 3 CPUs User 1 User 2 66% 66% 50% 66% 6 CPUs 2 GB 0% CPU mem (9 total) (18 total) 19
Online DRF Scheduler Wh Whenever there are available resources and tasks to run: h il bl d k Schedule a task to the user with smallest dominant share • O(log n ) time per decision using binary heaps (l ) d b h alig@cs.berkeley.edu 20
Talk Outline Talk Outline • What properties do we want? • How do we solve it (DRF)? • How would an economist solve this? • How well does this work in practice? alig@cs.berkeley.edu 21
Why not use pricing? Why not use pricing? • Approach Approach – Set prices for each good – Let users buy what they want Let users buy what they want • Problem P bl – How do we determine the right prices for different goods? d ? alig@cs.berkeley.edu 22
How would an economist solve it? How would an economist solve it? • Let the market determine the prices Let the market determine the prices • Competitive Equilibrium from Equal Incomes C i i ilib i f l (CEEI) – Give each user 1/n of every resource – Let users trade in a perfectly competitive market • Not strategy ‐ proof! gy p alig@cs.berkeley.edu 23
DRF vs CEEI • User 1: <1 CPU, 4 GB> User 2: <3 CPU, 1 GB> – DRF more fair, CEEI better utilization Dominant Competitive Resource Equilibrium from Fairness Equal Incomes q 100% 100% 66% user 1 91% 91% 50% 50% user 2 66% 55% 0% 0% CPU mem CPU mem alig@cs.berkeley.edu 24
DRF vs CEEI • User 1: <1 CPU, 4 GB> User 2: <3 CPU, 1 GB> – DRF more fair, CEEI better utilization Dominant Competitive Dominant Competitive Resource Equilibrium from Resource Equilibrium from Equal Incomes q Fairness Equal Incomes q Fairness 100% 100% 100% 100% 66% 66% user 1 80% 80% 91% 91% 50% 50% 50% 50% user 2 66% 66% 60% 55% 0% 0% 0% 0% CPU mem CPU mem CPU mem CPU mem • User 1: <1 CPU, 4 GB> User 2: <3 CPU, 2 GB> – User 2 increased her share of both CPU and memory alig@cs.berkeley.edu 25
Gaming Utilization Optimal Schedulers Gaming Utilization ‐ Optimal Schedulers • Cluster with <100 CPU, 100 GB> Cluster with 100 CPU, 100 GB • 2 users, each demanding <1 CPU, 2 GB> per task 100% 100% 50% User 1 User 2 95% 95% 50% 50% 50% 0% 0% CPU mem alig@cs.berkeley.edu 26
Recommend
More recommend