dc drf adaptive multi resource sharing at public cloud
play

DC-DRF : Adaptive Multi- Resource Sharing at Public Cloud Scale - PowerPoint PPT Presentation

DC-DRF : Adaptive Multi- Resource Sharing at Public Cloud Scale ACM Symposium on Cloud Computing 2018 Ian A Kash, Greg OShea, Stavros Volos 1 Public Cloud DC hosting enterprise customers O(100K) servers, mostly small tenants 2


  1. DC-DRF : Adaptive Multi- Resource Sharing at Public Cloud Scale 
 ACM Symposium on Cloud Computing 2018 Ian A Kash, Greg O’Shea, Stavros Volos 1

  2. Public Cloud DC hosting enterprise customers 
 O(100K) servers, mostly small tenants 2

  3. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 3

  4. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 One VM in compute V T O R1 V T O R2 VTORa VTORb server in compute rack T X1 R X1 TXb R X b S S D b compute storage 4

  5. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 One VM in compute V T O R1 V T O R2 VTORa VTORb server in compute rack T X1 R X1 TXb R X b One VHD in storage server in storage rack S S D b compute storage 5

  6. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 6

  7. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 7

  8. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 8

  9. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 9

  10. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 10

  11. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b 11

  12. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 12

  13. Small customer : one VM accessing storage T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 13

  14. Result: a multi-resource “demand vector” T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 14

  15. Encodes resource id and proportions T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 V T O R1 V T O R2 VTORa VTORb T X1 R X1 TXb R X b S S D b compute storage 15

  16. Encodes resource id and proportions T X1 VTOR1 VTORb R X b S S D b T X b VTORa VTOR2 R X1 Any element could be a V T O R1 V T O R2 VTORa VTORb bottleneck to performance T X1 R X1 TXb R X b S S D b compute storage 16

  17. Demand vectors form a sparse demand matrix r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7 r 8 r 9 n 0 - - - - - - - - - - n 1 - - - - - - - - - - n 2 - - - - - - - - - - - - - - - - - - - - n 3 - - - - - - - - - - n 4 - - - - - - - - - - n 5 n 6 - - - - - - - - - - - - - - - - - - - - n 7 n 8 - - - - - - - - - - n 9 - - - - - - - - - - 17

  18. Columns are shared physical resources r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7 r 8 r 9 n 0 - - - - - - - - - - n 1 - - - - - - - - - - n 2 - - - - - - - - - - - - - - - - - - - - n 3 - - - - - - - - - - n 4 - - - - - - - - - - n 5 n 6 - - - - - - - - - - - - - - - - - - - - n 7 n 8 - - - - - - - - - - n 9 - - - - - - - - - - 18

  19. Rows are tenants’ demand vectors r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7 r 8 r 9 n 0 - - - - - - - - - - n 1 - - - - - - - - - - n 2 - - - - - - - - - - - - - - - - - - - - n 3 - - - - - - - - - - n 4 - - - - - - - - - - n 5 n 6 - - - - - - - - - - - - - - - - - - - - n 7 n 8 - - - - - - - - - - n 9 - - - - - - - - - - 19

  20. Shown as fractions of a resource r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7 r 8 r 9 n 0 - - 1.0 - - - - - - .92 n 1 - - - - - - - - - - n 2 - - - - - - - - - - - - - - - - - - - - n 3 - - - - - - - - - - n 4 - - - - - - - - - - n 5 n 6 - - - - - - - - - - - - - - - - - - - - n 7 n 8 - - - - - - - - - - n 9 - - - - - - - - - - 20

  21. Large and very sparse matrix r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7 r 8 r 9 n 0 - - 1.0 - - - - - - .92 n 1 .95 - .47 - - - - - 1.0 - n 2 .54 1.0 - .30 .33 .23 .55 - .56 .31 - .41 .20 .12 .13 .09 .23 1.0 .23 .13 n 3 DC matrix 100K by 100K - 1.0 - .30 - .23 .55 - - .31 n 4 Rows mostly empty - .41 .09 .12 1.0 .64 .23 .20 .13 .13 n 5 n 6 .32 - .09 .12 1.0 .64 .23 .20 .13 .13 - - - - - - 1.0 - - .57 n 7 n 8 - - .56 .64 .20 .32 .13 .09 1.0 .23 n 9 .90 .27 .45 .64 .20 .32 .13 .09 1.0 .56 21

  22. Provider has multi-resource allocation problem • Goal: maintain acceptable service level for all tenants • Acceptable means always “willing to pay” • Avoid abrupt performance collapse for any tenant • Assuming aggressive (noisy) neighbors and oversubscription • DC-DRF builds on existing multi-resource algorithms • DRF [Ghodsi et al, NSDI’11] • EDRF [Parkes et al., EC2012] • Challenging at DC scale: EDRF iterates and is 22

  23. Systems aspects 23

  24. Systems challenges • How to capture multi-resource demand vectors? • How to enforce multi-resource allocations? • DRF implies central SDN-like controller – good or bad? • Good: Simpler algorithm and global view • Bad: EDRF at Public Cloud DC scale 24

  25. SIGCOMM 2015 demonstration 25

  26. SIGCOMM 2015 demonstration Central controller running EDRF Pass1: reservation-based SLAs Pass2: work conservation of residual 26

  27. SIGCOMM 2015 demonstration 4 tenants, 30 VMs each Spread over 10 servers R/W to 2X storage servers 40Gb RDMA switch 27

  28. SIGCOMM 2015 demonstration Demand estimation and enforcement in HyperV 28

  29. SIGCOMM 2015 demonstration Aggressive red tenant Perf. collapses for blue,yellow,green 29

  30. video 30

  31. SIGCOMM 2015 demonstration What did we learn from prototype? Potentially very powerful. But EDRF algorithm not scaling well. 31

  32. The algorithms • to understand DC-DRF first understand EDRF • to understand DRF first understand max-min 32

  33. Max-Min fairness : mice before elephants • Maximize the minimum allocation across competing tenants • Allocate fractions of a single shared resource based on demand • No tenant gets a larger fraction than its demand • Tenants with unsatisfiable demand obtain equal share Residual resource = 1.0 Residual resource = 0.7 Tenants remaining = 4 Tenants remaining = 2 .35 D Current share = 1.0/4 Current share = 0.7/2 x t = 0.25 x t = 0.6 .35 C Demand 0.35 0.5 x t =0.35 x t =0.25 B 0.2 0.2 0.1 0.1 A Allocated D A B C Tenant 33

  34. How to handle multiple resources? r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7 r 8 r 9 n 0 - - 1.0 - - - - - - .92 n 1 .95 - .47 - - - - - 1.0 - n 2 .54 1.0 - .30 .33 .23 .55 - .56 .31 - .41 .20 .12 .13 .09 .23 1.0 .23 .13 n 3 - 1.0 - .30 - .23 .55 - - .31 n 4 - .41 .09 .12 1.0 .64 .23 .20 .13 .13 n 5 n 6 .32 - .09 .12 1.0 .64 .23 .20 .13 .13 - - - - - - 1.0 - - .57 n 7 n 8 - - .56 .64 .20 .32 .13 .09 1.0 .23 n 9 .90 .27 .45 .64 .20 .32 .13 .09 1.0 .56 34

  35. Dominant Resource Fairness (DRF) • For each tenant identifies its Dominant Resource • The resource of which it demands the largest fraction • Apply max-min fairness across dominant shares • Maximize smallest dominant share in system • Then second smallest, and so on… • Think : find the smallest mouse across all columns 35

  36. Demand vectors normalized by Dominant Resource r 0 r 1 r 2 r 3 r 4 r 5 r 6 r 7 r 8 r 9 n 0 - - 1.0 - - - - - - .92 n 1 .95 - .47 - - - - - 1.0 - n 2 .54 1.0 - .30 .33 .23 .55 - .56 .31 - .41 .20 .12 .13 .09 .23 1.0 .23 .13 n 3 - 1.0 - .30 - .23 .55 - - .31 n 4 - .41 .09 .12 1.0 .64 .23 .20 .13 .13 n 5 n 6 .32 - .09 .12 1.0 .64 .23 .20 .13 .13 - - - - - - 1.0 - - .57 n 7 n 8 - - .56 .64 .20 .32 .13 .09 1.0 .23 n 9 .90 .27 .45 .64 .20 .32 .13 .09 1.0 .56 36

Recommend


More recommend