numfabric fast and flexible bandwidth allocation in
play

NUMFabric: Fast and Flexible Bandwidth Allocation in Datacenters - PowerPoint PPT Presentation

NUMFabric: Fast and Flexible Bandwidth Allocation in Datacenters Kanthi Nagaraj (Stanford), Dinesh Bharadia(M.I.T.), Mohammad Alizadeh (M.I.T.), Hongzi Mao (M.I.T.), Sandeep Chinchali (Stanford) and Sachin Katti(Stanford) Sigcomm 2016


  1. NUMFabric: Fast and Flexible Bandwidth Allocation in Datacenters Kanthi Nagaraj (Stanford), Dinesh Bharadia(M.I.T.), Mohammad Alizadeh (M.I.T.), Hongzi Mao (M.I.T.), Sandeep Chinchali (Stanford) and Sachin Katti(Stanford) Sigcomm 2016

  2. Datacenter fabric proposals

  3. Which one does the operator pick?

  4. Is there a single fabric that provides flexible and fast bandwidth allocation control? Yes ! NUMFabric provides a flexible fabric that is also fast.

  5. Flexible and Fast Fast Flexible •Flows converge to •Supports wide correct rates before variety of bandwidth the datacenter allocation objectives workload changes

  6. � � � NUMFabric: Flexibility Translate to utility Minimize avg flow Application level Resource functions m 𝑏𝑦𝑗𝑛𝑗𝑨𝑓 ∑ ) * Weighted proportional m 𝑏𝑦𝑗𝑛𝑗𝑨𝑓 ∑ 𝑥 𝑗 ∗ log (𝑦 - ) pooling m 𝑏𝑦𝑗𝑛𝑗𝑨𝑓 ∑ 𝑥 𝑗 ∗ log completion time (𝑧 - ) objective - - - fairness + * where y i = aggregate rate of flow across all subpaths Flow i’s utility at rate send utility x i function to hosts x i ß rate of flow i Hosts s i ß size of flow i w i ß weight of flow i 6

  7. � Network Utility Maximization in general maximize 𝑉 𝑌 = ∑ 𝑉 𝑗 𝑦 𝑗 -?@ subject to AX ≤ 𝐷 X ≥ 0 Problem ? Existing NUM solutions are slow and unsuitable for data center workloads

  8. Existing distributed NUM solutions Flow rates 1.2 1 0.8 Rates 0.6 Network sends 0.4 0.2 congestion 0 Signals 1 6 11 16 Iterations H1 H2 H3 H5 H6 H7 H8 H9 H4 Each source sets its rate based on gradient of its utility Sources send traffic function and the network feedback • Each source iteratively adjusts rates following its own gradient towards optimal • The sum of the rates moves towards the global optimal

  9. Gradient based methods Overshooting might cause bloated queues and packet drops Capacity Larger steps to optimal Smaller steps to optimal Capacity 1.2 1.2 Capacity Capacity Normalized rates 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 6 11 16 21 26 1 6 11 16 21 26 31 Iterations Iterations

  10. How can we fix this? Can we enable larger Overshooting might cause steps to optimal but drops, queues bloating without over-shooting and under-utilization? Larger steps to optimal Larger steps to optimal 1.2 Capacity 1.2 Capacity 1 Normalized rates 1 0.8 0.8 0.6 0.6 ? 0.4 0.4 0.2 0.2 0 0 1 6 11 16 21 26 1 3 5 7 9 11 13 15 17 19 Iterations Iterations Use Weights instead of rates ! Setting weights of the flow and allowing a fabric to allocate rates proportional to the weights enables exactly this.

  11. NUMFabric key idea In NUMFabric, sources give up direct control over rates • The sources specify “weights” and the Weighted Max-Min fabric • allocates relative rates proportional to the weights of all flows Setting weights to control rates Capacity 1.2 1 0.8 0.6 0.4 0.2 0 1 6 11 16 21

  12. Translate to utility Application level functions objective Flexible Weights Weighted Max-Min rate Layer that realizes rates Layer that sets weights of proportional to the weights allocation according to the flows based on network of the flows weights feedback Fast Network feedback

  13. Weight inference

  14. � � Distributed NUM mechanism NUM Objective 𝑁𝑏𝑦𝑗𝑛𝑗𝑨𝑓 N 𝑉 𝑗 (𝑦 - ) - KKT Conditions: Equations that must necessarily be true at optimal solution Price of a link : variable that indicates the congestion level at the switch At optimal, either the link is fully 𝑞 𝑚 ∑ 𝑦 𝑗 − 𝑑 Q = 0 - ∈@(Q) utilized or the price of the link is zero At optimal, the marginal utility of the source is equal to the sum of the prices 𝑉 𝑗 ′ 𝑦 𝑗 = N 𝑞 𝑚 along the path of the flow Q ∈S(-)

  15. � � Distributed NUM mechanism Switches set their prices measuring congestion solve p l = 𝑞 𝑚 + 𝛽 ∗ ∑ 𝑦 𝑗 − 𝑑 Q 𝑞 𝑚 ∑ 𝑦 𝑗 − 𝑑 Q = 0 - ∈@(Q) - ∈@(Q) Network congestion N 𝑞 𝑚 signals Q ∈S(-) H1 H5 H2 H3 H4 H6 H7 H8 H9 Sources set rates of flows Sources adapt rates of flows Sources set the rates of the flows using price feedback 𝑉 𝑗 ′ 𝑦 𝑗 = N 𝑞 𝑚 solve x i = 𝐽𝑜𝑤𝑓𝑠𝑡𝑓 𝑝𝑔 𝑉 𝑗 ^ (∑ 𝑞 𝑚 ) Q ∈S(-) Q ∈S(-)

  16. � NUMFabric iterations Controlling rates directly causes the brittleness in the existing solutions. Switches adapt prices at every iteration so that the ✔ flow rates move closer to optimal 𝑞 𝑚 ∑ 𝑦 𝑗 − 𝑑 Q = 0 - ∈@(Q) WMM layer always achieves 100% link utilization H1 H5 H2 H3 H4 H6 H7 H8 H9 ^ (∑ w i = 𝐽𝑜𝑤𝑓𝑠𝑡𝑓 𝑝𝑔 𝑉 𝑗 𝑞 𝑚 ) ✖ Q ∈S(-) 𝑉 𝑗 ′ 𝑦 𝑗 = N 𝑞 𝑚 WMM layer converts these weights to rates Q ∈S(-)

  17. � NUMFabric iterations As we know, controlling rates directly causes the brittleness in the existing solutions. Switches adapt prices every iteration so that the = 0 ✔ 𝑠𝑓𝑡𝑗𝑒𝑣𝑓 𝑗 𝑞 𝑚 = 𝑞 Q + min flow rates to move closer to optimal ℎ𝑝𝑞𝑡 𝑢𝑠𝑏𝑤𝑓𝑠𝑡𝑓𝑒 𝑐𝑧 𝑔𝑚𝑝𝑥 - 𝑞 𝑚 ∑ 𝑦 𝑗 − 𝑑 Q - ∈@(Q) Residue Residue H1 H5 H2 H3 H4 H6 H7 H8 H9 ✔ ✖ 𝑉 𝑗 ′ 𝑦 𝑗 = N 𝑞 𝑚 𝑆𝑓𝑡𝑗𝑒𝑣𝑓 = 𝑉 - ′ 𝑦 𝑗 − N 𝑞 𝑚 Q ∈S(-) Q ∈S(-)

  18. Operation summary Price adaptation at switches Weight adaptation at hosts Flow weights Path prices Prices Residues Residues Rates Weighted Max-Min Feasible and stable rates for all flows based on weights

  19. Evaluation

  20. Evaluation setup 40Gbps Fabric Links 10Gbps 8 Racks Edge Links • ns3 simulations: 128-port leaf-spine fabric • RTT = ~16µs • Evaluate speed of convergence • Evaluate flexibility • Compare the bandwidth allocations on NUMFabric with different utility functions against point solutions for different objectives– pFabric, MPTCP, etc. 20

  21. Fast convergence • 100 flows start/stop at every “event”. • We let the system converge before triggering another event • Median convergence time (335 us) of NUMFabric is 2.3X better that the other DGD : Dual Gradient Descent algorithm algorithms RCP* : Alpha-Fair RCP

  22. � Flexibility : minimize flow completion times m 𝑏𝑦𝑗𝑛𝑗𝑨𝑓 ∑ ) * - + * x i à rate of the flow s i à size of the flow

  23. � Flexibility : minimize flow completion times m 𝑏𝑦𝑗𝑛𝑗𝑨𝑓 ∑ log (𝑧 - ) - where y i = aggregate rate of flow across all sub-paths

  24. Conclusions • NUMFabric enables operators to flexibly optimize network’s bandwidth allocation for different bandwidth allocation objectives • NUMFabric uses weights as knobs to influence rates and thus, decouples the objectives of finding optimal rates and stable rates.This makes it 2-3X faster existing mechanisms. • Using NUMFabric with objective functions on co-flows, VM-level and tenant-level aggregates is focus of our current and future work.

  25. Thank you

Recommend


More recommend