aemon information agnostic mix flow scheduling in data
play

Aemon: Information-agnostic Mix-flow Scheduling in Data Center - PowerPoint PPT Presentation

Aemon: Information-agnostic Mix-flow Scheduling in Data Center Networks Tao Wang 1 , Hong Xu 2 , Fangming Liu 1 1 Huazhong University of Science and Technology 2 NetX Lab @ City University of Hong Kong August, 2017 @ APNet, Hong Kong Why


  1. Aemon: Information-agnostic Mix-flow Scheduling in Data Center Networks Tao Wang 1 , Hong Xu 2 , Fangming Liu 1 1 Huazhong University of Science and Technology 2 NetX Lab @ City University of Hong Kong August, 2017 @ APNet, Hong Kong

  2. Why information-agnostic mix-flow scheduling?

  3. Mix-flow in DCN 3

  4. Mix-flow in DCN … …… Hundreds of thousands of servers …… Web Services ML Analytics HPC 3

  5. Mix-flow in DCN … …… Hundreds of thousands of servers …… Web Services ML Analytics HPC ‣ Non-deadline flows ‣ Deadline flows ‣ minimize FCT ‣ minimize deadline miss ratio 3

  6. Flow size is hard to obtain 4

  7. Flow size is hard to obtain ‣ Multi-stage job processing technique (e.g. pipelining, etc.) ‣ Real-time characteristics (e.g. streaming application, etc.) Hard to know flow sizes beforehand! 4

  8. Flow size is hard to obtain ‣ Multi-stage job processing technique (e.g. pipelining, etc.) ‣ Real-time characteristics (e.g. streaming application, etc.) Hard to know flow sizes beforehand! 4

  9. Existing solutions fall short 5

  10. Existing solutions fall short ‣ Deadline-unaware transport ‣ TCP , DCTCP , etc. ‣ Fail to meet deadlines for deadline flows [1-2] [1] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’13 [2] Scheduling Mix-flows in Commodity Datacenters with Karuna, SIGCOMM’16 5

  11. Existing solutions fall short ‣ Deadline-unaware transport ‣ TCP , DCTCP , etc. ‣ Fail to meet deadlines for deadline flows [1-2] ‣ Deadline-aware transport ‣ D 3 , D 2 TCP , PDQ, pFabric, Karuna, etc. ‣ Either impossible to deploy in DCN (PDQ, pFabric) ‣ Or assume flow size is known (D 3 , D 2 TCP , Karuna) [1] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’13 [2] Scheduling Mix-flows in Commodity Datacenters with Karuna, SIGCOMM’16 5

  12. Aemon

  13. Aemon

  14. Aemon Maester Aemon was the blind maester at Castle Black in Game of Thrones

  15. Aemon’s Design 7

  16. Aemon’s Design w. deadline Urgency- based Congestion Control w/o deadline UCP 7

  17. Aemon’s Design w. deadline Urgency- based Congestion Control w/o deadline End-host UCP 7

  18. Aemon’s Design w. deadline Urgency- based Congestion Control w/o deadline End-host 2LPS: Two-level PS UCP 7

  19. Aemon’s Design Priority Scheduling Prio 1 w. deadline Urgency- Prio 2 End-host based … Priority Congestion Tagging Prio 2K-1 Control w/o deadline Prio 2K End-host 2LPS: Two-level PS UCP 7

  20. Aemon’s Design Priority Scheduling Prio 1 w. deadline Urgency- Prio 2 End-host based … Priority Congestion Tagging Prio 2K-1 Control w/o deadline Prio 2K End-host Switch 2LPS: Two-level PS UCP 7

  21. UCP Overview 8

  22. UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F 8

  23. UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F ‣ Deadline flow’s urgency (non-deadline flow’s urgency is 1) 8

  24. UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F ‣ Deadline flow’s urgency (non-deadline flow’s urgency is 1) T e s = T d − T e Deadline Elapsed Time 8

  25. UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F ‣ Deadline flow’s urgency (non-deadline flow’s urgency is 1) T e s = T d − T e Deadline Elapsed Time ‣ Congestion window modulation ⇢ cwnd · (1 − α s / 2) , α s > 0 , cwnd = α s = 0 . cwnd + 1 , 8

  26. UCP Rationale ‣ Penalize low-urgency deadline flow • leave more bandwidth for non-deadline flows ‣ Protect high-urgency deadline flow • meet deadlines 9

  27. UCP Rationale ‣ Penalize low-urgency deadline flow • leave more bandwidth for non-deadline flows ‣ Protect high-urgency deadline flow • meet deadlines w/o ddl w/ ddl di ff 1 Window Penalty 0.75 0.5 0.25 0 -0.25 -0.5 0 0.5 1 1.5 2 Urgency (i.e. s) 9

  28. 2LPS Overview

  29. 2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases

  30. 2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases ‣ Within the same prio (Level-2) ‣ Non-deadline flows are strictly prioritized

  31. 2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases ‣ Within the same prio (Level-2) ‣ Non-deadline flows are strictly prioritized High priority Logical view … Low priority Non-deadline flow

  32. 2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases ‣ Within the same prio (Level-2) ‣ Non-deadline flows are strictly prioritized High priority Logical view Physical view Prio 1 … … Deadline flow Prio 2K Low priority Non-deadline flow

  33. 2LPS: Level-1 rationale 11

  34. 2LPS: Level-1 rationale ‣ Within the same type (Level-1) ‣ For non-deadline flows ‣ PIAS [1] -like priority demotion to approximate SJF • prioritize short flows ‣ For deadline flows ‣ Priority promotion scheme based on urgency • prioritize flows with deadline approaching [1] Information-Agnostic Flow Scheduling for Commodity Data Centers, NSDI’15 11

  35. 2LPS: Level-1 rationale ‣ Within the same type (Level-1) ‣ For non-deadline flows ‣ PIAS [1] -like priority demotion to approximate SJF • prioritize short flows ‣ For deadline flows ‣ Priority promotion scheme based on urgency • prioritize flows with deadline approaching ‣ Why not Earliest-Deadline-First as tagging option? ‣ EDF is optimal when scheduling deadline flows ‣ but over-aggressive in mix-flow context ‣ and limited priority queues, etc. [1] Information-Agnostic Flow Scheduling for Commodity Data Centers, NSDI’15 11

  36. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 12

  37. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 12

  38. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 12

  39. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Priority Promotion Low priority 12

  40. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13

  41. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13

  42. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13

  43. 2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows (short) non-deadline flow is delayed! Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13

  44. How does Aemon perform?

  45. Packet-level NS2 simulation 40Gbps Link 9 racks 10Gbps …… Link …… …… …… …… ‣ Spine-leaf Fabric with 144 hosts ‣ RTT: ~85.2 μ s (80 μ s at hosts) ‣ Buffer size: 360KB each port ‣ ECN thresholds: 65/250 #pkts for 10/40Gbps link ‣ Workloads ‣ Web Search (DCTCP paper), Data Mining (VL2 paper) 15

  46. Overall Average FCT Web Search workload ‣ Compared with PIAS ‣ Aemon reduces ~45.1% average FCT ‣ UCP lowers non-deadline flows’ FCT ‣ 2LPS also lowers non- deadline flows’ FCT 16

  47. Overall Average FCT Web Search workload Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP ‣ Compared with PIAS 26 ‣ Aemon reduces ~45.1% 19.5 Average FCT (ms) average FCT ‣ UCP lowers non-deadline 13 flows’ FCT ‣ 2LPS also lowers non- 6.5 deadline flows’ FCT 0 0.75 0.8 0.85 0.9 Load 16

  48. Overall Average FCT Web Search workload Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP ‣ Compared with PIAS 26 ‣ Aemon reduces ~45.1% 19.5 Average FCT (ms) average FCT ‣ UCP lowers non-deadline 13 flows’ FCT ‣ 2LPS also lowers non- 6.5 deadline flows’ FCT 0 0.75 0.8 0.85 0.9 Load 16

Recommend


More recommend