Aemon: Information-agnostic Mix-flow Scheduling in Data Center Networks Tao Wang 1 , Hong Xu 2 , Fangming Liu 1 1 Huazhong University of Science and Technology 2 NetX Lab @ City University of Hong Kong August, 2017 @ APNet, Hong Kong
Why information-agnostic mix-flow scheduling?
Mix-flow in DCN 3
Mix-flow in DCN … …… Hundreds of thousands of servers …… Web Services ML Analytics HPC 3
Mix-flow in DCN … …… Hundreds of thousands of servers …… Web Services ML Analytics HPC ‣ Non-deadline flows ‣ Deadline flows ‣ minimize FCT ‣ minimize deadline miss ratio 3
Flow size is hard to obtain 4
Flow size is hard to obtain ‣ Multi-stage job processing technique (e.g. pipelining, etc.) ‣ Real-time characteristics (e.g. streaming application, etc.) Hard to know flow sizes beforehand! 4
Flow size is hard to obtain ‣ Multi-stage job processing technique (e.g. pipelining, etc.) ‣ Real-time characteristics (e.g. streaming application, etc.) Hard to know flow sizes beforehand! 4
Existing solutions fall short 5
Existing solutions fall short ‣ Deadline-unaware transport ‣ TCP , DCTCP , etc. ‣ Fail to meet deadlines for deadline flows [1-2] [1] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’13 [2] Scheduling Mix-flows in Commodity Datacenters with Karuna, SIGCOMM’16 5
Existing solutions fall short ‣ Deadline-unaware transport ‣ TCP , DCTCP , etc. ‣ Fail to meet deadlines for deadline flows [1-2] ‣ Deadline-aware transport ‣ D 3 , D 2 TCP , PDQ, pFabric, Karuna, etc. ‣ Either impossible to deploy in DCN (PDQ, pFabric) ‣ Or assume flow size is known (D 3 , D 2 TCP , Karuna) [1] pFabric: Minimal Near-Optimal Datacenter Transport, SIGCOMM’13 [2] Scheduling Mix-flows in Commodity Datacenters with Karuna, SIGCOMM’16 5
Aemon
Aemon
Aemon Maester Aemon was the blind maester at Castle Black in Game of Thrones
Aemon’s Design 7
Aemon’s Design w. deadline Urgency- based Congestion Control w/o deadline UCP 7
Aemon’s Design w. deadline Urgency- based Congestion Control w/o deadline End-host UCP 7
Aemon’s Design w. deadline Urgency- based Congestion Control w/o deadline End-host 2LPS: Two-level PS UCP 7
Aemon’s Design Priority Scheduling Prio 1 w. deadline Urgency- Prio 2 End-host based … Priority Congestion Tagging Prio 2K-1 Control w/o deadline Prio 2K End-host 2LPS: Two-level PS UCP 7
Aemon’s Design Priority Scheduling Prio 1 w. deadline Urgency- Prio 2 End-host based … Priority Congestion Tagging Prio 2K-1 Control w/o deadline Prio 2K End-host Switch 2LPS: Two-level PS UCP 7
UCP Overview 8
UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F 8
UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F ‣ Deadline flow’s urgency (non-deadline flow’s urgency is 1) 8
UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F ‣ Deadline flow’s urgency (non-deadline flow’s urgency is 1) T e s = T d − T e Deadline Elapsed Time 8
UCP Overview ‣ DCTCP expression of network congestion α ← (1 − g ) · α + g · F ‣ Deadline flow’s urgency (non-deadline flow’s urgency is 1) T e s = T d − T e Deadline Elapsed Time ‣ Congestion window modulation ⇢ cwnd · (1 − α s / 2) , α s > 0 , cwnd = α s = 0 . cwnd + 1 , 8
UCP Rationale ‣ Penalize low-urgency deadline flow • leave more bandwidth for non-deadline flows ‣ Protect high-urgency deadline flow • meet deadlines 9
UCP Rationale ‣ Penalize low-urgency deadline flow • leave more bandwidth for non-deadline flows ‣ Protect high-urgency deadline flow • meet deadlines w/o ddl w/ ddl di ff 1 Window Penalty 0.75 0.5 0.25 0 -0.25 -0.5 0 0.5 1 1.5 2 Urgency (i.e. s) 9
2LPS Overview
2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases
2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases ‣ Within the same prio (Level-2) ‣ Non-deadline flows are strictly prioritized
2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases ‣ Within the same prio (Level-2) ‣ Non-deadline flows are strictly prioritized High priority Logical view … Low priority Non-deadline flow
2LPS Overview ‣ Within the same type (Level-1) ‣ Non-deadline flow demotes its prio as more bytes sent ‣ Deadline flow promotes its prio as urgency increases ‣ Within the same prio (Level-2) ‣ Non-deadline flows are strictly prioritized High priority Logical view Physical view Prio 1 … … Deadline flow Prio 2K Low priority Non-deadline flow
2LPS: Level-1 rationale 11
2LPS: Level-1 rationale ‣ Within the same type (Level-1) ‣ For non-deadline flows ‣ PIAS [1] -like priority demotion to approximate SJF • prioritize short flows ‣ For deadline flows ‣ Priority promotion scheme based on urgency • prioritize flows with deadline approaching [1] Information-Agnostic Flow Scheduling for Commodity Data Centers, NSDI’15 11
2LPS: Level-1 rationale ‣ Within the same type (Level-1) ‣ For non-deadline flows ‣ PIAS [1] -like priority demotion to approximate SJF • prioritize short flows ‣ For deadline flows ‣ Priority promotion scheme based on urgency • prioritize flows with deadline approaching ‣ Why not Earliest-Deadline-First as tagging option? ‣ EDF is optimal when scheduling deadline flows ‣ but over-aggressive in mix-flow context ‣ and limited priority queues, etc. [1] Information-Agnostic Flow Scheduling for Commodity Data Centers, NSDI’15 11
2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 12
2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 12
2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 12
2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect (short) non-deadline flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Priority Promotion Low priority 12
2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13
2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13
2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13
2LPS: Level-2 rationale ‣ Within the same prio (Level-2) • Protect non-deadline (short) flows from over-aggressive (long) deadline flows (short) non-deadline flow is delayed! Deadline flow High priority Prio 1 Non-deadline flow Prio 2 Low priority 13
How does Aemon perform?
Packet-level NS2 simulation 40Gbps Link 9 racks 10Gbps …… Link …… …… …… …… ‣ Spine-leaf Fabric with 144 hosts ‣ RTT: ~85.2 μ s (80 μ s at hosts) ‣ Buffer size: 360KB each port ‣ ECN thresholds: 65/250 #pkts for 10/40Gbps link ‣ Workloads ‣ Web Search (DCTCP paper), Data Mining (VL2 paper) 15
Overall Average FCT Web Search workload ‣ Compared with PIAS ‣ Aemon reduces ~45.1% average FCT ‣ UCP lowers non-deadline flows’ FCT ‣ 2LPS also lowers non- deadline flows’ FCT 16
Overall Average FCT Web Search workload Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP ‣ Compared with PIAS 26 ‣ Aemon reduces ~45.1% 19.5 Average FCT (ms) average FCT ‣ UCP lowers non-deadline 13 flows’ FCT ‣ 2LPS also lowers non- 6.5 deadline flows’ FCT 0 0.75 0.8 0.85 0.9 Load 16
Overall Average FCT Web Search workload Aemon Karuna PIAS+DCTCP PIAS+UCP 2LPS+DCTCP ‣ Compared with PIAS 26 ‣ Aemon reduces ~45.1% 19.5 Average FCT (ms) average FCT ‣ UCP lowers non-deadline 13 flows’ FCT ‣ 2LPS also lowers non- 6.5 deadline flows’ FCT 0 0.75 0.8 0.85 0.9 Load 16
Recommend
More recommend