con congesti tion on ma management t fo for ethernet
play

Con Congesti tion on Ma Management t fo for Ethernet-based Lo - PowerPoint PPT Presentation

Con Congesti tion on Ma Management t fo for Ethernet-based Lo Lossless Da DataCe Center Ne Networks Pedro Javier Garcia 1 , Jesus Escudero-Sahuquillo 1 , Francisco J. Quiles 1 and Jose Duato 2 1: University of Castilla-La Mancha (UCLM)


  1. Con Congesti tion on Ma Management t fo for Ethernet-based Lo Lossless Da DataCe Center Ne Networks Pedro Javier Garcia 1 , Jesus Escudero-Sahuquillo 1 , Francisco J. Quiles 1 and Jose Duato 2 1: University of Castilla-La Mancha (UCLM) 2: Technical University València (UPV) NENDICA DCN: 1-19-0012-00-ICne

  2. Ab Abstract This paper describes congestion phenomena in lossless data center networks and its nega- tive consequences. It explores proposed solutions, analyzing their pros and cons to determine which are suited to the requirements of modern data centers. Conclusions identify important issues that should be addressed in the future.

  3. Ag Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

  4. Ag Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

  5. In Intr troduc ductio tion On-Line Data Intensive (OLDI) Services [Congdon18] • Require immediate answers to requests that are coming in at a high rate. • End-user experience is highly dependent upon the system responsiveness. • The network becomes a significant component of overall DC latency when congestion occurs in the network. Deadline = 250 ms Request Aggregator Deadline = 50 ms Aggregator Aggregator ... Aggregator Deadline = 10 ms Worker Worker ... Worker Worker Worker ... Worker

  6. In Intr troduc ductio tion Data-Center Networks (DCNs) • Todays DCNs require a flexible fabric for carrying in a convergent way traffic from different types of applications, storage of control. • Latency is a concern: Fabric design for DCNs must minimize or eliminate packet loss , provide high throughput and maintain low latency . • These goals are crucial for applications of OLDI, Deep Learning, NVMe over Fabrics and the Cloudified Central Offices. • However, congestion threatens these applications.

  7. In Intr troduc ductio tion Why congestion isolation is needed? HS = traffic injected to Hot Spot destination • HoL-blocking dramatically HS degrades the network HS starts ends performance (e.g. PFC has not enough granularity and there is no congested flow Network Throughput (normalized) identification) [Garcia05]. 0.8 0.7 • Classical e2e congestion 0.6 0.5 control for lossless networks 0.4 0.3 is difficult to tune, reacts 1Q 0.2 ITh VOQnet slowly, and may introduce 0.1 0 oscillations and instability 0 1e+06 2e+06 3e+06 4e+06 5e+06 Time (nanoseconds) [Escudero11]. 64-node CLOS network, 4 hot-spots

  8. In Intr troduc ductio tion Why congestion isolation is needed? Sw. 1 Congested flows (Dst. X) 33% Src. A 33% Sw. 5 Non-congested flows (Dst. Y) Non-congested flows (Dst. Z) 33% Sw. 8 Sw. 2 100% 33% Dst. X Src. B 66% Sw. 6 33% 33% 33% Dst. Y Src. C 33% Sw. 9 Sw. 3 Sw. 7 66% 33% 33% 33% Dst. Z Src. D 33% Sw. 4 33% Src. E High-Order HoL-blocking Low-Order HoL-blocking 33 % Sending 33 % Sending 33 % Stopped 33 % Stopped 33 % Sending 33 % Sending

  9. In Intr troduc ductio tion Why congestion isolation is needed? • We need a congestion isolation (CI) mechanism that reacts quickly when transient congestion situations appear, preventing network performance degradation caused by the HoL blocking. • We want a CI mechanism that complements other technologies available in the DCNs, so that CI improves their performance, while the others reduce the CI complexity.

  10. Ag Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

  11. Appearance of Congestion Congestion Con Congestion (t0+T) Injection rate at 100% of Injection rate at 100% of the link bandwidth (full rate) the link bandwidth (full rate) Speedup = 2 Speedup = 1 Congestion on Dynami Congestion (t0) mics in DC Congestion (t0) Injection rate at 100% of Injection rate at 100% of the link bandwidth (full rate) the link bandwidth (full rate) DCNs Speedup = 1.5 Congestion Speedup = 2 Ns Congestion (t0+T)

  12. Con Congestion on Dynami mics in DC DCNs Ns Growth of Congestion Trees (from root to leaves) Switch speedup = 1.5 Packet flows Congestion point Switch 1 Switch 3 Switch 5 Switch 4 Switch 2

  13. Con Congestion on Dynami mics in DC DCNs Ns Growth of Congestion Trees (from leaves to root) Switch speedup = 1.5 Packet flows Congestion point Switch 1 Switch 5 Switch 2 Switch 7 Switch 3 Switch 6 Switch 4

  14. Congestion Con on Dynami mics in DC DCNs Ns Growth of Congestion Trees (Roots movement) Switch speedup = 1.5 Packet flows (start) Packet flows (after) Congestion point Switch 1 Switch 1 Switch 3 Switch 3 Switch 2 Switch 2

  15. Con Congestion on Dynami mics in DC DCNs Ns Growth of Congestion Trees (in-network roots) Switch 1 Switch 5 X Switch 2 Y Switch 7 Switch 8 Switch speedup = 1.5 Packet flows addressed to X Switch 3 Packet flows addressed to Y Congestion point Switch 6 Switch 4

  16. Con Congestion on Dynami mics in DCN CNs Growth of Congestion Trees (Overlapping) X Switch 1 Switch 4 Switch 8 Switch 2 Switch 5 Switch 7 Y Switch speedup = 1.5 Switch 3 Switch 6 Packet flows addressed to X Switch 9 Packet flows addressed to Y Congestion point

  17. Con Congestion on Dynami mics in DCN CNs Growth of Congestion Trees (Vanishing) Switch speedup = 1.5 Permanent packet flows Packet flows disappearing first Congestion point first appeared in the switch Switch 1 Switch 1 Switch 3 Switch 3 Switch 2 Switch 2

  18. Ag Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

  19. Re Reducing Congestion Incast congestion reduction - ECMP

  20. Reducing Congestion Re In-network congestion reduction - ECN X Switch 1 Switch 4 Switch 8 Switch 2 Switch 5 Switch 7 Y Switch speedup = 1.5 Packet flows addressed to X Switch 9 Switch 3 Switch 6 Packet flows addressed to Y Victim flow Congestion point

  21. Re Reducing Congestion Limitations of current technologies [Escudero19] • These technologies may work together to eliminate loss in the cloud data center network. • Load-balancing and destination scheduling are end-to- end solutions incurring in the RTT delays when congestion appear. • However, there is no time for loss in the network due to congestion and congestion trees grow very quickly . • Transient congestion may still produce HoL blocking that leads to increase latency, lower throughput and buffers overflow, significantly degrading performance. • Even using these mechanisms, we still need something to deal with HOL Blocking locally and fast.

  22. Ag Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

  23. Comb Combining Con Congestion on Ma Manageme ment Me Mechanisms ms • CI is needed to react locally and very fast to immediately eliminate HoL blocking . • Previous technologies reduce the use of PFC and ECN, but their closed- and open-loop approach cause delays still happening . • Congestion trees appear suddenly , are difficult to predict (even worse when load balancing is applied) and grow quickly . • New techniques can be applied in combination to the previous technologies , improving their behavior.

  24. Combining Con Comb Congestion on Manageme Ma ment Me Mechanisms ms Dynamic Virtual Lanes (DVL) Switch B Switch A CFQ CFQ CFQ CFQ P1 P1 P3 P3 CIP nCFQ nCFQ nCFQ nCFQ Congestion Root CFQ CFQ P2 P2 P4 P4 nCFQ nCFQ Legend Output port requested by the packet on top. Congestion root. Congestion Isolation Packets (CIP). Packets from congested flows. Packets from non-congested flows.

  25. Ag Agenda Introduction Congestion Dynamics in DCNs Reducing In-Network and Incast Congestion Combining Congestion Management Mechanisms Conclusions

  26. Re References [Duato03] J. Duato, S. Yalamanchili, and L. M. Ni, Interconnection Networks: An Engineering Approach. San Francisco, CA, USA: Morgan Kaufmann Publishers, 2003. [Garcia05] P. J. Garcia, J. Flich, J. Duato, I. Johnson, F. J. Quiles, and F. Naven, “Dynamic Evolution of Congestion Trees: Analysis and Impact on Switch Architecture,” in High Performance Embedded Architectures and Compilers, ser. Lecture Notes in Computer Science. Springer, Berlin, Heidelberg, Nov. 2005, pp. 266–285. [Congdon18] Paul Congdon, “IEEE 802 Nendica Report: The Lossless Network for Data Centers”, IEEE-SA Industry Connections White Paper, August 2018. [Leiserson85] C. E. Leiserson, “Fat-trees: Universal networks for hardware-efficient supercomputing,” IEEE Transactions on Computers , vol. C-34, pp. 892– 901, Oct 1985. [Escudero11] Jesús Escudero-Sahuquillo, Ernst Gunnar Gran, Pedro Javier García, Jose Flich, Tor Skeie, Olav Lysne, Francisco J. Quiles, José Duato: Combining Congested-Flow Isolation and Injection Throttling in HPC Interconnection Networks. ICPP 2011: 662-672 [Escudero19] Jesús Escudero-Sahuquillo, Pedro Javier García, Francisco J. Quiles, José Duato: P802.1Qcz interworking with other data center technologies. IEEE 802.1 Plenary Meeting, San Diego, CA, USA July 8, 2018 (cz-escudero-sahuquillo-ci-internetworking-0718-v1.pdf)

Recommend


More recommend