data driven connectivity
play

Data Driven Connectivity Junda Liu, Aurojit Panda , Ankit Singla, - PowerPoint PPT Presentation

Data Driven Connectivity Junda Liu, Aurojit Panda , Ankit Singla, Brighten Godfrey, Michael Schapira, Scott Shenker Division of Concerns Division of Concerns Routing is a control plane operation. Operates in the order of milliseconds.


  1. Data Driven Connectivity Junda Liu, Aurojit Panda , Ankit Singla, Brighten Godfrey, Michael Schapira, Scott Shenker

  2. Division of Concerns

  3. Division of Concerns • Routing is a control plane operation. • Operates in the order of milliseconds.

  4. Division of Concerns • Routing is a control plane operation. • Operates in the order of milliseconds. • Packet forwarding is a data plane operation. • Operates in the order of microseconds.

  5. Link Failures Hard

  6. Link Failures Hard • Some users require low latency packet delivery.

  7. Link Failures Hard • Some users require low latency packet delivery. • Some users require high reliability.

  8. Link Failures Hard • Some users require low latency packet delivery. • Some users require high reliability. • Control Plane response to link failure is too slow.

  9. Today ’ s Solution • Rely on precomputed backup paths

  10. Today ’ s Solution • Rely on precomputed backup paths • Typically support single link failures.

  11. Today ’ s Solution • Rely on precomputed backup paths • Typically support single link failures. • State grows exponentially for more links.

  12. Today ’ s Solution • Rely on precomputed backup paths • Typically support single link failures. • State grows exponentially for more links. • Hard to generalize. Hard to configure.

  13. Routing is the Problem! • Routing conflates two functions • Optimality - Use good paths • Inherently global, requires coordination. • Connectivity - Deliver packets • Can it be local?

  14. Data Plane Connectivity

  15. Data Plane Connectivity • Can we push connectivity to the data plane?

  16. Data Plane Connectivity • Can we push connectivity to the data plane? • What would it take?

  17. Data Plane Connectivity • Can we push connectivity to the data plane? • What would it take? • No FIB changes at packet rate.

  18. Data Plane Connectivity • Can we push connectivity to the data plane? • What would it take? • No FIB changes at packet rate. • No additional data in packet header.

  19. Data Plane Connectivity • Can we push connectivity to the data plane? • What would it take? • No FIB changes at packet rate. • No additional data in packet header. • Impossible

  20. Data Plane Connectivity • Can we push connectivity to the data plane? • What would it take? • No FIB changes at packet rate. • No additional data in packet header. • Impossible

  21. Data Plane Connectivity • Relax constraints • Change a few bits in FIB at packet rates. • Clearly feasible, but is it enough?

  22. Guaranteeing Connectivity 1. Take advantage of available redundancy.

  23. Guaranteeing Connectivity 1. Take advantage of available redundancy. 2. Restore connectivity at data speeds.

  24. Guaranteeing Connectivity 1. Take advantage of available redundancy. 2. Restore connectivity at data speeds. 3. Achieve optimality at control speeds.

  25. Using Redundancy: DAGs Destination

  26. Using Redundancy: DAGs Destination • Current paths to a destination do not use all links

  27. Using Redundancy: DAGs Destination • Current paths to a destination do not use all links • Extend routing tables to increase redundancy.

  28. Restoring Connectivity

  29. Reverse to Reconnect

  30. Reverse to Reconnect • Link failure can disconnect a DAG.

  31. Reverse to Reconnect • Link failure can disconnect a DAG. • Disconnected node reverses all links to point out.

  32. Reverse to Reconnect • Link failure can disconnect a DAG. • Disconnected node reverses all links to point out. • Finite set of reversals reconnect DAG.

  33. Reversals in Data Plane • Two challenges must be addressed

  34. Reversals in Data Plane • Two challenges must be addressed • Notifications can be lost.

  35. Reversals in Data Plane • Two challenges must be addressed • Notifications can be lost. • Notifications can be delayed.

  36. Walk Through

  37. Walk Through

  38. Walk Through 0 Source

  39. Create an OUT Link 0

  40. Create an OUT Link 0 0 0 0 Local Sequence

  41. Create an OUT Link 0 1 0 0 0 0 0 Local Sequence Remote Sequence

  42. Create an OUT Link 0 1 0 0 0 0 0 Local Sequence Remote Sequence Reversible

  43. Create an OUT Link • Reverse link direction 0 1 0 0 0 0 0 Local Sequence Remote Sequence Reversible

  44. Create an OUT Link • Reverse link direction 0 1 1 0 • Increment Local Sequence 0 0 0 Local Sequence Remote Sequence Reversible

  45. Create an OUT Link • Reverse link direction 0 1 1 0 • Increment Local Sequence 1 0 0 • Forward packet Local Sequence Remote Sequence Reversible

  46. Dealing with Notifications 0 0 00 Local Sequence Remote Sequence Reversible

  47. Dealing with Notifications • Receive on link pointing OUT 1 0 0 00 Local Sequence Remote Sequence Reversible

  48. Dealing with Notifications • Receive on link pointing OUT 1 • Compare sequence numbers 0 0 00 Local Sequence Remote Sequence Reversible

  49. Dealing with Notifications • Receive on link pointing OUT 1 • Compare sequence numbers 0 0 • See if anything changed 00 Local Sequence Remote Sequence Reversible

  50. Dealing with Notifications • Receive on link pointing OUT • Compare sequence numbers 1 0 1 • See if anything changed 00 • Reverse link Local Sequence Remote Sequence Reversible

  51. Zooming Out 0

  52. Zooming Out 1

  53. Zooming Out 1

  54. What about Optimality?

  55. Safe Control Plane • Cannot interfere with data plane.

  56. Safe Control Plane • Cannot interfere with data plane. • Build a safe primitive

  57. Safe Control Plane • Cannot interfere with data plane. • Build a safe primitive • Set all edges of a node to point out

  58. Safe Control Plane • Cannot interfere with data plane. • Build a safe primitive • Set all edges of a node to point out • Described in paper

  59. Evaluation

  60. Evaluation Overview • Test on WAN and datacenter topologies • Stretch, Throughput, Latency • Effect of FIB update delays • On latency and throughput • End-to-end benefits of using DDC.

  61. Evaluation Overview • Test on WAN and datacenter topologies • Stretch, Throughput, Latency • Effect of FIB update delays • On latency and throughput • End-to-end benefits of using DDC.

  62. End-to-End Test • 8 Pod FatTree • Partition aggregate workload • 5 link failures • Simulated effect for 550 seconds

  63. Requests Fulfilled �������� ����������� ���� ��� ������������������� ��� ��� ��� ��� ��� ��� ��� �� ���� ���� ���� ���� ���� ���� �������� • Bucketed 10 second intervals. • Percentage requests satisfied.

  64. Request Latency �� �������� ����������� ����������������������� ���� ���� ���� ���� �� �� ���� �� ���� �� ���������

  65. FIB Update Delay • What is the impact of delayed FIB changes • On packet latency? • Three link failure: all traffic in test affected. • Focus on behavior before convergence.

  66. FIB Update Delay �� �������������� ����� �������������� ���������������������� ���� ��������������� ����� ���� ����� ���� ����� ���� ����� ���� �� ���� �� ���� �� ��������� Overall ~99% of packets in under 3 ms. No packets get dropped, just long tail.

  67. FIB Update Delay • What is the impact of delayed FIB changes • On TCP throughput? • Use a WAN topology (AS 2914) • 1 Gbps links • 5 link failures

  68. FIB Update Delay �� �������������� �������������� ���������������������� ���� ���� ���� ���� �� ���� ���� ���� ���� ���� ���� �� �����������������

  69. In the Same Vein... • FCP (SIGCOMM ’ 07) • Unbounded bits in header • Extensive FIB changes on failure packet • Packet Re-Cycling (HotNets ’ 10) • First solve an NP-Complete problem. • log(network diameter) bits in header. • DDC is simpler.

  70. Potential Impact • ASICs implement DDC • Connectivity guaranteed by the data plane. • Control Plane focuses on optimality/functionality.

  71. Questions?

Recommend


More recommend