past
play

PAST Scalable Ethernet for Data Centers Brent Stephens , Alan Cox - PowerPoint PPT Presentation

PAST Scalable Ethernet for Data Centers Brent Stephens , Alan Cox , Wes Felter , Colin Dixon , and John Carter Rice University IBM Research December 11th, 2012 Brent Stephens PAST: Scalable Ethernet for Data Centers


  1. PAST Scalable Ethernet for Data Centers Brent Stephens † , Alan Cox † , Wes Felter ‡ , Colin Dixon ‡ , and John Carter ‡ † Rice University ‡ IBM Research December 11th, 2012 Brent Stephens PAST: Scalable Ethernet for Data Centers 1 / 31

  2. PAST . . . . . . is a large flat L2 network for using arbitrary topologies . . . is implementable on existing Ethernet switch hardware and unmodified host network stacks . . . meets or exceeds the performance of the state of the art Brent Stephens PAST: Scalable Ethernet for Data Centers 2 / 31

  3. Data Center Network Requirements Host mobility Effective use of bandwidth Autonomous Scalability Brent Stephens PAST: Scalable Ethernet for Data Centers 3 / 31

  4. Additional Design Requirements No hardware Respects Topology changes Layering Independent Brent Stephens PAST: Scalable Ethernet for Data Centers 4 / 31

  5. PAST Per-Address Spanning Tree routing algorithm Unmodified Ethernet switches and hosts ◮ Implementable today ◮ Exploit existing features Arbitrary topologies ◮ 10’s of thousands of hosts Performance comparable to or greater than ECMP Brent Stephens PAST: Scalable Ethernet for Data Centers 5 / 31

  6. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  7. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  8. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  9. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  10. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  11. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  12. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  13. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  14. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  15. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  16. Routing Space Brent Stephens PAST: Scalable Ethernet for Data Centers 6 / 31

  17. PAST Algorithm Goal: Route using the Ethernet table (DMAC, VLAN) Constraint 1: Full pair-wise connectivity per-VLAN Constraint 2: Ethernet table forces a tree Solution: Build a spanning tree rooted at each address Load Balances at the address ((v-)host) granularity Brent Stephens PAST: Scalable Ethernet for Data Centers 7 / 31

  18. PAST Algorithm Goal: Route using the Ethernet table (DMAC, VLAN) Constraint 1: Full pair-wise connectivity per-VLAN Constraint 2: Ethernet table forces a tree Solution: Build a spanning tree rooted at each address Load Balances at the address ((v-)host) granularity Brent Stephens PAST: Scalable Ethernet for Data Centers 7 / 31

  19. PAST Algorithm Goal: Route using the Ethernet table (DMAC, VLAN) Constraint 1: Full pair-wise connectivity per-VLAN Constraint 2: Ethernet table forces a tree Solution: Build a spanning tree rooted at each address Load Balances at the address ((v-)host) granularity Brent Stephens PAST: Scalable Ethernet for Data Centers 7 / 31

  20. Tree Construction Goal: Use efficient paths Solution: Use a BFS tree for minimal paths Goal: Load-balance over all links Solution: Tree selection ◮ Random ◮ Weight links by load Brent Stephens PAST: Scalable Ethernet for Data Centers 8 / 31

  21. Tree Construction Goal: Use efficient paths Solution: Use a BFS tree for minimal paths Goal: Load-balance over all links Solution: Tree selection ◮ Random ◮ Weight links by load Brent Stephens PAST: Scalable Ethernet for Data Centers 8 / 31

  22. Tree Construction Goal: Use efficient paths Solution: Use a BFS tree for minimal paths Goal: Load-balance over all links Solution: Tree selection ◮ Random ◮ Weight links by load Brent Stephens PAST: Scalable Ethernet for Data Centers 8 / 31

  23. Tree Construction Goal: Use efficient paths Solution: Use a BFS tree for minimal paths Goal: Load-balance over all links Solution: Tree selection ◮ Random ◮ Weight links by load Brent Stephens PAST: Scalable Ethernet for Data Centers 8 / 31

  24. Tree Construction Goal: Use efficient paths Solution: Use a BFS tree for minimal paths Goal: Load-balance over all links Solution: Tree selection ◮ Random ◮ Weight links by load Brent Stephens PAST: Scalable Ethernet for Data Centers 8 / 31

  25. Tree Construction Goal: Use efficient paths Solution: Use a BFS tree for minimal paths Goal: Load-balance over all links Solution: Tree selection ◮ Random ◮ Weight links by load Brent Stephens PAST: Scalable Ethernet for Data Centers 8 / 31

  26. Tree Construction Goal: Use efficient paths Solution: Use a BFS tree for minimal paths Goal: Load-balance over all links Solution: Tree selection ◮ Random ◮ Weight links by load Brent Stephens PAST: Scalable Ethernet for Data Centers 8 / 31

  27. Tree Construction Goal: Use efficient paths Solution: Use a BFS tree for minimal paths Goal: Load-balance over all links Solution: Tree selection ◮ Random ◮ Weight links by load Brent Stephens PAST: Scalable Ethernet for Data Centers 8 / 31

  28. Tree Construction Goal: Use efficient paths Solution: Use a BFS tree for minimal paths Goal: Load-balance over all links Solution: Tree selection ◮ Random ◮ Weight links by load Brent Stephens PAST: Scalable Ethernet for Data Centers 8 / 31

  29. Valiant Load Balancing Brent Stephens PAST: Scalable Ethernet for Data Centers 9 / 31

  30. Non-Minimal Tree Construction NM-PAST ◮ Root the tree for host h at a random intermediate switch i ◮ Inspired by Valiant Load Balancing Brent Stephens PAST: Scalable Ethernet for Data Centers 10 / 31

  31. Non-Minimal Tree Construction NM-PAST ◮ Root the tree for host h at a random intermediate switch i ◮ Inspired by Valiant Load Balancing Brent Stephens PAST: Scalable Ethernet for Data Centers 10 / 31

  32. Non-Minimal Tree Construction NM-PAST ◮ Root the tree for host h at a random intermediate switch i ◮ Inspired by Valiant Load Balancing Brent Stephens PAST: Scalable Ethernet for Data Centers 10 / 31

  33. PAST Discussion Broadcast/Multicast Unaffected. May be provided through STP or SDN Security Use VLANs as normal Virtualization Use any higher layer virtualization overlay (NetLord, SecondNet, MOOSE, VXLAN) Brent Stephens PAST: Scalable Ethernet for Data Centers 11 / 31

  34. PAST Implementation IBM RackSwitch G8264 Brent Stephens PAST: Scalable Ethernet for Data Centers 12 / 31

  35. Implementation Scalability Eliminate Broadcasts Improve scalability by using controller for address detection and resolution (ARP, DHCP, IPv6 ND, and RS) Route Computation 8,000 hosts ⇒ 40 µ s − 1 ms per tree (300ms per network) Trivially Parallelizable Route Installation Install and forward to 100K addresses 2-12ms rule install latency ⇒ masked by migration latency Failure Recovery Should patch affected portions of trees Brent Stephens PAST: Scalable Ethernet for Data Centers 13 / 31

  36. Simulator Simulate to evaluate performance at scale ◮ Flow based simulator assumes max-min fairness Workloads URand-8 Stride-64 Shuffle-10 MSR i ∈ 1 .. 8 Dst n = 128MB to all Synthetically Dst i = ( n + 64)% N , hosts, generated from rand ()% N , Random order, 1500-server Adversarial 10 active cluster, Benign connections, Light load More stressful than URand Brent Stephens PAST: Scalable Ethernet for Data Centers 14 / 31

  37. Topologies Compare equal bisection-bandwidth (oversubscription ratio) networks HyperX Jellyfish (Flattened Butterfly) EGFT (Random Regular (Fat Tree) Graph) Brent Stephens PAST: Scalable Ethernet for Data Centers 15 / 31

  38. Evaluation Demonstrate PAST performance equal to or greater than other routing algorithms Demonstrate PAST performs well under adversarial workloads Demonstrate that PAST can effectively use a variety of topologies Brent Stephens PAST: Scalable Ethernet for Data Centers 16 / 31

  39. Evaluation Demonstrate PAST performance equal to or greater than other routing algorithms ◮ URand-8 on a 1:2 Bisection Bandwidth HyperX ◮ Shuffle-10 on a 1:2 Bisection Bandwidth HyperX Demonstrate PAST performs well under adversarial workloads Demonstrate that PAST can effectively use a variety of topologies Brent Stephens PAST: Scalable Ethernet for Data Centers 17 / 31

  40. URand-8 on a 1:2 Bisection Bandwidth HyperX PAST matches ECMP 1.0 Aggregate Throughput 0.8 PAST/ECMP 0.6 0.4 0.2 0.0 1K 2K 3K 4K 5K 6K 7K 8K Number of hosts PAST NM-PAST EthAir ECMP VAL STP Brent Stephens PAST: Scalable Ethernet for Data Centers 18 / 31

  41. Shuffle-10 on a 1:2 Bisection Bandwidth HyperX EthAir scales poorly 1.0 Aggregate Throughput 0.8 0.6 0.4 0.2 EthAir 0.0 1K 4K 5K 7K 2K 3K 6K 8K Number of hosts PAST NM-PAST EthAir ECMP VAL STP Brent Stephens PAST: Scalable Ethernet for Data Centers 19 / 31

  42. Evaluation Demonstrate PAST performance equal to or greater than other routing algorithms ◮ PAST matches ECMP ◮ EthAir scales poorly Demonstrate PAST performs well under adversarial workloads Demonstrate that PAST can effectively use a variety of topologies Brent Stephens PAST: Scalable Ethernet for Data Centers 20 / 31

Recommend


More recommend