jellyfish
play

Jellyfish networking data centers randomly Brighten Godfrey UIUC - PowerPoint PPT Presentation

Jellyfish networking data centers randomly Brighten Godfrey UIUC Cisco Systems, September 12, 2013 [Photo: Kevin Raskoff] Ask me about... Low latency networked systems Data plane verification (Veriflow) Ankit Singla UIUC Chi-Yao


  1. Example 12 of 16 4 of 16 reachable in origin ≤ 5 hops reachable origin in ≤ 5 hops (good expander) Fat tree Jellyfish random graph 16 servers, 20 switches, degree 4 16 servers, 20 switches, degree 4

  2. Example 12 of 16 reachable in origin ≤ 5 hops origin (good expander) Fat tree Jellyfish random graph 16 servers, 20 switches, degree 4 16 servers, 20 switches, degree 4

  3. Example 12 of 16 reachable in origin ≤ 5 hops origin (good expander) Fat tree Jellyfish random graph 16 servers, 20 switches, degree 4 16 servers, 20 switches, degree 4

  4. Example 12 of 16 reachable in origin ≤ 5 hops origin (good expander) Fat tree Jellyfish random graph 16 servers, 20 switches, degree 4 16 servers, 20 switches, degree 4

  5. Jellyfish has short paths                    Fat-tree with 686 servers

  6. Jellyfish has short paths                     Jellyfish, same equipment

  7. System Design: Performance Consistency

  8. Is performance more variable? Performance depends on choice of random graph • if you expand the network, would performance change dramatically? Extreme case: graph could be disconnected! • never happens, with high probability

  9. Little variation if size is moderate                        {min, avg, max} of 20 trials shown

  10. System Design: Routing

  11. Routing Intuition if we fully utilize all available capacity ... if total capacity used capacity per flow = # 1 Gbps flows How do we effectively utilize capacity without structure?

  12. Routing without structure In theory, just a multicommodity flow (MCF) problem Potential issues: • Solve MCF using a distributed protocol? • Optimal solution could have too many small subflows

  13. Routing Does ECMP work? • No • ECMP doesn’t use Jellyfish’s path diversity                      

  14. Routing: a simple solution Find k shortest paths Let Multipath TCP do the rest • [Wischik, Raiciu, Greenhalgh, Handley, NSDI’10] Optimal 1 Packet level simulation Normalized Throughput 0.8 0.6 86-90% of 0.4 optimal 0.2 0 70 165 335 600 960 #Servers (TCP is within 3 percentage points of MPTCP)

  15. Throughput: Jellyfish vs. fat tree  } +25%      more  servers              8-shortest paths + MPTCP

  16. Deploying k-shortest paths Multiple options: • SPAIN [Mudigonda, Yalagandula, Al-Fares, Mogul, NSDI’ 10] • Equal-cost MPLS tunnels • IBM Research’s SPARTA [CoNEXT 2012] • SDN controller based methods

  17. System Design: Cabling

  18. Cabling

  19. Cabling [Photo: Javier Lastras / Wikimedia]

  20. Cabling solutions Aggregate Fewer bundles cables cluster A new rack X for same # Cluster of switches servers as Rack of servers fat tree Aggregate cable cluster B Generic optimization: Place all switches centrally

  21. Interconnecting clusters How many “long” cables do we need?

  22. Interconnecting clusters 0.6 Normalized Throughput 0.5 ? 0.4 0.3 0.2 0.1 0 0 0.5 1 1.5 2 Cross-cluster Links (Ratio to Expected Under Random Connection)

  23. Interconnecting clusters 0.6 Normalized Throughput 0.5 0.4 0.3 0.2 0.1 0 0 0.5 1 1.5 2 Cross-cluster Links (Ratio to Expected Under Random Connection)

  24. Intuition

  25. Intuition

  26. Intuition Still need one crossing! ✓ ◆ Throughput should of total capacity 1 Θ drop when less than crosses the cut! APL

  27. Explaining throughput 0.7 Normalized Throughput 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Cross-cluster Links (Ratio to Expected Under Random Connection)

  28. Explaining throughput Upper bounds... 0.7 Normalized Throughput 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Cross-cluster Links (Ratio to Expected Under Random Connection) And constant-factor matching lower bounds in special case.

  29. Two regimes of throughput “plateau”: sparsest cut (total cap) / APL 0.7 Normalized Throughput 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Cross-cluster Links (Ratio to Expected Under Random Connection)

  30. Two regimes of throughput “plateau”: sparsest cut (total cap) / APL High-capacity switches 0.7 needn’t be clustered Normalized Throughput 0.6 0.5 0.4 Bisection bandwidth 0.3 is poor predictor of 0.2 performance! 0.1 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Cables can be Cross-cluster Links (Ratio to Expected Under Random Connection) localized

  31. What’s Next

  32. Research agenda Prototype in the lab • High throughput routing even in unstructured networks • New techniques for near-optimal TE applicable generally • SDN-based implementation Topology-aware application & VM placement Tech transfer

  33. For more... “Networking Data Centers Randomly” A. Singla, C. Hong, L. Popa, P . B. Godfrey NSDI 2012 “High throughput data center topology design” A. Singla, P . B. Godfrey, A. Kolla Manuscript (check arxiv soon!)

  34. Conclusion High throughput Expandability

  35. [Photo: Kevin Raskoff]

  36. Backup Slides

  37. Hypercube vs. Random Graph

  38. Is Jellyfish’s advantage just that it’s a “direct” network? 2.4 Hypercube_1serv 2.2 2 1.8 Relative Throughput 1.6 1.4 1.2 1 0.8 0.6 0.4 Answer: 0.2 1 2 3 4 5 6 7 8 8 64 128 256 Hypercube-n No switches switches

  39. Are There Even Better Topologies?

  40. A simple upper bound ∑ links capacity( link ) ≤ Throughput per flow # flows • mean path length Lower bound this!

Recommend


More recommend