tcep traffic consolidation for energy proportional high
play

TCEP: Traffic Consolidation for Energy-Proportional High-Radix - PowerPoint PPT Presentation

TCEP: Traffic Consolidation for Energy-Proportional High-Radix Networks Gwangsun Kim Hayoung Choi, John Kim Arm Research KAIST High-radix Networks Dragonfly network in Cray XC30 system 1D Flattened butterfly (fully connected) Image source:


  1. TCEP: Traffic Consolidation for Energy-Proportional High-Radix Networks Gwangsun Kim Hayoung Choi, John Kim Arm Research KAIST

  2. High-radix Networks Dragonfly network in Cray XC30 system 1D Flattened butterfly (fully connected) Image source: Cray ▪ A large number of narrow links → low network diameter, high path diversity … ▪ Energy-proportionality can be challenging – Links use high-speed signaling … – High energy consumption regardless of load (‘Idle’ packets transmitted) 2D Flattened butterfly (fully connected within each dimension)

  3. Motivation ▪ Data center networks can be significantly underutilized – Resources provisioned to meet peak demand – Low link utilization measured by Facebook ▪ Network energy waste can be high at low system utilization ▪ Exploit link power-gating opportunity in high-radix routers [Rot et al., SIGCOMM’15] [Abts et al., ISCA’10]

  4. Motivation ▪ Data center networks can be significantly underutilized – Resources provisioned to meet peak demand – Low link utilization measured by Facebook ▪ Network energy waste can be high at low system utilization ▪ Exploit link power-gating opportunity in high-radix routers Link power-gating challenges: - How to maximize power reduction? - How to keep network connected? - How to achieve scalability? - How to minimize performance impact? - How to load-balance network? [Rot et al., SIGCOMM’15] [Abts et al., ISCA’10]

  5. Contents ▪ Background / Motivation ▪ Traffic consolidation ▪ Maintaining connectivity ▪ Criteria for selecting links to power-gate ▪ Power-aware load-balanced routing ▪ Evaluation ▪ Conclusion

  6. Traffic Consolidation ▪ Energy-proportionality requires aggressive power-gating ▪ Consolidate flows onto fewer links thru non-minimal routing Flow 1: 25% link util. Flow 0: 50% link util. … … Router Router Flow 0: 50% link util. Traffic consolidation Flow 1: … … 50% link util. Flow 2: 50% link util. Flow 2: 50% link util. Flow 1: 25% link util.

  7. Subnetwork-based Distributed Approach ▪ Subnetwork: routers that are fully connected in a dimension ▪ Independently manage power with local information R0 R1 R3 R2 R0 R1 R4 R5 R6 R7 R10 R11 R8 R9 R3 R2 Fully connected within R12 R13 R14 R15 each dimension 2D Flattened butterfly

  8. Subnetwork-based Distributed Approach ▪ Subnetwork: routers that are fully connected in a dimension ▪ Independently manage power with local information R0 R1 R3 R2 R4 R5 R6 R7 Row subnetworks R10 R11 R8 R9 R12 R13 R14 R15 2D Flattened butterfly

  9. Subnetwork-based Distributed Approach ▪ Subnetwork: routers that are fully connected in a dimension ▪ Independently manage power with local information Column subnetworks R0 R1 R3 R2 R4 R5 R6 R7 R10 R11 R8 R9 R12 R13 R14 R15 2D Flattened butterfly

  10. Subnetwork-based Distributed Approach ▪ Subnetwork: routers that are fully connected in a dimension ▪ Independently manage power with local information Consists of a single subnetwork R0 R1 R3 R2 R4 R5 R6 R7 R10 R11 R8 R9 R12 R13 R14 R15 2D Flattened butterfly 1D Flattened butterfly

  11. Root Network – Maintaining Connectivity ▪ Constantly checking connectivity can incur high overhead ▪ Subset of links that are always ON to keep all nodes connected ▪ Star topology → minimal # of links and low network diameter Root network links R2 Other links R1 R3 R0 Rearrange the R7 R4 root network R6 R5 Max. hop count = 2 Root network for 1D Flattened butterfly

  12. Root Network for Higher Dimensions ▪ Star topology is formed within each subnetwork Root network links R0 R0 R1 R2 R3 Other links R4 R5 R6 R7 R1 R8 R10 R11 R9 R0 R2 R3 R12 R13 R14 R15 Root network for 2D Flattened butterfly

  13. Root Network for Higher Dimensions ▪ Star topology is formed within each subnetwork ▪ Further reducing ON links? → too complex, little added benefit (4.3% for radix-64 routers) Root network links R0 R0 R1 R2 R3 Other links R4 R5 R6 R7 R8 R10 R11 R9 R12 R13 R14 R15 Root network for 2D Flattened butterfly

  14. Root Network for Higher Dimensions ▪ Star topology is formed within each subnetwork ▪ Further reducing ON links? → too complex, little added benefit (4.3% for radix-64 routers) ▪ Other links can be power-gated without affecting connectivity Root network links R0 R0 R1 R2 R3 Other links R4 R5 R6 R7 R8 R10 R11 R9 R12 R13 R14 R15 Root network for 2D Flattened butterfly

  15. Observation on Maximizing Path Diversity ▪ Which links should be ON for high path diversity? OFF ON (root network) ON (additional links) Provides better path diversity Approach 1: distribute ON links Approach 2: concentrate ON links

  16. Hub Routers for High Path Diversity ▪ ‘Hub’ routers created → ‘small - world’ network ▪ Similarly, airlines create hub airports to reduce cost ▪ Quantitative results: – 1D Flattened butterfly (32 routers, 1024 nodes) – No non-minimal paths with more than 2 hops – Random distribution: average from 10,000 samples Hub airport Concentrate to "hub" routers Randomly distribute 1 0.8 Normalized 0.6 number of 0.4 total paths Total # of paths 0.2 improved by 1.9x 0 0 20 40 60 80 100 Fraction of active links (%) Edge: Direct flights by United Airlines

  17. Hub Routers for High Path Diversity ▪ ‘Hub’ routers created → ‘small - world’ network ▪ Similarly, airlines create hub airports to reduce cost ▪ Quantitative results: – 1D Flattened butterfly (32 routers, 1024 nodes) – No non-minimal paths with more than 2 hops TCEP concentrates ON links – Random distribution: average from 10,000 samples to a small number of “hub” routers. Hub airport Concentrate to "hub" routers Randomly distribute 1 0.8 Normalized 0.6 number of 0.4 total paths Total # of paths 0.2 improved by 1.9x 0 0 20 40 60 80 100 Fraction of active links (%) Edge: Direct flights by United Airlines

  18. Hub Routers for High Path Diversity ▪ ‘Hub’ routers created → ‘small - world’ network ▪ Similarly, airlines create hub airports to reduce cost ▪ Quantitative results: – 1D Flattened butterfly (32 routers, 1024 nodes) – No non-minimal paths with more than 2 hops – Random distribution: average from 10,000 samples Concentrate to "hub" routers Randomly distribute 1 0.8 Normalized 0.6 number of 0.4 total paths Total # of paths 0.2 improved by 1.9x 0 0 20 40 60 80 100 Fraction of active links (%)

  19. Observation on Minimizing Impact on Network Differentiate the type of traffic (minimally vs. non-minimally routed) R0 R1 ON OFF Power-gate Increased Making Candidate 1 hop count power-gating Minimally & BW usage decision at R0 routed R0 R1 R2 R3 R0 R1 R2 R3 The same hop count Non-minimally Power-gate & BW usage routed Candidate 2 R2 R3

  20. Observation on Minimizing Impact on Network Differentiate the type of traffic (minimally vs. non-minimally routed) R0 R1 ON OFF Power-gate Increased Making Candidate 1 hop count power-gating Minimally & BW usage TCEP prioritizes power-gating links decision at R0 routed with the least amount of minimally routed traffic. R0 R1 R2 R3 R0 R1 R2 R3 The same hop count Non-minimally Power-gate & BW usage routed Candidate 2 R2 R3

  21. Problem with Load-balanced Routing ▪ No global link state information → non-minimal path can become significantly longer ▪ Baseline non-minimal routing: Source → Intermediate (INTM) router → Destination R0 R1 R2 R3 SRC Congestion R4 R5 R6 R7 INTM R8 R9 R10 R10 R11 R12 R13 R15 R14 DEST Baseline (no power-gating) 4 hops

  22. Problem with Load-balanced Routing ▪ No global link state information → non-minimal path can become significantly longer ▪ Baseline non-minimal routing: Source → Intermediate (INTM) router → Destination Some links are OFF! R0 R1 R2 R3 R0 R1 R2 R3 SRC SRC Congestion Congestion R4 R5 R6 R7 R4 R5 R6 R7 INTM INTM R8 R9 R10 R10 R11 R8 R9 R10 R11 R12 R13 R15 R12 R13 R15 R14 R14 DEST DEST Baseline (no power-gating) 4 hops With power-gating 8 hops

  23. Problem with Load-balanced Routing ▪ No global link state information → non-minimal path can become significantly longer ▪ Baseline non-minimal routing: Source → Intermediate (INTM) router → Destination R0 R1 R2 R3 R0 R1 R2 R3 SRC SRC Congestion Congestion R4 R5 R6 R7 R4 R5 R6 R7 INTM INTM R8 R9 R10 R10 R11 R8 R9 R10 R11 R12 R13 R15 R12 R13 R15 R14 R14 DEST DEST Baseline (no power-gating) 4 hops With power-gating 8 hops

  24. Proposed PAL Routing ▪ PAL: P ower- A ware progressive L oad-balanced (Routing) ▪ Instead of global randomization, locally randomize ▪ Compared to the baseline load-balanced routing: – The same maximum hop count (2 hops in each dimension) – No additional virtual channels (dimension-order routing) INTM_Y R0 R1 R2 R3 R3 INTM_X SRC Congestion R4 R4 R5 R6 R7 DEST_X R8 R9 R10 R11 4 hops R12 R13 R15 R14 DEST

  25. Other Issues Addressed in the Paper ▪ Challenge: the observations can lead to different decisions → We propose a low-complexity algorithm to reconcile them ▪ Only one link turned on/off from a router per epoch → avoid supply voltage shift ▪ Routers keep track of non-minimal paths within a subnetwork ▪ More details in the paper..

Recommend


More recommend