RADWAN | Rate Adaptive Wide Area Networks Rachee Singh / U. Massachusetts Amherst Manya Ghobadi / Microsoft Research Klaus-Tycho Foerster / University of Vienna Mark Filer / Microsoft Research Phillipa Gill / U. Massachusetts Amherst 1
O(100) datacenters Wide Area Networks Dedicated Wide Area Network Costs O(100) million dollars per year [SIGCOMM ’13] [SIGCOMM ’14] [SIGCOMM ’16] 2
O(100,000 miles) of fiber O(1,000) optical devices Fiber is scarce, expensive Identify inefficiencies in the optical backbone to gain capacity, availability at reduced cost. 3
This Talk Gain 134 Tbps of capacity and prevent 25% link failures in large North American WAN. 4
Talk Outline 1 How inefficient are optical backbones? 2 Dynamic capacity links in WANs Challenges in dynamically adapting link capacities 3 Rate Adaptive WANs 4 5
Optical Backbone Networks Optical cross-connects (OXCs) • OXC: switches optical signals fiber • Signal-to-noise ratio (SNR) measures signal quality • At OXC, measure signal quality • 8,000 wavelengths • Every 15 minutes • February 2015 to June 2017 6
Longitudinal Signal Quality on Fiber Capacity Threshold 200 Gbps 175 Gbps 150 Gbps Higher is better 125 Gbps 100 Gbps 75 Gbps 50 Gbps Failure SNR 01-07-2017 7
Opportunity for capacity gain For 8,000 wavelengths in WAN: 1.00 175 Gbps 200 Gbps 150 Gbps 100 Gbps 125 Gbps • Analyze average SNR • Compare with thresholds for link 0.75 capacity CDF 0.50 64% of optical wavelengths can operate at 175 Gbps or more. 0.25 95% of optical wavelengths can operate at higher than 100 Gbps. 0.00 0 2 4 6 8 10 12 14 16 (dB) Average SNR 8
Opportunity for availability gain • Distribution of link failure SNR • Across WAN links • For 2.5 years 25% of failures have SNR > 2.5dB These failures can be prevented by reducing link capacity to 50 Gbps (dB) 9
Our Proposal Dynamically adapt link capacities in response to changes in SNR. Gain 134 Tbps Prevent 25% link capacity failures By increasing link By reducing link capacity when capacity when high SNR low SNR 10
Talk Outline 1 How inefficient are optical backbones? 2 Dynamic capacity links in WANs Challenges in dynamically adapting link capacities 3 Rate Adaptive WANs 4 11
Challenges in dynamically adapting link capacities Requires hardware support for capacity reconfiguration 1 Requires re-thinking IP layer traffic engineering 2 12
Key question Can we use commodity hardware for changing link capacities? Supports higher order modulations (QPSK, 8-QAM, 16-QAM) Bandwidth Variable Transceiver Link capacity of 100G, 150G, 200G Arista 7504 linecards 13
Challenge 1: Adapting capacity on commodity h/w Increasing noise from attenuator Arista 7504 Chassis Link Link Down Down Ethernet 3/1/1 200G 150G 100G Ethernet 4/1/1 Capacity Downgrade to 150G Capacity Downgrade to 100G Variable Optical Attenuator Takes over 1 minute to change capacity à link downtime 14
Problem Commodity hardware is not optimized for dynamically adapting link capacity. 15
Question What causes latency of capacity reconfiguration? Turn off laser Program Registers Laser is on Link not usable Link Usable Link Usable 1 minute Turn laser on Majority of time spent in turning laser on. 16
Question Can we reduce the latency of capacity reconfiguration by not turning off the laser? 17
Question Can we reduce the latency of capacity reconfiguration by not turning off the laser? Program registers for modulation change Do not turn off laser in the evaluation board Repeat experiment 200X If the laser is left on, the outage is only 35ms to change capacity Acacia BVT Evaluation Board 18
Key question How should traffic engineering incorporate dynamic capacity links? 19
Question How should traffic engineering incorporate dynamic capacity links? Capacity changes cause links to be una unavailable for carrying ng traff ffic. Capacity changes lead to ne network chur hurn n and can be di disrupt ptive . 20
Talk Outline 1 How inefficient are optical backbones? 2 Dynamic capacity links in WANs Challenges in dynamically adapting link capacities 3 Rate Adaptive WANs 4 21
Solution We design the Rate Adaptive Wide Area Network (RADWAN) traffic engineering controller. Minimally SNR-aware Rate Adaptive disruptive Reconfigure Adapts link rates Knows possible capacity while to meet demands capacity gain of minimizing and improve each link network churn availability 22
RADWAN Traffic Engineering Formulation Inputs Outputs Network T opology Optimization Flow Allocations Demand Matrix Objective Optical T opology Links to and SNR reconfigure Constraints Current Flow Allocation 23
Proof of concept: RADWAN 365 km A B 390 km 375 km Router Amplifier C D 410 km 24
Throughput Gains with RADWAN RADWAN-hitless RADWAN (Gbps) SWAN-150 SWAN [SIGCOMM ‘13] RADWAN has 40% Higher network throughput compared to SWAN 25
Conclusion • Physical layer today is configured statically • We show that this leaves money on the table, in terms of • Network performance capacity • Link availability • Equipment cost ($/Gbps) • RAD AN introduces programmability in Layer 1 RADWAN • Improves network throughput by 40% 40% • Reduces link downtime by a factor of 18 18 • Reduces equipment cost ($/Gbps) by 32% 32% 26
Recommend
More recommend