CS244 Advanced Topics in Networking Lecture 6: Switching Nick - PowerPoint PPT Presentation

CS244 Advanced Topics in Networking Lecture 6: Switching Nick McKeown “High-speed switch scheduling for local-area networks” [Tom Anderson, Susan Owicki, James Saxe, Chuck Thacker. 1993] Spring 2020

Context Tom Anderson James B. Saxe At the time: DEC SRC (Palo Alto) At the time: DEC SRC (Palo Alto) ? After that: Compaq and HP Labs Professor of CS, University of Washington Previously: UC Berkeley, EECS Susan Owicki Chuck Thacker (d. 2017) At the time: DEC SRC (Palo Alto) At the time: DEC SRC (Palo Alto) Before that: Prof of EE & CS, Stanford Before that: Xerox PARC (“Alto”) Today: Marriage and Family Therapist, Palo Alto After that: Microsoft 2010 Turing Award Winner At the time the paper was written… • WWW was new, and Internet traffic was growing fast • Fastest Ethernet networks ran at 100Mb/s • Lots of interest in building faster switches and routers • Lively debate about an alternative to the Internet, called “ATM” 2

But first…

A few words about packet queues… R = line rate. ( 2 ) e.g. 100M bit/s, 10Gb/s 𝜇 R Packet buffer R 𝜇 ( 2 ) R R 𝜇 R Q: For any “load” what arrival pattern Q: For any “load” what arrival pattern 𝜇 ≤ 1, 𝜇 ≤ 1, leads to the most customers in the queue? leads to the most customers in the queue? Cumulative arrivals, A(t) Cumulative bits q(t) Cumulative bits 2R R Cumulative arrivals, A(t) R gradient ≤ R gradient ≤ 2 R time time Observation : The arrival rate is “bounded” by R on average. Observation : With one arrival “line” at the same rate, the queue is always empty (or at most one store-and-forward packet). The arrival process is “bounded” by R. 4

Different cases for 𝜇 = 1 1 3 line 1 line 1 line 2 line 2 0.5 1 1.5 2 time, s 1hr 2hr 3hr 4hr time Q: How big does the buffer need to be? Q: How big does the buffer need to be? Observation : For a given arrival rate, in order to know the queueing delay, we need to know the pattern (or “process”) of arrivals. 2 line 1 line 2 0.5 1 1.5 2 time, s Q: How big does the buffer need to be? 5

Background R R 3 1 4 2 R R R 2 1 R R … R R 3 N … … … R R A switch, or router, with N “ports”. N Each port runs at rate R b/s. We say the “switching capacity” is N x R b/s. 6

An output-queued (OQ) switch R Properties of an OQ switch R 1 • All buffering takes place at the output. • Output queues must be able to write R packets at rate N x R. R 2 Consequences R R 3 • “Work conserving”: Whenever there is a packet in the system, its output is busy sending a packet. No unnecessary idling. … • Average delay is minimized. • But memory bandwidth limits the switching capacity. R R N 7

Traffic Matrix Λ = [ 𝜇 𝑗 , 𝑘 ] Traffic matrix, R R 1 0.1 is the fraction of traffic from input i to output j 0 𝜇 𝑗 , 𝑘 . 2 For example: 0.1 0.2 0.2 0.4 0.2 0.4 0.2 0.3 0.1 0.1 R R 1.0 0.0 0.0 0.0 2 Λ = 0.1 0.4 0.3 0.1 R R 3 Note that the row (input) sum: ∑ 𝜇 𝑗 , 𝑘 ≤ 1, ∀ 𝑗 𝑘 Non-oversubscribed TM: Uniform Traffic Matrix: … Total traffic rate to each 1 1 1 1 output is ≤ 1 Λ = 𝜇 1 1 1 1 ∑ 𝜇 𝑗 , 𝑘 ≤ 1, ∀ 𝑘 1 1 1 1 R R 𝑗 N 1 1 1 1 𝑏 𝑜 𝑒 𝑡 𝑢 𝑗 𝑚𝑚 : ∑ 𝜇𝑗 , 𝑘 ≤ 1, ∀ 𝑗 𝑘 𝑥 h 𝑓𝑠𝑓 : 𝜇 ≤ 1/ 𝑂 8

OQ Switches and “100% Throughput” If we send traffic according to any non-over-subscribed traffic matrix to an OQ switch (with infinite buffers) then the output rates correspond to the column sums. 𝑘 = 𝑆 ∑ i.e. The traffic rate at output 𝜇 𝑗 , 𝑘 ≤ 𝑆 𝑗 Put another way, an OQ switch can “keep up” with any reasonable traffic matrix we throw at it. We often say an OQ switch can “sustain 100% throughput”. Q: What happens if the buffers are finite? 9

An input-queued (IQ) switch R Properties of an IQ switch R 1 • All buffering takes place at the input. • Input queues only need to be able to write R packets at rate R (instead of N x R). R 2 Consequences R R 3 • Can build a switch N times faster. • But, a packet can be held up by packet ahead destined to a different output. … • Hence an IQ switch is not “work conserving”. It can unnecessarily idle. • May not achieve “100% throughput”. R R N • Average delay is not minimized. 10

Head of Line Blocking

Head of Line Blocking IQ switch with uniform traffic matrix, 𝜇 ≤ 1 Observation : HOL Blocking means we lose 42% of the switching capacity Delay, d h Poisson arrivals: c t i w 2 ≈ 58 % 𝜇 ≤ 2 − S Poisson arrivals: h c Q Karol ‘87 t 2 ( 1 − 𝜇 ) i O 𝐹 ( 𝑒 ) = 1 2 − 𝜇 w S Q I 5/2 3/2 0 0.5 0.58 0.75 1 Load , 𝜇 12

What does the “58%” result mean? Arrival rate Departure rate 𝜈 𝜇 R R R R 1 𝜇 , 𝜈 ≤ 1 R R 2 OQ switch R Arrival rate Departure rate R 3 𝜇 R R … IQ switch uniform TM, Poisson Arrival rate Departure rate R R N 𝜇 R R 0.58 13

Virtual Output Queues (VOQs)

Basic idea With a VOQ, a packet cannot be held up by a packet in front of it, destined to a different output. Q: With VOQs, does/can 58% become 100% throughput? IQ switch uniform TM, Poisson IQ switch with VOQs Any TM, Any arrivals ? Arrival rate Departure rate Arrival rate Departure rate 𝜇 𝜇 R R R R 0.58 16

100% Throughput Reminder : “100% throughput” is equivalent to For a non over-subscribing traffic matrix, queues don’t grow without bound. i.e. for every queue in the system. 𝜈 ≥ 𝜇 Observations: 1. Burstiness of arrivals does not affect throughput 2. For a uniform Traffic Matrix, solution is trivial! 17

An input-queued (IQ) switch with VOQs and a crossbar N 2 VOQs R R R R 1 1 1 R R R R 2 2 2 R R R R 3 3 3 crossbar … Observation : scheduling is … … equivalent to choosing a permutation. R R R R N N N 18

N 2 VOQs bipartite bipartite request match graph crossbar e.g. “maximum size match” 19

Crossbar schedule , therefore 𝜇 ≤ 1 arrival rate departure rate. ≤ Fixed cycle of permutations: True for all VOQs, therefore 100% throughput for uniform TM uniform TM schedule ( 𝑂 ) ( 𝑂 ) 𝜇 1 R R crossbar crossbar crossbar crossbar 20

100% throughput for uniform traffic Four (trivial) algorithms for a uniform traffic matrix: 1. Cycle through permutations in “round-robin” (i.e. previous slide). 2. Each time, randomly pick one of the permutations in (1). 3. Each time, pick a permutation uniformly and at random from all possible N! permutations. 4. Wait until all VOQs are non-empty, then pick any algorithm above. 21

Quick recap so far

An input-queued (IQ) switch R Properties of an IQ switch R 1 • All buffering takes place at the input. • Input queues only need to be able to write R packets at rate R (instead of N x R). R 2 Consequences R R 3 • Can build a switch N times faster. • HOL Blocking: a packet can be held up by packet ahead destined to a different output. … • Hence an IQ switch is not “work conserving”. It can unnecessarily idle. • May not achieve “100% throughput”. R R N • Average delay is not minimized. 23

Head of Line Blocking IQ switch with uniform traffic matrix, 𝜇 ≤ 1 Observation : HOL Blocking means we lose 42% of the switching capacity Delay, d h Poisson arrivals: c t i w 2 ≈ 58 % 𝜇 ≤ 2 − S Poisson arrivals: h c Q Karol ‘87 t 2 ( 1 − 𝜇 ) i O 𝐹 ( 𝑒 ) = 1 2 − 𝜇 w S Q I 5/2 3/2 0 0.5 0.58 0.75 1 Load , 𝜇 24

100% throughput easy for uniform traffic Four (trivial) algorithms for a uniform traffic matrix: 1. Cycle through permutations in “round-robin”. 2. Each time, randomly pick one of the permutations in (1). 3. Each time, pick a permutation uniformly and at random from all possible N! permutations. 4. Wait until all VOQs are non-empty, then pick any algorithm above. 25

Q: So why did the authors need Parallel Iterative Matching (PIM)? Because in practice, arrivals are not uniform. (If know the matrix, we can still create a cycle of permutations to serve every VOQ at the rate in the traffic matrix). In practice we don’t know the traffic matrix. Hence, PIM….

Parallel Iterative Matching A maximal bipartite match uar selection uar selection 1 1 1 1 1 1 2 2 2 2 2 2 Iteration 1: 3 3 3 3 3 3 4 4 4 4 4 4 Request Grant Accept Q: Are we done? Q: Is a larger match possible? 1 1 1 1 1 1 2 2 2 2 2 2 Iteration 2 : 3 3 3 3 3 3 4 4 4 4 4 4

PIM Properties 1. Inputs and outputs make decisions independently and in parallel. 2. Guaranteed to find a maximal match in at most N iterations. 3. Typically completes in much fewer than N iterations. Q: How large is a maximal match compared to a maximum match? A maximal match is guaranteed to be at least half the cardinality (size) of a maximum match.

Parallel Iterative Matching O F I F + Q I VOQ + Maximum Size Match Output Queued Note log scale Simulation 16-port switch Uniform traffic matrix

Parallel Iterative Matching one iteration PIM with O F I F + Q I VOQ + Maximum Size Match Output Queued Simulation 16-port switch Uniform traffic matrix

CS244 Advanced Topics in Networking Lecture 6: Switching Nick - PowerPoint PPT Presentation

CS244 Advanced Topics in Networking Lecture 6: Switching Nick McKeown High-speed switch scheduling for local-area networks [Tom Anderson, Susan Owicki, James Saxe, Chuck Thacker. 1993] Spring 2020 Context Tom Anderson James B. Saxe At

Welcome to CS244 Spring 2020! Class will start shortly CS244 Advanced Topics in Networking

Welcome to CS244 Spring 2020! Class will start shortly CS244 Advanced Topics in Networking

CS244 Online for COVID-19 This is the first time for us too, so please email us if you have ideas

The Forwarding Plane: An Old New Frontier of Networking Research CS244, Spring 2019 Changhoon

CS244 Advanced Topics in Networking Lecture 10: Buffer Sizing Nick McKeown Sizing Router

Towards a Fully Encrypted Internet CS244 | Zakir Durumeric 2013 Snowden Revelations Explicit

CS244 Advanced Topics in Networking Lecture 7: Programmable Forwarding Nick McKeown Processing

CS244 Advanced Topics in Networking Lecture 9: SDN (2) Network Virtualization Nick McKeown

Class will start shortly CS244 Online for COVID-19 This is the first time for us too, so please

Architecture and Principles 1. End to end arguments in system design (1981) Sachin Katti

Parallel Splash Belief Propagation Joseph E. Gonzalez Yucheng Low Carlos Guestrin David

ENE 2XX: Renewable Energy Systems and Control LEC 04 : Distributed Optimization of DERs Professor

Chapter 5: CPU Scheduling Outline Wh a t i s s c h e d u l i n g i n t h

Data Locality in MapReduce Loris Marchal 1 Olivier Beaumont 2 1: CNRS and ENS Lyon, France. 2:

EECS 583 Class 10 Code Generation University of Michigan October 6, 2014 Announcements

Chapter 6 Cloud Resource Management and Scheduling Contents Resource management and

Instruction Scheduling List scheduling [Gibbons & Muchnick 86] Reorder instructions to

Energy-aware job scheduler for high- performance computing 7.9.2011 Olli Mmmel (VTT), Mikko

Claude TADONKI MINES ParisTech PSL Research University Centre de Recherche Informatique

Improving C HARM ++ Performance with a NUMA-aware Load Balancer Larcio Lima Pilla 1,2 ,

5 CPU Scheduling (1)

Online Algorithms Lectures 1 and 2 Ji r Sgall Computer Science Institute of the Charles

Data Processing on Modern Hardware Jens Teubner, TU Dortmund, DBIS Group

Sticky Expectations and Consumption Dynamics Christopher D. Carroll 1 Edmund Crawley 2 Jiri

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

CS244 Advanced Topics in Networking Lecture 6: Switching Nick - PowerPoint PPT Presentation

CS244 Advanced Topics in Networking Lecture 6: Switching Nick McKeown High-speed switch scheduling for local-area networks [Tom Anderson, Susan Owicki, James Saxe, Chuck Thacker. 1993] Spring 2020 Context Tom Anderson James B. Saxe At

Welcome to CS244 Spring 2020! Class will start shortly CS244 Advanced Topics in Networking

Welcome to CS244 Spring 2020! Class will start shortly CS244 Advanced Topics in Networking

CS244 Online for COVID-19 This is the first time for us too, so please email us if you have ideas

The Forwarding Plane: An Old New Frontier of Networking Research CS244, Spring 2019 Changhoon

CS244 Advanced Topics in Networking Lecture 10: Buffer Sizing Nick McKeown Sizing Router

Towards a Fully Encrypted Internet CS244 | Zakir Durumeric 2013 Snowden Revelations Explicit

CS244 Advanced Topics in Networking Lecture 7: Programmable Forwarding Nick McKeown Processing

CS244 Advanced Topics in Networking Lecture 9: SDN (2) Network Virtualization Nick McKeown

Class will start shortly CS244 Online for COVID-19 This is the first time for us too, so please

Architecture and Principles 1. End to end arguments in system design (1981) Sachin Katti

Parallel Splash Belief Propagation Joseph E. Gonzalez Yucheng Low Carlos Guestrin David

ENE 2XX: Renewable Energy Systems and Control LEC 04 : Distributed Optimization of DERs Professor

Chapter 5: CPU Scheduling Outline Wh a t i s s c h e d u l i n g i n t h

Data Locality in MapReduce Loris Marchal 1 Olivier Beaumont 2 1: CNRS and ENS Lyon, France. 2:

EECS 583 Class 10 Code Generation University of Michigan October 6, 2014 Announcements

Chapter 6 Cloud Resource Management and Scheduling Contents Resource management and

Instruction Scheduling List scheduling [Gibbons &amp; Muchnick 86] Reorder instructions to

Energy-aware job scheduler for high- performance computing 7.9.2011 Olli Mmmel (VTT), Mikko

Claude TADONKI MINES ParisTech PSL Research University Centre de Recherche Informatique

Improving C HARM ++ Performance with a NUMA-aware Load Balancer Larcio Lima Pilla 1,2 ,

5 CPU Scheduling (1)

Online Algorithms Lectures 1 and 2 Ji r Sgall Computer Science Institute of the Charles

Data Processing on Modern Hardware Jens Teubner, TU Dortmund, DBIS Group

Sticky Expectations and Consumption Dynamics Christopher D. Carroll 1 Edmund Crawley 2 Jiri

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Instruction Scheduling List scheduling [Gibbons & Muchnick 86] Reorder instructions to