CompSci514/ECE558: Computer Networks Lecture 17: Programmable Switches Xiaowei Yang xwy@cs.duke.edu http://www.cs.duke.edu/~xwy Some slides are adapted from Prof Nick McKeown’s lecture slides
Overview • The Trio of modern networking – SDN – NFV – Programmable switches
Network Functions Virtualisation 3
4 Software Defined Network (SDN) Control Control Control Program Program Program Global Network Map Control Plane Control Packet Control Forwarding Packet Forwarding Control Packet Control Forwarding Packet Forwarding Control Packet Forwarding
Motivation • Network changes fast • Need to extend the forwarding plane
History of Programmable Routers • Mini-computer based routers (1969-1990) • Active networks (Mid 1990) • Software routers (1999 – present) – Click, RouteBricks, PacketShader • Software Defined Networking (2004– present)
Open Compute Project: Wedge switch (open-source design from Facebook) Switch chip! State of the art about 3.2Tb/s (32x100GE) 7
2014: The bare-metal switch Feature Feature Feature Code Code Code Control Plane Linux
Now I can tailor my network to meet my needs! I can…. 1. Quickly deploy new protocols. 2. See what my forwarding plane is doing. 3. Put expensive middlebox functions into the network. 4. Try out beautiful new ideas. Tailor my network to meet my needs. 5. Differentiate. Now I own my intellectual property.
“Beautiful ideas” 1. Deploy new protocols and new headers 2. Simplify the data plane. Throw out unused protocols. 3. Reallocate resources in switches: tables, packet buffers, etc. 4. Add new telemetry for debugging and diagnostics 5. Verify network behavior 6. Embed “middlebox” functions into the network: load-balancing, gateways and firewalls. 7. In-network congestion control 8. New routing and reliability algorithms 9. …
Tailoring my network switches today Feature New Feature Feature Feature Code Code Code Code Control Plane Linux Driver
Can a CPU forward all my packets?
Packet Forwarding Speeds 100000 3.2Tb/s 10000 Switch Chip 1000 100 Gb/s (per chip) 10 1 0.1 1990 1995 2000 2005 2010 2015 2020 13
Packet Forwarding Speeds 3.2Tb/s 100000 10000 1000 50x 100 Switch Chip Gb/s CPU (per chip) 10 1 0.1 1990 1995 2000 2005 2010 2015 2020 14
Conventional Wisdom: “Programmable devices are 10-100x slower. They consume much more power and area.”
Wedge Whitebox CPU Blackbox switch 16
My whitebox switch has a blackbox switch inside
Fixed-Function Switch Chips L2 Queues ACL L2 IPv4 IPv6 Packet Packet Parser Stage Stage Stage Stage L3 18
Domain Specific Processors Signal Graphics Processing My renderer My codec Applications Applications Compiler Compiler DSP GPU
Conventional wisdom said: programmability too expensive Then, someone identified: 1. The right model for data-parallelism 2. Basic underlying processing primitives Domain-specific processors were built Domain-specific languages, compilers and tool-chains
Control Flow Graph v4 L2 ACL v6 Control Flow Graph Switch Pipeline Action IPv6 Table Fixed Action Queues Fixed Action Fixed Action ACL Table ACL L2 IPv4 IPv4 Table IPv6 L2 Table Parser Stage Stage Stage Stage 21
Fixed-Function Switch Chips Are Limited 1. Can’t add new forwarding functionality 22
Fixed-Function Switch Chips v4 MyEnca L2 MyEncap p ACL v6 Control Flow Graph Switch Pipeline Action IPv6 Fixed Action Queues Fixed Action Fixed Action Table ACL Table ACL L2 IPv4 IPv4 Table IPv6 L2 Table Parser Stage Stage Stage Stage 23
Fixed-Function Switch Chips Are Limited 1. Can’t add new forwarding functionality 2. Can’t move resources between functions IPv AC IPv Action L2 Fixed Action IPv6 Fixed Action Table Queues Fixed Action ACL Table IPv4 Table 6 L L2 Table Parser 4 Sta Stag Stag Stag e ge e e 24
Switch Pipeline Control Flow Graph Parser Programmable Switch Chips L2 Table Match Table L2 Fixed Action Action Macro IPv4 Table Match Table Action Macro Fixed Action v4 v6 IPv6 Table Match Table Fixed Action Action Macro ACL Match Table ACL Table Action Macro Fixed Action Queues 25
Switch Pipeline Control Flow Graph Parser Programmable Switch Chip. Match Table L2 Table Mapping Control Flow to L2 L2 Action Macro L2 Action Macro Match Table IPv4 Table Action Macro v4 Action Macro v4 v4 v6 v6 Match Table IPv6 Table Action Macro v6 Action ACL ACL Match Table ACL Table Action Macro ACL Action Macro Queues 26
RMT: Reconfigurable Match + Action (Now more commonly called “PISA”) 27
PISA: Protocol Independent Switch Architecture Match+Action Memory ALU Programmable Parser 28
Programmable Parser Match+Action
P4 Programming P4 code Compiler Match+Action Memory ALU Programmable Parser
P4 (http://p4.org/) Match Control Flow Action Parser Graph control ingress { Tables table ipv4_lpm { apply (l2_table); reads { if ( valid (ipv4)) { ipv4.dstAddr : apply (ipv4_table); parser parse_ethernet { lpm; } extract (ethernet); } if ( valid (ipv6)) { select ( latest .etherType) actions { apply (ipv6_table); { set_next_hop; } 0x800 : parse_ipv4; drop; apply (acl); 0x86DD : parse_ipv6; } } } } } L v4 AC 2 L v6 Action Macro Action Macro Action Macro Action Macro Fixed Action Fixed Action Fixed Action Queues Fixed Action Match Table Match Table Match Table Match Table ACL Table IPv6 Table IPv4 Table L2 Table Parser 31
Question How can we exploit the parallelism within each stage? 32
Switch Pipeline Control Flow Parser Naïve Mapping: Control Flow Graph L2 Match Table Table L2 L2 Action Action Macro Match Table IPv4 Table v4 Action Macro Action Macro v4 v4 v6 v6 Match Table IPv6 Table v6 Action Macro Action Macro ACL ACL ACL Match Table Table Action Macro Action Queues 33
Table Dependency Graph (TDG) v4 L2 ACL v6 Control Flow Graph Table Dependency Graph v4 L2 ACL v6 34
Efficient Mapping: TDG v4 v4 L2 ACL L2 ACL v6 v6 Table Dependency Graph Control Flow Graph Switch Pipeline Action v6 Action Macro Table v4 Action Macro Action L2 ACL Table Queues IPv4 Table IPv6 Table Parser 35
Switch Pipeline Control Flow Graph Parser L3 L2 L2 Table L2 Resource constraints L2 Action Macro IPv4 v4 Action Macro v4 v4 v6 v6 IPv6 v6 Action Macro ACL ACL Table Action Queues 36
Step 1: P4 Program Step 2: Control Flow Graph v4 L2 ACL v6 Step 3: Table Dependency Graph L2 v4 v6 ACL Step 4: Table Configuration 37
RMT Switch 1 2 … 32 3 4 RMT 32 Stages
Example Ipv4_Ur A Typical TDG pf IG_ACL Ipv4- 1 Ipv4- Ipv4- Ecmp Ucast- Ucast- IG- IPv4- Host LPM Router- Nextho Mac IG- p IPv4- Smac EG_Pro Mcast IPv6- IG_Phy ps IG_Bca Nextho _Meta EG- st_Stor p IG- IG- Ipv6_Ur ACL1 EG- m IG-Agg- Props Dmac pf Ipv6- Phy- Intf Ipv6- Ecmp Meta Ipv6- Ucast- Ucast- LPM IG_ACL Host 2 IPv6- Mcast Configuration for RMT 39
Area Comparison with Fixed Function Switches Section Area % of chip Extra Cost I/O, buffer, queue, CPU, etc 37% 0.0% Match memory & logic 54.3% 8.0% VLIW action engine 7.4% 5.5% Parser + deparser 1.3% 0.7% Total extra area cost 14.2%
Design goals of P4 • Reconfigurability – Redefine packet parsing and processing • Protocol independence – Cannot be tied to specific packet formats • Target independence – Does not need to know the underlying switch hardware
The abstract forwarding model
P4 Concepts • Headers • Parsers • Tables • Actions
An example P4 program
The packet parser
Table specification
Action specifications
The control program • Implements a control diagram
The control program
P4 primitive actions • set field: Set a specific field in a header to a value. • copy field: Copy one field to another. • add header: Set a specific header instance (and all its • fields) as valid. • remove header: Delete (“pop”) a header (and all its fields) • from a packet. • increment: Increment or decrement the value in a field. • checksum: Calculate a checksum over some set of header • fields (e.g., an IPv4 checksum).
P4 compiler • Compiles the program into target-specific configurations • Packet parser à state machine • Control program à target dependency graphs
Discussion • Is P4 the right language? • Is match-action the right abstraction?
New idea: Packet Transactions
Summary • Completes our coverage of modern networking – SDN – NFV – Programmable switches • P4 • Match-Action Tables
Recommend
More recommend