serverswitch a programmable and
play

ServerSwitch: A Programmable and High Performance Platform for Data - PowerPoint PPT Presentation

ServerSwitch: A Programmable and High Performance Platform for Data Center Networks Guohan Lu, Chuanxiong Guo, Yulong Li, Zhiqiang Zhou , Tong Yuan, Haitao Wu, Yongqiang Xiong, Rui Gao, Yongguang Zhang Microsoft Research Asia Tsinghua


  1. ServerSwitch: A Programmable and High Performance Platform for Data Center Networks Guohan Lu, Chuanxiong Guo, Yulong Li, Zhiqiang Zhou † , Tong Yuan, Haitao Wu, Yongqiang Xiong, Rui Gao, Yongguang Zhang Microsoft Research Asia † Tsinghua University NSDI 2011, Boston, USA

  2. Motivations • Lots of research and innovations in DCN – PortLand, DCell /BCube, CamCube, VL2, … – Topology, routing, congestion control, network services, etc. • Many DCN designs depart from current practices – BCube uses self-defined packet header for source routing – Portland performs LPM on destination MAC – Quantized Congestion Notification (QCN) requires the switches to send explicit congestion notification • Need a platform to prototype existing and many future DCN designs NSDI 2011, Boston, USA

  3. Requirements • Programmable and high-performance packet forwarding engine – Wire-speed packet forwarding for various packet sizes – Various packet forwarding schemes and formats • New routing and signaling, flow/congestion control – ARP interception (PortLand), adaptive routing (BCube), congestion control (QCN) • Support new DCN services by enabling in-network packet processing – Network cache service (CamCube), Switch-assisted reliable multicast (SideCar) NSDI 2011, Boston, USA

  4. Existing Approaches • Existing switches/routers – Usually closed system, no programming interface • OpenFlow – Mainly focus on control plane at present – Unclear how to support new congestion control mechanisms and in-network data processing • Software routers – Performance not comparable to switching ASIC • NetFPGA – Not commodity devices and difficult to program NSDI 2011, Boston, USA

  5. Technology Trends Modern Switching Chip PCI-E Interconnect Commodity Server • High switching capacity • High bandwidth (160Gbps) • Multi-core (640Gbps) • Low latency (<1us) • Multi 10GE packet • Rich protocol support processing capability (Ethernet, IP, MPLS) • TCAM for advanced packet filtering NSDI 2011, Boston, USA

  6. Design Goals • Programmable packet forwarding engine in silicon – Leverage the high capacity and programmability within modern switching chip for packet forwarding • Low latency software processing for control plane and congestion control messages – Leverage the low latency PCI-E interface for latency sensitive schemes • Software-based in-network packet processing – Leverage the rich programmability and high performance provided by modern server NSDI 2011, Boston, USA

  7. Architecture • Hardware – Modern Switching App App chip API/Library User Space – Multi-core CPU TCP/IP Server – PCI-E interconnect ServerSwitch driver • Software Stack SC driver NIC driver Kernel – C APIs for switching PCI-E PCI-E ServerSwitch Card External Ports chip management Ethernet TCAM Ethernet Ethernet Controller – Packet Processing in Switching NIC Controller Controller chip chips both Kernel and User Hardware Space NSDI 2011, Boston, USA

  8. Programmable Packet Forwarding Engine High Programmability Eth Parser EM(MAC) DMAC Limited Programmability P P Classifi IP L2 IP Parser er Modifier Modifier LPM DIP DIP EM(IP) No Programmabiltiy MPLS MPLS Parser Modifier Label EM(MPLS) Prog Interface TCAM UDLK Index Parser Table 56338 • Destination-based forwarding, e.g. , IP, Ethernet • Tag-based forwarding, e.g. , MPLS • Source Routing based forwarding, e.g. , BCube NSDI 2011, Boston, USA

  9. TCAM Basic TCAM non-cared cared Key Value 1 1 A Value 2 1 B Value 3 2 A B A 2 A Value 4 2 B Value 5 3 A Value 6 B 3 NSDI 2011, Boston, USA

  10. TCAM Based Source Routing Output TCAM Port Incoming Packet Idx IA 1 IA 2 IA 3 1 A 1 Idx IA 1 IA 2 IA 3 2 1 B 2 A B A 2 A 1 2 B 2 3 A 1 1 A B A B 3 2 NSDI 2011, Boston, USA

  11. ServerSwitch API • Switching chip management – User defined lookup key extraction – Forwarding table manipulation – Traffic statistics collection • Examples: – SetUDLK(1, (B0-5)) – SetLookupTable(TCAM, 1, 1, “000201000000”, “FFFFFF000000”, {act=REDIRECT_VIF, vif=3}) – ReadRegister(OUTPUT_QUEUE_BYTES_PORT 0) NSDI 2011, Boston, USA

  12. Implementation BCM56338 4xGE 2x10GE Intel 82576EB • • Hardware Software – – 4 GE external ports Windows Server 2008 R2 – – x4 PCI-E to server Switching chip driver (2670 lines of C) – – 2x10GE board-to-board interconnection NIC driver (binary from Intel) – – Cost: 400$ in 80 pieces ServerSwitch driver (20719 lines of C) – – Power consumption: 15.7W User library (Based on Broadcom SDK) NSDI 2011, Boston, USA

  13. Example 1: BCube B14-17 Version HL Tos Total length B18-21 Identification Flags Fragment offset B22-25 TTL Protocol Header checksum B26-29 Source Address B30-33 Destination Address B34-37 NHA 1 NHA 2 NHA 3 NHA 4 B38-41 NHA 5 NHA 6 NHA 7 NHA 8 B42-45 BCube Protocol NH Pad • Self-defined packet header for BCube source routing • Easy to program: Less than 200 LoC to program the switching chip NSDI 2011, Boston, USA

  14. BCube Experiment Forwarding rate (ServerSwitch) Latency (Software) NetFPGA 4-core i7 server NetFPGA Forwarding rate (Software) Latnecy (ServerSwitch) • ServerSwitch: wire-speed packet forwarding for 64B • ServerSwitch: 15.6us forwarding latency, ~1/3 of software forwarding latency NSDI 2011, Boston, USA

  15. Example 2: Quantized Congestion Notification ServerSwitch UDP Source RP CP ① qlen Token Packet ③ Bucket Marker ② Congestion Output Port Notication NIC • Congestion notification generation requires very low latency NSDI 2011, Boston, USA

  16. QCN Experiment Sender Queue Length Throughput Change bandwidth Receiver • Queue fluctuates around equilibrium point (Q_EQ) NSDI 2011, Boston, USA

  17. Limitations • Only support modifications for standard protocols – Ethernet MACs, IP TTL, MPLS label • Not suitable for low-latency, per-packet processing – XCP • Limited number of ports and port speed – Cannot be directly used for fat-tree and VL2 – 4 ServerSwitch cards form a 16-port ServerSwitch, still viable for prototyping fat-tree and VL2 NSDI 2011, Boston, USA

  18. Summary • ServerSwitch: integrating a high performance, limited programmable ASIC switching chip with a powerful, fully programmable server – Line-rate forwarding performance for various user-defined forwarding schemes – Support new signaling and congestion mechanisms – Enable in-network data processing • Ongoing 10GE ServerSwitch NSDI 2011, Boston, USA

  19. Thanks. Q&A NSDI 2011, Boston, USA

Recommend


More recommend