Microboxes: High Performance NFV with Customizable, Asynchronous TCP Stacks and Dynamic Subscriptions Guyue (Grace) Liu , Yuxin Ren, Mykola Yurchenko K.K. Ramakrishnan, Timothy Wood 1
Why Improve Existing NFV Frameworks? • Existing NFV frameworks focus on L2/L3 processing ClickOS [NSDI’14 ] OpenNetVM [Hotmiddlebox’16 ] E2 [SOSP’15 ] netmap [Usenix ATC ‘12 ] PF_RING [SANE’04 ] 2 Guyue Liu – George Washington University
Why Improve Existing NFV Frameworks? • Existing NFV frameworks are based on a packet-centric model HTTP, DNS, Data FTP, SMTP, L5-7 Application SSH, POP TELNET L4 Transport UDP TCP NF3 NF1 NF2 P P L3 Fwd Firewall Network IP, ICMP, ARP L3 Data Link IPsec Shaper Packet L2 P P Ethernet, NFV IO PPP, LLDP L2 Fwd NAT Physical L1 NIC NIC 3 Guyue Liu – George Washington University
Why Improve Existing NFV Frameworks? • Existing NFV frameworks are based on a packet-centric model • Protocol processing becomes part of the NF. Repeated protocol stack processing within a chain - redundant NF1 NF2 NF3 Load Balancer HTTP, DNS, data data data Data FTP, SMTP, L5-7 Web Proxy Application SSH, POP stack stack stack P P Transcoder TELNET P P NFV IO Gateway IDS [NSDI’17] L4 Transport UDP TCP NIC NIC L3 Fwd Firewall Network IP, ICMP, ARP L3 Data Link IPsec Shaper Packet L2 Ethernet, PPP, LLDP L2 Fwd NAT Physical L1 4 Guyue Liu – George Washington University
Issue #1: Redundant Stack Processing • As the chain length increases, the overhead grows significantly when going through stack processing multiple times 140 Processing Latency (us) stack-8KB 169% 120 fwd-8KB 100 80 79% 60 40 20 0 1 2 3 4 5 6 7 8 Chain Length 5 Guyue Liu – George Washington University
Idea #1: Consolidate Stack Processing • How can we remove the redundancy within a chain? NF1 NF2 NF3 app app app stack stack stack P P P P NFV IO NIC NIC 6 Guyue Liu – George Washington University
Idea #1: Consolidate Stack Processing • How can we remove the redundancy within a chain? • Deploy all NFs and the stack as a single, monolithic process ? app NF1 NF2 NF3 P P stack P P NFV IO NIC NIC 7 Guyue Liu – George Washington University
Idea #1: Consolidate Stack Processing • How can we remove the redundancy within a chain? • Move stack processing from NF into NFV framework NF3 NF1 NF2 app app app P P stack P NFV IO P NIC NIC 8 Guyue Liu – George Washington University
Issue #2: A Monolithic Stack is Not Efficient • The throughput drops as the stack processing grows in functionality 10 9 Throughput (Gbps) 8 7 6 5 4 3 2 1 0 Simple Fwd Connection TCP State TCP Splicing Tracking Bytetream Assembly 9 Guyue Liu – George Washington University
Idea #2: Customizable Stack Modules • How to avoid unnecessary processing in the stack? NF3 NF1 NF2 app app app P P Stack P NFV IO P NIC NIC 10 Guyue Liu – George Washington University
Idea #2: Customizable Stack Modules • How to avoid unnecessary processing in the stack? • Split stack into modules based on functionality and customize processing for each NF/chain NF3 NF1 NF2 app app app P P Bytestream State Monitoring Reconstruction P NFV IO P NIC NIC 11 Guyue Liu – George Washington University
Issue #3: Separate Stacks for NFs and Endpoint Applications • Middlebox NFs and Endpoint applications use different underlying frameworks for protocol support Web IDS DPI MON Proxy P P Socket API Stack 1 Stack 2 Stack 3 Linux TCP P NFV IO P NIC NIC 12 Guyue Liu – George Washington University
Issue #3: Separate Stacks for NFs and Endpoint Applications • How to transparently manage both middleboxes and endpoints? Web IDS DPI MON Proxy P P Socket API Stack 1 Stack 2 Stack 3 Linux TCP P NFV IO P NIC NIC 13 Guyue Liu – George Washington University
Idea #3: Event Communication Interface • How to transparently manage both middleboxes and endpoints? • A flexible event interface can represent pkt., data and legacy events for a variety of services Web IDS DPI MON Event Event Proxy Event Event Event Stack 1 Stack 2 Stack 3 Stack 4 P NFV IO P NIC NIC 14 Guyue Liu – George Washington University
Microboxes = µStack + µEvent • Idea #1: Consolidate Stack Processing µStack • Idea #2: Customizable Stack Modules • Idea #3: Event Communication Interface µEvent NF NF NF NF NF µEvent µEvent µEvent µStack µStack µStack µStack µStack µEvent µStack µStack 15 Guyue Liu – George Washington University
Outline ➢ Why Improve NFV Frameworks? ➢ Microboxes = µStack + µEvent ➢ µStack ▪ Customizable Modules ▪ Consistency Challenges ➢ µEvent ▪ Hierarchy Events ▪ Publish/Subscribe Interface 16 Guyue Liu – George Washington University
µStack Modules • We divide TCP processing into five basic µStacks and they can be composed together to support different NFs. Transcoder HTTP LB Web Proxy TCP Split Proxy TCP Splicer µStack µStack TCP Endpoint µStack TCP monitor IDS µStack L2/3 Firewall µStack 17 Guyue Liu – George Washington University
µStack Modules • Layer 2/3: network layer processing to determine what flow it is associated with and maintain minimal state such as flow stats. Transcoder HTTP LB Web Proxy TCP Split Proxy TCP Splicer µStack µStack TCP Endpoint µStack TCP monitor IDS µStack L2/3 Firewall µStack 18 Guyue Liu – George Washington University
µStack Modules • TCP Monitor: tracks the TCP state and reconstructs bytestream of both the client and server side of a connection. Transcoder HTTP LB Web Proxy TCP Split Proxy TCP Splicer µStack µStack TCP Endpoint µStack TCP monitor IDS µStack L2/3 Firewall µStack 19 Guyue Liu – George Washington University
µStack Modules • TCP Splicer: redirects a TCP connection after establishing the handshake without support for modifying the bytestream. Transcoder HTTP LB Web Proxy TCP Split Proxy TCP Splicer µStack µStack TCP Endpoint µStack TCP monitor IDS µStack L2/3 Firewall µStack 20 Guyue Liu – George Washington University
µStack Modules • TCP Endpoint: contains the full TCP logic and can terminate and respond to client requests directly. Transcoder HTTP LB Web Proxy TCP Split Proxy TCP Splicer µStack µStack TCP Endpoint µStack TCP monitor IDS µStack L2/3 Firewall µStack 21 Guyue Liu – George Washington University
µStack Modules • TCP Split Proxy: sets up two TCP connections with client and server respectively and allows NFs to perform bytestream transformations. Transcoder HTTP LB Web Proxy TCP Split Proxy TCP Splicer µStack µStack TCP Endpoint µStack TCP monitor IDS µStack L2/3 Firewall µStack 22 Guyue Liu – George Washington University
Stack Consistency • Stack and NFs are running on separate cores. • Both Stack and NFs need to access the stack state. P1 P3 P2 Stack NF1 NF2 NF3 Stack State 23 Guyue Liu – George Washington University
Stack Consistency • Stack state could be inconsistent when NF reads the state while stack has changed it based on new packet arrivals. P2 P1 P3 Stack NF1 NF2 NF3 Stack State Stack Consistency: Protocol stack associated with each packet needs to be consistent when each NF processes this packet. 24 Guyue Liu – George Washington University
Stack Consistency • Sequential processing can achieve the correctness but lead to an inefficient pipeline P2 P1 P3 Stack NF1 NF2 NF3 Stack State Core 0 (Stack) P2 P3 P1 P2 P3 Core 1 (NF1) P1 P2 P3 P1 Core 2 (NF2) P2 P3 P1 Core 3 (NF3) time 25 Guyue Liu – George Washington University
Stack Consistency • Only one core is doing useful work while others are idle P2 P1 P3 Stack NF1 NF2 NF3 Stack State Core 0 (Stack) P2 P3 P1 P2 P3 Core 1 (NF1) P1 P2 P3 P1 Core 2 (NF2) P2 P3 P1 Core 3 (NF3) T Idle ! time 26 Guyue Liu – George Washington University
Stack Consistency: Stack Snapshots • Take a snapshot of stack state for each packet to avoid inconsistency problem P3 P2 P1 Stack NF1 NF2 NF3 Stack State Core 0 (Stack) P2 P3 P1 Core 1 (NF1) P1 P2 P3 P2 P3 P1 Core 2 (NF2) P1 P2 P3 Core 3 (NF3) time T 27 Guyue Liu – George Washington University
Stack Consistency: Track Bytestream • Store an offset instead of copying the whole bytestream • Allow stack and NF processing to be performed asynchronously P3 P2 P1 Stack NF1 NF2 NF3 Stack State bytestream Core 0 (Stack) P2 P3 P1 Core 1 (NF1) P1 P2 P3 P2 P3 P1 Core 2 (NF2) P1 P2 P3 Core 3 (NF3) time T 28 Guyue Liu – George Washington University
NF Consistency: Parallel Processing • Parallel processing increases core utilization and can be used for NFs without dependencies ( NFP [SIGCOMM’17], Parabox [SOSR’17]) NF1 P1 P2 P3 Stack NF2 P1 P1 NF3 Core 0 (Stack) P2 P3 P1 Core 1 (NF1) P1 P2 P3 P3 P1 P2 Core 2 (NF2) P1 P2 P3 Core 3 (NF3) T time 29 Guyue Liu – George Washington University
Flow Consistency: Parallel Stacks • Run multiple copies of the same stack to maximize performance • Packets are distributed at flow level to keep flow consistency µStack NF1 NF2 µStack NIC µStack RSS µStack NF3 µStack NF5 NF4 30 Guyue Liu – George Washington University
Recommend
More recommend