gen a gpu accelerated elastic framework for nfv
play

GEN: A GPU-Accelerated Elastic Framework for NFV Zhilong Zheng Jun Bi - PowerPoint PPT Presentation

GEN: A GPU-Accelerated Elastic Framework for NFV Zhilong Zheng Jun Bi Chen Sun Heng Yu Hongxin Hu Zili Meng Shuhe Wang Kai Gao Jianping Wu Network Function Virtualization (NFV) Dedicated Dedicated Dedicated Dedicated NFV: Commodity


  1. GEN: A GPU-Accelerated Elastic Framework for NFV Zhilong Zheng Jun Bi Chen Sun Heng Yu Hongxin Hu Zili Meng Shuhe Wang Kai Gao Jianping Wu

  2. Network Function Virtualization (NFV) Dedicated Dedicated Dedicated Dedicated NFV: Commodity Hardware Devices Service Function Chain (SFC) VPN Monitor Firewall Load VM VM VM VM Balancer Virtualization Techniques Low cost Elasticity control Service provisioning flexibility 2

  3. CPU-based NFV … OpenNetVM NetBricks NFP Metron (HotMiddlebox’16) (OSDI’16) (SIGCOMM’17) (NSDI’18) NFV Platforms NFV Infrastructure General-purpose Multi-core Servers • Problems – Low performance with negative improvement expectation – Coarse-grained scaling 3

  4. Problems of CPU-based NFV • Low performance with negative improvement expectation – Hard to achieve high performance (e.g., 40~100Gbps) for a wide range of NFs IPSec NIDS E5-2650 v2 (8 Cores, 2.6 GHz) (AES & SHA1) (Aho-Corasick) Go, Younghwan, et al. "APUNet: Revitalizing GPU as Packet 2.6 ~ 7.7 Gbps 4.2 ~ 10.4 Gbps Processing Accelerator." NSDI . 2017. – The slow/end of Moore’s Law • Coarse-grained scaling 1 Mpps 9 Mpps 11 Mpps 10 Mpps 10 Mpps 1 CPU core 2 CPU cores 4

  5. GPU as An Accelerator for NFV • Benefits of GPU High-performance NFs – Massive processing cores Potential – Fine-grained computing units Fine-grained resource • Existing work – Router (PacketShader , SIGCOMM’10) – SSL proxy (SSLShader , NSDI’11) High-performance SFCs Problems – NIDS (Kargus , CCS’12 ) Unsolved – IPSec (NBA, EuroSys’15) Fine-grained fast Scaling – NFV framework (G- NET, NSDI’18) 5

  6. GEN exploits GPU to support high-performance SFCs with fine-grained scaling

  7. GEN Framework Overview Orchestrator SFC Configurations SFC Configurations Infrastructure SFC SFC GPU GPU SFC GPU SFC GPU Manager Manager Manager Manager SFC SFC SFC SFC SFC SFC SFC SFC Controller Controller Controller Controller Controllers Controllers Controller Controller GPU GPU GPU GPU CPU CPU CPU CPU Server Server Server Server 7

  8. Infrastructure Design High Performance NIC CPU (User Space) GPU (2k~3k physical cores) SFC Manager SFC Controller #1 Global Memory Tx 10 / 40 / 100 GbE Ports Output Packet Packet ① Queuing Forwarder Dropper Chain #1 Chain #1 Chain #1 NF #1 NF #2 NF #3 SFC Agent ② #1 Adaptive SFC R Rx Batcher Starter …… Chain SFC Classifier Chain #n Chain #n SFC Controller #n R …… Agent NF #1 NF #m n #n Elastic Scaling 8

  9. Problem #1: SFC Model Selection Run-to-completion Pipelining (RTC) Packets Packets NF1 NF2 NF1 NF2 Instance #1 Instance #2 Instance #1 9

  10. SFC Model Selection: Pipelining • Two potential ways to support pipelining in GPU Persistent kernels Sequenced invocations CPU GPU CPU GPU Packet Packet Packet batch Packet batch 3. Reading Buffer 2. Reading Buffer 2. Kernel invocation Worker- Worker- NF1 NF1 NF1 4. Synchronization SFC ( persistent ) 5. Next NF Out 3. Next NF 6. Kernel invocation Worker- 7. Reading 4. Reading NF2 NF2 8. Synchronization NF2 ( persistent ) Out High overhead from frequent kernel Hard and costly scaling invocations (~5us per invocation) 10

  11. SFC Model Selection: RTC • RTC-based Model CPU GPU Less kernel Packet Packet batch invocations Buffer (once per SFC) 2. Kernel invocation Worker- NF1 SFC 4. Synchronization Easier scaling Out NF2 (not persistent) RTC Model Packet • NFs are integrated into a specific SFC Agent kernel fusion • SFC Agent (in GPU) is Launched by SFC Starter (in CPU) 11

  12. Problem #2: Elastic Scaling • Avoid monitoring NF load for scaling – Avoid deciding when to scale – Avoid deciding to what extent an NF should be scaled – Avoid considering how to quickly carry out NF scaling • Avoid state management caused by scale out / in – Intuition: Use scale up / down to avoid state management • Adaptive Batcher 12

  13. Elastic Scaling – Adaptive Batcher • Design of the adaptive batcher – Keeping the buffer occupancy at a low level – Scaling up/in GPU resource provisioning State management All packets avoidance Packets Fetching Buffer In the buffer Batching GPU Scaling up/in more Adaptive Batcher mini-batches in GPU Load monitoring avoidance 13

  14. Preliminary Evaluation • Hardware – CPU: Two Intel Xeon E5-2650 v4 (10 physical cores) – GPU: NVIDIA TITAN Xp – NIC: Two Intel X520 (40 Gbps in total) • Software – DPDK 17.11 for networking IO – CUDA 8.0 for GPU programming • NFs & SFCs – IPV4Router (1k entries) → NIDS (3k rules) → IPSec (SHA1 & AES-128-CBC) 14

  15. Performance of RTC vs. Pipelining Pipeline RTC Pipeline RTC 40 95th 1.0 Throughput (Gbps) 35 0.8 30 33.7% 0.6 CDF 25 0.4 20 0.2 15 10 0.0 0 64 128 256 512 1024 1600 100 200 300 400 500 Latency ( m s) Packet Size (Byte) 29.2% and 28.1% 15

  16. Fast Elastic Scaling 40 Throughput (Gbps) 30 Fast 20 converging 10 ( < 100ms ) 0 0 5 10 15 20 25 30 35 Timeline (second) 16

  17. Conclusion and Future Work • Gen: a GPU-accelerated elastic framework for NFV – High-performance SFC – Elastic scaling • Future work – More SFC performance enhancement in GPU – Coordination between CPU and GPU – Impact of dynamic traffic load 17

  18. Thank You http://netarchlab.tsinghua.edu.cn

Recommend


More recommend