network stack as a service in the cloud
play

Network Stack as a Service in the Cloud Zhixiong Niu 1 , Hong Xu 1 , - PowerPoint PPT Presentation

Network Stack as a Service in the Cloud Zhixiong Niu 1 , Hong Xu 1 , Dongsu Han 2 , Peng Cheng 3 , Yongqiang Xiong 3 , Guo Chen 3 , Keith Winstein 4 1 City University of Hong Kong 2 KAIST 3 Microsoft Research Asia 4 Stanford University Imagine


  1. Network Stack as a Service in the Cloud Zhixiong Niu 1 , Hong Xu 1 , Dongsu Han 2 , Peng Cheng 3 , Yongqiang Xiong 3 , Guo Chen 3 , Keith Winstein 4 1 City University of Hong Kong 2 KAIST 3 Microsoft Research Asia 4 Stanford University

  2. Imagine you’re a tenant. You want to deploy a new stack. 2

  3. Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! 3

  4. Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! VM 3

  5. Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! Stack VM 3

  6. Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! Stack VM Problem: cannot deploy a stack across OSes 3

  7. Motivation: Tenants I heard that BBR is great. Let’s deploy it to my VMs! Stack VM Problem: cannot deploy a stack across OSes 3

  8. Motivation: Tenants Stack VM 4

  9. Motivation: Tenants Stack VM No. of commits of mTCP and F-stack in 2017 30 mTCP 22.5 F-stack 15 7.5 0 Aug Sep Oct Nov 4

  10. Motivation: Tenants Stack VM No. of commits of mTCP and F-stack in 2017 30 mTCP 22.5 F-stack 15 7.5 0 Aug Sep Oct Nov Problem: high deployment and maintenance cost 4

  11. Motivation: Tenants Stack VM No. of commits of mTCP and F-stack in 2017 30 mTCP 22.5 F-stack 15 7.5 0 Aug Sep Oct Nov Problem: high deployment and maintenance cost 4

  12. So your life as a tenant sucks. What about the cloud provider ? 5

  13. Motivation: Provider I know that BBR is great. Let me deploy it for my tenants! 6

  14. Motivation: Provider I know that BBR is great. Let me deploy it for my tenants! VM Tenant Stack Hypervisor Provider 6

  15. Motivation: Provider I know that BBR is great. Let me deploy it for my tenants! VM Tenant Stack Hypervisor Provider Problem: can’t touch the tenant stack 6

  16. Motivation: Provider I know that BBR is great. Let me deploy it for my tenants! VM Tenant Stack Hypervisor Provider Problem: can’t touch the tenant stack 6

  17. So what’s wrong here? 7

  18. VM APP2 APP1 Networking API Tenant Network Stack vNIC Provider Current architecture 8

  19. Network stack is coupled to the guest OS VM APP2 APP1 Networking API Tenant Network Stack vNIC Provider Current architecture 8

  20. VM APP1 APP2 Tenant Networking API Network stack module Provider Network Stack 9

  21. VM Interface unchanged APP1 APP2 (BSD sockets, etc.) Tenant Networking API Network stack module Provider Network Stack 9

  22. VM Interface unchanged APP1 APP2 (BSD sockets, etc.) Tenant Networking API Network stack module Provider Network Stack Packets handled in the NSM 9

  23. Vision: Network Stack as a Service VM Interface unchanged APP1 APP2 (BSD sockets, etc.) Tenant Networking API Network stack module Provider Network Stack Packets handled in the NSM 9

  24. What’re the benefits?

  25. Flexibility for Tenants mTCP NSM VM BBR NSM VM 11

  26. Flexibility for Tenants mTCP NSM VM BBR NSM VM ‣ Stack independent of the guest OS 11

  27. Flexibility for Tenants mTCP NSM VM BBR NSM VM ‣ Stack independent of the guest OS ‣ No deployment or maintenance cost 11

  28. Efficiency for Provider 12

  29. Efficiency for Provider ‣ Offer meaningful SLAs NSM Capacity Price mTCP 25Mpps $2/hr mTCP 50Mpps $4/hr F-Stack 20Mpps $2/hr 12

  30. Efficiency for Provider ‣ Offer meaningful SLAs ‣ Optimize resource utilization NSM Capacity Price mTCP 25Mpps $2/hr BBR mTCP 50Mpps $4/hr NSM F-Stack 20Mpps $2/hr 12

  31. Efficiency for Provider ‣ Offer meaningful SLAs ‣ Optimize resource utilization NSM Capacity Price mTCP 25Mpps $2/hr BBR mTCP 50Mpps $4/hr NSM F-Stack 20Mpps $2/hr ‣ Easier to assert coordination and control 12

  32. Efficiency for Provider ‣ Offer meaningful SLAs ‣ Optimize resource utilization NSM Capacity Price mTCP 25Mpps $2/hr BBR mTCP 50Mpps $4/hr NSM F-Stack 20Mpps $2/hr ‣ Easier to assert coordination and control NUM pHost mon. Fabric 12

  33. Accelerate Innovation VM VM VM VM mTCP mTCP mTCP Oct 2017 Nov 2017 Dec 2017 … 13

  34. Accelerate Innovation VM VM VM VM mTCP mTCP mTCP Oct 2017 Nov 2017 Dec 2017 … ‣ Allow stack to evolve independently with the guest OS ‣ Write once, run everywhere 13

  35. Accelerate Innovation VM VM VM VM mTCP mTCP mTCP Oct 2017 Nov 2017 Dec 2017 … ‣ Allow stack to evolve independently with the guest OS Not possible in current architecture ‣ Write once, run everywhere 13

  36. NetKernel VM NSM APP1 APP2 Network Stack Network API vNIC Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14

  37. NetKernel VM NSM APP1 APP2 Network Stack Network API Socket API GuestLib vNIC Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14

  38. NetKernel VM NSM APP1 APP2 Network Stack Network API Socket API ServiceLib GuestLib vNIC Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14

  39. NetKernel VM NSM APP1 APP2 Network Stack Huge Network API page Socket API ServiceLib Data Data GuestLib vNIC Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14

  40. NetKernel VM NSM APP1 APP2 Network Stack Huge Network API page Socket API ServiceLib Data Data GuestLib Queues vNIC CoreEngine Virtual Switch / Embedded Switch (SR-IOV) Hypervisor pNICs Physical NICs 14

  41. Microbenchmark ‣ 3000 lines of C code, in user space ‣ QEMU KVM 2.5.0, Linux Kernel 4.9 ‣ Intel Xeon CPU E5-2618L v3 @ 2.30GHz x 2 Communication between ServiceLib and GuestLib (Random read and copy) Chunk 64B 512B 1KB 2KB 4KB 8KB size Latency 8ns 64ns 117ns 214ns 425ns 809ns 15

  42. Microbenchmark ‣ 3000 lines of C code, in user space ‣ QEMU KVM 2.5.0, Linux Kernel 4.9 ‣ Intel Xeon CPU E5-2618L v3 @ 2.30GHz x 2 Communication between ServiceLib and GuestLib (Random read and copy) Chunk 64B 512B 1KB 2KB 4KB 8KB size Latency 8ns 64ns 117ns 214ns 425ns 809ns 64Gbps 81Gbps 15

  43. Windows VM + BBR NSM BBR NSM VM VM 350ms rtt 12Mbps Uplink Beijing California 12 Throughput 9 (Mbps) 6 3 0 Win + NSM BBR Linux BBR Windows CTCP Linux CUBIC 16

  44. Takeaway ‣ Vision: Network Stack as a Service ‣ Decouple the network stack from the guest OS ‣ Better flexibility and efficiency, and faster innovation ‣ NetKernel as a solution ‣ GuestLib, ServiceLib, CoreEngine 17

  45. Research Agenda 18

  46. Research Agenda ‣ NSM form ‣ VM? unikernel-based VMs? containers? hypervisor modules? 18

  47. Research Agenda ‣ NSM form ‣ VM? unikernel-based VMs? containers? hypervisor modules? ‣ Support for containers ‣ Currently a container has to use the host stack ‣ Different containers on the same host use different stacks 18

  48. Research Agenda ‣ NSM form ‣ VM? unikernel-based VMs? containers? hypervisor modules? ‣ Support for containers Spark DCTCP ‣ Currently a container has to use the host stack Nginx BBR ‣ Different containers on the same host use different stacks 18

  49. Research Agenda ‣ NSM form ‣ VM? unikernel-based VMs? containers? hypervisor modules? ‣ Support for containers Spark DCTCP ‣ Currently a container has to use the host stack Nginx BBR ‣ Different containers on the same host use different stacks ‣ Network stacks to NSMs ‣ … 18

  50. Open Questions ‣ Any downsides? ‣ Other use cases in a production cloud? ‣ How about a private data center? ‣ What’s the right abstraction boundary of the network stack? 19

Recommend


More recommend