mul multitena nancy ncy for r fast and nd programmabl ble
play

Mul Multitena nancy ncy for r Fast and nd Programmabl ble - PowerPoint PPT Presentation

Mul Multitena nancy ncy for r Fast and nd Programmabl ble Network rks in n the he Cl Cloud ud Tao Wang * , Hang Zhu * , Fabian Ruffy, Xin Jin, Anirudh Sivaraman, Dan Ports, and Aurojit Panda ( * Equal contribution) Wha What do does


  1. Mul Multitena nancy ncy for r Fast and nd Programmabl ble Network rks in n the he Cl Cloud ud Tao Wang * , Hang Zhu * , Fabian Ruffy, Xin Jin, Anirudh Sivaraman, Dan Ports, and Aurojit Panda ( * Equal contribution)

  2. Wha What do does s toda day’s s cloud ud offer as s a se service? Ø Generic compute and storage resources Ø Specialized accelerators 2

  3. Em Emergenc nce of f pr progr grammabl ble ne network k de devices Ø Pipeline-based programmable devices Ø In-network switches Ø At-host SmartNICs Ø Enable wide-range innovations for classical networked systems Ø Consensus: NOPaxos, NetPaxos Ø Concurrency control: Eris Ø Caching: NetCache, IncBricks Ø Storage: NetChain, SwitchKV Ø Applications: SwitchML, NetAccel Ø … 3

  4. Wh Why y no not offer suc such h system as s a cloud ud se service? Ø Need of multitenancy support Ø Provider’s aspect Ø Improve resource utilization Ø One application can hardly consume all the hardware resources Ø Heterogenous resource requirement Ø Tenant’s aspect Ø Enable innovations Ø New programs can be easily tested w/o impacting basic network functionality 4

  5. How to enable multitenancy y for programmable devices? Requirements: Ø Resource efficiency Ø Little overhead Ø Isolation Ø Performance Ø Allocated resource Our vision: a hybrid compile-time and run-time solution 5

  6. Backgrou Ba ound on on prog ogramma mmable network ork devices Parser Ingress Pipeline Exact match Xbar Queues Egress Ternary match Xbar Pipeline Stage 1 SRAMs/TCAMs …… Match Action … … Match Action PHV Action Ethernet header Stateful Mem Circuit Packet … container … units Headers Queue length … Per-packet Metadata Hardware e.g., register enqueue port 6

  7. Pr Programmable devices’ characte teristics Performance Ø Various types of hardware resources Ø Most of them are decided during compile time Ø Limited run-time support Ø Hardware wirings are decided during compile time Ø Line-rate performance achieved after successful compilation Ø No temporal scheduling (e.g., CPU or NPU scheduling) Ø No spatial reconfiguration (e.g., FPGA [AmorphOS, OSDI’18]) Ø Resource efficiency Ø Isolation Ø Little overhead Ø Performance Ø Allocated resource Programmability 7

  8. A A hybrid compile-tim time e an and run-tim time e solu lutio tion Ø Compile-time program linker Ø Target generic resources (e.g., SRAMs/TCAMs, action units, etc.) Ø But static Ø Run-time memory allocator Ø Target stateful memory Ø But limited 8

  9. Sy System overview S u Run-time b m Tenants i t r e q u Control Plane e s t 2 Reallocation Memory Utility Problem Table Entry … S T 1 T n Allocator Calculator Solver Handler Compile-time Linker Translation Layer 3 1 Resource Sharing Policy Data Plane … Stage 2 Stage 3 Stage m Stage 1 Resource Usage Checker Header & Metadata Program Linker Sys & Sys & Config Counter Tenant … Tenant Params Record Tables Tables Merged Jumbo Program One Big Array One Big Array One Big Array One Big Array

  10. Go Goals als of compile ile-tim time e lin linker er Ø Restrict resource usage Ø Provide isolation Ø Ensure tenant program does not inference with others’ Ø Ensure no infinite packet resubmitting Ø Ensure no loop forwarding configuration Ø … 10

  11. Pa Parser Ø Fixed packet format Parser Ø Eth, VLAN, IP, TCP or UDP header apply S’s parser to if (tag==T 1 ’s VID) followed by custom headers extract common apply T 1 ’s parser headers … Ø System program Header { Ø Extract common headers Ethernet hdr Ø Tenant Programs IP hdr VLAN hdr Ø Extract tenant-defined headers System TCP or UDP hdr Program T 1 hdr Tenant … Programs T n hdr } 11

  12. Control Con ol (ingress and egress) pipeline Ø Feed-forward packet flow Packet Flow Ø “Sandwich” architecture Control Pipeline Ø write-then-read half Ø read-then-write half Convert to if (tag==T 1 ’s VID) Pass system system apply T 1 ’s ctrl states to states … tenants Ø System program Ø Interact with tenant programs System states { Ø E.g., pass system states System states { … Ø Convert virtual addresses to physical egress_port link utilization … ones packet count } … } 12

  13. Run-tim Ru time e mem emory allo allocator Ø Page-table-like indirection Register Array Match Action Tenant 1 Config metadata.offset=0 VID==1 Params Control metadata.amount=2 6 Plane One Big Array metadata.offset=512 VID==2 Tenant 2 metadata.amount=2 4 Memory allocator … … One Big Array pkt.physical_address = Counter metadata.offset + (pkt.virtual_address % metadata.amount) Record One Big Array 13

  14. Im Implem plemen entatio tion Ø Prototype on Barefoot Tofino switch Ø Compile-time linker Ø Extend open-source P4 compiler [1] Ø Run-time memory allocator Ø Base on auto-generated APIs to pull records and modify table entries [1] https://github.com/p4lang/p4c 14

  15. Comp Compile-tim time e program am lin linker er correc ectn tnes ess Ø Resource usage on Tofino 150 Resource Usage (% of total) Ø Packet-level validation on PTF 100 Ø Sys program Ø Basic parsing and forwarding logics 50 Ø [SOSP’17] NetCache Ø [NSDI’18] NetChain 0 r t s V M s s a i e t e n H i b A g l n U b P X a R U a t S s h T S n t c # Ø Overhead i o y B t a a i h t w M c s A e a t t H c a Ø Additional gateway tables to check a G x E which program to be executed Merged program Sys program NetCache NetChain Ø Additional tag-along PHV containers 15

  16. Ru Run-tim time e mem emory allo allocator effic icien iency Ø Experimental Setting Ø 64 tenants submit 1-min heavy hitter detection task against source IP address within its /6 subnets Ø 10-min CAIDA trace replay Ø Evaluation metric Ø Utility: memory hit ratio Ø Satisfaction: time fraction w/ utility > 0.9 Ø We show the mean and 5 th percentile 16

  17. Con Conclusion on Ø Takeaways Ø A hybrid solution for multi-tenancy support Ø Compile-time linker: general but static Ø Run-time memory allocator: dynamic but limited Ø Future work Ø Seek new hardware design Ø Both general and dynamic 17

  18. Thanks! Happy to take questions tw1921@nyu.edu

Recommend


More recommend