communication for enterprise
play

Communication for Enterprise Appliances Anton Burtsev , Kiran - PowerPoint PPT Presentation

Fido: Fast Inter-Virtual-Machine Communication for Enterprise Appliances Anton Burtsev , Kiran Srinivasan, Prashanth Radhakrishnan, Lakshmi N. Bairavasundaram, Kaladhar Voruganti, Garth R. Goodson NetApp, Inc University of Utah, School


  1. Fido: Fast Inter-Virtual-Machine Communication for Enterprise Appliances Anton Burtsev † , Kiran Srinivasan, Prashanth Radhakrishnan, Lakshmi N. Bairavasundaram, Kaladhar Voruganti, Garth R. Goodson NetApp, Inc † University of Utah, School of Computing

  2. Enterprise appliances Network attached storage, routers, etc. • High performance • Scalable and highly-available access 2

  3. Example Appliance • Monolithic kernel • Kernel components Problems: • Fault isolation • Performance isolation • Resource provisioning 3

  4. Split architecture 4

  5. Benefits of virtualization • High availability • Fault-isolation • Micro-reboots • Partial functionality in case of failure • Performance isolation • Resource allocation • Consolidation and load balancing, VM migration • Non-disruptive updates • Hardware upgrades via VM migration • Software updates as micro-reboots • Computation to data migration 5

  6. Main Problem: Performance Is it possible to match performance of a monolithic environment? • Large amount of data movement between components • Mostly cross-core • Connection oriented (established once) • Throughput optimized (asynchronous) • Coarse grained (no one-word messages) • Multi-stage data processing • Main cost contributors • Transitions to hypervisor • Memory map/copy operations • Not VM context switches (multi-cores) • Not IPC marshaling 6

  7. Main Insight: Relaxed Trust Model • Appliance is built by a single organization • Components: • Pre-tested and qualified • Collaborative and non-malicious • Share memory read-only across VMs! • Fast inter-VM communication • Exchange only pointers to data • No hypervisor calls (only cross-core notification) • No memory map/copy operations • Zero-copy across entire appliance 7

  8. Contributions • Fast inter-VM communication mechanism • Abstraction of a single address space for traditional systems • Case study • Realistic microkernelized network attached storage 8

  9. Design 9

  10. Design Goals • Performance • High-throughput • Practicality • Minimal guest system and hypervisor dependencies • No intrusive guest kernel changes • Generality • Support for different communication mechanisms in the guest system 10

  11. Transitive Zero Copy • Goal • Zero-copy across entire appliance • No changes to guest kernel • Observation • Multi-stage data processing 11

  12. Pseudo Global Virtual Address Space 2 64 Insight: • CPUs support 64-bit address space • Individual VMs have no need in it 0 12

  13. Pseudo Global Virtual Address Space 2 64 0 13

  14. Pseudo Global Virtual Address Space 2 64 0 14

  15. Transitive Zero Copy 15

  16. Fido: High-level View 16

  17. Fido: High-level View • “c” – connection management • “m” – memory mapping • “s” – cross-VM signaling 17

  18. IPC Organization • Shared memory ring • Pointers to data 18

  19. IPC Organization • Shared memory ring • Pointers to data • For complex data structures • Scatter-gather array 19

  20. IPC Organization • Shared memory ring • Pointers to data • For complex data structures • Scatter-gather array • Translate pointers 20

  21. IPC Organization • Shared memory ring • Pointers to data • For complex data structures • Scatter-gather array • Translate pointers • Signaling: • Cross-core interrupts (event channels) • Batching and in-ring polling 21

  22. Fast device-level communication • MMNet • Link-level • Standard network device interface • Supports full transitive zero-copy • MMBlk • Block-level • Standard block device interface • Zero-copy on write • Incurs one copy on read 22

  23. Evaluation 23

  24. MMNet Evaluation Loop NetFront XenLoop MMNet • AMD Opteron with 2 2.1GHz 4-core CPUs (8 cores total) • 16GB RAM • NVidia 1Gbps NICs • 64-bit Xen (3.2), 64-bit Linux (2.6.18.8) • Netperf benchmark (2.4.4) 24

  25. MMNet: TCP Throughput 12000 10000 Throughput (Mbps) 8000 Monolithic 6000 Netfront XenLoop 4000 MMNet 2000 0 0.5 1 2 4 8 16 32 64 128 256 Message Size (KB) 25

  26. MMBlk Evaluation Monolithic XenBlk MMNet • Same hardware • AMD Opteron with 2 2.1GHz 4-core CPUs (8 cores total) • 16GB Ram • NVidia 1Gbps NICs • VMs are configured with 4GB and 1GB RAM • 3 GB in-memory file system (TMPFS) • IOZone benchmark 26

  27. MMBlk Sequential Writes 600 500 Throughput (MB/s) 400 300 Monolithic XenBlk 200 MMBlk 100 0 4 8 16 32 64 128 256 512 1K 2K 4K Record Size (KB) 27

  28. Case Study 28

  29. Network-attached Storage 29

  30. Network-attached Storage • RAM • VMs have 1GB each, except FS VM (4GB) • Monolithic system has 7GB RAM • Disks : • RAID5 over 3 64MB/s disks • Benchmark • IOZone reads/writes 8GB file over NFS (async) 30

  31. Sequential Writes 90 80 70 Throughput (MB/s) 60 50 Monolithic 40 Native-Xen 30 MM-Xen 20 10 0 4 8 16 32 64 128 256 512 1K 2K 4K Record Size (KB) 31

  32. Sequential Reads 80 70 60 Throughput (MB/s) 50 40 Monolithic Native-Xen 30 MM-Xen 20 10 0 4 8 16 32 64 128 256 512 1K 2K 4K Record Size (KB) 32

  33. TPC-C (On-Line Transactional Processing) 350 300 Transactions/minute (tpmC) 250 Monolithic 200 MMXen 150 Native-Xen 100 50 0 33

  34. Conclusions • We match monolithic performance • “ Microkernelization ” of traditional systems is possible! • Fast inter-VM communication • The search for VM communication mechanisms is not over • Important aspects of design • Trust model • VM as a library (for example, FSVA) • End-to-end zero copy • Pseudo Global Virtual Address Space • There are still problems to solve • Full end-to-end zero copy • Cross-VM memory management • Full utilization of pipelined parallelism 34

  35. Thank you. aburtsev@flux.utah.edu 35

  36. Backup Slides 36

  37. Related Work • Traditional microkernels [L4, Eros, CoyotOS] • Synchronous (effectively thread migration) • Optimized for single-CPU, fast context switch, small messages (often in registers), efficient marshaling (IDL) • Buffer management [Fbufs, IOLite, Beltway Buffers] • Shared buffer is a unit of protection • Fast-forward – fast cache-to-cache data transfer • VMs [Xen split drivers, XWay, XenSocket, XenLoop] • Page flipping, later buffer sharing • IVC, VMCI • Language-based protection [Singularity] • Shared heap, zero-copy (only pointer transfer) • Hardware acceleration [Solarflare] • Multi-core OSes [Barrelfish, Corey, FOS] 37

Recommend


More recommend