the operating system
play

The Operating System is the Control Plane Simon Peter , Jialin Li, - PowerPoint PPT Presentation

Arrakis is: The Operating System is the Control Plane Simon Peter , Jialin Li, Irene Zhang, Timothy Roscoe Dan Ports, Doug Woos, ETH Zurich Arvind Krishnamurthy, Tom Anderson University of Washington Building an OS for the Data Center


  1. Arrakis is: The Operating System is the Control Plane Simon Peter , Jialin Li, Irene Zhang, Timothy Roscoe Dan Ports, Doug Woos, ETH Zurich Arvind Krishnamurthy, Tom Anderson University of Washington

  2. Building an OS for the Data Center • Server I/O performance matters • Key- value stores, web & file servers, lock managers, … • Can we deliver performance close to hardware? • Example system: Dell PowerEdge R520 + + = $1,200 Intel X520 Intel RS3 RAID Sandy Bridge CPU 10G NIC 1GB flash-backed cache 6 cores, 2.2 GHz 2 us / 1KB packet 25 us / 1KB write

  3. Building an OS for the Data Center • Server I/O performance matters • Key- value stores, web & file servers, lock managers, … • Can we deliver performance close to hardware? Today’s I/O devices are fast • Example system: Dell PowerEdge R520 + + = $1,200 Intel X520 Intel RS3 RAID Sandy Bridge CPU 10G NIC 1GB flash-backed cache 6 cores, 2.2 GHz 2 us / 1KB packet 25 us / 1KB write

  4. Can’t we just use Linux?

  5. Linux I/O Performance % OF 1KB REQUEST TIME SPENT 9 us HW 18% Kernel 62% App 20% GET Redis App HW 163 us Kernel 84% SET 13% 3% API Multiplexing Naming Resource limits Kernel Access control I/O Scheduling Data I/O Processing Copying Path Protection 10G NIC RAID Storage 25 us / 1KB write 2 us / 1KB packet

  6. Linux I/O Performance % OF 1KB REQUEST TIME SPENT 9 us HW 18% Kernel 62% App 20% GET Redis App HW 163 us Kernel 84% SET 13% 3% Kernel mediation API Multiplexing is too heavyweight Naming Resource limits Kernel Access control I/O Scheduling Data I/O Processing Copying Path Protection 10G NIC RAID Storage 25 us / 1KB write 2 us / 1KB packet

  7. Arrakis Goals • Skip kernel & deliver I/O directly to applications • Reduce OS overhead • Keep classical server OS features • Process protection • Resource limits • I/O protocol flexibility • Global naming • The hardware can help us…

  8. Hardware I/O Virtualization • Standard on NIC, emerging on RAID • Multiplexing SR-IOV NIC • SR-IOV : Virtual PCI devices w/ own registers, queues, INTs User-level User-level VNIC 1 VNIC 2 • Protection Rate limiters • IOMMU : Devices use app virtual memory Packet filters • Packet filters , logical disks : Only allow eligible I/O • I/O Scheduling Network • NIC rate limiter , packet schedulers

  9. How to skip the kernel? Redis Redis API Multiplexing Naming Resource limits Kernel Access control I/O Scheduling Data I/O Processing Copying Path Protection I/O Devices

  10. How to skip the kernel? Redis Redis API Naming Resource limits Kernel Access control Data I/O Processing Copying Path Protection Multiplexing I/O Devices I/O Scheduling

  11. How to skip the kernel? API Redis I/O Processing Redis Naming Resource limits Kernel Access control Data Copying Path Protection Multiplexing I/O Devices I/O Scheduling

  12. How to skip the kernel? API Redis I/O Processing Redis Naming Resource limits Kernel Access control Data Path Protection Multiplexing I/O Devices I/O Scheduling

  13. Arrakis I/O Architecture Control Plane Data Plane Redis Redis API I/O Processing Kernel Naming Data Path Access control Resource limits I/O Devices Protection Multiplexing I/O Scheduling

  14. Arrakis I/O Architecture Control Plane Data Plane Redis Redis API I/O Processing Kernel Naming Data Path Access control Resource limits I/O Devices Protection Multiplexing I/O Scheduling

  15. Arrakis I/O Architecture Control Plane Data Plane Redis Redis API I/O Processing Kernel Kernel Naming Naming Data Path Access control Access control Resource limits Resource limits I/O Devices Protection Multiplexing I/O Scheduling

  16. Arrakis Control Plane • Access control • Do once when configuring data plane • Enforced via NIC filters, logical disks • Resource limits • Program hardware I/O schedulers • Global naming • Virtual file system still in kernel • Storage implementation in applications

  17. Global Naming Virtual Storage Area Fast Redis HW ops /tmp/lockfile /var/lib/key_value.db /etc/config.rc … Logical Kernel disk VFS

  18. Global Naming Virtual Storage Area Fast Redis HW ops /tmp/lockfile /var/lib/key_value.db /etc/config.rc … emacs Logical Kernel disk VFS

  19. Global Naming Virtual Storage Area Fast Redis HW ops /tmp/lockfile /var/lib/key_value.db /etc/config.rc … emacs open(“/ etc/config.rc ”) Logical Kernel disk VFS

  20. Global Naming Virtual Storage Area Fast Redis HW ops /tmp/lockfile /var/lib/key_value.db Indirect IPC interface /etc/config.rc … emacs Logical Kernel disk VFS

  21. Arrakis I/O Architecture Control Plane Data Plane Redis Redis API I/O Processing Kernel Naming Data Path Access control Resource limits I/O Devices Protection Multiplexing I/O Scheduling

  22. Arrakis I/O Architecture Control Plane Data Plane Redis Redis API I/O Processing Kernel Naming Data Path Access control Resource limits I/O Devices Protection Multiplexing I/O Scheduling

  23. Arrakis I/O Architecture Control Plane Data Plane Redis Redis Redis API API I/O Processing I/O Processing Kernel Naming Data Path Access control Resource limits I/O Devices Protection Multiplexing I/O Scheduling

  24. Storage Data Plane: Persistent Data Structures • Examples: log, queue • Operations immediately persistent on disk Benefits: • In-memory = on-disk layout • Eliminates marshaling • Metadata in data structure • Early allocation • Spatial locality • Data structure specific caching/prefetching • Modified Redis to use persistent log : 109 LOC changed

  25. Evaluation

  26. Redis Latency • Reduced (in-memory) GET latency by 65% Linux HW 18% Kernel 62% App 20% 9 us 4 us Arrakis HW 33% libIO 35% App 32% • Reduced (persistent) SET latency by 81% App Linux (ext4) HW 13% Kernel 84% 163 us 3% libIO Arrakis HW 77% App 15% 31 us 7%

  27. Redis Throughput • Improved GET throughput by 1.75x • Linux: 143k transactions/s • Arrakis: 250k transactions/s • Improved SET throughput by 9x • Linux: 7k transactions/s • Arrakis: 63k transactions/s

  28. memcached Scalability 10Gb/s interface limit 3.1x 1200 1000 800 2x Throughput 600 (k transactions/s) 400 1.8x 200 0 1 2 4 Number of CPU cores Linux Arrakis

  29. Single-core Performance UDP echo benchmark 10Gb/s interface limit 1200 3.6x 1000 3.4x 800 2.3x Throughput 600 (k packets/s) 400 1x 200 0 Linux Arrakis/POSIX Arrakis/Zero-copy Driver

  30. Summary • OS is becoming an I/O bottleneck • Globally shared I/O stacks are slow on data path • Arrakis: Split OS into control/data plane • Direct application I/O on data path • Specialized I/O libaries • Application-level I/O stacks deliver great performance • Redis: up to 9x throughput, 81% speedup • Memcached scales linearly to 3x throughput Source code: http://arrakis.cs.washington.edu

Recommend


More recommend