Pushing the Limits of Kernel Networking Networking Services Team, - PowerPoint PPT Presentation

Pushing the Limits of Kernel Networking Networking Services Team, Red Hat Alexander Duyck August 19 th , 2015 1 Pushing the Limits of Kernel Networking

Agenda ● Identifying the Limits ● Memory Locality Effect ● Death by Interrupts ● Flow Control and Buffer Bloat ● DMA Delay ● Performance ● Synchornization Slow Down ● The Cost of MMIO ● Memory Alignment, Memcpy, and Memset ● How the FIB Can Hurt Performance ● What more can be done? 2 Pushing the Limits of Kernel Networking

Identifying the Limits ● With 60B frames achieving line rate is difficult ● Only 24B of additional overhead per frame ● 10Gb/s / 125MB/Gb / 84Bpp = 14.88Mpps, 67.2nspp ● L3 cache latency on Ivy Bridge is about 30 cycles ● Each nanosecond an E5-2690 will process 2.6 cycles ● 30 cycles / 2.6 cycles/ns = 12ns ● To achieve line rate at 10G we need to do two things ● Lower processing time ● Improve scalability 3 Pushing the Limits of Kernel Networking

Memory Locality Effect ● NUMA – Non-uniform memory access 4 Pushing the Limits of Kernel Networking

Memory Locality Effect ● DDIO - Data Direct I/O ● Xeon E5 26XX Feature ● Local socket only ● No need for memory access ● XPS – Transmit Packet Steering ● Transmit packets on local CPU echo 01 > /sys/class/net/enp5s0f0/queues/tx-0/xps_cpus echo 02 > /sys/class/net/enp5s0f0/queues/tx-1/xps_cpus echo 04 > /sys/class/net/enp5s0f0/queues/tx-2/xps_cpus echo 08 > /sys/class/net/enp5s0f0/queues/tx-3/xps_cpus 5 Pushing the Limits of Kernel Networking

Death by Interrupts ● Interrupts can change location based on irqbalance ● Too low of an interrupt rate ● Overrun ring buffers on device ● Add unnecessary latency ● Overrun socket memory if NAPI shares CPU ● Too high of an interrupt rate ● Frequent context switches ● Frequent wake-ups ● Interrupt moderation schemes often tuned for benchmarks instead of real workloads 6 Pushing the Limits of Kernel Networking

Flow Control and Buffer Bloat ● Flow control can siginficantly harm performance ● Adds additional buffering, adding extra latency ● Creates head-of-line blocking which limits throughput ● Faster queues drop packets waiting on slowest CPU ● Some NICs implement per-queue drop when disabled ● Disabling it requires just one line in ethtool ethtool -A enp5s0f0 tx off rx off autoneg off 7 Pushing the Limits of Kernel Networking

DMA Delay ● IOMMU can add security but at significant overhead ● Resource allocation/free requires lock ● Hardware access required to add/remove resources ● If you don't need it you can turn it off intel_iommu=off ● If you need it for virualization (KVM/XEN) iommu=pt ● Some drivers include mitigation strategies ● Page reuse 8 Pushing the Limits of Kernel Networking

Performance Data Ahead!!! ● Single socket Xeon E5-2690 ● Dual port 82599ES ● Assigned addresses 192.168.100.64 & 192.168.101.64 ● Disabled flow control ● Pinned IRQs 1:1 ● Used ntuple filter to force flows to specific queues ● CPU C states disabled via cpu /dev/cpu_dma_latency ● Traffic generator sent IP data w/ RR source address ● Each frame sent 4 times before moving to next address ● Your Experience May Vary 9 Pushing the Limits of Kernel Networking

Routing Performance 14,000,000 12,000,000 10,000,000 Packets Per Second 8,000,000 RHEL 7.1 6,000,000 4,000,000 2,000,000 0 1 2 3 4 5 6 7 8 9 10 11 12 Threads 10 Pushing the Limits of Kernel Networking

Synchronization Slow Down ● Synchronization primitives come at a heavy cost ● local_irq_save/resore costs 10s of ns ● Not needed when all requests are in same context ● rmb/wmb flush pipelines which adds delay ● Needed for some architectures but not others ● Updated kernel to remove unecessary bits in 3.19 ● NAPI allocator for page fragments and skb ● dma_rmb/wmb for DMA memory ordering 11 Pushing the Limits of Kernel Networking

The Cost of MMIO ● MMIO write to notify device can cost hundreds of ns ● Latency shows up as either Qdisc lock, or Tx queue unlock overhead ● xmit_more was added to 3.18 kernel to address this ● Reduces MMIO writes to device ● Reduces locking overhead per packet ● Reduces interrupt rates as packets are coalesced ● Allows for 10Gbps line rate 60B packets w/ pktgen 12 Pushing the Limits of Kernel Networking

Memory Alignment, Memcpy, and Memset ● Partial cache-line writes come at a cost ● Most architectures now start with NET_IP_ALIGN = 0 ● On x86 partial writes trigger a read, modify, write cycle ● String ops change implementation based on CPU flags ● erms and rep_good can have impact on performance ● KVM doesn't copy CPU flags by default ● tx-nocache-copy ● Enabled use of movntq for user to kernel space copy ● Enabled by default for kernels 3.0 – 3.13 ● Prevents use of features such as DDIO ethtool -K enp5s0f0 tx-nocache-copy off 13 Pushing the Limits of Kernel Networking

How the FIB Can Hurt Performance ● Starting w/ version 4.0 of kernel fib_trie was rewritten ● FIB statistics were made per CPU and not global ● Penalty for trie depth significantly reduced ● Kernel 4.1 merged local and main trie for further gains ● Recommendations for kernels prior to 4.0 ● Disable CONFIG_IP_FIB_TRIE_STATS in kernel config ● Avoid assigning addresses such as 192.168.122.1 ● IPs in the range 192.168.122.64 – 191 can reduce depth by 1 ● Use class A reserved addresses to redeuce trie walk ● 10.x.x.x likely will contain fewer bits than 192.168.x.x 14 Pushing the Limits of Kernel Networking

Routing Performance 14000000 12000000 10000000 Packets Per Second 8000000 RHEL 7.1 RHEL 7.2 6000000 4000000 2000000 0 1 2 3 4 5 6 7 8 9 10 11 12 Threads 15 Pushing the Limits of Kernel Networking

What More Can be Done? ● SLAB/SLUB bulk allocation ● https://lwn.net/Articles/648211/ ● Tuning interrupt moderation to work in more cases ● Pktgen with 60B packets ● Explore optimizing users for memset/memcpy() ● build_skb() ● Find a way to better use xmit_more on small packets ● Explore shortening Tx/Rx queue lengths 16 Pushing the Limits of Kernel Networking

Routing Performance 14000000 12000000 10000000 Packetrs Per Second 8000000 RHEL 7.1 RHEL 7.2 6000000 T weaked 7.2 4000000 2000000 0 1 2 3 4 5 6 7 8 9 10 11 12 Threads 17 Pushing the Limits of Kernel Networking

Questions? ● Alexander Duyck ● alexander.h.duyck@redhat.com ● AlexanderDuyck@gmail.com 18 Pushing the Limits of Kernel Networking

Pushing the Limits of Kernel Networking Networking Services Team, - PowerPoint PPT Presentation

Pushing the Limits of Kernel Networking Networking Services Team, Red Hat Alexander Duyck August 19 th , 2015 1 Pushing the Limits of Kernel Networking Agenda Identifying the Limits Memory Locality Effect Death by Interrupts

City Limits Lions Clubs City Limits Lions Clubs City Limits Lions Clubs City Limits Lions

Different Types of Limits Besides ordinary, two-sided limits, there are one-sided limits (left-

MAT 166 Calculus for Bus/Soc Chapter 3 Notes Limits The Deriviative David J. Gisch Limits

Pushing the limits: Climate and Range Distribution of Two Forest Pests *****Colloque prsent en

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Limits (the size of the pie) allocation limits minimum reliability flow of supply Limits

Medical Programs Overview Table 1. Caption Medical SNAP TANF Programs Income Limits Income

Scope & Limits of Scope & Limits of Scope & Limits of Legal Authority Legal

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

From HARPS to ESPRESSO Pushing the limits further Francesco Pepe, Observatoire de Genve

Rajiv Ramnath Program Director, Software Cluster, NSF/OAC rramnath@nsf.gov Version: 03/09/17

persistent queries and phantom nameservers "I wonder how many systems will _still_ be

Data objects/Atoms Some objects are only identified by their name, called an atom . Thus, it is

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

XPS X-ray absorption and X-ray photoemission XAS and XPS I ( FIXED ) Charge transfer effects

CS 4100: Artificial Intelligence Nave Bayes Jan-Willem van de Meent, Northeastern University

Symbol Timing Synchronization Part II: Over-the-air Testing ELEC 433 Evan Everett and Michael Wu

1 Access Agreement 2 Sampling residential soil with XRF tool 3 Soil sample being removed from

Pushing the Limits of Kernel Networking Networking Services Team, - PowerPoint PPT Presentation

Pushing the Limits of Kernel Networking Networking Services Team, Red Hat Alexander Duyck August 19 th , 2015 1 Pushing the Limits of Kernel Networking Agenda Identifying the Limits Memory Locality Effect Death by Interrupts

City Limits Lions Clubs City Limits Lions Clubs City Limits Lions Clubs City Limits Lions

Different Types of Limits Besides ordinary, two-sided limits, there are one-sided limits (left-

MAT 166 Calculus for Bus/Soc Chapter 3 Notes Limits The Deriviative David J. Gisch Limits

Pushing the limits: Climate and Range Distribution of Two Forest Pests *****Colloque prsent en

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Limits (the size of the pie) allocation limits minimum reliability flow of supply Limits

Medical Programs Overview Table 1. Caption Medical SNAP TANF Programs Income Limits Income

Scope &amp; Limits of Scope &amp; Limits of Scope &amp; Limits of Legal Authority Legal

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Kernel Machines Support Vector Machines 1 Kernel Machines Optimal Separating HyperPlanes Soft

From HARPS to ESPRESSO Pushing the limits further Francesco Pepe, Observatoire de Genve

Rajiv Ramnath Program Director, Software Cluster, NSF/OAC rramnath@nsf.gov Version: 03/09/17

persistent queries and phantom nameservers &quot;I wonder how many systems will _still_ be

Data objects/Atoms Some objects are only identified by their name, called an atom . Thus, it is

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

XPS X-ray absorption and X-ray photoemission XAS and XPS I ( FIXED ) Charge transfer effects

CS 4100: Artificial Intelligence Nave Bayes Jan-Willem van de Meent, Northeastern University

Symbol Timing Synchronization Part II: Over-the-air Testing ELEC 433 Evan Everett and Michael Wu

1 Access Agreement 2 Sampling residential soil with XRF tool 3 Soil sample being removed from

Scope & Limits of Scope & Limits of Scope & Limits of Legal Authority Legal

persistent queries and phantom nameservers "I wonder how many systems will _still_ be