Analyzing DPDK applications with eBPF Sharpening the toolset Stephen Hemminger Fosdem, February 1, 2020 Microsoft 1
Table of Contents Introduction Packet Capture Tracing Lttng Bpftrace Performance Conclusion 2
Introduction
Ancient wisdom French proverb Mauvés ovriers ne trovera ja bon hostill Bad workers will never fjnd a good tool Chinese proverb To do a good job, a craftsman must sharpen his tools. 3
Ancient wisdom French proverb Mauvés ovriers ne trovera ja bon hostill Bad workers will never fjnd a good tool Chinese proverb To do a good job, a craftsman must sharpen his tools. 3
Methodology • Don’t focus on a tool set • Problem statement • Workload Characterization • USE • Utilization • Saturation • Errors See Linux tracing talks (Brendan Gregg et al) 4
Methodology • Don’t focus on a tool set • Problem statement • Workload Characterization • USE • Utilization • Saturation • Errors See Linux tracing talks (Brendan Gregg et al) 4
Methodology • Don’t focus on a tool set • Problem statement • Workload Characterization • USE • Utilization • Saturation • Errors See Linux tracing talks (Brendan Gregg et al) 4
Methodology • Don’t focus on a tool set • Problem statement • Workload Characterization • USE • Utilization • Saturation • Errors See Linux tracing talks (Brendan Gregg et al) 4
Methodology • Don’t focus on a tool set • Problem statement • Workload Characterization • USE • Utilization • Saturation • Errors See Linux tracing talks (Brendan Gregg et al) 4
Capture vs Tracing 5 Capture DPDK Application pdump send ring receive tcpdump Tracing Tracer
Packet Capture
DPDK pdump • Packet copied and queued to ring • Secondary process sends to libpcap • Packets recorded in pcap format 6 DPDK Primary Application librte_pdump dpdk-pdump tool PCAP PMD dpdk_port0 capture.pcap Tra ffi c Generator
Pdump limitations • No metadata (vlan, offmoad, ...) • Inaccurate timestamp • No direction information • Single port only • No fjltering • Poor performance 7
Pdump limitations • No metadata (vlan, offmoad, ...) • Inaccurate timestamp • No direction information • Single port only • No fjltering • Poor performance 7
Pdump limitations • No metadata (vlan, offmoad, ...) • Inaccurate timestamp • No direction information • Single port only • No fjltering • Poor performance 7
Pdump limitations • No metadata (vlan, offmoad, ...) • Inaccurate timestamp • No direction information • Single port only • No fjltering • Poor performance 7
Pdump limitations • No metadata (vlan, offmoad, ...) • Inaccurate timestamp • No direction information • Single port only • No fjltering • Poor performance 7
Pdump limitations • No metadata (vlan, offmoad, ...) • Inaccurate timestamp • No direction information • Single port only • No fjltering • Poor performance 7
PCAP Next Generation • Nanosecond resolution timestamp • System and Interface metadata • Multiple interfaces • Flags (direction, hash, ...) • Comments 8
PCAP Next Generation • Nanosecond resolution timestamp • System and Interface metadata • Multiple interfaces • Flags (direction, hash, ...) • Comments 8
PCAP Next Generation • Nanosecond resolution timestamp • System and Interface metadata • Multiple interfaces • Flags (direction, hash, ...) • Comments 8
PCAP Next Generation • Nanosecond resolution timestamp • System and Interface metadata • Multiple interfaces • Flags (direction, hash, ...) • Comments 8
PCAP Next Generation • Nanosecond resolution timestamp • System and Interface metadata • Multiple interfaces • Flags (direction, hash, ...) • Comments 8
Packet fjltering with libpcap (003) jeq #0 (005) ret #65535 (004) ret jf 5 jt 4 #0x1f16168c [30] PCAP fjlter string: ip dst fosdem.org (002) ld jf 5 jt 2 #0x800 (001) jeq [12] (000) ldh cBPF program (6 insns): 9
Packet fjltering cBPF ldw r0, [30] exit L10: mov32 r0, #0x1 L9: exit L8: mov32 r0, #0xffff L7: jne r0, #0x1f16168c, L9 L6: L5: Translated to eBPF jne r0, #0x800, L9 L4: ldh r0, [12] L3: mov r6, r1 L2: xor r7, r7 L1: xor r0, r0 L0: eBPF program (11 insns): 10
Tracing
Linux Trace toolkit • Easy to use • User Defjned Trace Points • Filtering • Common Trace Format • High performance 11
Linux Trace toolkit • Easy to use • User Defjned Trace Points • Filtering • Common Trace Format • High performance 11
Linux Trace toolkit • Easy to use • User Defjned Trace Points • Filtering • Common Trace Format • High performance 11
Linux Trace toolkit • Easy to use • User Defjned Trace Points • Filtering • Common Trace Format • High performance 11
Linux Trace toolkit • Easy to use • User Defjned Trace Points • Filtering • Common Trace Format • High performance 11
Adding lttng tracepoint tx_burst , int ret ; ret = rte_eth_tx_burst ( port , queueid , m_table , n ) ; tracepoint ( l3fwd , port , /∗ Send burst queueid , n , ret ) ; i f ( u n l i k e l y ( ret < n )) return 0; } >tx_queue_id [ port ] ; 12 uint16_t int of packets on an output i n t e r f a c e ∗/ s t a t i c i n l i n e send_burst ( struct >tx_mbufs [ port ] . m_table ; lcore_conf ∗qconf , uint16_t n , uint16_t port ) { struct rte_mbuf ∗∗m_table = qconf − queueid = qconf − ^^Irte_pktmbuf_free_bulk(&m_table [ ret ] , n − ret ) ;
Using eBPF from userspace • Origin: dtrace • Adds NOP locations and ELF section • Run code at tracepoint • Prerequisites uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev 13
Using eBPF from userspace • Origin: dtrace • Adds NOP locations and ELF section • Run code at tracepoint • Prerequisites uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev 13
Using eBPF from userspace • Origin: dtrace • Adds NOP locations and ELF section • Run code at tracepoint • Prerequisites uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev 13
Using eBPF from userspace • Origin: dtrace • Adds NOP locations and ELF section • Run code at tracepoint • Prerequisites uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev 13
Using eBPF from userspace • Origin: dtrace • Adds NOP locations and ELF section • Run code at tracepoint • Prerequisites uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev 13
Using eBPF from userspace • Origin: dtrace • Adds NOP locations and ELF section • Run code at tracepoint • Prerequisites uprobe Linux 3.14 (or later) kernel sys/sdt.h systemtap-std-dev 13
Adding DTRACE probes s t a t i c ^^ I ^^ I r e t u r n ; ( u n l i k e l y (nb_rx == 0)) i f nb_rx ) ; rx_burst , DTRACE_PROBE1( testpmd , nb_pkt_per_burst ) ; pkts_burst , ^^ I ^^ I ^^ I >rx_queue , >rx_port , packets . ∗/ of /∗ Receive a burst nb_rx ; i , uint16_t ∗pkts_burst [MAX_PKT_BURST] ; rte_mbuf struct { fwd_stream ∗ f s ) pkt_burst_receive ( struct void 14 nb_rx = rte_eth_rx_burst ( fs − fs −
Looking for USDT Use bpftrace to look for tracepoints in application $ sudo bpftrace -l "usdt:./build/app/testpmd" usdt:./build/app/testpmd:testpmd:rx_burst 15
Running bpftrace [4, 8) | 5333977 |@@@@@@@@@@@@@@@@@ [32, 64) | 0 | [16, 32) | 0 | [8, 16) | 0 | | Build a histogram of the number of packets per loop 0 | [2, 4) | 0 | [1] 16001930 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [0] @: ^C Attaching 1 probe... { @ = hist(arg0); }' $ sudo bpftrace -e 'usdt:./build/app/testpmd:rx_burst 16
Performance
Caveats • Limited hardware - x85 with 25G NIC • One ofg test • Untuned • Limited scope • Testpmd - 64 byte packets • Immediate drop • Current DPDK 19.11 • Single queue active 17
Caveats • Limited hardware - x85 with 25G NIC • One ofg test • Untuned • Limited scope • Testpmd - 64 byte packets • Immediate drop • Current DPDK 19.11 • Single queue active 17
Caveats • Limited hardware - x85 with 25G NIC • One ofg test • Untuned • Limited scope • Testpmd - 64 byte packets • Immediate drop • Current DPDK 19.11 • Single queue active 17
Caveats • Limited hardware - x85 with 25G NIC • One ofg test • Untuned • Limited scope • Testpmd - 64 byte packets • Immediate drop • Current DPDK 19.11 • Single queue active 17
Caveats • Limited hardware - x85 with 25G NIC • One ofg test • Untuned • Limited scope • Testpmd - 64 byte packets • Immediate drop • Current DPDK 19.11 • Single queue active 17
Caveats • Limited hardware - x85 with 25G NIC • One ofg test • Untuned • Limited scope • Testpmd - 64 byte packets • Immediate drop • Current DPDK 19.11 • Single queue active 17
Recommend
More recommend