Revisiting Benchmarking Methodology for Interconnect Devices Daniel - PowerPoint PPT Presentation

Chair of Network Architectures and Services Department of Informatics Technical University of Munich Revisiting Benchmarking Methodology for Interconnect Devices Daniel Raumer, Sebastian Gallemüller, Florian Wohlfart, Paul Emmerich, Patrick Werneck, and Georg Carle July 16, 2016

Contents Case study: benchmarking software routers Flaws of benchmarks Latency metrics Latency under load Traffic pattern Omitted tests Reproducibility Conclusion Daniel Raumer – Revisiting Benchmarking Methodology 2

Why to revisit benchmarking state of the art? • Numerous standards, recommendations, best practices • Well-known benchmarking definition RFC 2544 • Various extensions • Divergence of benchmarks • New class of devices • High speed network IO frameworks • Virtual switching • Many core CPU architectures: CPU NIC Daniel Raumer – Revisiting Benchmarking Methodology 3

Common metrics • Throughput: highest rate that the devices under test (DuT) can serve without loss. • Back-to-Back frame burst size: longest duration (in frames) without loss. • Frame loss rate: percentage of dropped frames under a given load. • Latency: average duration a packet stays within the DuT. • . . . extended metrics, e.g., FIB-dependent performance • . . . additional SHOULDs, rarely measured Daniel Raumer – Revisiting Benchmarking Methodology 4

Case study: RFC 2544 benchmarks RFC 2544 ◭ ◭ DuT Test Suite ◮ ◮ Three different DuTs • Linux router • FreeBSD router • MikroTik router Daniel Raumer – Revisiting Benchmarking Methodology 5

Flaws of benchmarks: selected examples Daniel Raumer – Revisiting Benchmarking Methodology 6

Meaningful latency measurements: case study Probability [%] 0.6 0.4 0.2 0 0 5 10 15 20 25 30 Latency [ µ s] • FreeBSD, 64-byte packets • Average does not reflect long tail distribution Daniel Raumer – Revisiting Benchmarking Methodology 7

Meaningful latency measurements: 2nd example Probability [%] 10 5 0 1 1.5 2 2.5 3 3.5 4 Latency [ µ s] • Pica8 switch tested in [IFIP NETWORKING 16] • Different processing paths through a device • Bimodal distribution • Average latency is misleading → Extensive reports: histograms for visualization → Short reports: percentiles (25th, 50th, 75th, 95th, 99th, and 99.9th) Daniel Raumer – Revisiting Benchmarking Methodology 8

Latency under load CBR (median) 200 CBR (25 th /75 th percentile) Latency [ µ s] 100 0 0 0.5 1 1.5 2 Offered load [Mpps] • Open vSwitch (Linux NAPI & ixgbe) [IMC15] • Latency at maximum throughput is not worst case → Measurements at different loads (10, 20, ..., 100% max. throughput) Daniel Raumer – Revisiting Benchmarking Methodology 9

Traffic pattern & latency CBR (median) 200 CBR (25 th /75 th percentile) Latency [ µ s] Poisson (median) Poisson (25 th /75 th percentile) 100 0 0 0.5 1 1.5 2 Offered load [Mpps] • Open vSwitch (NAPI + ixgbe) [IMC15] • Different behavior for different traffic patterns → Tests with different traffic patterns → Poisson process to approximate real world traffic Daniel Raumer – Revisiting Benchmarking Methodology 10

Omitted tests Throughput L1 L2 L3 · 10 6 [Cache Misses/Pkt.] 1.6 Rate [Mpps] 10 1.4 5 1.2 10 6 0 10 0 10 0 10 1 10 1 10 2 10 2 10 3 10 3 10 4 10 4 10 5 10 5 10 6 IP addresses [log] • CPU caches affect the performance → Additional tests for certain device classes → Functionality dependent tests Daniel Raumer – Revisiting Benchmarking Methodology 11

Reproducibility of configurations • Manual device configuration is error prone • Device configuration is hard to reproduce → Reproducible configuration of DuT via scripts → Configuration scripts executed by benchmarking tool Daniel Raumer – Revisiting Benchmarking Methodology 12

Conclusion • Novel class of devices requires additional tests • There are arguments for reconsidering best practice: • Average latency may be misleading → Histograms / percentiles • Latency is load dependent → Measure 10, 20, ..., 100% of max. throughput • CBR traffic is a unrealistic test pattern → Poisson process • Device specific functionality → Perform device specific benchmarks • Manual configuration is error prone → Automatic configuration by benchmark tool Daniel Raumer – Revisiting Benchmarking Methodology 13

Novelty: RFC 2544 test suite on commodity hardware • MoonGen [IMC15] is a fast software packet generator • Hardware-assisted latency measurements (misusing PTP support) • Precise software rate control and traffic patterns • http://net.in.tum.de/pub/router-benchmarking/ • RFC 2544 benchmark reports for Linux, FreeBSD, and MikroTik • Early version of the MoonGen RFC 2544 module Daniel Raumer – Revisiting Benchmarking Methodology 14

Revisiting Benchmarking Methodology for Interconnect Devices Daniel - PowerPoint PPT Presentation

Chair of Network Architectures and Services Department of Informatics Technical University of Munich Revisiting Benchmarking Methodology for Interconnect Devices Daniel Raumer, Sebastian Gallemller, Florian Wohlfart, Paul Emmerich, Patrick

Geometrically Parameterized Interconnect Geometrically Parameterized Interconnect Performance

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3

Benchmarking Lunch-n-Learn March 18, 2019 Agenda 1. Why Benchmarking? 2. Introduction to

Connectors: HiLo Interconnect Wednesday, October 04, 2017 Company Overview March 12, 2015

COOL Interconnect COOL Interconnect Low Power Interconnection Technology Low Power

The Interconnect Verification Challenge Franois Cerisier and Mike Bartley Test and

RF-Interconnect RF-Interconnect and its Applications to and its Applications to NoC Design NoC

Robust Interconnect Robust Interconnect Communication Capacity Algorithm Communication Capacity

Interconnect Technologies for Clusters Interconnect approaches Cluster Computing WAN

Revisiting the Estim ation of the Revisiting the Estim ation of the Marginal Cost of Highw ay

Historical Spaces Historical Spaces Revisiting revolution memory and Revisiting revolution

Revisiting Magnetic Field Limits in Revisiting Magnetic Field Limits in Quadrupoles Arising From

Next Generation ACO Model Review of Alignment / Benchmarking Methodology February 28, 2017 For

Scaling Methodology Scaling Methodology Dan Smith Director HW Engineering dsmith@nvidia.com

Benchmarking Methodology for IPv6 Transition Technologies

Addressing the System-on-a- Addressing the System-on-a- Chip Interconnect Woes Through Chip

The Epeak - Fluence Bimodality: A fundamental discriminator between long and short GRBs Adam

Announcements Piazza started Matlab Grader homework, email Friday, 2 (of 9) homeworks Due 21

CS340: Machine Learning Modelling discrete data with Bernoulli and multinomial distributions

Adaptive Background Mixture Models for Real-Time Tracking Chris Stauffer and W.E.L Grimson CVPR

Slice sampling Dr. Jarad Niemi STAT 615 - Iowa State University November 14, 2017 Jarad Niemi

Control Charts for x and R Subsequent use of the charts The next 20 samples are added to the

AES-Based Authenticated Encryption Modes in Parallel High-Performance Software Andrey Bogdanov

Sampling multimodal densities in high dimensional sampling space Gersende FORT LTCI, CNRS &

Revisiting Benchmarking Methodology for Interconnect Devices Daniel - PowerPoint PPT Presentation

Chair of Network Architectures and Services Department of Informatics Technical University of Munich Revisiting Benchmarking Methodology for Interconnect Devices Daniel Raumer, Sebastian Gallemller, Florian Wohlfart, Paul Emmerich, Patrick

Geometrically Parameterized Interconnect Geometrically Parameterized Interconnect Performance

B3 Benchmarking B3 Building Benchmarking Program Overview www.CleanEnergyResourceTeams.org B3

Benchmarking Lunch-n-Learn March 18, 2019 Agenda 1. Why Benchmarking? 2. Introduction to

Connectors: HiLo Interconnect Wednesday, October 04, 2017 Company Overview March 12, 2015

COOL Interconnect COOL Interconnect Low Power Interconnection Technology Low Power

The Interconnect Verification Challenge Franois Cerisier and Mike Bartley Test and

RF-Interconnect RF-Interconnect and its Applications to and its Applications to NoC Design NoC

Robust Interconnect Robust Interconnect Communication Capacity Algorithm Communication Capacity

Interconnect Technologies for Clusters Interconnect approaches Cluster Computing WAN

Revisiting the Estim ation of the Revisiting the Estim ation of the Marginal Cost of Highw ay

Historical Spaces Historical Spaces Revisiting revolution memory and Revisiting revolution

Revisiting Magnetic Field Limits in Revisiting Magnetic Field Limits in Quadrupoles Arising From

Next Generation ACO Model Review of Alignment / Benchmarking Methodology February 28, 2017 For

Scaling Methodology Scaling Methodology Dan Smith Director HW Engineering dsmith@nvidia.com

Benchmarking Methodology for IPv6 Transition Technologies

Addressing the System-on-a- Addressing the System-on-a- Chip Interconnect Woes Through Chip

The Epeak - Fluence Bimodality: A fundamental discriminator between long and short GRBs Adam

Announcements Piazza started Matlab Grader homework, email Friday, 2 (of 9) homeworks Due 21

CS340: Machine Learning Modelling discrete data with Bernoulli and multinomial distributions

Adaptive Background Mixture Models for Real-Time Tracking Chris Stauffer and W.E.L Grimson CVPR

Slice sampling Dr. Jarad Niemi STAT 615 - Iowa State University November 14, 2017 Jarad Niemi

Control Charts for x and R Subsequent use of the charts The next 20 samples are added to the

AES-Based Authenticated Encryption Modes in Parallel High-Performance Software Andrey Bogdanov

Sampling multimodal densities in high dimensional sampling space Gersende FORT LTCI, CNRS &amp;

Sampling multimodal densities in high dimensional sampling space Gersende FORT LTCI, CNRS &