T AIL B ENCH : A B ENCHMARK S UITE AND E VALUATION M ETHODOLOGY FOR L - PowerPoint PPT Presentation

T AIL B ENCH : A B ENCHMARK S UITE AND E VALUATION M ETHODOLOGY FOR L ATENCY - C RITICAL A PPLICATIONS H ARSHAD K ASTURE , D ANIEL S ANCHEZ IISWC 2016 tailbench.csail.mit.edu

Executive Summary 2  Latency-critical applications have stringent performance requirements  low datacenter utilization  Wastes billions of dollars in energy and equipment annually  Research in this area hampered by the lack of a comprehensive benchmark suite  Few latency-critical applications  limited coverage  Complicated setup and configuration Inaccurate latency  Methodological issues measurements  TailBench makes latency-critical applications easy to analyze  Varied application domains and latency characteristics  Standardized, statistically sound methodology  Supports simplified load-testing configurations

Outline 3  Background and Motivation  TailBench Applications  TailBench Harness  Simplified Configurations

Understanding Latency-Critical Applications 4 Back End Back End Leaf Node Client Back End Client Root Node Back End Client Leaf Node Back End Back End Leaf Node Datacenter

Understanding Latency-Critical Applications 7 Back End Back End 1 ms Leaf Node Client Back End Client 1 ms Root Node Back End Client Leaf Node Back End Back End Leaf Node Datacenter  The few slowest responses determine user-perceived latency  Tail latency (e.g., 95 th / 99 th percentile), not mean latency, determines performance

Latency Requirements Cause Low Utilization 8  End-to-end latency increases rapidly with load  Must keep utilization low to keep latency within reasonable bounds  Traditional resource management techniques (e.g., colocation) often cannot be used since they degrade latency  Low resource utilization wastes billions of dollars in energy and equipment  Sparked research in latency-critical systems

Benchmark Suite Design Goals 9  Applications from a diverse set of domains Hell K V 你好 o  Applications with diverse tail latency characteristics 100 μ s 1 ms 10 ms 100 ms 1 s Live VM Migration LLC Warmup DVFS  Easy to set up and run  Support different measurement scenarios  Robust latency measurement methodology

TailBench Applications 11 xapian masstree moses sphinx K V Hello 你好 Speech Statistical Machine Online Search Key-Value Store Recognition Translation shore silo specjbb img-dnn On-disk Database Image Recognition Java Middleware In-memory Database

Wide Range of End-to-End Latencies 12 100 μ s 1 ms 10 ms 100 ms 1 s silo specjbb masstree shore xapian img-dnn moses sphinx

Varied Service Time Characteristics 13  masstree service times are more tightly distributed  xapian service times are more loosely distributed

End-to-End Latency vs. Load 14

Tail ≠ Mean 15  Tail latency increases more rapidly with load than mean latency  Relationship between mean and tail latencies is hard to predict

Impact of Parallelism 16

Parallelism Helps Some Applications 17

…But Hurts Others 18

TailBench Harness 20  Measuring tail latency accurately is complicated  Load generation, statistics aggregation, warmup periods…  Harness encapsulates most of the complexity  Harness makes TailBench easily extensible  New benchmarks reuse existing harness functionality  Simplified harness configurations enable different measurement scenarios  Trade off some accuracy for reduced setup complexity

Example: Open- vs. Closed-Loop Clients 21 Client Ω Network Ω Client Application  Many popular load testers use closed-loop clients  Clients wait for response before submitting next request  Increase in application load throttles client request rate  Latency-critical applications typically service a large number of independent clients  Request rate independent of application load  Better modeled by open-loop clients  Closed-loop clients can underestimate latency by orders of magnitude [Tene LLS 2013, Zhang ISCA 2016]

Networked Harness Configuration 22 TCP/IP App Traffic Shaper Client Req. Queue Network Application Stats Collector TCP/IP … App TCP/IP Traffic Shaper Client Stats Collector

Networked Harness Configuration 23 TCP/IP App Traffic Shaper Client Req. Queue Network Application Stats Collector TCP/IP … App TCP/IP Traffic Shaper Client Stats Collector  Application and the clients run on separate machines  Traffic Shaper inserts inter-request delays to model load  Request Queue enqueues incoming requests and measures service times and queuing delays  Statistics Collector aggregates latency data

Networked Harness Configuration 27 TCP/IP App Traffic Shaper Client Req. Queue Network Application Stats Collector TCP/IP … App TCP/IP Traffic Shaper Client Stats Collector  Faithfully captures all sources of overhead X Difficult to configure and deploy

Loopback Harness Configuration 29 App Client TCP/IP TCP/IP Loopback Application Loopback App Client  Application and clients reside on the same machine  Reduced setup complexity  Highly accurate in many cases X Difficult to simulate

Load-Latency for Networked Configuration 30

Loopback Configuration Highly Accurate 31  Loopback and Networked configurations have near-identical performance  Networking delays minimal in our setup

Loopback Harness Configuration 32 App Client TCP/IP TCP/IP Loopback Application Loopback App Client  Application and clients reside on the same machine  Reduced setup complexity  Highly accurate in many cases X Still difficult to simulate

Integrated Harness Configuration 33 App Client Application Single Process  Application and client integrated into a single process  Easy to setup X Some loss of accuracy 

Integrated Configuration Validation 34 39% 23%  Networked/Loopback configurations saturate earlier for applications with short requests (silo, specjbb)  TCP/IP processing overhead a significant fraction of request

Integrated Harness Configuration 35 App Client Application Single Process  Application and client integrated into a single process  Easy to setup X Some loss of accuracy  Enables user-level simulations

Simulation vs. Real System 36 16% 32% 20% 16% 31%  Performance difference between real and simulated systems well within usual simulation error bounds  Average absolute error in saturation QPS: 14%  zsim IPC error for SPEC CPU2006 applications: 8.5 – 21%

Conclusions 37  TailBench includes a diverse set of latency-critical applications with varied latency characteristics  TailBench harness implements a statistically sound experimental methodology to achieve accurate results  Various harness configurations allow trading off configuration complexity for some accuracy  Our results show that the integrated configuration is highly accurate for six of our eight benchmarks

T HANKS F OR Y OUR A TTENTION ! Q UESTIONS ? tailbench.csail.mit.edu

T AIL B ENCH : A B ENCHMARK S UITE AND E VALUATION M ETHODOLOGY FOR L - PowerPoint PPT Presentation

T AIL B ENCH : A B ENCHMARK S UITE AND E VALUATION M ETHODOLOGY FOR L ATENCY - C RITICAL A PPLICATIONS H ARSHAD K ASTURE , D ANIEL S ANCHEZ IISWC 2016 tailbench.csail.mit.edu Executive Summary 2 Latency-critical applications have stringent

2014 H EALTH C ARE C OST T RENDS H EARING P ANEL 1 M EETING THE C OST G ROWTH B ENCHMARK P ANEL 2 A

Vincent t GA GARCIA SPACEOPS 31 May 2018 PEPS PEPS is is th the F e Fren ench h Pl

VSAM P ERFORMANCE S UITE Optimize VSAM performance with this powerful suite of tools from CSI

MANILA WINE COURSES Lear Learn Fren ench h Wine ine & & Wine ine Tas asting ting

Will William iam Bou Bouguere guereau au 1 Fren ench ch Acad ademi emic c Clas assica

Performance Based Fees Fresno County Employees Retirement Association December 3, 2008

MYREN Future Network KONVENSYEN PENTADBIR (ICT) 2016, M-S UITE HOTEL , JOHOR BHARU 28 JULAI 2016

I. Statutory Basis is IRC 401(a) (9): 401(a)(9) Required Distributions 401(a)(9)(A) In General

DB ENCH -OLTP (2005) tpmC Baseline Performance $ Tf Performance With Faults $ tpmC Tf

Need representative, end-to-end applications 3. Cluster management 3. Cluster management built

F EDERAL L ANDS U.S. How may we serve you? How may we serve you? F EDERAL L AND O

C HALLENGES IN S ENIOR AND B OOMER H OUSING : D ENVER M ETRO E LISABETH B ORDEN , P RINCIPAL , T HE

November 9, 2016 Via email to director@ fasb.org S usan M. Cosper Technical Director

Ecological succession and ecosystem services Natalia Norden Instituto Alexander von Humboldt

Pu Public lic Inf nforma ormation tion Se Sessi ssions ons February 2018 1 Welcome come

April 25, 2016 Via email to director@ fasb.org S usan M. Cosper Technical Director Financial

C ONSTANTS IN GEP Cndida Ferreira WSC7 2002 Gepsoft A IM Analyse the usefulness of numerical

Guy de Maupassant Moliere Emile Zola Victor Hugo Rousseau Guy de Maupassant Moliere Emile

T AIL B ENCH : A B ENCHMARK S UITE AND E VALUATION M ETHODOLOGY FOR L - PowerPoint PPT Presentation

T AIL B ENCH : A B ENCHMARK S UITE AND E VALUATION M ETHODOLOGY FOR L ATENCY - C RITICAL A PPLICATIONS H ARSHAD K ASTURE , D ANIEL S ANCHEZ IISWC 2016 tailbench.csail.mit.edu Executive Summary 2 Latency-critical applications have stringent

2014 H EALTH C ARE C OST T RENDS H EARING P ANEL 1 M EETING THE C OST G ROWTH B ENCHMARK P ANEL 2 A

Vincent t GA GARCIA SPACEOPS 31 May 2018 PEPS PEPS is is th the F e Fren ench h Pl

VSAM P ERFORMANCE S UITE Optimize VSAM performance with this powerful suite of tools from CSI

MANILA WINE COURSES Lear Learn Fren ench h Wine ine &amp; &amp; Wine ine Tas asting ting

Will William iam Bou Bouguere guereau au 1 Fren ench ch Acad ademi emic c Clas assica

Performance Based Fees Fresno County Employees Retirement Association December 3, 2008

MYREN Future Network KONVENSYEN PENTADBIR (ICT) 2016, M-S UITE HOTEL , JOHOR BHARU 28 JULAI 2016

I. Statutory Basis is IRC 401(a) (9): 401(a)(9) Required Distributions 401(a)(9)(A) In General

DB ENCH -OLTP (2005) tpmC Baseline Performance $ Tf Performance With Faults $ tpmC Tf

Need representative, end-to-end applications 3. Cluster management 3. Cluster management built

F EDERAL L ANDS U.S. How may we serve you? How may we serve you? F EDERAL L AND O

C HALLENGES IN S ENIOR AND B OOMER H OUSING : D ENVER M ETRO E LISABETH B ORDEN , P RINCIPAL , T HE

November 9, 2016 Via email to director@ fasb.org S usan M. Cosper Technical Director

Ecological succession and ecosystem services Natalia Norden Instituto Alexander von Humboldt

Pu Public lic Inf nforma ormation tion Se Sessi ssions ons February 2018 1 Welcome come

April 25, 2016 Via email to director@ fasb.org S usan M. Cosper Technical Director Financial

C ONSTANTS IN GEP Cndida Ferreira WSC7 2002 Gepsoft A IM Analyse the usefulness of numerical

Guy de Maupassant Moliere Emile Zola Victor Hugo Rousseau Guy de Maupassant Moliere Emile

MANILA WINE COURSES Lear Learn Fren ench h Wine ine & & Wine ine Tas asting ting