Estimating Cloud Application Performance Based on Micro-Benchmark Profiling Joel Scheuner, Philipp Leitner Joel Scheuner ! scheuner@chalmers.se " joe4dev # @joe4dev Supported by
Context: Public Infrastructure-as-a-Service Clouds IaaS PaaS SaaS Applications Applications Applications User-Managed Data Data Data Runtime Runtime Runtime Middleware Middleware Middleware OS OS OS Virtualization Virtualization Virtualization Servers Servers Servers Storage Storage Storage Networking Networking Networking Provider-Managed Infrastructure-as-a-Service (IaaS) Platform-as-a-Service (PaaS) Software-as-a-Service (SaaS) 2018-07-02 IEEE CLOUD'18 2
Motivation: Capacity Planning in IaaS Clouds What cloud provider should I choose? https://www.cloudorado.com 2018-07-02 IEEE CLOUD'18 3
Motivation: Capacity Planning in IaaS Clouds What cloud service (i.e., instance type) should I choose? 120 t2.nano Number of Instance Type 0.05-1 vCPU 100 0.5 GB RAM $0.006/h 80 à Impractical to Test all Instance Types 60 40 x1e.32xlarge 128 vCPUs 20 3904 GB RAM $26.688 hourly 0 6 7 8 9 0 1 2 3 4 5 6 7 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2018-07-02 IEEE CLOUD'18 4
Topic: Performance Benchmarking in the Cloud “The instance type itself is a very major tunable parameter” ! @brendangregg re:Invent’17 https://youtu.be/89fYOo1V2pA?t=5m4s 2018-07-02 IEEE CLOUD'18 5
Background Application Micro Benchmarks Benchmarks Memory CPU I/O Network Overall performance (e.g., response time) Generic Domain Specific Artificial Workload Real-World Resource- Resource Resource- specific Usage heterogeneous 2018-07-02 IEEE CLOUD'18 6
Problem: Isolation, Reproducibility of Execution Application Micro Benchmarks Benchmarks Memory CPU I/O Network Overall performance (e.g., response time) Generic Specific Artificial Real-World Resource-specific Resource- heterogeneous 2018-07-02 IEEE CLOUD'18 7
Question Application Micro Benchmarks Benchmarks Memory CPU I/O Network Overall performance (e.g., response time) ? Generic Specific Artificial Real-World How relevant? Resource-specific Resource- heterogeneous 2018-07-02 IEEE CLOUD'18 8
Research Questions PRE – Performance Variability Does the performance of equally configured cloud instances vary relevantly? RQ1 – Estimation Accuracy How accurate can a set of micro benchmarks estimate application performance? RQ2 – Micro Benchmark Selection Which subset of micro benchmarks estimates application performance most accurately? 2018-07-02 IEEE CLOUD'18 9
Idea Application Micro Benchmarks Benchmarks Memory CPU I/O Network Overall performance (e.g., response time) performance performance VM N VM N Evaluate a Prediction Model cost cost 2018-07-02 IEEE CLOUD'18 10
Methodology Benchmark Design 2018-07-02 IEEE CLOUD'18 11
Micro Micro Benchmarks Benchmarks Broad resource coverage and specific resource testing Memory CPU I/O Network I/O CPU • [file I/O] sysbench/fileio-1m-seq-write • sysbench/cpu-single-thread • [file I/O] sysbench/fileio-4k-rand-read • sysbench/cpu-multi-thread • [disk I/O] fio/4k-seq-write • stressng/cpu-callfunc • [disk I/O] fio/8k-rand-read • stressng/cpu-double • stressng/cpu-euler Network • stressng/cpu-ftt • iperf/single-thread-bandwidth • stressng/cpu-fibonacci • iperf/multi-thread-bandwidth • stressng/cpu-int64 • stressng/network-epoll • stressng/cpu-loop • stressng/network-icmp • stressng/cpu-matrixprod Software (OS) • stressng/network-sockfd • sysbench/mutex • stressng/network-udp Memory • sysbench/thread-lock-1 • sysbench/memory-4k-block-size • sysbench/thread-lock-128 • sysbench/memory-1m-block-size 2018-07-02 IEEE CLOUD'18 12
Application Benchmarks Application Benchmarks Overall performance (e.g., response time) Molecular Dynamics WordPress Benchmark (WPBench) Simulation (MDSim) 100 Number of Concurrent Threads 80 60 40 20 0 00:00 01:00 02:00 03:00 04:00 05:00 06:00 07:00 08:00 Elapsed Time [min] Multiple short blogging session scenarios (read, search, comment) 2018-07-02 IEEE CLOUD'18 13
Methodology Benchmark Benchmark Design Execution A Cloud Benchmark Suite Combining Micro and Applications Benchmarks QUDOS@ICPE’18, Scheuner and Leitner 2018-07-02 IEEE CLOUD'18 14
Execution Methodology D) Randomized Multiple Interleaved Trials (RMIT) [1] B A C C B A A C B 30 benchmark scenarios 3 trials ~2-3h runtime [1] A. Abedi and T. Brecht. Conducting repeatable experiments in highly variable cloud computing environments. ICPE’17 2018-07-02 IEEE CLOUD'18 15
Benchmark Manager Cloud WorkBench (CWB) Tool for scheduling cloud experiments ! sealuzh/cloud-workbench Cloud Work Bench – Infrastructure-as- Code Based Cloud Benchmarking CloudCom’14, Scheuner, Leitner, Cito, and Gall Cloud WorkBench: Benchmarking IaaS Providers based on Infrastructure-as-Code Demo@WWW’15, Scheuner, Cito, Leitner, and Gall 2018-07-02 IEEE CLOUD'18 16
Methodology Benchmark Benchmark Data Pre- Data Design Execution Processing Analysis � � 50 40 Relative Standard Deviation (RSD) [%] 30 � � � � � � 20 � � � � � � � � � � � � � 10 � � � � � � � � � � � � � � � � � � � � 6.83 � � � � � � 5 � � � � � � 4.41 � 4.3 � � � � � � 3.32 � � � � � � � 3.16 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 0 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � m1.small (eu) m1.small (us) m3.medium (eu) m3.medium (us) m3.large (eu) Configuration [Instance Type (Region)] A Cloud Benchmark Suite Combining Micro and Estimating Cloud Application Performance Applications Benchmarks Based on Micro Benchmark Profiling QUDOS@ICPE’18, Scheuner and Leitner CLOUD’18, Scheuner and Leitner 2018-07-02 IEEE CLOUD'18 17
Performance Data Set * Instance Type vCPU ECU RAM [GiB] Virtualization Network Performance eu + us m1.small 1 1 1.7 PV Low m1.medium 1 2 3.75 PV Moderate eu + us PRE m3.medium 1 3 3.75 PV /HVM Moderate m1.large 2 4 7.5 PV Moderate eu m3.large 2 6.5 7.5 HVM Moderate RQ1+2 m4.large 2 6.5 8.0 HVM Moderate c3.large 2 7 3.75 HVM Moderate c4.large 2 8 3.75 HVM Moderate c3.xlarge 4 14 7.5 HVM Moderate c4.xlarge 4 16 7.5 HVM High c1.xlarge 8 20 7 PV High * ECU := Elastic Compute Unit (i.e., Amazon’s metric for CPU performance) >240 Virtual Machines (VMs) à 3 Iterations à ~750 VM hours >60’000 Measurements (258 per instance) 2018-07-02 IEEE CLOUD'18 18
PRE – Performance Variability Results Does the performance of equally configured cloud instances vary relevantly? 30 2 outliers (54% and 56%) Relative Standard Deviation (RSD) [%] 20 Threads Latency Fileio Random 10 mean 5 4.41 4.3 4.14 Network 3.32 3.16 Fileio Seq. 0 m1.small (eu) m1.small (us) m3.medium (eu) m3.medium (us) m3.large (eu) Configuration [Instance Type (Region)] 2018-07-02 IEEE CLOUD'18 19
RQ1 – Estimation Accuracy Approach How accurate can a set of micro benchmarks estimate application performance? micro 1 , micro 2 , …, micro N Instance Type 1 (m1.small) app 1 , app 2 Linear Regression Model Instance Type 2 app 1 … micro 1 Forward feature selection Instance Type 12 to optimize relative error (c1.xlarge) 2018-07-02 IEEE CLOUD'18 20
RQ1 – Estimation Accuracy Results How accurate can a set of micro benchmarks estimate application performance? Instance Type m1.small Relative Error (RE) = 12.5% m3.medium (pv) WPBench Read − Response Time [ms] m3.medium (hvm) ! " = 99.2% m1.medium 2000 m3.large m1.large c3.large m4.large c4.large c3.xlarge 1000 c4.xlarge c1.xlarge Group test 0 train 25 50 75 100 Sysbench − CPU Multi Thread Duration [s] 2018-07-02 IEEE CLOUD'18 21
RQ2 – Micro Benchmark Selection Results Which subset of micro benchmarks estimates application performance most accurately? Relative Error [%] Micro Benchmark Sysbench – CPU Multi Thread 12 Sysbench – CPU Single Thread 454 Baseline vCPUs 616 ECU 359 (i.e., Amazon’s metric for CPU performance) Cost 663 8/25/18 Chalmers 22
RQ – Implications Suitability of selected micro benchmarks to estimate application performance Benchmarks cannot be used interchangeable à Configuration is important Baseline metrics vCPU and ECU are insufficient 2018-07-02 IEEE CLOUD'18 23
Recommend
More recommend