Automatic Detection of Performance Deviations in Load Testing of Large Scale Systems Haroon Malik Software Analysis and Intelligence Lab (SAIL) Queen’s University, Kingston, Canada
Large scale systems need to satisfy performance constraints 2
PERFROMANCE PEROBLEMS • System not responding fast enough • Taking too much of an important resource(s) • Hanging and/or crashing under heavy load Symptoms Include: • High response time • Increased Latency & • Low throughput under load 3
LOAD TESTING Performance Analysts use load testing to detect early performance problems in the system before they become critical field problems 4
LOAD TESTING STEPS 1 2 3 4 Environment Load Test Load Test Report Setup Execution Analysis Generation 5
LOAD TESTING STEPS 1 2 3 4 Environment Load Test Load Test Report Setup Execution Analysis Generation 6
LOAD TESTING STEPS 1 2 3 4 Environment Load Test Load Test Report Setup Execution Analysis Generation 7
2. LOAD TEST EXECUTION MONITORING TOOL SYSTEM LOAD GENERATOR- 1 PERFORMANCE REPOSITORY 8 LOAD GENERATOR- 2
LOAD TESTING STEPS 1 2 3 4 Environment Load Test Load Test Report Setup Execution Analysis Generation 9
LOAD TESTING STEPS 1 2 3 4 Environment Load Test Load Test Report Setup Execution Analysis Generation 10
LOAD TESTING STEPS 1 2 3 4 Environment Load Test Load Test Report Setup Execution Analysis Generation 11
CHALLENGES WITH LOAD TEST ANAYSIS 1 2 3 Limited Large Number of Counters Knowledge 12
CHALLENGES WITH LOAD TEST ANAYSIS 1 2 3 Limited Large Number of Counters Knowledge 13
CHALLENGES WITH LOAD TEST ANAYSIS 1 2 3 Limited Large Number of Counters Knowledge 14
I Propose 4 Methodologies 3 Unsupervised 1 Supervised To Automatically Analyze the Load Test Results 15
Use Performance Counters to Construct Performance Signature %CPU Byte Busy Commits Disk %CPU writes/sec Idle % Cache Bytes received Faults/ Sec 16
PERFORMANCE COUNTERS ARE HIGHLY CORRELAED CPU DISK (IOPS) MEMORY NETWORK TRANSACTIONS/SEC 17
HIGH LEVEL OVERVIEW OF OUR METHODOLOGIES Input Load Signature Data Deviation Test Generation Preparation Detection Baseline Test Sanitization Performance Standardization Report New Test 18
UNSUPERVISED SIGNATURE GENERATION Random Load Test Sampling Random Signature Sampling Methodology Clustering Data Methodology Load Test Extracting Signature Reduction Centroids Clustering Analyst tunes weight parameter Identifying Top k Performance Dimension Signature Load Test PCA Counters Reduction Methodology (PCA) Ranking Mapping 19
SUPERVISED SIGNATURE GENERATION Partitioning the Attribute Identifying Top k SPC1 WRAPPER Prepared Selection Performance Labeling Data Load Test Signature Counters Genetic Search Methodology SPC2 (only for . i . Count baseline) … … … . ii. % Frequency OneR . SPC10 20
DEVIATION DETECTION TECHNIQUES Using Control Chart Using Methodology- Specific Techniques For PCA and WRAPPER For Clustering and Methodologies Random Sampling Methodologies 21
CONTROL CHART The Upper/Lower Control Limits (U/LCL) are the 16 upper/lower limit of the range of a counter under the normal behavior of the system 14 Performance Counter Value 12 10 8 Baseline Load Test New Load Test 6 4 Baseline CL Baseline LCL, UCL 2 0 1 2 3 4 5 6 7 8 9 10 11 Time (min) 22
DEVIATION DETECTION Baseline Signature Clustering and Performance Control Chart Report New Test Random Signature Sampling Baseline Signature Comparing Performance PCA Approach PCA Counter Report New Test Weights Signature Baseline WRAPPER Signature Logistic Performance Approach Regression Report New Test Signature 23
CASE STUDY How effective are our signature-based approaches in RQ detecting performance deviations in load tests? 24
CASE STUDY How effective are our signature-based approaches in RQ detecting performance deviations in load tests? Evaluation Using: Precision, Recall and F-measure An Ideal approach should predict a minimal and correct set of performance deviations. 25
SUBJECT OF STUDY DVD Store System: Industrial System System: Open Source Domain: Telecom Domain: Ecommerce Type of data: Type of data: 1. Load Test Repository 1. Data From Our Experiments 2. Data From Our Experiments on with an Open Source the Company’s Testing Platform Benchmark Application 26
FAULT INJECTION Category Faults CPU Stress Software Failure Memory Stress Abnormal Workload Interfering Workload Operator Errors Unscheduled Replication 27
CASE STUDY FINDINGS Effectiveness Practical Precision/Recall/F-measure Differences 28
CASE STUDY FINDINGS 1 (Effectiveness) 0.9 0.8 0.7 Precision 0.6 Recall 0.5 0.4 F-Measure 0.3 0.2 0.1 0 WRAPPER PCA Clustering Random Random Sampling has the lowest effectiveness On Avg. and in all experiments, PCA performs better than Clustering approach. WRAPPER dominates the best supervised approach, i.e., PCA 29
CASE STUDY FINDINGS 1 (Effectiveness) 0.9 0.8 0.7 Precision 0.6 Recall 0.5 0.4 F-Measure 0.3 0.2 0.1 0 WRAPPER PCA Clustering Random Overall, there is an excellent balance of high precision and recall of both the WRAPPER and PCA approaches (on average 0.95, 0.94 and 0.82, 0.84 respectively) for deviation detection 30
CASE STUDY FINDINGS (Practical Differences) Real Time Analysis Manual Overhead Stability 31
REAL TIME ANALYSIS WRAPPER- -- deviations on a per-observation basis. PCA --- requires a certain amount of observations (wait time). 32
STABILITY We refer to ‘ Stability ’ as the ability of an approach to remain effective while its signature size is reduced. 33
STABILITY 1 0.95 0.9 F-Measure 0.85 0.8 0.75 0.7 Unsupervised (PCA) 0.65 Supervised(Wrapper) 0.6 45 40 35 30 25 20 15 10 5 4 3 2 1 Signature Size 1.05 1 0.95 0.9 F-Measure 0.85 0.8 WRAPPER methodology is 0.75 0.7 more stable than PCA Unsupervised (PCA) 0.65 Supervised(Wrapper) approach 0.6 60 50 40 30 20 10 4 2 Signature Size 34
MANUAL OVERHEAD WRAPPER approach requires all observations of the baseline performance counter data to be labeled as Pass/Fail 35
MANUAL OVERHEAD Marking each observation is time consuming 36
2010 37
Recommend
More recommend