Empirical Evaluation of Workload Forecasting Techniques for Predictive Cloud Resource Scaling In Kee Kim , Wei Wang, Yanjun (Jane) Qi, and Marty Humphrey Computer Science @ University of Virginia
Motivation – Cloud Resource Scaling Approach Reactive Auto Scaling [AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O… 2
Motivation – Cloud Resource Scaling Approach Reactive Auto Scaling [AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O… Resource Demand Time 3
Motivation – Cloud Resource Scaling Approach Reactive Auto Scaling [AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O… Number of Instances Resource Demand Time 4
Motivation – Cloud Resource Scaling Approach Reactive Auto Scaling [AWS, Google, Azure, etc.] Autoscaling based on Resource Utilization: CPU, Memory, Network-I/O… Number of Instances Scaling Delays Scaling Delays Resource Demand Time 5
Motivation – Cloud Resource Scaling Approach Reactive Auto Scaling Predictive Resource Scaling [AWS, Google, Azure, etc.] Resource Scaling based on forecasting: Autoscaling based on Resource Utilization: 1. Future Resource Usage CPU, Memory, Network-I/O… 2. Workload Arrival Pattern Number of Instances Scaling Delays Scaling Delays Resource Demand Time 6
Motivation – Cloud Resource Scaling Approach Reactive Auto Scaling Predictive Resource Scaling [AWS, Google, Azure, etc.] Resource Scaling based on forecasting: Autoscaling based on Resource Utilization: 1. Future Resource Usage CPU, Memory, Network-I/O… 2. Workload Arrival Pattern 2. Workload Arrival Pattern Number of Instances Scaling Delays Scaling Delays Resource Demand Time 7
Predictive Resource Scaling Predictive Resource Management Engine Workload Resource Predictor Scaling Workload 1. Workload Predictor Cloud Infrastructure - Detects cloud workload pattern. - Predicts job arrival pattern in near future. 2. Resource Scaling - allocates/deallocates cloud resources based on the prediction. 8
Predictive Resource Scaling Predictive Resource Management Engine Workload Resource Predictor Scaling Workload Cloud Infrastructure Regression? Machine Learning? Time Series? 9
Research Questions • Question #1 : Which workload predictor has the highest accuracy for job arrival time prediction? • Question #2 : Which exiting workload predictor has the best cost efficiency and performance benefits? • Question #3 : Which styles of predictive scaling achieves the best cost efficiency and performance benefits? 10
Research Big Picture Public Clouds Resource Realistic Collection of Manager Workload WL Predictor Resource Naive Scaling Regression Job Scheduling Time Series VM Control Non-temporal . . . 24 WL patterns 21 Predictors 4 Policies 2 Configs. X X X = 4K cases • 4K cases are very challenging via actual deployment on IaaS clouds. - Use PICS ( P ublic I aaS C loud S imulator) – KWH – CLOUD’15 11
Experiment Design • Collection of Workload Predictors. • Simulation Workloads. • Design of Resource Management System. • Implementation and Performance Tuning. 12
Collection of (Existing) Workload Predictors • We collect all 21 workload predictors: 1) Naïve Models 2) Regression Models Local Model Recent-mean Global Model Mean-based (Linear, Quad, Cubic) ( k NN) (Linear, Quad, Cubic) 3) Time Series Models 4) Non-Temporal (ML) Models SVMs Decision Smoothing Box-Jenkins Ensemble (Linear, Tree (WMA, EMA, DES) ( AR, ARMA, ARIMA) (RF, GBM, Exts) Gaussian) 13
Simulation Workload Patterns • We generate 24 workload patterns based on: Compute Compute Inactivity Period t t On and Off (Batch/Scientific) Growing (Emerging Service) Compute Compute t t Random/Unpredictable (Media) Cyclic Bursting (E-Commerce) 14
Design of Resource Management System Cloud Resource Management System Workload Job (Duration, Deadline ) Job Portal Workload Repository Workload Repository Job Arrival Info Samples for Job Prediction Job Queue J J J Predictor for Predictor for Predictor for Predictor for Job Exe Scaling-Out Scaling-Out Scaling-In Scaling-In Resource J J Job Exe Management Module Prediction J J J Predictive Result Job Exe (e.g. job +/- VMs, Scaling Job Assign. scheduling, VM Decision scaling, and Predictive Scaler Predictive Scaler management) Cloud Infrastructure Predictive Scaling Module (e.g. AWS, Azure) 15
Implementations and Performance Tuning • Workload Predictor Implementation. - All predictors are written in Python . - numpy and Pandas . - statsmodels for time-series model implementation. - scikit-learn machine learning lib for non temporal models. • Predictor Performance T uning. - (Training) Sample Size Decision: - a tradeoff between prediction performance and overhead. - Most predictors use 50 -- 100 of most recent job arrival samples . - Parameter Selection: - a grid search algorithm with prediction accuracy. 16
Performance Evaluation • Experiment #1 – Statistical Predictor Performance. • Experiment #2 – Predictive Scaling Performance. 17
Experiment #1 – (Statistical) Predictor Performance • Purpose: Measuring Statistical Predictor Accuracy and Overhead . - Accuracy: MAPE – Mean Absolute Percentage Error. - Overhead: Sum of All Prediction Time. • Overall Results: Overall Prediction Accuracy Overall Prediction Overhead (10K Jobs) 18
Experiment #1 – (Statistical) Predictor Performance • Purpose: Measuring Statistical Predictor Accuracy and Overhead . - Accuracy: MAPE – Mean Absolute Percentage Error. - Overhead: Sum of All Prediction Time. • Overall Results: Average: 0.6360 SVMs: 0.37 -- 0.4 (42% less than average) Overall Prediction Accuracy Overall Prediction Overhead (10K Jobs) 19
Experiment #1 – (Statistical) Predictor Performance • Purpose: Measuring Statistical Predictor Accuracy and Overhead . - Accuracy: MAPE – Mean Absolute Percentage Error. - Overhead: Sum of All Prediction Time. • Overall Results: k NN: 0.5s for 10K Jobs ARMA: 6032s Overall Prediction Accuracy Overall Prediction Overhead (10K Jobs) 20
Experiment #1 – (Statistical) Predictor Performance • Accuracy of Workload Predictor per Pattern. Workload Rank Predictor MAPE Workload Rank Predictor MAPE Growing On/Off 1 Lin. SVM 0.28 1 Gau. SVM 0.22 2 AR 0.29 2 ARMA 0.30 3 ARMA 0.30 3 Lin. SVM 0.44 Avg. -- 0.51 Avg. -- 0.69 Bursty 1 ARIMA 0.38 Random 1 Gau. SVM 0.45 2 Brown’s DES 0.41 2 Lin. Reg. 0.46 3 Lin. SVM 0.43 3 Lin. SVM 0.46 Avg. -- 0.75 Avg. -- 0.52 21
Experiment #2 – Predictive Scaling Performance • Purpose: How much benefits RM can achieve by applying 1. “Good Predictor” 2. Different Styles of Predictive Scaling. • List of Predictors: 8 best predictors from evaluation #1. • Linear Regression, WMA, BRDES, AR, ARMA, ARIMA, Linear SVM, Gaussian SVM • Four Different Styles of Resource Scaling. - RR ( R eactive Scaling-Out + R eactive Scaling-In) -- Baseline - PR ( P redictive Scaling-Out + R eactive Scaling-In) - RP ( R eactive Scaling-Out + P redictive Scaling-In) - PP ( P redictive Scaling-Out + P redictive Scaling-In) • Cloud Configurations: T wo Pricing Models -- Hourly and Minutely. • Metrics: Cloud Cost and Job Deadline Miss Rate. 22
Experiment #2 – Predictive Scaling Performance • Overall Results: 1.2 1.2 Baseline (RR) 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0.0 0.0 PR RP PP PR RP PP Cost DL Miss Rate Cost DL Miss Rate (a) Hourly Pricing Model (b) Minutely Pricing Model 23
Experiment #2 – Predictive Scaling Performance • Overall Results: 1.2 1.2 Baseline (RR) 1.0 1.0 cost DL Miss 37% 0.8 0.8 58% 0.6 0.6 67% 87% 0.4 0.4 0.2 0.2 0.0 0.0 PR RP PP PR RP PP Cost DL Miss Rate Cost DL Miss Rate (a) Hourly Pricing Model (b) Minutely Pricing Model 24
Experiment #2 – Predictive Scaling Performance Cost: No Improvement • Overall Results: 1.2 1.2 Baseline (RR) 1.0 1.0 DL Miss 0.8 0.8 60% 0.6 0.6 72% 0.4 0.4 0.2 0.2 0.0 0.0 PR RP PP PR RP PP Cost DL Miss Rate Cost DL Miss Rate (a) Hourly Pricing Model (b) Minutely Pricing Model 25
Experiment #2 – Predictive Scaling Performance • PP ( P redictive Scaling-Out – P redictive Scaling-In) Details -- Deadline Miss Rate Hourly Pricing Model Minutely Pricing Model 1.20 1.20 1.00 1.00 0.80 0.80 0.60 0.60 0.40 0.40 0.20 0.20 0.00 0.00 Lin-SVM Average Gau-SVM Average BRDES Average Lin-SVM Average AR Average ARMA Average BRDES Average Gau-SVM Average (Best) (Best) (Best) (Best) (Best) (Best) (Best) (Best) Growing On/Off Bursty Random Growing On/Off Bursty Random Workloads Top 1 Top 2 Top 3 Growing Linear SVM AR ARMA On/Off Gaussian SVM ARMA Linear SVM Bursty ARIMA Brown’s DES Linear SVM Random Gaussian SVM Linear Regression Linear SVM 26
Recommend
More recommend