Opprentice: Towards Practical and Automatic Anomaly Detection Through Machine Learning Dapeng Liu , Youjian Zhao, Haowen Xu, Yongqian Sun, Dan Pei, Jiao Luo, Xiaowei Jing, Mei Feng 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
KPIs and Anomaly Detection Page views (PV) of Baidu KPIs (Key Performance Indicators) : A set of performance measures that evaluate the service quality 1 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
KPIs and Anomaly Detection Page views (PV) of Baidu KPIs (Key Performance Indicators) : A set of performance measures that evaluate the service quality KPI anomalous (unexpected) behaviors Potential failures, bugs, attacks... 2 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
KPIs and Anomaly Detection Page views (PV) of Baidu KPIs (Key Performance Indicators) : A set of performance measures that evaluate the service quality KPI anomalous (unexpected) behaviors Potential failures, bugs, attacks... Anomaly detection matters: Find anomalous behaviors of the KPI curve Diagnose and fix it Avoid further influences and revenue losses 3 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
KPIs and Anomaly Detection IMC’ 15 The Dark Menace: Characterizing IMC’ 15 Dissecting UbuntuOne: Autopsy of a Network-based Attacks in the Cloud Global-scale Personal Cloud Back-end Page views (PV) of Baidu KPIs (Key Performance Indicators) : A set of performance measures that evaluate the service quality KPI anomalous (unexpected) behaviors Potential failures, bugs, attacks, etc. Anomaly detection matters: Find anomalous behaviors of the KPI curve Diagnose and fix it Avoid further influences and revenue losses 4 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
How to Build the Anomaly Detection System Domain experts (Operators) Developers • • Responsible for the KPIs Building the detection system • • Knowing the KPI behaviors well Knowing several anomaly detectors Simple threshold Historical Average Wavelet Holt-Winters … 5 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
How to Build the Anomaly Detection System In practice, it is more complex Describe anomalies Developers Operators 6 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
How to Build the Anomaly Detection System In practice, it is more complex Describe anomalies Developers Operators Select detectors & Tune parameters Detection Wavelet System Moving Average Holt-Winters … 7 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
How to Build the Anomaly Detection System In practice, it is more complex Describe anomalies Developers Operators Select detectors & Tune parameters Detection Anomalies Wavelet System Moving Average Holt-Winters … 8 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
How to Build the Anomaly Detection System In practice, it is more complex Describe anomalies Developers Operators Select detectors & Tune parameters Detection Anomalies Wavelet System Moving Average Holt-Winters … 9 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
How to Build the Anomaly Detection System Challenges Selecting and combining suitable Describe anomalies 2. detectors are tricky Operators have difficulties to precisely and 1. 3. formally define anomalies in advance Detectors are not intuitive to tune Developers Operators Select detectors & Tune parameters Detection Anomalies Wavelet System Moving Average Holt-Winters … 10 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
(Op erators’ ap prentice) 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
A More Natural Way Opprentice OP PV 13 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Design Goal Label Anomaly Detection Operators Accuracy preference (Precision & recall) Opprentice Provide 14 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Design Goal vs. Label Anomaly Detection Operators Accuracy preference (Precision & recall) Opprentice Provide 15 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Outline Background and Motivation Key Ideas Results Conclusion 16 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Key Ideas Detector model: 17 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Key Ideas Detector model: For example |𝑤𝑏𝑚𝑣𝑓−𝜈| 𝑤𝑏𝑚𝑣𝑓 severity = 𝜏 Historical Average 18 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Key Ideas Detector model: For example |𝑤𝑏𝑚𝑣𝑓−𝜈| 𝑤𝑏𝑚𝑣𝑓 severity = 𝜏 1 Historical sThld 0 Average 19 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Key Ideas Detector model: For example |𝑤𝑏𝑚𝑣𝑓−𝜈| 𝑤𝑏𝑚𝑣𝑓 severity = 𝜏 1 Historical sThld 0 Average Anomaly feature 20 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Key Ideas Historical average-4 season EWMA-0,7 WMA-WIN30 Extract Differencing-last slot features Detector Differencing-last season Configurations Differencing-last day KPI data (Detectors with different parameters) Time series decomposition HW 0.2 0.2 0.2 HW 0.5 0.7 0.7 21 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Key Ideas Historical average-4 season EWMA-0,7 WMA-WIN30 Extract Differencing-last slot features Detector Differencing-last season Configurations Differencing-last day KPI data (Detectors with different parameters) Time series decomposition HW 0.2 0.2 0.2 HW 0.5 0.7 0.7 22 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Key Ideas Classification in the feature space (Supervised machine learning) 23 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Key Ideas Classification in the feature space (Supervised machine learning) Operators 24 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Address Challenges of Designing Opprentice Labeling overhead – Solution: an effective labeling tool 25 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Address Challenges of Designing Opprentice Labeling overhead – Solution: an effective labeling tool Incomplete anomaly types in the historical data – Solution: incremental re-training with new data 26 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Address Challenges of Designing Opprentice Labeling overhead – Solution: an effective labeling tool Incomplete anomaly types in the historical data – Solution: incremental re-training with new data Class imbalance problem – Solution: adjusting classification threshold (cThld) based on the preference 27 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Address Challenges of Designing Opprentice Labeling overhead – Solution: an effective labeling tool Incomplete anomaly types in the historical data – Solution: incremental re-training with new data Class imbalance problem – Solution: adjusting classification threshold (cThld) based on the preference Irrelevant and redundant features – Solution: random forests 28 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Design Overview Training a classifier See the paper for full details 29 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Design Overview Training a classifier See the paper for full details Detecting anomalies 30 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Outline Background and Motivation Key Ideas Results Conclusion 31 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Evaluation 32 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Evaluation 33 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Random forests vs. Basic Detectors and Static Combinations Random forest basic detector basic detector basic detector 34 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Evaluation 35 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Random Forests vs. Other Learning Algorithms (The order of features is based on mutual information ) 36 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Evaluation See the paper for full details 37 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Opprentice as a whole Oracle mode 5-Fold Opprentice (best case) Opprentice achieves 23% 40% 110% more points inside the preference regions than 5-Fold cross-validation 38 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Opprentice as a whole Oracle mode 5-Fold Opprentice (best case) Opprentice achieves 23% 40% 110% more points inside the preference regions than 5-Fold cross-validation 39 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Conclusion Opprentice is an automatic and accurate machine learning framework for KPI anomaly detection Opprentice Defining anomalies Selecting detectors Tuning detectors Opprentice bridges the gap in applying complex detectors in practice The idea of Opprentice i.e., using machine learning to model the domain knowledge could be a very promising way to automate other service managements 40 2015/12/3 Dapeng Liu (liudp10@mails.tsinghua.edu.cn)
Recommend
More recommend