nesime mejbah justin tae jun stan alam gottschlich lee
play

Nesime Mejbah Justin Tae Jun Stan Alam Gottschlich Lee Zdonik - PowerPoint PPT Presentation

Nesime Mejbah Justin Tae Jun Stan Alam Gottschlich Lee Zdonik Tatbul 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada Motivation: Time Series Anomaly Detection Anomaly: Patterns that do not


  1. Nesime Mejbah Justin Tae Jun Stan Alam Gottschlich Lee Zdonik Tatbul 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montreal, Canada

  2. Motivation: Time Series Anomaly Detection  Anomaly: Patterns that do not conform to expected behavior.  Anomalies can have critical impact: loss of life, property damage, monetary loss, ...  Applications of anomaly detection (AD) are numerous and diverse. Autonomous Driving Cancer Detection Six levels of autonomy:  L0: No automation  L1: Driver assistance Anomalies  L2: Partial automation often occur  L3: Conditional automation L3+ autonomy over a  L4: High automation requires robust period of time.  L5: Full automation AD systems. Source: Society of Automotive Engineers (SAE), Source: National Highway and Traffic Safety Administration (NHTSA) http://www.vaccinogeninc.com/oncovax/science-due-diligence/overview-part-1 2

  3. Motivation: Range-based Anomalies  Time series anomalies are range based , i.e., they occur over a period of time. Atrial Premature Contraction anomaly in human ECG  There are domain-specific application preferences . – Cancer detection, Real-time systems: – Early response; Avoid false negatives! – Robotic defense systems: – Delayed response; Avoid false positives! – Emergency braking in self-driving cars: Source: Chandola et al., “Anomaly Detection: A Survey”, – Neither too early nor too late; Avoid false negatives! ACM Computing Surveys, 41(3), 2009. 3

  4. Problem: How to Measure Accuracy? Point-based Anomalies Range-based Anomalies T rue P ositives F alse F alse N egatives P ositives 𝑄𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 = ? 𝑆𝑓𝑑𝑏𝑚𝑚 = ? 𝑄𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 = 𝑈𝑄 ÷ (𝑈𝑄 + 𝐺𝑄)  Must express partial detection 𝑆𝑓𝑑𝑏𝑚𝑚 = 𝑈𝑄 ÷ 𝑈𝑄 + 𝐺𝑂  Must support flexible time bias 4

  5. State of the Art  Classical Precision and Recall 𝑄𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜 × 𝑆𝑓𝑑𝑏𝑚𝑚 𝛾 = (1 + 𝛾 2 ) × 𝐺 (𝛾 2 × 𝑄𝑠𝑓𝑑𝑗𝑡𝑗𝑝𝑜) + 𝑆𝑓𝑑𝑏𝑚𝑚 – Point-based anomalies β : relative importance of Recall to Precision – Precision penalizes FP, Recall penalizes FN β = 1 : evenly weighted (harmonic mean) β = 2 : weights Recall higher (i.e., no FN!) – F β -Score to combine and weight them β = 0.5 : weights Precision higher (i.e., no FP!)  Numenta Anomaly Benchmark (NAB) [2] – Point-based anomalies – Focuses specifically on early detection use cases – Difficult to use in practice (irregularities, ambiguities, magic numbers) [3]  Activity recognition metrics – No support for flexible time bias [2] Lavin and Ahmad, “Evaluating Real -Time Anomaly Detection Algorithms – The Numenta Anomaly Benchmark”, IEEE ICMLA, 2015. 5 [3] Singh and Olinsky , “ Demistifying Numenta Anomaly Benchmark”, IEEE IJCNN, 2017.

  6. Precision and Recall for Time Series Customizable parameters  We extend classical Precision and Recall to measure ranges. Range-based Recall  Our model is: – expressive – flexible – extensible Range-based Precision 6

  7. Customization Examples Overlap Size ω() Positional Bias δ() Cancer Detection: Robotic Defense: Emergency Braking:  Set δ () = Front-end , β = 2  Set δ() = Back-end , β = 0.5  Set δ() = Middle , β = 1.5 Our model subsumes the classical point-based model , when:  all ranges are represented as unit-size ranges, and  α=0 , γ()=1 , ω() is as above, and δ() = Flat 7

  8. Selected Experimental Results Comparison to Classical model Comparison to Numenta model Multiple Anomaly Detectors (LSTM-AD) (LSTM-AD) (NYC-Taxi) Our model Our model can Our model is more effective in  subsumes the classical model  mimic the Numenta model  evaluating multiple detectors  is sensitive to positional bias  catch additional intricacies  capturing subtleties in data Please see our paper for details of this experimental study and additional results. 8

  9. Key Takeaways  This work extends the classical Precision and Recall model to time series data.  We provide tunable parameters to capture domain-specific application preferences.  Experiments with diverse datasets and anomaly detectors prove the benefits of our approach.  Future work includes: – designing new training strategies for range-based anomaly detection – exploring use in other time series classification tasks and applications 9

  10. More Information Watch our short video: https://www.youtube.com/watch?v=K5f-dUBiQP4 Read our paper: https://arxiv.org/abs/1803.03639/ Download our tool: https://github.com/IntelLabs/TSAD-Evaluator/ Visit our poster session at NeurIPS’18: Today at 5:00 - 7:00 PM in Room 210 & 230 AB #116 Thanks to Intel and NSF for funding this research. 10

Recommend


More recommend