in incorporating feedback in into tree based anomaly
play

In Incorporating Feedback in into Tree-based Anomaly Detection - PowerPoint PPT Presentation

In Incorporating Feedback in into Tree-based Anomaly Detection Shubhomoy Das, Weng-Keen Wong, Alan Fern, Thomas G. Dietterich and Md Amran Siddiqui School of EECS Anomaly Detection Goal: Identify rare or strange objects 2 Anomaly


  1. In Incorporating Feedback in into Tree-based Anomaly Detection Shubhomoy Das, Weng-Keen Wong, Alan Fern, Thomas G. Dietterich and Md Amran Siddiqui School of EECS

  2. Anomaly Detection β€’ Goal: Identify rare or strange objects 2

  3. Anomaly Detection β€’ Goal: Identify rare or strange objects 2

  4. Typical Investigation Anomaly Detector Ranking 𝑔(𝑦) 3

  5. Typical Investigation Anomaly Detector Ranking 𝑔(𝑦) 3

  6. Typical Investigation Anomaly Detector Ranking 𝑔(𝑦) 3

  7. Typical Investigation Anomaly Detector Ranking 𝑔(𝑦) . . . 3

  8. Typical Investigation Anomaly Detector Ranking 𝑔(𝑦) β€’ Major problem: Statistical anomalies don’t necessarily correspond to semantic anomalies . . . β€’ Need to deal with large number of false positives 3

  9. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) 4

  10. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) Nominal 4

  11. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) Nominal 4

  12. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) 4

  13. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) Nominal 4

  14. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) Nominal 4

  15. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) Nominal . . . 4

  16. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) Anomaly . . . 4

  17. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) Anomaly . . . 4

  18. Investigation with Feedback Anomaly Detector Ranking 𝑔(𝑦) β€’ Ranking is adaptive Anomaly . β€’ Reduces false positive . . 4

  19. Tree-based Anomaly Detection β€’ Isolation Forest β€’ HS-Trees β€’ RS-Forest β€’ RPAD β€’ Random Projection Forest β€’ … 5

  20. Isolation Forest Random feature and random split β‰₯ point < Shallow leaf indicates anomaly Deeper leaf indicates nominal 6

  21. Isolation Forest Random feature and random split β‰₯ point < Shallow leaf indicates anomaly Deeper leaf indicates nominal 6

  22. Isolation Forest Typically 100 trees in practice Random feature and random split β‰₯ point < Shallow leaf indicates anomaly Deeper leaf indicates nominal 6

  23. Weighted Representation of Trees < β‰₯ 𝑨(𝑦) = βˆ’1, 0, 0, βˆ’1, 0, 0, 0, βˆ’1, βˆ’1, … π‘ˆ (extremely sparse) β€’ Weights for isolation forest: π‘₯ = 1, 1, 1, 1, 1, 1, 1, 1, 1, … π‘ˆ π’š β€’ Different set of weights will result other tree based detectors 𝑑𝑑𝑝𝑠𝑓 𝑦 = π‘₯ π‘ˆ . 𝑨 𝑦 7

  24. Active Anomaly Discovery 𝑒 π‘Ÿ 𝜐 π‘₯ 𝑒 8

  25. Active Anomaly Discovery 𝑒 π‘Ÿ 𝜐 π‘₯ 𝑒 8

  26. Active Anomaly Discovery Nominal 𝑒 π‘Ÿ 𝜐 π‘₯ 𝑒 8

  27. Active Anomaly Discovery Nominal 𝑒 𝑒+1 π‘Ÿ 𝜐 π‘Ÿ 𝜐 π‘₯ 𝑒 π‘₯ 𝑒+1 8

  28. Active Anomaly Discovery Nominal 𝑒 𝑒+1 π‘Ÿ 𝜐 π‘Ÿ 𝜐 π‘₯ 𝑒 π‘₯ 𝑒+1 8

  29. Active Anomaly Discovery Anomaly Nominal 𝑒 𝑒+1 π‘Ÿ 𝜐 π‘Ÿ 𝜐 π‘₯ 𝑒 π‘₯ 𝑒+1 8

  30. Active Anomaly Discovery Anomaly Nominal 𝑒 𝑒+1 𝑒+2 π‘Ÿ 𝜐 π‘Ÿ 𝜐 π‘Ÿ 𝜐 π‘₯ 𝑒 π‘₯ 𝑒+1 π‘₯ 𝑒+2 8

  31. Active Anomaly Discovery Anomaly Nominal 𝑒 𝑒+1 𝑒+2 π‘Ÿ 𝜐 π‘Ÿ 𝜐 π‘Ÿ 𝜐 π‘₯ 𝑒 π‘₯ 𝑒+1 π‘₯ 𝑒+2 8

  32. Active Anomaly Discovery Anomaly Nominal 𝑒 𝑒+1 𝑒+2 π‘Ÿ 𝜐 π‘Ÿ 𝜐 π‘Ÿ 𝜐 … π‘₯ 𝑒 π‘₯ 𝑒+1 π‘₯ 𝑒+2 8

  33. Result True anomalies Synthetic Dataset Baseline discovers 12 AAD discovers 23 anomalies in anomalies in 35 iterations 35 iterations 9

  34. Result 0 Feedback 10

  35. Result 0 Feedback 10 Feedback 10

  36. Result 20 Feedback 0 Feedback 10 Feedback 10

  37. Result 20 Feedback 0 Feedback 10 Feedback 25 Feedback 10

  38. Result 20 Feedback 0 Feedback 10 Feedback 25 Feedback 35 Feedback 10

  39. A closer look at the data with t-SNE 11

Recommend


More recommend