haoyi fan 1 fengbin zhang 1 ruidong wang 1

Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong - PowerPoint PPT Presentation

-1- Correlation-aware Deep Generative Model for Unsupervised Anomaly Detection Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong Li 2 Harbin University of Science and Technology 1 Minjiang University 2 isfanhy@hrbust.edu.cn


  1. -1- Correlation-aware Deep Generative Model for Unsupervised Anomaly Detection Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong Li 2 Harbin University of Science and Technology 1 Minjiang University 2 isfanhy@hrbust.edu.cn

  2. -2- Background Anomaly Anomaly Observed Space Latent Space Normal

  3. -3- Background https://www.explosion.com/135494/5-effective-strategies-of-fraud- https://towardsdatascience.com/building-an-intrusion-detection-system- detection-and-prevention-for-ecommerce/ using-deep-learning-b9488332b321 Fraud Detection Intrusion Detection https://planforgermany.com/switching-private-public-health-insurance- https://blog.exporthub.com/working-with-chinese-manufacturers/ germany/ Disease Detection Fault Detection

  4. -4- Background Unsupervised Anomaly Detection – From the Density Estimation Perspective Data samples: π‘Œ π‘’π‘ π‘π‘—π‘œ = 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 π‘œ , 𝑦 𝑗 is assumed normal. Latent Space

  5. -5- Background Unsupervised Anomaly Detection – From the Density Estimation Perspective Data samples: π‘Œ π‘’π‘ π‘π‘—π‘œ = 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 π‘œ , 𝑦 𝑗 is assumed normal. Model: π‘ž(𝑦) Latent Space

  6. -6- Background Unsupervised Anomaly Detection – From the Density Estimation Perspective Data samples: π‘Œ π‘’π‘ π‘π‘—π‘œ = 𝑦 1 , 𝑦 2 , … , 𝑦 π‘œ , 𝑦 𝑗 is assumed normal. Model: π‘ž(𝑦) Test samples: π‘Œ 𝑒𝑓𝑑𝑒 = 𝑦 1 , 𝑦 2 , … , 𝑦 π‘œ , 𝑦 𝑒 is unknow. if π‘ž(𝑦 𝑒 ) < πœ‡ , 𝑦 𝑒 is abnormal . if π‘ž(𝑦 𝑒 ) β‰₯ πœ‡ , 𝑦 𝑒 is normal . Latent Space

  7. -7- Background Unsupervised Anomaly Detection – From the Density Estimation Perspective Data samples: π‘Œ π‘’π‘ π‘π‘—π‘œ = 𝑦 1 , 𝑦 2 , … , 𝑦 π‘œ , 𝑦 𝑗 is assumed normal. Model: π‘ž(𝑦) Test samples: π‘Œ 𝑒𝑓𝑑𝑒 = 𝑦 1 , 𝑦 2 , … , 𝑦 π‘œ , 𝑦 𝑒 is unknow. if π‘ž(𝑦 𝑒 ) < πœ‡ , 𝑦 𝑒 is abnormal . if π‘ž(𝑦 𝑒 ) β‰₯ πœ‡ , 𝑦 𝑒 is normal . Latent Space Anomalies reside in the low probability density areas.

  8. -8- Background Correlation among data samples Conventional Anomaly Feature Learning Detection Graph Modeling Feature Space Correlation-aware Anomaly Feature Learning Detection Structure Space How to discover the normal pattern from both the feature level and structural level ?

  9. -9- Problem Statement Anomaly Detection Notations 𝓗 : Graph. Given a set of input samples 𝓨 = {𝑦 𝑗 |𝑗 = 𝓦 : Set of nodes in a graph. 1, . . . , 𝑂} , each of which is associated 𝓕 : Set of edges in a graph. with a 𝐺 dimension feature 𝐘 𝑗 ∈ ℝ 𝐺 , we 𝑂 : Number of nodes. aim to learn a score function 𝑣(𝐘 𝑗 ): ℝ 𝐺 ↦ 𝐺 : Dimension of attribute. 𝐁 ∈ ℝ 𝑂×𝑂 : Adjacency matrix ℝ , to classify sample 𝑦 𝑗 based on the threshold πœ‡ : of a network. 𝐘 ∈ ℝ 𝑂×𝐺 : Feature matrix of all nodes. 𝑧 𝑗 = {1, 𝑗𝑔 𝑣(𝐘 𝑗 ) β‰₯ πœ‡, 0, π‘π‘’β„Žπ‘“π‘ π‘₯𝑗𝑑𝑓. where 𝑧 𝑗 denotes the label of sample 𝑦 𝑗 , with 0 being the normal class and 1 the anomalous class.

  10. -10- Method CADGMM Feature Dual-Encoder Decoder Graph Estimation Construction network

  11. -11- Method CADGMM K-Nearest Neighbor e.g. K=5 Original feature: 𝓨 = {𝑦 𝑗 |𝑗 = 1, . . . , 𝑂} Find neighbors by K-NN: π“ž 𝑗 = {𝑦 𝑗 𝑙 |𝑙 = 1, . . . , 𝐿 ࡟ Model correlation as graph: 𝓗 = {𝓦, 𝓕, 𝐘} Graph Construction 𝓦 = {𝑀 𝑗 = 𝑦 𝑗 |𝑗 = 1, . . . , 𝑂} 𝓕 = {𝑓 𝑗 𝑙 = (𝑀 𝑗 , 𝑀 𝑗 𝑙 )|𝑀 𝑗 𝑙 ∈ π“ž 𝑗 }

  12. -12- Method CADGMM Feature Encoder e.g. MLP , CNN, Feature Decoder LSTM Graph Encoder e.g. GAT

  13. -13- Method CADGMM Gaussian Mixture Model Initial embedding: Z Membership: Z 𝓝(π‘š β„³ ) = 𝜏 Z 𝓝 π‘š β„³ βˆ’1 W 𝓝 π‘š β„³ βˆ’1 + b 𝓝 π‘š β„³ βˆ’1 , Z 𝓝(0) = Z 𝓝 = Softmax ( Z 𝓝(𝑀 β„³ ) ) , 𝓝 ∈ ℝ 𝑂×𝑁 Parameter Estimation: 𝑂 𝓝 𝑗,𝑛 ( Z 𝑗 βˆ’π‚ 𝒏 )( Z 𝑗 βˆ’π‚ 𝒏 ) T 𝑂 𝓝 𝑗,𝑛 Z 𝑗 ෍ ෌ 𝑗=1 𝑗=1 𝝂 𝒏 = , 𝚻 𝒏 = 𝑂 𝑂 𝓝 𝑗,𝑛 𝓝 𝑗,𝑛 ෌ 𝑗=1 ෌ 𝑗=1 Estimation network Energy: 2 ( Z βˆ’π‚ 𝒏 ) T 𝚻 𝑛 exp (βˆ’ 1 βˆ’1 ( Z βˆ’π‚ 𝒏 )) 𝓝 𝑗,𝑛 𝑂 E Z = βˆ’ log Οƒ 𝑛=1 𝑁 Οƒ 𝑗=1 1 𝑂 |2𝜌𝚻 𝑛 | 2

  14. -14- Method Loss and Anomaly Score Loss Function: 2 + πœ‡ 1 E Z + πœ‡ 2 Οƒ 𝑛=1 1 β„’ = || X βˆ’ ΰ·‘ 𝑁 𝑂 2 X || 2 (𝚻 𝒏 ) 𝑗𝑗 + πœ‡ 3 || Z || 2 Οƒ 𝑗=1 Embedding Covariance Rec. Error Energy Penalty Penalty Anomaly Score: 𝑇𝑑𝑝𝑠𝑓 = E Z Solution for Problem: 𝑧 𝑗 = {1, 𝑗𝑔 𝑣(𝐘 𝑗 ) β‰₯ πœ‡, 0, π‘π‘’β„Žπ‘“π‘ π‘₯𝑗𝑑𝑓. πœ‡ =Distribution( 𝑇𝑑𝑝𝑠𝑓 )

  15. -15- Experiment Datasets Baselines Evaluation Metrics Precision OC-SVM Chen et al. 2001 Recall IF Liu et al. 2008 F1-Score DSEBM Zhai et al. 2016 DAGMM Zong et al. 2018 AnoGAN Schlegl et al. 2017 ALAD Zenati et al. 2018

  16. -16- Experiment Results Consistent performance improvement!

  17. -17- Experiment Results Less sensitive to noise data! More robust!

  18. -18- Experiment Results Fig. Impact of different K values of K-NN algorithms in graph construction. Less sensitive to hyper-parameters! Easy to use!

  19. -19- Experiment Results (a). DAGMM (b). CADGMM Fig. Embedding visualization on KDD99 (Blue indicates the normal samples and orange the anomalies). Explainable and Effective!

  20. -20- Conclusion and Future Works Conventional feature learning models cannot β€’ effectively capture the correlation among data samples for anomaly detection. We propose a general representation learning β€’ framework to model the complex correlation among data samples for unsupervised anomaly detection. We plan to explore the correlation among samples β€’ for extremely high-dimensional data sources like image or video. We plan to develop an adaptive and learnable graph β€’ construction module for a more reasonable correlation modeling.

  21. -21- Reference [OC-SVM] Chen, Y., Zhou, X.S., Huang, T.S.: One-class svm for learning in image β€’ retrieval. ICIP . 2001 [IF] 8. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. ICDM . 2008. β€’ [DSEBM] Zhai, S., Cheng, Y., Lu, W., Zhang, Z.: Deep structured energy based β€’ models for anomaly detection. ICML . 2016. [DAGMM] Zong, B., Song, Q., Min, M.R., Cheng, W., Lumezanu, C., Cho, D., Chen, β€’ H.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. ICLR . 2018. [AnoGAN] Schlegl, T., Seebβ€’ ock, P ., Waldstein, S.M., Schmidt-Erfurth, U., Langs, β€’ G.: Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. IPMI . 2017. [ALAD] Zenati, H., Romain, M., Foo, C.S., Lecouat, B., Chandrasekhar, V.: β€’ Adversarially learned anomaly detection. ICDM . 2018.

  22. -22- Thanks Thanks for listening! Contact: isfanhy@hrbust.edu.cn Home Page: https://haoyfan.github.io/

Recommend


More recommend