Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong - PowerPoint PPT Presentation

-1- Correlation-aware Deep Generative Model for Unsupervised Anomaly Detection Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong Li 2 Harbin University of Science and Technology 1 Minjiang University 2 isfanhy@hrbust.edu.cn

-2- Background Anomaly Anomaly Observed Space Latent Space Normal

-3- Background https://www.explosion.com/135494/5-effective-strategies-of-fraud- https://towardsdatascience.com/building-an-intrusion-detection-system- detection-and-prevention-for-ecommerce/ using-deep-learning-b9488332b321 Fraud Detection Intrusion Detection https://planforgermany.com/switching-private-public-health-insurance- https://blog.exporthub.com/working-with-chinese-manufacturers/ germany/ Disease Detection Fault Detection

-4- Background Unsupervised Anomaly Detection – From the Density Estimation Perspective Data samples: 𝑌 𝑢𝑠𝑏𝑗𝑜 = 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑜 , 𝑦 𝑗 is assumed normal. Latent Space

-5- Background Unsupervised Anomaly Detection – From the Density Estimation Perspective Data samples: 𝑌 𝑢𝑠𝑏𝑗𝑜 = 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑜 , 𝑦 𝑗 is assumed normal. Model: 𝑞(𝑦) Latent Space

-6- Background Unsupervised Anomaly Detection – From the Density Estimation Perspective Data samples: 𝑌 𝑢𝑠𝑏𝑗𝑜 = 𝑦 1 , 𝑦 2 , … , 𝑦 𝑜 , 𝑦 𝑗 is assumed normal. Model: 𝑞(𝑦) Test samples: 𝑌 𝑢𝑓𝑡𝑢 = 𝑦 1 , 𝑦 2 , … , 𝑦 𝑜 , 𝑦 𝑢 is unknow. if 𝑞(𝑦 𝑢 ) < 𝜇 , 𝑦 𝑢 is abnormal . if 𝑞(𝑦 𝑢 ) ≥ 𝜇 , 𝑦 𝑢 is normal . Latent Space

-7- Background Unsupervised Anomaly Detection – From the Density Estimation Perspective Data samples: 𝑌 𝑢𝑠𝑏𝑗𝑜 = 𝑦 1 , 𝑦 2 , … , 𝑦 𝑜 , 𝑦 𝑗 is assumed normal. Model: 𝑞(𝑦) Test samples: 𝑌 𝑢𝑓𝑡𝑢 = 𝑦 1 , 𝑦 2 , … , 𝑦 𝑜 , 𝑦 𝑢 is unknow. if 𝑞(𝑦 𝑢 ) < 𝜇 , 𝑦 𝑢 is abnormal . if 𝑞(𝑦 𝑢 ) ≥ 𝜇 , 𝑦 𝑢 is normal . Latent Space Anomalies reside in the low probability density areas.

-8- Background Correlation among data samples Conventional Anomaly Feature Learning Detection Graph Modeling Feature Space Correlation-aware Anomaly Feature Learning Detection Structure Space How to discover the normal pattern from both the feature level and structural level ?

-9- Problem Statement Anomaly Detection Notations 𝓗 : Graph. Given a set of input samples 𝓨 = {𝑦 𝑗 |𝑗 = 𝓦 : Set of nodes in a graph. 1, . . . , 𝑂} , each of which is associated 𝓕 : Set of edges in a graph. with a 𝐺 dimension feature 𝐘 𝑗 ∈ ℝ 𝐺 , we 𝑂 : Number of nodes. aim to learn a score function 𝑣(𝐘 𝑗 ): ℝ 𝐺 ↦ 𝐺 : Dimension of attribute. 𝐁 ∈ ℝ 𝑂×𝑂 : Adjacency matrix ℝ , to classify sample 𝑦 𝑗 based on the threshold 𝜇 : of a network. 𝐘 ∈ ℝ 𝑂×𝐺 : Feature matrix of all nodes. 𝑧 𝑗 = {1, 𝑗𝑔 𝑣(𝐘 𝑗 ) ≥ 𝜇, 0, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓. where 𝑧 𝑗 denotes the label of sample 𝑦 𝑗 , with 0 being the normal class and 1 the anomalous class.

-10- Method CADGMM Feature Dual-Encoder Decoder Graph Estimation Construction network

-11- Method CADGMM K-Nearest Neighbor e.g. K=5 Original feature: 𝓨 = {𝑦 𝑗 |𝑗 = 1, . . . , 𝑂} Find neighbors by K-NN: 𝓞 𝑗 = {𝑦 𝑗 𝑙 |𝑙 = 1, . . . , 𝐿 ൟ Model correlation as graph: 𝓗 = {𝓦, 𝓕, 𝐘} Graph Construction 𝓦 = {𝑤 𝑗 = 𝑦 𝑗 |𝑗 = 1, . . . , 𝑂} 𝓕 = {𝑓 𝑗 𝑙 = (𝑤 𝑗 , 𝑤 𝑗 𝑙 )|𝑤 𝑗 𝑙 ∈ 𝓞 𝑗 }

-12- Method CADGMM Feature Encoder e.g. MLP , CNN, Feature Decoder LSTM Graph Encoder e.g. GAT

-13- Method CADGMM Gaussian Mixture Model Initial embedding: Z Membership: Z 𝓝(𝑚 ℳ ) = 𝜏 Z 𝓝 𝑚 ℳ −1 W 𝓝 𝑚 ℳ −1 + b 𝓝 𝑚 ℳ −1 , Z 𝓝(0) = Z 𝓝 = Softmax ( Z 𝓝(𝑀 ℳ ) ) , 𝓝 ∈ ℝ 𝑂×𝑁 Parameter Estimation: 𝑂 𝓝 𝑗,𝑛 ( Z 𝑗 −𝝂 𝒏 )( Z 𝑗 −𝝂 𝒏 ) T 𝑂 𝓝 𝑗,𝑛 Z 𝑗 ෍ ෌ 𝑗=1 𝑗=1 𝝂 𝒏 = , 𝚻 𝒏 = 𝑂 𝑂 𝓝 𝑗,𝑛 𝓝 𝑗,𝑛 ෌ 𝑗=1 ෌ 𝑗=1 Estimation network Energy: 2 ( Z −𝝂 𝒏 ) T 𝚻 𝑛 exp (− 1 −1 ( Z −𝝂 𝒏 )) 𝓝 𝑗,𝑛 𝑂 E Z = − log σ 𝑛=1 𝑁 σ 𝑗=1 1 𝑂 |2𝜌𝚻 𝑛 | 2

-14- Method Loss and Anomaly Score Loss Function: 2 + 𝜇 1 E Z + 𝜇 2 σ 𝑛=1 1 ℒ = || X − ෡ 𝑁 𝑂 2 X || 2 (𝚻 𝒏 ) 𝑗𝑗 + 𝜇 3 || Z || 2 σ 𝑗=1 Embedding Covariance Rec. Error Energy Penalty Penalty Anomaly Score: 𝑇𝑑𝑝𝑠𝑓 = E Z Solution for Problem: 𝑧 𝑗 = {1, 𝑗𝑔 𝑣(𝐘 𝑗 ) ≥ 𝜇, 0, 𝑝𝑢ℎ𝑓𝑠𝑥𝑗𝑡𝑓. 𝜇 =Distribution( 𝑇𝑑𝑝𝑠𝑓 )

-15- Experiment Datasets Baselines Evaluation Metrics Precision OC-SVM Chen et al. 2001 Recall IF Liu et al. 2008 F1-Score DSEBM Zhai et al. 2016 DAGMM Zong et al. 2018 AnoGAN Schlegl et al. 2017 ALAD Zenati et al. 2018

-16- Experiment Results Consistent performance improvement!

-17- Experiment Results Less sensitive to noise data! More robust!

-18- Experiment Results Fig. Impact of different K values of K-NN algorithms in graph construction. Less sensitive to hyper-parameters! Easy to use!

-19- Experiment Results (a). DAGMM (b). CADGMM Fig. Embedding visualization on KDD99 (Blue indicates the normal samples and orange the anomalies). Explainable and Effective!

-20- Conclusion and Future Works Conventional feature learning models cannot • effectively capture the correlation among data samples for anomaly detection. We propose a general representation learning • framework to model the complex correlation among data samples for unsupervised anomaly detection. We plan to explore the correlation among samples • for extremely high-dimensional data sources like image or video. We plan to develop an adaptive and learnable graph • construction module for a more reasonable correlation modeling.

-21- Reference [OC-SVM] Chen, Y., Zhou, X.S., Huang, T.S.: One-class svm for learning in image • retrieval. ICIP . 2001 [IF] 8. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. ICDM . 2008. • [DSEBM] Zhai, S., Cheng, Y., Lu, W., Zhang, Z.: Deep structured energy based • models for anomaly detection. ICML . 2016. [DAGMM] Zong, B., Song, Q., Min, M.R., Cheng, W., Lumezanu, C., Cho, D., Chen, • H.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. ICLR . 2018. [AnoGAN] Schlegl, T., Seeb• ock, P ., Waldstein, S.M., Schmidt-Erfurth, U., Langs, • G.: Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. IPMI . 2017. [ALAD] Zenati, H., Romain, M., Foo, C.S., Lecouat, B., Chandrasekhar, V.: • Adversarially learned anomaly detection. ICDM . 2018.

-22- Thanks Thanks for listening! Contact: isfanhy@hrbust.edu.cn Home Page: https://haoyfan.github.io/

Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong - PowerPoint PPT Presentation

-1- Correlation-aware Deep Generative Model for Unsupervised Anomaly Detection Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong Li 2 Harbin University of Science and Technology 1 Minjiang University 2 isfanhy@hrbust.edu.cn

AnomalyDAE: Dual Autoencoder for Anomaly Detection on Attributed Networks Haoyi Fan 1 , Fengbin

Pu Wang 1 Pu Wang 2 Pu Wang 3 Pu Wang 4 4 1 2 3 Path: 1,2,3,4 Pu Wang 5 Pu Wang 6

Metascala A tiny DIY JVM https://github.com/lihaoyi/Metascala Li Haoyi haoyi@dropbox.com Scala

Why you might like Scala.js Li Haoyi, Scaladays 17-March-2015 0 Who Am I? Li Haoyi Work

Intro to Scala.js Singapore-Scala, 28 Mar 2017 Li Haoyi haoyi.sg@gmail.com Bright Technology

#prep X Assembly 02: Left Fan In this guide, we attach Left filament fan to the X carriage.

Scala.js Safety & Sanity in the wild west of the web Li Haoyi, Dropbox, 20 July 2015 1.1 Who

FastParse Fast, Modern, Object-Oriented Parser Combinators Li Haoyi, Parsing@SLE 24 Oct 2015 Who

Fan Design MSYS4480 Fan Definition ASHRAE A fan is an air pump that creates a pressure

Fully Automated Nagios Cdric TEMPLE 1 RMLL 2009 Presentation outline Introducing FAN

RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM Fengbin Tu ,

Tsinghua @ TRECVID2007.search Zhikun Wang, Dong Wang, Huiyi Wang, Tongchun Xiao, Duanpeng Wang,

New Distinguishing Attack on MAC Using Secret- Prefix Method 1,2 , Wei Wang Wang 1,2 , Wei Wang 2

Facilitating Attuned Interactions (FAN) A Taste of the FAN Your Trainer: Sylvia Kurin, MSW

#prep X Assembly 03-B: Proximity Sensor + Right Fan You got the Dual Fan Upgrade? This is what

#prep X Assembly 03-A: Proximity Sensor You got Single Fan? You got the Dual Fan Upgrade? Good.

ADAM THE PHYSICIST, THE FRIEND, THE MENTOR Marta Kiciska -Habior Symposium on the occasion of

Scales in geophysical flows Rupert Klein Mathematik & Informatik, Freie Universit at

Universal Hydrodynamics for Quantum Critical Points with Lifshitz scaling Bom Soo Kim Raymond

Portraits of manifolds Mahito Kobayashi Akita-Uni,

Review: probability Covariance, correlation relationship to independence Law of

Approximate Correlation Clustering using Same-Cluster Queries Ragesh Jaiswal CSE, IIT Delhi

AIRR Community Minimal Standards Working Group Report 2019/20 2020-12-08 AIRR Community Meeting

Do Not Worry Matthew 6:25-34 Do Not Worry Do Not Worry Worry Preoccupation with problems

Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong - PowerPoint PPT Presentation

-1- Correlation-aware Deep Generative Model for Unsupervised Anomaly Detection Haoyi Fan 1 , Fengbin Zhang 1 , Ruidong Wang 1 , Liang Xi 1 , Zuoyong Li 2 Harbin University of Science and Technology 1 Minjiang University 2 isfanhy@hrbust.edu.cn

AnomalyDAE: Dual Autoencoder for Anomaly Detection on Attributed Networks Haoyi Fan 1 , Fengbin

Pu Wang 1 Pu Wang 2 Pu Wang 3 Pu Wang 4 4 1 2 3 Path: 1,2,3,4 Pu Wang 5 Pu Wang 6

Metascala A tiny DIY JVM https://github.com/lihaoyi/Metascala Li Haoyi haoyi@dropbox.com Scala

Why you might like Scala.js Li Haoyi, Scaladays 17-March-2015 0 Who Am I? Li Haoyi Work

Intro to Scala.js Singapore-Scala, 28 Mar 2017 Li Haoyi haoyi.sg@gmail.com Bright Technology

#prep X Assembly 02: Left Fan In this guide, we attach Left filament fan to the X carriage.

Scala.js Safety &amp; Sanity in the wild west of the web Li Haoyi, Dropbox, 20 July 2015 1.1 Who

FastParse Fast, Modern, Object-Oriented Parser Combinators Li Haoyi, Parsing@SLE 24 Oct 2015 Who

Fan Design MSYS4480 Fan Definition ASHRAE A fan is an air pump that creates a pressure

Fully Automated Nagios Cdric TEMPLE 1 RMLL 2009 Presentation outline Introducing FAN

RANA: Towards Efficient Neural Acceleration with Refresh-Optimized Embedded DRAM Fengbin Tu ,

Tsinghua @ TRECVID2007.search Zhikun Wang, Dong Wang, Huiyi Wang, Tongchun Xiao, Duanpeng Wang,

New Distinguishing Attack on MAC Using Secret- Prefix Method 1,2 , Wei Wang Wang 1,2 , Wei Wang 2

Facilitating Attuned Interactions (FAN) A Taste of the FAN Your Trainer: Sylvia Kurin, MSW

#prep X Assembly 03-B: Proximity Sensor + Right Fan You got the Dual Fan Upgrade? This is what

#prep X Assembly 03-A: Proximity Sensor You got Single Fan? You got the Dual Fan Upgrade? Good.

ADAM THE PHYSICIST, THE FRIEND, THE MENTOR Marta Kiciska -Habior Symposium on the occasion of

Scales in geophysical flows Rupert Klein Mathematik &amp; Informatik, Freie Universit at

Universal Hydrodynamics for Quantum Critical Points with Lifshitz scaling Bom Soo Kim Raymond

Portraits of manifolds Mahito Kobayashi Akita-Uni,

Review: probability Covariance, correlation relationship to independence Law of

Approximate Correlation Clustering using Same-Cluster Queries Ragesh Jaiswal CSE, IIT Delhi

AIRR Community Minimal Standards Working Group Report 2019/20 2020-12-08 AIRR Community Meeting

Do Not Worry Matthew 6:25-34 Do Not Worry Do Not Worry Worry Preoccupation with problems

Scala.js Safety & Sanity in the wild west of the web Li Haoyi, Dropbox, 20 July 2015 1.1 Who

Scales in geophysical flows Rupert Klein Mathematik & Informatik, Freie Universit at