loganomaly unsupervised detection of sequential and
play

LogAnomaly: Unsupervised Detection of Sequential and Quantitative - PowerPoint PPT Presentation

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs Weibin Meng , Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei Yuqing Liu, Yihao Chen, Ruizhi Zhang, Shimin Tao, Pei Sun and Rong Zhou 2019/9/10 1


  1. LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs Weibin Meng , Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei Yuqing Liu, Yihao Chen, Ruizhi Zhang, Shimin Tao, Pei Sun and Rong Zhou 2019/9/10 1 Weibin Meng

  2. Internet Services The number of Internet provide Stability of services services is growing various types of are becoming rapidly services more important Traffic will increase more than three times 400 396 300 319 254 200 201 156 100 122 0 2017 2018 2019 2020 2021 2022 Source: Cisco VNI Global IP 2019/9/10 2 Weibin Meng

  3. Anomaly Detection ■ Anomalies will impact revenue and user experience. ■ Anomaly detection plays an important role in service management. 2019/9/10 3 Weibin Meng

  4. Logs for Anomaly Detection ■ Logs are one of the most valuable data for anomaly detection Diverse General ■ Logs record a vast range of ■ Every service and device runtime information generates logs Types Timestamps Detailed messages Switch Jul 10 19:03:03 Interface te-1/1/59, changed state to down Unstructured RAS KERNEL INFO 87 L3 EDRAM error(s) (dcr 0x0157) detected and Supercomputer Jun 4 6:45:50 corrected over 27362 seconds logs INFO dfs.DataNode$PacketResponder: PacketResponder 1 for block blk_- HDFS Jun 8 13:42:26 1608999687919862906 terminating Neighbour(rid:10.231.0.43, addr:10.231.39.61) on vlan23, changed state Router Jul 11 11:05:07 from Exchange to Loading 2019/9/10 4 Weibin Meng

  5. Logs for Anomaly Detection scenario pattern detection A single log can reflect an Single log Keywords & anomaly. anomaly Regular expressions e.g., “ power down ” The number of multiple logs Quantitative Logs changes can reflect anomalies. Anomalies e.g., num(down) != num(up) Based on log sequence The sequence of multiple logs Sequential changes can reflect anomalies. Anomalies e.g., OSPF failed to start Our work focus on log sequence anomaly detection 2019/9/10 5 Weibin Meng

  6. Manual Detection Not all anomalies are An operator has The explosion of logs explicitly displayed incomplete information • e.g., 10T/day in Huawei • Some anomalies hide in log of the overall system sequence. Automatically detect anomalies based on unstructured logs Workflow of OSPF (a network protocol) startup : Quantitative relationship of Interface flapping : Down → Attempt → Init → Two-way→ Exstart → Exchange → Loading → Full num(interface down) = num(interface up) Runtime logs: Runtime logs : OSPF ADJCHG, Nbr 1.1.1.1 on FastEthernet0/0 from Attempt to Init Line protocol on Interface ae3, changed state to down OSPF ADJCHG, Nbr 1.1.1.1 on FastEthernet0/0 from Init to Two-way Interface ae3, changed state to down Interface ae3, changed state to up OSPF ADJCHG, Nbr 1.1.1.1 on FastEthernet0/0 from Two-way to Exstart OSPF ADJCHG, Nbr 1.1.1.1 on FastEthernet0/0 from Two-way to Exstart An interface down event Every log is normal, occurs but OSPF failed to start 2019/9/10 6 Weibin Meng

  7. Previous studies ■ Existing log anomaly detection : Quantitative anomalies detection methods Sliding/session windows ■ Quantitative pattern based methods ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 Count Matrix T 1 , T 2 , T 3 , T 1 , T 4 v 1 v 2 v 3 v 4 ■ Sequential pattern based methods C j 1 1 1 0 PreFix(SIGMETRIS’18)PCA(SOSP’09) C j+1 1 1 1 0 ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 C j+2 1 0 1 1 T 1 , T 2 , T 3 , T 1 , T 4 C j+3 1 0 1 1 Only comparing template indexes loses the • ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 Templates (log keys) : Logs: LogCluster (ICSE’16) IM(ATC’10) T 1 . Interface *, changed state to down T 1 , T 2 , T 3 , T 1 , T 4 L 1 . Interface ae3, changed state to down T 2 . Vlan-interface *, changed state to down information hidden in template semantics L 2 . Vlan-interface vlan22, changed state to down T 3 . Interface *, changed state to up L 3 . Interface ae3, changed state to up. T 4 . Vlan-interface *, changed state to up L 4 . Interface ae1, changed state to down Logs -> Template indexes : L 5 . Vlan-interface vlan22, changed state to up Sliding/session windows L 1 ->T 1 , L 2 ->T 2 , L 3 ->T 3 L 6 . Interface ae1, changed state to up L 4 ->T 1 , L 5 ->T 4 , L 6 ->T 3 ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 sequence next Log template index sequence : T 1 , T 2 , T 3 , T 1 , T 4 T 1 , T 2 , T 3 , T 1 , T 4 , T 3 v 1 [ v 1 v 2 v 3 ] ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 [ v 2 v 3 v 1 ] v 4 T 1 , T 2 , T 3 , T 1 , T 4 [ v 3 v 1 v 4 ] v 3 ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 ∆𝑢 DeepLog (CCS’17) T 1 , T 2 , T 3 , T 1 , T 4 Sequential anomalies detection methods 2019/9/10 7 Weibin Meng

  8. Challenges Valuable information could be lost if only log Services can generate template index is used. new log templates Some templates are similar in between semantics but different in two re-trainings indexes Existing methods cannot Existing approaches cannot detect sequential and address this problem quantitative anomalies simultaneously. 2019/9/10 8 Weibin Meng

  9. Overview of LogAnomaly 𝐰 (% & ) 𝐰 (% &'( ) 𝐰 (% &',-( ) … Match Template Historical Template sequence logs Vector LSTM Sequence Vector Attention 𝐰 (% &', ) sequence Synonyms& Extract Count LSTM Antonyms Vector An anomaly template2Vec 𝐃 +.01/ 𝐃 + 𝐃 +./ … detection Word Template Templates system based Vectors Vectors Model template2Vec template2Vec on Classification Offline learning unstructured Online detection logs Temporary Existing Temporary Vectors Vectors Templates Update Template Vector Real-time Comparison Output sequence sequence Match logs 2019/9/10 9 Weibin Meng

  10. Template Representation 𝐰 (% & ) 𝐰 (% &'( ) 𝐰 (% &',-( ) … Match Template Historical Template sequence logs Vector LSTM Sequence Vector Attention 𝐰 (% &', ) sequence Synonyms& Extract Count LSTM Antonyms Vector template2Vec 𝐃 +.01/ 𝐃 + 𝐃 +./ … Address the Word Template first Templates Vectors Vectors Model challenge and template2Vec template2Vec Classification Offline learning save template semantics. Online detection Temporary Existing Temporary Vectors Vectors Templates Update Template Vector Real-time Comparison Output sequence sequence Match logs 2019/9/10 10 Weibin Meng

  11. Template Representations Goals Insights ■ Some existing templates have similar ■ Convert log templates to “soft ” semantics representations ■ Some logs containing antonyms look similar ■ Takes antonyms and synonyms into but have opposite semantics consideration Logs: Templates : 1.Interface ae3, changed state to down 1 .Interface *, changed state to down 2.Vlan-interface vlan22, changed state to down 2.Vlan-interface *, changed state to down 3.Interface ae3, changed state to up 3. Interface *, changed state to up 4.Vlan-interface vlan22, changed state to up 4.Vlan-interface *, changed state to up 5.Interface ae1, changed state to down 6.Vlan-interface vlan20, changed state to down Logs>Templates: 7.Interface ae1, changed state to up L1->T1 L2->T2 L3->T3 L4->T4 8.Vlan-interface vlan20, changed state to up L5->T1 L6->T2 L7->T3 L8->T4 2019/9/10 11 Weibin Meng

  12. Template2Vec ■ template2Vec : ( template representation method) 1. Construct the set of synonyms and antonyms Combine domain knowledge and WordNet • 2. Generate word vectors by using dLCE [1] algorithm dLCE is a distributional lexical-contrast embedding model • 3. Calculate template vectors. Syns&Ants Word vectors Adding Synonyms Antonyms (2) Interface [x1,…,xn] Relations Word pairs methods … … Interface Vlan-interface down up down low WordNet changed [x1,…,xn] … … … … Synonyms (3) (1) Templates Template vectors Interface port Operators T 1 Interface * changed state to up V1 [x1,…,xn] (3) DOWN UP WordNet … … … … Antonyms T n+1 Interface * changed state to up [x1,…,xn] powerDown powerOn Operators Vn+1 [1] Kim Anh Nguyen, Sabine Schulte, and Ngoc Thang Vu. Integrating distributional lexical contrast into word embeddings for antonym-synonym 2019/9/10 distinction. arXiv preprint arXiv:1605.07766 , 2016. 12 Weibin Meng

  13. Template Approximation 𝐰 (% & ) 𝐰 (% &'( ) 𝐰 (% &',-( ) … Match Template Historical Template sequence logs Vector LSTM Sequence Vector Attention 𝐰 (% &', ) sequence Synonyms& Extract Count LSTM Antonyms Vector template2Vec 𝐃 +.01/ A mechanism 𝐃 + 𝐃 +./ … to address Word Template Templates Vectors Vectors Model new templates template2Vec template2Vec at runtime Classification Offline learning Online detection Temporary Existing Temporary Vectors Vectors Templates Update Template Vector Similarity Real-time Output sequence sequence comparison Match logs 2019/9/10 13 Weibin Meng

  14. Template Approximation Between two re-trainings ■ Extract a temporary template for the log of a new type ■ Map the temporary template vector into one of the existing vector Word Template Templates Vectors Vectors offline online Existing Temporary Temporary Vectors Vectors Templates Template Approximation Real-time Between Two Consecutive Trainings logs 2019/9/10 14 Weibin Meng

Recommend


More recommend