mylobot detecting the undetected using deep learning
play

Mylobot, Detecting the Undetected Using Deep Learning Yael Daihes - PowerPoint PPT Presentation

01 Mylobot, Detecting the Undetected Using Deep Learning Yael Daihes 02 WHO AM I Yael Daihes Security Data Science Team Lead @ Akamai Technologies My things - Botnets, Traffic, Data, Algorithms, Reading, Painting and a bit of Gaming (:


  1. 01 Mylobot, Detecting the Undetected Using Deep Learning Yael Daihes

  2. 02 WHO AM I Yael Daihes Security Data Science Team Lead @ Akamai Technologies My things - Botnets, Traffic, Data, Algorithms, Reading, Painting and a bit of Gaming (:

  3. 03 DNS Empire

  4. 04 AGENDA 1 Mylobot - What is it? 2 What Is DGA? How Has the Defense Community Tackle the DGA Problem So Far? 3 How Did We Tackle this Issue? Overview of Our Detection System 4 Results in the Wild 5 Mylobot - As we see it

  5. 05 Mylobot?

  6. 06 2018

  7. [1] www.deepinstinct.com/2018/06/20/meet-mylobot- a-new-highly-sophisticated-never-seen-before-botnet- 07 thats-out-in-the-wild/ Tons of Evasion Techniques *Anti VM techniques *Anti-sandbox techniques *Anti-debugging techniques *Wrapping internal parts with an encrypted resource file *Code injection *Process hollowing *CNC communication delaying mechanism - 14 days before accessing its command and control servers *DGA

  8. 08 Command and Control Dear Bot master, botmaster.com what do I do next? Send me a screen shot

  9. 09 Defense Dear Bot master, botmaster.com what do I do next? Send me a screen shot

  10. 10 What is DGA?

  11. 11 Day 1 Generated Domains DNS Response Asdiuouoi.top NX Whjrhkejwh.biz NX CNC Channel Hakjhsdkjh.top 1.2.3.4 for Sunday Hjkwrjkhew.biz NX ... ...

  12. 12 Day 2 Generated Domains DNS Response CNC Channel ycrxmen.com 5.6.7.8 for Monday ljfsmaroqok.com NX dtswwomss.eu NX gvzoutzukdzth.ru NX ... ...

  13. 13 How had the defense community tackle the DGA problem so far?

  14. github.com/baderj/domain_generation_algorithms 14 Reverse the DGA code Example - Code for creating "Simda" domain names, created by reversing the binary and implementing the logic in Python Could be used for creating the domain names and add to a block list (given the seed) How to find the seed? Cool ways that involve checking what’s in the traffic and brute forcing possibilities with really strong computers

  15. 15 But.. How does that detect new malware? What about DGAs we can't break their seed?

  16. 16 Our Goal - Detect DGA Domains Detect new Detect new DGAs Bonus - Detect domains of never seen or what I can DGAs I know, reported before break and is but couldn't known break

  17. 17 Let's Solve This Together Maybe we should check if the characters used in the domain name are basically.. gibberish?

  18. 18 Let's Solve This Together Attackers adapt..

  19. 19 Let's Solve This Together OK OK OK.. hold up - we're smarter. How does the 2- characters distribution look like? Hooray!

  20. 20 Let's Solve This Together Or not..?

  21. 21 Deep Learning to the Rescue WHY DEEP LEARNING? Sequence model (char by char) Should be able to distinguish between a sequence generated artificially (DGA) and a sequence from a natural language Learns the patterns (by training) Predicting Domain Generation Algorithms with Long Short-Term Memory Networks, Endgame -https://arxiv.org/abs/1611.00791

  22. 22 Training Data Results the 90% Model Accuracy 1.2 MILLION DGA DOMAINS 1.2 MILLION BENIGN DOMAINS Generated by the Can't classify reverse codes or Captured from which captured in the normal traffic wild [1] malware family [1]dgarchive.caad.fkie.fraunhofer.de

  23. 23 Visualization of Domains as seen by the Deep Learning model Intuition - The Challenges of Classifying the Malware Family Some clusters are separable Some clusters are too intertwined

  24. 24 Overview of the System DNS Queries made by Deep Learning model one user in specified time response -"How likely this window is a DGA, between 0-1? Attribute to the specific malware facebook.com 0.0 D Y R E U N K N O W N # 1 oyeeedysb.com 0.999 What does the Classify the L O C K Y Deep Learning domains model think? detected gvzozukdzth.ru 0.91 oyeeedysb.com goooogle.com 0.0 gvzozukdzth.ru ycrxmen.com 0.999 ycrxmen.com ljfsaroqok.com 0.98 ljfsaroqok.com 09

  25. 25 Results in the Wild 2.5 million 70 million ~0% False ±8 unknown domains DNS Positives (zero day?) detected requests DGAs and blocked blocked detected daily daily

  26. 26 MYLOBOT

  27. [1] www.deepinstinct.com/2018/06/20/meet-mylobot- 27 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ Stage 1 - Mylobot (Downloader) DNS query m8.zdrussle.ru Connect to C2s for grabbing and executing second stage DNS Server 1 IP: x.x.x.x

  28. [1] www.deepinstinct.com/2018/06/20/meet-mylobot- 28 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ Stage 1 - Mylobot (Downloader) Connect to C2s for grabbing and executing second stage IP: x.x.x.x 2 Where do I need to go next? Go and get http://1.2.3.4/malware.gif

  29. [1] www.deepinstinct.com/2018/06/20/meet-mylobot- 29 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ Stage 1 - Mylobot (Downloader) Connect to C2s for grabbing and executing second stage GET http://1.2.3.4/malware.gif IP: 1.2.3.4 3 Malware file (unknown)

  30. [1] www.deepinstinct.com/2018/06/20/meet-mylobot- 30 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ 4 Stage 2 Run second malware Unknown malicious activity Reported to have been using Khalesi as second stage [1][2]

  31. [1] www.deepinstinct.com/2018/06/20/meet-mylobot- 31 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ Stage 1 - Mylobot (Downloader) DNS query m8.zdrussle.ru Connect to C2s for grabbing and executing second stage DNS Server 1 IP: x.x.x.x IP: x.x.x.x 2 Where do I need to go next? Go and get http://1.2.3.4/malware.gif GET http://1.2.3.4/malware.gif IP: 1.2.3.4 4 3 Malware file (unknown) Stage 2 Run second malware Unknown malicious activity Reported to have been using Khalesi as second stage [1][2]

  32. 32 What Did We Detect in Traffic? The first step "DNS query m8.zdrussle.ru " The domain name is generated by a DGA, and the deep learning model detected it ~1,400 domains detected m<number between 0 and 43>.<domain generated by a DGA>.com|in|biz|org|net|me|cc|ru 3

  33. 33

  34. 34 What Did We Detect in Traffic? The first step "DNS query m8.zdrussle.ru " The domain name is generated by a DGA, and the deep learning model detected it ~8,000 domains detected, of what we understand to be variants. 4 The four variants differ in the DGA pattern used m<number between 0 and 43>.<domain generated by a DGA>.com|in|biz|org|net|me|cc|ru 1 2 x<number between 0 and 43>.<domain generated by a DGA>.com|ru|net|org|bz|in|biz|su|eu|cc green<number between 0 and 43>.<domain generated by a DGA>.com|ru| 3 v1.<domain generated by a DGA>.com|ru|net|org|bz|in|biz|su|eu|cc 4

  35. IOC MAP 35 As researched by us and by our findings X Variant M Variant 2019

  36. IOC MAP 36 As researched by us and by our findings Green M Variant Variant V1 Variant 671 more domains resolving 2019

  37. IOC MAP 37 02 X Variant M As researched by us and by our Variant findings 2020

  38. 38 For comparison, DNS queries a day as seen in Akamais traffic: Pykspa ~ 1Million Qsnatch ~1Million Emotet ~500k Gameover Zeus ~200k How Did It Look In Our Traffic Over Time

  39. 39 Entities - could be single user or a NAT How Did It Look In Our Traffic Over Time

  40. 40 SUMMARY What's DGA Defense System: What's DGA Mylobot DGA detection in Piece of code some traffic malwares have that Piece of code some Super active newly Trained a Deep generate domain malwares have that seen botnet this Learning model and names for forming generate domain system detected, use it over live traffic C&C channel look out! names

  41. 41 Takeaways Mylobot -Countermeasures Threat Intelligence - Network Perspective Monitor DNS traffic We will publish IOCs Investigate new patterns and DGA detection beasts are domains possible!

  42. 42 TWITTER Reach Out @Yael_Daihes

Recommend


More recommend