01 Mylobot, Detecting the Undetected Using Deep Learning Yael Daihes
02 WHO AM I Yael Daihes Security Data Science Team Lead @ Akamai Technologies My things - Botnets, Traffic, Data, Algorithms, Reading, Painting and a bit of Gaming (:
03 DNS Empire
04 AGENDA 1 Mylobot - What is it? 2 What Is DGA? How Has the Defense Community Tackle the DGA Problem So Far? 3 How Did We Tackle this Issue? Overview of Our Detection System 4 Results in the Wild 5 Mylobot - As we see it
05 Mylobot?
06 2018
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- a-new-highly-sophisticated-never-seen-before-botnet- 07 thats-out-in-the-wild/ Tons of Evasion Techniques *Anti VM techniques *Anti-sandbox techniques *Anti-debugging techniques *Wrapping internal parts with an encrypted resource file *Code injection *Process hollowing *CNC communication delaying mechanism - 14 days before accessing its command and control servers *DGA
08 Command and Control Dear Bot master, botmaster.com what do I do next? Send me a screen shot
09 Defense Dear Bot master, botmaster.com what do I do next? Send me a screen shot
10 What is DGA?
11 Day 1 Generated Domains DNS Response Asdiuouoi.top NX Whjrhkejwh.biz NX CNC Channel Hakjhsdkjh.top 1.2.3.4 for Sunday Hjkwrjkhew.biz NX ... ...
12 Day 2 Generated Domains DNS Response CNC Channel ycrxmen.com 5.6.7.8 for Monday ljfsmaroqok.com NX dtswwomss.eu NX gvzoutzukdzth.ru NX ... ...
13 How had the defense community tackle the DGA problem so far?
github.com/baderj/domain_generation_algorithms 14 Reverse the DGA code Example - Code for creating "Simda" domain names, created by reversing the binary and implementing the logic in Python Could be used for creating the domain names and add to a block list (given the seed) How to find the seed? Cool ways that involve checking what’s in the traffic and brute forcing possibilities with really strong computers
15 But.. How does that detect new malware? What about DGAs we can't break their seed?
16 Our Goal - Detect DGA Domains Detect new Detect new DGAs Bonus - Detect domains of never seen or what I can DGAs I know, reported before break and is but couldn't known break
17 Let's Solve This Together Maybe we should check if the characters used in the domain name are basically.. gibberish?
18 Let's Solve This Together Attackers adapt..
19 Let's Solve This Together OK OK OK.. hold up - we're smarter. How does the 2- characters distribution look like? Hooray!
20 Let's Solve This Together Or not..?
21 Deep Learning to the Rescue WHY DEEP LEARNING? Sequence model (char by char) Should be able to distinguish between a sequence generated artificially (DGA) and a sequence from a natural language Learns the patterns (by training) Predicting Domain Generation Algorithms with Long Short-Term Memory Networks, Endgame -https://arxiv.org/abs/1611.00791
22 Training Data Results the 90% Model Accuracy 1.2 MILLION DGA DOMAINS 1.2 MILLION BENIGN DOMAINS Generated by the Can't classify reverse codes or Captured from which captured in the normal traffic wild [1] malware family [1]dgarchive.caad.fkie.fraunhofer.de
23 Visualization of Domains as seen by the Deep Learning model Intuition - The Challenges of Classifying the Malware Family Some clusters are separable Some clusters are too intertwined
24 Overview of the System DNS Queries made by Deep Learning model one user in specified time response -"How likely this window is a DGA, between 0-1? Attribute to the specific malware facebook.com 0.0 D Y R E U N K N O W N # 1 oyeeedysb.com 0.999 What does the Classify the L O C K Y Deep Learning domains model think? detected gvzozukdzth.ru 0.91 oyeeedysb.com goooogle.com 0.0 gvzozukdzth.ru ycrxmen.com 0.999 ycrxmen.com ljfsaroqok.com 0.98 ljfsaroqok.com 09
25 Results in the Wild 2.5 million 70 million ~0% False ±8 unknown domains DNS Positives (zero day?) detected requests DGAs and blocked blocked detected daily daily
26 MYLOBOT
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- 27 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ Stage 1 - Mylobot (Downloader) DNS query m8.zdrussle.ru Connect to C2s for grabbing and executing second stage DNS Server 1 IP: x.x.x.x
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- 28 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ Stage 1 - Mylobot (Downloader) Connect to C2s for grabbing and executing second stage IP: x.x.x.x 2 Where do I need to go next? Go and get http://1.2.3.4/malware.gif
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- 29 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ Stage 1 - Mylobot (Downloader) Connect to C2s for grabbing and executing second stage GET http://1.2.3.4/malware.gif IP: 1.2.3.4 3 Malware file (unknown)
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- 30 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ 4 Stage 2 Run second malware Unknown malicious activity Reported to have been using Khalesi as second stage [1][2]
[1] www.deepinstinct.com/2018/06/20/meet-mylobot- 31 a-new-highly-sophisticated-never-seen-before-botnet- How Does Mylobot Look in Traffic thats-out-in-the-wild/ [2] blog.centurylink.com/mylobot- continues-global-infections/ Stage 1 - Mylobot (Downloader) DNS query m8.zdrussle.ru Connect to C2s for grabbing and executing second stage DNS Server 1 IP: x.x.x.x IP: x.x.x.x 2 Where do I need to go next? Go and get http://1.2.3.4/malware.gif GET http://1.2.3.4/malware.gif IP: 1.2.3.4 4 3 Malware file (unknown) Stage 2 Run second malware Unknown malicious activity Reported to have been using Khalesi as second stage [1][2]
32 What Did We Detect in Traffic? The first step "DNS query m8.zdrussle.ru " The domain name is generated by a DGA, and the deep learning model detected it ~1,400 domains detected m<number between 0 and 43>.<domain generated by a DGA>.com|in|biz|org|net|me|cc|ru 3
33
34 What Did We Detect in Traffic? The first step "DNS query m8.zdrussle.ru " The domain name is generated by a DGA, and the deep learning model detected it ~8,000 domains detected, of what we understand to be variants. 4 The four variants differ in the DGA pattern used m<number between 0 and 43>.<domain generated by a DGA>.com|in|biz|org|net|me|cc|ru 1 2 x<number between 0 and 43>.<domain generated by a DGA>.com|ru|net|org|bz|in|biz|su|eu|cc green<number between 0 and 43>.<domain generated by a DGA>.com|ru| 3 v1.<domain generated by a DGA>.com|ru|net|org|bz|in|biz|su|eu|cc 4
IOC MAP 35 As researched by us and by our findings X Variant M Variant 2019
IOC MAP 36 As researched by us and by our findings Green M Variant Variant V1 Variant 671 more domains resolving 2019
IOC MAP 37 02 X Variant M As researched by us and by our Variant findings 2020
38 For comparison, DNS queries a day as seen in Akamais traffic: Pykspa ~ 1Million Qsnatch ~1Million Emotet ~500k Gameover Zeus ~200k How Did It Look In Our Traffic Over Time
39 Entities - could be single user or a NAT How Did It Look In Our Traffic Over Time
40 SUMMARY What's DGA Defense System: What's DGA Mylobot DGA detection in Piece of code some traffic malwares have that Piece of code some Super active newly Trained a Deep generate domain malwares have that seen botnet this Learning model and names for forming generate domain system detected, use it over live traffic C&C channel look out! names
41 Takeaways Mylobot -Countermeasures Threat Intelligence - Network Perspective Monitor DNS traffic We will publish IOCs Investigate new patterns and DGA detection beasts are domains possible!
42 TWITTER Reach Out @Yael_Daihes
Recommend
More recommend