Malware Datasets Aleieldin Salem and Alexander Pretschner Technische - PowerPoint PPT Presentation

Poking the Bear: Lessons Learned from Probing Three Android Malware Datasets Aleieldin Salem and Alexander Pretschner Technische Universität München Garching bei München {salem, pretschn @in.tum.de} Montpellier, 04.09.2018

Abstract • Stumbled upon some inconsistencies while experimenting with different Android malware datasets • Investigate the source of discrepancies • A series of experiments performed on three Android malware datasets • Some (interesting) findings 2 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Background • Working on a solution based on “Active Learning” • Evaluating on Malgenome vs. Piggybacking • Datasets of Repackaged/Piggybacked Malware • Malgenome = great results! • Piggybacking = mediocre results? • Trying on AMD and Drebin • Works like a charm! • What the .. ? 3 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Research Questions 4 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Dissection Experiments • Infer some information about the malicious instances found in: • Malgenome (Zhou et al. 2012) • Piggybacking (Li et al. 2017) • AMD (Wei et al. 2017) • VirusTotal detection rates, involved marketplaces, malware types, etc. • Backed up by information in Euphony (Hurier et al. 2017) 5 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Dissection Experiments • Backed up by information in Euphony (Hurier et al. 2017) around 50 More information: https://androidmalwareinsights.github.io 7 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Dissection Experiments • Backed up by information in Euphony (Hurier et al. 2017) around 50 More information: https://androidmalwareinsights.github.io 8 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Dissection Experiments • Backed up by information in Euphony (Hurier et al. 2017) More information: https://androidmalwareinsights.github.io 9 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Dissection Experiments (cont'd) • What about repackaging? • What is in fact the definition of repackaging? • E.g. must the app be decompiled/disassembled? • Wei et al. [authors of AMD] claim it has been declining • How to quickly infer whether an app is repackaged? • Simple technique using compiler fingerprinting (with APKiD 1 ) 1 https://rednaga.io/2016/07/31/detecting_pirated_and_malicious_android_apps_with_apkid/ 10 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Dissection Experiments (cont'd) • Simple technique using compiler fingerprinting (with APKiD 1 ) • Legitimate developer = access to source code = using IDE • Compile app using Android SDK’s dx and dexmerge compilers • If app compiled using other compilers (e.g., dexlib ) = repackaged = no access to source code != legitimate developer? • Different compilers leave unique marks on the compiled code 1 https://rednaga.io/2016/07/31/detecting_pirated_and_malicious_android_apps_with_apkid/ 11 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Dissection Experiments (cont'd) • What about repackaging? • What is in fact the definition of repackaging? 12 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Dissection Experiments (cont'd) • What about repackaging? • What is in fact the definition of repackaging? lazy developers? wrong labeling? 13 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Dissection Experiments (cont'd) • What about repackaging? • What is in fact the definition of repackaging? 86% repackaged?! declining? 14 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Detection Experiments • How do conventional detection techniques fare against different datasets? • Conventional: • Machine learning classifiers • Trained with static/dynamic features • Validated using K-fold CV 15 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Detection Experiments • How do conventional detection techniques fare against different datasets? • Ensemble classifier • KNN, with K = {10, 25, 50, 100, 250, 500} • Random Forests with estimators = {10, 25, 50, 75, 100} • Support Vector machine with linear kernel • 10-Fold CV • Trained with static/dynamic features • Static: Extracted from APK using androguard • Dynamic: Running apps within VM + recording issued API calls 16 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Detection Experiments • How do conventional detection techniques fare against different datasets? 17 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Detection Experiments • How do conventional detection techniques fare against different datasets? • But why? • Piggybacking = original, benign apps + repackaged, malicious versions • Majority = Adware • ~70% of misclassified apps = Adware 18 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Detection Experiments (cont'd) • What is the lifespan of malware datasets? • Can we use an old/new dataset to detect newer/older datasets? • Train voting classifier using dataset A, and test using dataset B 19 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Detection Experiments (cont'd) • What is the lifespan of malware datasets? • Can we use an old/new dataset to detect newer/older datasets? • Train voting classifier using dataset A, and test using dataset B 20 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Adversarial Experiments • How can an adversary make use of this? • Consider a marketplace using a ML classifier as its “bouncer” • The classifier is trained using malicious + benign apps • If I [adversary] figure out one (or more) of the benign apps • Repackage benign apps + upload to marketplace • Classifier will be confused!! 21 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Adversarial Experiments (cont'd) • How can an adversary make use of this? • If I [adversary] figure out one (or more) of the benign apps • Many people presume apps on Google Play to be benign • Use Google Play apps as benchmark/reference for benign behaviors • Adversary make the same assumption! 22 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Adversarial Experiments (cont'd) • Piggybacking dataset = benign apps + repackaged versions • Train voting classifier with dataset A, and test with dataset B • Observe the effect of adding “Original” segment of Piggybacking on classification accuracy 23 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Adversarial Experiments • Observe the effect of adding “Original” segment of Piggybacking on classification accuracy 24 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Adversarial Experiments • Observe the effect of adding “Original” segment of Piggybacking on classification accuracy 25 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Conclusion • Trojans appear to be most popular malware type • Adware is the go-to model for repackaging • Repackaging is losing popularity • Malicious apps continue to bypass Google Play’s safeguards 27 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Conclusion (cont'd) • AMD is 5-6 years younger than Malgenome • Yet, apps from Malgenome are still out there! • Malware authors prefer re-using/building on older malware • Five years to use a dataset for training? 28 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Conclusion (cont'd) • Already answered that in the detection experiments. • Adware most challenging to detect = Ambiguous nature • Binary-labeling problem? What are the alternatives? 29 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Conclusion (cont'd) • In what we called as “adversarial setting” • Effectively circumvent app vetting safeguards (especially ML-based ones) • Repackaging benign apps used during training 30 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Thank You Any questions? 31

How it all began • Working on a solution based on “Active Learning” 32 Alei Salem (TUM) | A-Mobile 2018 | Montpellier, France

Malware Datasets Aleieldin Salem and Alexander Pretschner Technische - PowerPoint PPT Presentation

Poking the Bear: Lessons Learned from Probing Three Android Malware Datasets Aleieldin Salem and Alexander Pretschner Technische Universitt Mnchen Garching bei Mnchen {salem, pretschn @in.tum.de} Montpellier, 04.09.2018 Abstract

Malware Obfuscation Techniques: Packing November 18, 2014 Malware and packing Not packed (20%)

Linux malware presentation @r00tbsd Paul Rascagnres Malware.lu July 2013 @r00tbsd

GOODWARE DRUGS FOR MALWARE: ON-THE-FLY MALWARE ANALYSIS AND CONTAINMENT DAMIANO BOLZONI

Entrapment: Tricking Malware with Transparent, Scalable Malware Analysis Paul Royal

Malware Halting 1. Malware 2. Software diversity Part I: Method Development 3. Computer

Android Malware Analysis on Attacks and Defense Android malware Android malware With the

Malware What is malware? Malware: malicious software worm ransomware adware

On Static Malware Detection Tayssir Touili LIPN, CNRS & Univ. Paris 13 Motivation: Malware

Android Malware Adventures Mert Can Cokuner Krat Ouzhan Aknc Android Malware

Malware What is malware? Malware: malicious software worm ransomware adware

Impeding Automated Malware Analysis with Environment-sensitive Malware Chengyu Song , Paul Royal

StealthWare Social Engineering Malware Running malware for Social Engineering and Covert

Tien Phan Malware Manipulation 2019-08-26 2 Pokemon Fusion Con - Fusion Malicious Malware

FIGHTING MALWARE WITH MACHINE LEARNING Edward Raff Jared Sylvester Mark McLean Need ML for

Visiting the snake nest Recon Brussels 2018 Jean-Ian Boutin | Senior Malware Researcher Matthieu

Research: Threat Intelligence & Malware Infrastructures Andrea Lanzi: andrea.lanzi@unimi.it

The 4 th Competition on Syntax-Guided Synthesis Rajeev Alur, Dana Fisman, Rishabh Singh and

Why Im NOT Why Im NOT Why Im NOT a Jew... Or Am I? Why Im NOT a Jew... Or Am

Compassion Fatigue, Burnout, and The Strengths-Based Workplace Robert O. Phillips, D.BH Indian

Welcome to P5 Networking Session 21 Jan 2017 CHIJ Our Lady of the Nativity Outline Student

Writing and Identifying Statements and Exclamations Teaching Input Types of Sentences There are

Measuring model performance or error Introduction to Machine Learning Is our model any good?

Around the World Unit 8 Lesson 11 What will we be learning this class? A look into a new

cse 311: foundations of computing Spring 2015 Lecture 10: Functions, Modular arithmetic

Malware Datasets Aleieldin Salem and Alexander Pretschner Technische - PowerPoint PPT Presentation

Poking the Bear: Lessons Learned from Probing Three Android Malware Datasets Aleieldin Salem and Alexander Pretschner Technische Universitt Mnchen Garching bei Mnchen {salem, pretschn @in.tum.de} Montpellier, 04.09.2018 Abstract

Malware Obfuscation Techniques: Packing November 18, 2014 Malware and packing Not packed (20%)

Linux malware presentation @r00tbsd Paul Rascagnres Malware.lu July 2013 @r00tbsd

GOODWARE DRUGS FOR MALWARE: ON-THE-FLY MALWARE ANALYSIS AND CONTAINMENT DAMIANO BOLZONI

Entrapment: Tricking Malware with Transparent, Scalable Malware Analysis Paul Royal

Malware Halting 1. Malware 2. Software diversity Part I: Method Development 3. Computer

Android Malware Analysis on Attacks and Defense Android malware Android malware With the

Malware What is malware? Malware: malicious software worm ransomware adware

On Static Malware Detection Tayssir Touili LIPN, CNRS &amp; Univ. Paris 13 Motivation: Malware

Android Malware Adventures Mert Can Cokuner Krat Ouzhan Aknc Android Malware

Malware What is malware? Malware: malicious software worm ransomware adware

Impeding Automated Malware Analysis with Environment-sensitive Malware Chengyu Song , Paul Royal

StealthWare Social Engineering Malware Running malware for Social Engineering and Covert

Tien Phan Malware Manipulation 2019-08-26 2 Pokemon Fusion Con - Fusion Malicious Malware

FIGHTING MALWARE WITH MACHINE LEARNING Edward Raff Jared Sylvester Mark McLean Need ML for

Visiting the snake nest Recon Brussels 2018 Jean-Ian Boutin | Senior Malware Researcher Matthieu

Research: Threat Intelligence &amp; Malware Infrastructures Andrea Lanzi: andrea.lanzi@unimi.it

The 4 th Competition on Syntax-Guided Synthesis Rajeev Alur, Dana Fisman, Rishabh Singh and

Why Im NOT Why Im NOT Why Im NOT a Jew... Or Am I? Why Im NOT a Jew... Or Am

Compassion Fatigue, Burnout, and The Strengths-Based Workplace Robert O. Phillips, D.BH Indian

Welcome to P5 Networking Session 21 Jan 2017 CHIJ Our Lady of the Nativity Outline Student

Writing and Identifying Statements and Exclamations Teaching Input Types of Sentences There are

Measuring model performance or error Introduction to Machine Learning Is our model any good?

Around the World Unit 8 Lesson 11 What will we be learning this class? A look into a new

cse 311: foundations of computing Spring 2015 Lecture 10: Functions, Modular arithmetic

On Static Malware Detection Tayssir Touili LIPN, CNRS & Univ. Paris 13 Motivation: Malware

Research: Threat Intelligence & Malware Infrastructures Andrea Lanzi: andrea.lanzi@unimi.it