automatically evading classifiers
play

Automatically Evading Classifiers A Case Study on PDF Malware - PowerPoint PPT Presentation

Automatically Evading Classifiers A Case Study on PDF Malware Classifiers Weilin Xu David Evans Yanjun Qi University of Virginia Machine Learning is Solving Our Problems Fake Fake Spam IDS


  1. Automatically Evading Classifiers A Case Study on PDF Malware Classifiers Weilin Xu David Evans Yanjun Qi University of Virginia

  2. Machine Learning is Solving Our Problems … Fake Fake … Spam IDS Malware Accounts 2

  3. 3

  4. 4

  5. Machine Learning is Eating the World ? Data Security Expert Scientist 5

  6. Machine Learning is Eating the World Data Security No! Expert Security is different. Scientist 6

  7. Security Tasks are Different: Adversary Adapts Goal : Understand classifiers under attack. Results : Vulnerable to automated evasion. 7

  8. Building Machine Learning Classifiers Training (Supervised Learning) Vectors Labelled Trained Classifier ML Feature Training Algorithm Extraction Data 8

  9. Assumption: Training Data is Representative Deployment Training (Supervised Learning) Operational Data Vectors Labelled Trained Classifier ML Feature Training Algorithm Extraction Data Malicious / Benign 9

  10. Results: Evaded PDF Malware Classifiers PDFrate* Hidost [ACSAC’12] [NDSS’13] 0.9976 0.9996 Accuracy 0.0000 0.0056 False Negative Rate False Negative Rate 1.0000 1.0000 with Adversary * Mimicus [Oakland ’14], an open source reimplementation of PDFrate. 10

  11. Results: Evaded PDF Malware Classifiers Very robust against “strongest conceivable mimicry attack”. PDFrate* Hidost [ACSAC’12] [NDSS’13] 0.9976 0.9996 Accuracy 0.0000 0.0056 False Negative Rate False Negative Rate 1.0000 1.0000 with Adversary * Mimicus [Oakland ’14], an open source reimplementation of PDFrate. 11

  12. Automated Evasion Approach Based on Genetic Programming Malicious PDF Benign PDFs Variants ✓ 01011001101 ✓ Variants ✗ ✓ Select Mutation Clone Variants Variants 12

  13. Automated Evasion Approach Based on Genetic Programming Extract Me If You Can: Abusing PDF Parsers in Malware Detectors Curtis Carmony,et al. /Pages /Catalog Malicious PDF Benign PDFs Modified /Root Variants Parser 0 ✓ 01011001101 ✓ /JavaScript Variants ✗ ✓ eval(‘…’); Select Mutation Clone Variants Variants 13

  14. Automated Evasion Approach Mutation Based on Genetic Programming /Pages /Catalog 0 /Root /JavaScript Malicious PDF Benign PDFs Variants eval(‘…’); From Variants ✓ 01011001101 ✓ Variants Benign ✗ ✓ Insert / Replace / Delete Select Mutation Clone Variants Variants 14

  15. Automated Evasion Approach Mutation Based on Genetic Programming /Pages /Catalog 0 0 0 /Root 128 /JavaScript Malicious PDF Benign PDFs 546 Variants eval(‘…’); From Variants ✓ 01011001101 ✓ Variants Benign ✗ ✓ Insert / Replace / Delete Select Mutation Clone Variants Variants 15

  16. Automated Evasion Approach Mutation Based on Genetic Programming /Pages /Catalog 0 0 0 /Root 128 /JavaScript 546 Malicious PDF Benign PDFs Variants eval(‘…’); From Variants ✓ 01011001101 ✓ Variants Benign ✗ ✓ Insert / Replace / Delete Select Mutation Clone Variants Variants 16

  17. Automated Evasion Approach Mutation Based on Genetic Programming /Pages /Catalog 0 0 0 0 /Root 128 128 /JavaScript 546 Malicious PDF Benign PDFs Variants eval(‘…’); From Variants ✓ 01011001101 ✓ Variants Benign ✗ ✓ Insert / Replace / Delete Select Mutation Clone Variants Variants 17

  18. Automated Evasion Approach Mutation Based on Genetic Programming /Pages /Catalog 0 0 /Root 128 /JavaScript Malicious PDF Benign PDFs Variants eval(‘…’); From Variants ✓ 01011001101 ✓ Variants Benign ✗ ✓ Insert / Replace / Delete Select Mutation Clone Variants Variants 18

  19. Automated Evasion Approach Mutation Based on Genetic Programming /Pages /Catalog 0 0 /Root 128 /JavaScript Malicious PDF Benign PDFs Variants eval(‘…’); From Variants ✓ 01011001101 ✓ Variants Benign ✗ ✓ Insert / Replace / Delete Select Mutation Clone Variants Variants 19

  20. Automated Evasion Approach Based on Genetic Programming Malicious PDF Benign PDFs Variants ✓ 01011001101 ✓ Variants ✗ ✓ Select Mutation Clone Variants Variants 20

  21. Automated Evasion Approach Based on Genetic Programming Malicious? Malicious PDF Benign PDFs Variants Oracle f(x) Variants Fitness Score ✓ 01011001101 ✓ Variants Fitness Function ✗ ✓ Score Select Mutation Clone Target Classifier Variants Variants 21

  22. Automated Evasion Approach Based on Genetic Programming Malicious Malicious? Malicious PDF Benign PDFs Variants Oracle f(x) Variants Fitness Score ✓ 01011001101 ✓ Variants Benign Fitness Function ✗ ✓ Score Select Mutation Clone Target Classifier Variants Variants 22

  23. Automated Evasion Approach Based on Genetic Programming Malicious PDF Benign PDFs Variants ✓ 01011001101 ✓ Variants ✗ ✓ Select Mutation Clone Variants Variants 23

  24. Results: Evaded PDFrate 100% Original Malware Seeds 24

  25. Results: Evaded PDFrate 100% Original Malware Seeds Evasive Variants 25

  26. Evaded PDFrate with Adjusted Threshold Original Malware Seeds Evasive Variants Evasive Variants with lower threshold 26

  27. Results: Evaded Hidost 100% Original Malware Seeds 27

  28. Results: Evaded Hidost 100% Original Malware Seeds Evasive Variants 28

  29. Results: Accumulated Evasion Rate Difficulty varies by seed Simple mutations often work Complex mutations sometimes needed. Difficulty varied by targets: PDFrate: 6 days to evade all Hidost: 2 days to evade all 29

  30. Cross-Evasion Effects Hidost 387/500 Evasive PDFrate (77.4%) Evasive 3/500 Evasive PDF Malware PDF Malware (0.6%) Seeds (against Hidost) Automated Evasion Gmail’s classifier is secure? 30

  31. Cross-Evasion Effects Hidost 387/500 Evasive PDFrate (77.4%) Evasive 3/500 Evasive PDF Malware PDF Malware (0.6%) Seeds (against Hidost) Automated Evasion Gmail’s classifier is secure? different. 31

  32. Evading Gmail’s Classifier Evasion rate on : 135/380 (35.5%) 32

  33. Evading Gmail’s Classifier Evasion rate on : 179/380 (47.1%) 33

  34. Conclusion Vs. Who will win this arm race? Source Code: http://EvadeML.org 34

Recommend


More recommend