security through multi layer diversity
play

Security through Multi-Layer Diversity Meng Xu (Qualifying - PowerPoint PPT Presentation

Security through Multi-Layer Diversity Meng Xu (Qualifying Examination Presentation) 1 Bringing Diversity to Computing Monoculture Current computing monoculture leaves our infrastructure vulnerable to massive and rapid attacks. Knowing


  1. Security • RIPE Benchmark Config Succeed Probabilistic Failed Not possible 114 16 720 2990 Default 8 0 842 2990 AddressSanitizer 8 0 842 2990 Bunshin • Real-world CVEs Config CVE Exploits Sanitizer Detect 2013-2028 Blind ROP AddressSanitizer nginx-1.4.0 2016-5636 Integer overflow AddressSanitizer cpython-2.7.10 2015-4602 Type confusion AddressSanitizer php-5.6.6 2014-0160 Heartbleed AddressSanitizer openssl-1.0.1a 2014-3581 Null dereference UndefinedBehaviorSanitizer httpd-2.4.10 44

  2. Performance Benchmark Items Strict-Lockstep Selective-Lockstep Max 17.5% 14.7% SPEC CPU2006 Min 1.6% 1.0% (19 Programs) Ave 8.6% 5.6% Max 21.4% 18.9% SPLASH-2X / PARSEC Min 10.7% 6.6% (19 Programs) Ave 16.6% 14.5% lighttpd Ave 1.44% 1.21% 1MB File Request nginx Ave 1.71% 1.41% 1MB File Request 45

  3. Performance Highlights • Low overhead (5% - 16%) for standard benchmarks • Negligible overhead (<= 2%) for server programs • Extra cost of ensuring weak determinism is 8% • Selective-lockstep saves around 3% overhead 46

  4. Scalability - Number of Variants Ave Max Min 37.6 Sync Overhead (%) 20.9 17.2 11.4 11.2 10.5 6.6 4.4 1.7 0.6 0.5 0 2 4 6 8 Number of variants 47

  5. Scalability - Number of Variants Ave Max Min 37.6 Sync Overhead (%) The number of variants Bunshin can 20.9 17.2 support with a reasonable overhead depends on machine configurations 11.4 11.2 10.5 and program characteristics. 6.6 4.4 1.7 0.6 0.5 0 2 4 6 8 Number of variants 48

  6. Scalability - System Load Ave Max Min 13 9.7 Sync Overhead (%) 6.6 6.4 4.8 2.2 1.9 0.8 0.2 2% 50% 99% Number of variants 49

  7. Scalability - System Load Ave Max Min 13 9.7 Sync Overhead (%) 6.6 6.4 Bunshin works well in all levels of system load 4.8 (i.e., Bunshin does not require exclusive cores) 2.2 1.9 0.8 0.2 2% 50% 99% Number of variants 50

  8. Check Distribution - ASan 107 107 Overhead (%) Overhead (%) 65.6 63 57.4 43.1 37.2 34.8 34.9 Whole V1 V2 Bunshin Whole V1 V2 V3 Bunshin 51

  9. Sanitizer Distribution - UBSan 228 228 Overhead (%) Overhead (%) 129 125 124 94.5 88 78.7 77.2 Whole V1 V2 Bunshin Whole V1 V2 V3 Bunshin 52

  10. Unifying LLVM Sanitizers ASan MSan UBSan Bunshin 248 246 208 207 205 191 189 177 172 Overhead (%) 165 158 148 141 116 112 98.9 gobmk povray h264ref average 53

  11. Unifying LLVM Sanitizers ASan MSan UBSan Bunshin 248 246 208 207 205 191 189 177 172 Overhead (%) With an average of 5% more slowdown, 165 158 Bunshin can seamlessly unify all three 148 141 LLVM sanitizers 116 112 98.9 gobmk povray h264ref average 54

  12. Limitations and Future Work • Finer-grained check distribution • Sanitizer integration • Record-and-replay 55

  13. Conclusion • It is feasible to achieve both comprehensive protection and high throughput with an N-version system • Bunshin is e ff ective in reducing slowdown caused by sanitizers • 107% → 47.1% for ASan, 228% → 94.5% for UBSan • Bunshin can seamlessly unify three LLVM sanitizers with 5% extra slowdown https://github.com/sslab-gatech/bunshin (Source code will be released soon) 56

  14. Enhance System Security Through Diversity Input Virtualization Input MSan UBSan ASan Bunshin (ATC’17) HHVM Zend Zend JPHP Future work Linux Linux Windows MacOS PlatPal (Security’17) Software Variant 1 Variant 2 Variant 3 Stack Output Synchronize Execution & Consolidate Outputs Output 57

  15. PlatPal: Detecting Malicious Documents with Platform Diversity Meng Xu and Taesoo Kim Georgia Tech Presented at the 2017 USENIX Security Symposium (Security’17) 58

  16. Malicious Documents On the Rise 59

  17. 60

  18. 61

  19. Adobe Components Exploited Element parser JavaScript engine 137 CVEs in 2015 Font manager 227 CVEs in 2016 System dependencies 62

  20. Maldoc Formula Flexibility of doc spec More opportunities A large attack surface to profit Less caution from users 63

  21. Battle against Maldoc - A Survey Category Focus Work Year Detection JavaScript PJScan 2011 Lexical analysis JavaScript Vatamanu et al. 2012 Token clustering JavaScript Lux0r 2014 API reference classification JavaScript MPScan 2013 Shellcode and opcode sig Static Metadata PDF Malware Slayer 2012 Linearized object path Metadata Srndic et al. 2013 Hierarchical structure Metadata PDFrate 2012 Content meta-features Both Maiorca et al. 2016 Many heuristics combined JavaScript MDScan 2011 Shellcode and opcode sig JavaScript PDF Scrutinizer 2012 Known attack patterns JavaScript ShellOS 2011 Memory access patterns Dynamic JavaScript Liu et al. 2014 Common attack behaviors Memory CWXDetector 2012 Violation of invariants 64

  22. Reliance on External PDF Parser Category Focus Work Year Detection External Parser ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig Yes JavaScript PDF Scrutinizer 2012 Known attack patterns Yes JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors No Memory CWXDetector 2012 Violation of invariants No 65

  23. Reliance on External PDF Parser Category Focus Work Year Detection External Parser ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Parser-confusion attacks Metadata PDFrate 2012 Content meta-features Yes (Carmony et al., NDSS’16) Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig Yes JavaScript PDF Scrutinizer 2012 Known attack patterns Yes JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors No Memory CWXDetector 2012 Violation of invariants No 66

  24. Reliance on Machine Learning Category Focus Work Year Detection Machine Learning ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig No JavaScript PDF Scrutinizer 2012 Known attack patterns No JavaScript ShellOS 2011 Memory access patterns No Dynamic JavaScript Liu et al. 2014 Common attack behaviors No Memory CWXDetector 2012 Violation of invariants No 67

  25. Reliance on Machine Learning Category Focus Work Year Detection Machine Learning ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Automatic classifier evasions Metadata PDFrate 2012 Content meta-features Yes (Xu et al., NDSS’16) Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig No JavaScript PDF Scrutinizer 2012 Known attack patterns No JavaScript ShellOS 2011 Memory access patterns No Dynamic JavaScript Liu et al. 2014 Common attack behaviors No Memory CWXDetector 2012 Violation of invariants No 68

  26. Reliance on Known Attacks Category Focus Work Year Detection Known Attacks ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig Yes Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig Yes JavaScript PDF Scrutinizer 2012 Known attack patterns Yes JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors Yes Memory CWXDetector 2012 Violation of invariants No 69

  27. Reliance on Known Attacks Category Focus Work Year Detection Known Attacks ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig Yes Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes How about zero-day attacks ? Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig Yes JavaScript PDF Scrutinizer 2012 Known attack patterns Yes JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors Yes Memory CWXDetector 2012 Violation of invariants No 70

  28. Reliance on Detectable Discrepancy (between benign and malicious docs) Category Focus Work Year Detection Discrepancy ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Metadata PDFrate 2012 Content meta-features Yes Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig No JavaScript PDF Scrutinizer 2012 Known attack patterns No JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors Yes Memory CWXDetector 2012 Violation of invariants No 71

  29. Reliance on Detectable Discrepancy (between benign and malicious docs) Category Focus Work Year Detection Discrepancy ? JavaScript PJScan 2011 Lexical analysis Yes JavaScript Vatamanu et al. 2012 Token clustering Yes JavaScript Lux0r 2014 API reference classification Yes JavaScript MPScan 2013 Shellcode and opcode sig No Static Metadata PDF Malware Slayer 2012 Linearized object path Yes Metadata Srndic et al. 2013 Hierarchical structure Yes Mimicry and reverse mimicry attacks Metadata PDFrate 2012 Content meta-features Yes (Srndic et al., Oakland’14 and Maiorca et al, AsiaCCS’13) Both Maiorca et al. 2016 Many heuristics combined Yes JavaScript MDScan 2011 Shellcode and opcode sig No JavaScript PDF Scrutinizer 2012 Known attack patterns No JavaScript ShellOS 2011 Memory access patterns Yes Dynamic JavaScript Liu et al. 2014 Common attack behaviors Yes Memory CWXDetector 2012 Violation of invariants No 72

  30. Highlights of the Survey Prior works rely on • External PDF parsers Parser-confusion attacks • Machine learning Automatic classifier evasion • Known attack signatures Zero-day attacks • Detectable discrepancy Mimicry and reverse mimicry 73

  31. Motivations for PlatPal Prior works rely on What PlatPal aims to achieve • External PDF parsers • Using Adobe’s parser • Machine learning • Using only simple heuristics • Known attack signatures • Capable to detect zero-days • Detectable discrepancy • Do not assume discrepancy • Complementary to prior works 74

  32. Motivations for PlatPal Prior works rely on What PlatPal aims to achieve • External PDF parsers • Using Adobe’s parser • Machine learning • Using only simple heuristics • Known attack signatures • Capable to detect zero-days • Detectable discrepancy • Do not assume discrepancy • Complementary to prior works 75

  33. Motivations for PlatPal Prior works rely on What PlatPal aims to achieve • External PDF parsers • Using Adobe’s parser • Machine learning • Using only simple heuristics • Known attack signatures • Capable to detect zero-days • Detectable discrepancy • Do not assume discrepancy • Complementary to prior works 76

  34. Motivations for PlatPal Prior works rely on What PlatPal aims to achieve • External PDF parsers • Using Adobe’s parser • Machine learning • Using only simple heuristics • Known attack signatures • Capable to detect zero-days • Detectable discrepancy • Do not assume discrepancy • Complementary to prior works 77

  35. Motivations for PlatPal Prior works rely on What PlatPal aims to achieve • External PDF parsers • Using Adobe’s parser • Machine learning • Using only simple heuristics • Known attack signatures • Capable to detect zero-days • Detectable discrepancy • Do not assume discrepancy • Complementary to prior works 78

  36. Motivations for PlatPal Prior works rely on What PlatPal aims to achieve • External PDF parsers • Using Adobe’s parser • Machine learning • Using only simple heuristics • Known attack signatures • Capable to detect zero-days • Detectable discrepancy • Do not assume discrepancy • Complementary to prior works 79

  37. A Motivating Example • A CVE-2013-2729 PoC against Adobe Reader 10.1.4 SHA-1: 74543610d9908698cb0b4bfcc73fc007bfeb6d84 80

  38. 81

  39. 82

  40. Platform Diversity as A Heuristic When the same document is opened across different platforms: • A benign document “behaves” the same • A malicious document “behaves” differently 83

  41. Questions for PlatPal • What is a “behavior” ? • What is a divergence ? • How to trace them ? • How to compare them ? 84

  42. PlatPal Basic Setup ? Virtual Machine 1 Virtual Machine 2 Adobe Reader Adobe Reader Windows Host MacOS Host 85

  43. PlatPal Dual-Level Tracing ? Virtual Machine 1 Virtual Machine 2 Adobe Reader Adobe Reader Traces of PDF Internal Tracer Internal Tracer processing Windows Host MacOS Host 86

  44. PlatPal Dual-Level Tracing ? Virtual Machine 1 Virtual Machine 2 Adobe Reader Adobe Reader Traces of PDF Internal Tracer Internal Tracer processing Syscalls Syscalls Impacts on External Tracer External Tracer host platform Windows Host MacOS Host 87

  45. PlatPal Internal Tracer • Implemented as an Adobe Adobe Reader Reader plugin. Internal Tracer • Hooks critical functions and callbacks during the PDF COS object parsing processing lifecycle. PD tree construction Script execution • Very fast and stable across Other actions Adobe Reader versions. Element rendering 88

  46. PlatPal External Tracer Virtual Machine • Implemented based on NtTrace (for Windows) and Dtrace (for Adobe Reader MacOS). Syscalls • Resembles high-level system External Tracer impacts in the same manner as Cuckoo guest agent. Filesystem Program Operations Executions • Starts tracing only after the Network Normal Exit document is loaded into Adobe Activities or Crash Reader. Host Platform 89

  47. PlatPal Automated Workflow PlatPal <file-to-check> Restore Clean Restore Clean Snapshot Snapshot Launch Adobe Launch Adobe Reader Reader Attach External Attach External Tracer Tracer Open PDF Open PDF Drive PDF by Drive PDF by Internal Tracer Internal Tracer Dump Traces Dump Traces Compare Windows VM MacOS VM Traces 90

  48. Evaluate PlatPal • Robustness against benign samples A benign document “behaves” the same ? • E ff ectiveness against malicious samples A malicious document “behaves” di ff erently ? • Speed and resource usages 91

  49. Robustness • 1000 samples from Google search. • 30 samples that use advanced features in PDF standards from PDF learning sites. Divergence Detected ? Sample Type Number of Samples (i.e., False Positive) 966 No Plain PDF 34 No Embedded fonts 32 No JavaScript code 17 No AcroForm 2 No 3D objects 92

  50. Effectiveness • 320 malicious samples from VirusTotal with CVE labels. • Restricted to analyze CVEs published after 2013. • Use the most recent version of Adobe Reader when the CVE is published. 93

  51. Effectiveness Analysis Results of 
 320 Maldoc Samples 24% 11% 65% No Divergence Both Crash Divergence 94

  52. Effectiveness Analysis Results of 
 Breakdown of 77 
 320 Maldoc Samples potentially false positives 24% 25% 47% 3% 11% 65% 26% No Divergence Targets old versions Mis-classified by AV vendor No malicious activity trigerred Unknown 95

  53. Time and Resource Usages Average Analysis Time Breakdown Resource Usages (unit. Seconds) • 2GB memory per running virtual Item Windows MacOS machine. 9.7 12.6 Snapshot restore 0.5 0.6 Document parsing • 60GB disk space for Windows 10.5 5.1 Script execution and MacOS snapshots that each corresponds to one of the 7.3 6.2 Element rendering 6 Adobe Readers versions. 23.7 22.1 Total 96

  54. Evaluation Highlights • Confirms our fundamental assumption in general: benign document “behaves” the same malicious document “behaves” di ff erently • PlatPal is subject to the pitfalls of dynamic analysis i.e., prepare the environment to lure the malicious behaviors • Incurs reasonable analysis time to make PlatPal practical 97

  55. Further Analysis • What could be the root causes of these divergences? 98

  56. Diversified Factors across Platforms Category Factor Windows MacOS Shellcode Creation Memory Management Platform Features 99

  57. Diversified Factors across Platforms Category Factor Windows MacOS Both the syscall number and the register set used to hold Syscall semantics syscall arguments are di ff erent Shellcode Calling convention rcx, rdx, r8 for first 3 args rdi, rsi, rdx for first 3 args Creation Library dependencies e.g., LoadLibraryA e.g. dlopen Memory Management Platform Features 100

Recommend


More recommend