guarding vulnerable code module 1 sanitization
play

Guarding Vulnerable Code: Module 1: Sanitization Mathias Payer, - PowerPoint PPT Presentation

Guarding Vulnerable Code: Module 1: Sanitization Mathias Payer, Purdue University http://hexhive.github.io 1 Vulnerabilities everywhere? 2 Common Languages: TIOBE18 Jul 2018 Jul 2017 Change Language Ratings Change 1 1 Java


  1. Guarding Vulnerable Code: Module 1: Sanitization Mathias Payer, Purdue University http://hexhive.github.io 1

  2. Vulnerabilities everywhere? 2

  3. Common Languages: TIOBE’18 Jul 2018 Jul 2017 Change Language Ratings Change 1 1 Java 16.139% +2.37% 2 2 C 14.662% +7.34% 3 3 C++ 7.615% +2.04% 4 4 Python 6.361% +2.82% 5 7 + VB .NET 4.247% +1.20% 6 5 - C# 3.795% +0.28% 7 6 - PHP 2.832% -0.26% 8 8 JavaScript 2.831% +0.22% 9 - ++ SQL 2.334% +2.33% 10 18 ++ Objective-C 1.453% -0.44% 3

  4. Software is highly complex Google Chrome: 76 MLoC Gnome: 9 MLoC Xorg: 1 MLoC glibc: 2 MLoC Linux kernel: 17 MLoC Low-level languages (C/C++) trade type safety and memory safety for performance 4

  5. Defense: Testing vs. Mitigations Software Testing Mitigations ● Discover bugs ● Stop exploitation ● Development tool ● Always on ● Result oriented ● Low overhead 5

  6. Memory Corruption 6

  7. Memory error: invalid dereference Dangling pointer: free (foo); (temporal) *foo = 23; Out-of-bounds pointer: char foo[ 40 ]; (spatial) foo[ 42 ] = 23; Violation iff: pointer is read, written, or freed 7

  8. Type Confusion 8

  9. Type confusion through downcasts Base Greeter Exec Greeter *g = new Greeter(); Exec *e = static_cast<Exec*>(b); √ Base *b = static_cast<Base*>(g); X 9

  10. C++ casting operations ● static_cast<ToClass>(Object) – Compile time check – No runtime type information ● dynamic_cast<ToClass>(Object) – Runtime check – Requires Runtime Type Information (RTTI) – Not used in performance critical code 10

  11. Static cast Base *b = …; a = static_cast<Greeter*>(b); movq -24(%rbp), %rax # Load pointer # Type “check” movq %rax, -40(%rbp) # Store pointer 11

  12. Dynamic cast (O2) Base *b = …; a = dynamic_cast<Greeter*>(b); leaq _ZTI7Greeter(%rip), %rdx leaq _ZTI4Base(%rip), %rsi xorl %ecx, %ecx movq %rbp, %rdi # Load pointer call __dynamic_cast@PLT # Type check 12

  13. Type confusion Gptr vtable*? class Base { Bptr x int x; }; y? class Greeter: Base { int y; vtable* virtual void Hi(); }; B G x … y Base *Bptr = new Base(); Greeter *Gptr; Gptr = static_cast<Greeter*>Gptr; // Type Conf Gptr->y = 0x43; // Memory safety violation! Gptr->Hi(); // Control-flow hijacking 13

  14. Type Confusion Demo 14

  15. C++ virtual dispatch class Base { … }; class Exec: public Base { Base public : virtual void exec( char *prg) { system(prg) ; } Greater Exec }; class Greeter: public Base { public : virtual void sayHi( char *str) { std::cout << str << std::endl; } }; Greeter *greeter = new Greeter(); greeter->sayHi("Oh, hello there!"); 15

  16. Simple exploitation demo GreeterT int main() { Base *b1 = new Greeter(); Base *b2 = new Exec(); Greeter *g; b1 vtable* g = static_cast <Greeter*>(b1); g->sayHi( "Greeter says hi!" ); // g[0][0](str); g = static_cast <Greeter*>(b2); g->sayHi( "/usr/bin/xcalc" ); // g[0][0](str); delete b1; delete b2; b2 vtable* return 0; } ExecT 16

  17. Sanitization 17

  18. Problem: broken abstractions? C/C++ void log( int a) { printf("Log: "); printf("%d", a); } void (* fun )( int ) = &log; void init() { fun(15); } ASM log: ... fun : .quad log init: ... movl $15, %edi movq fun(%rip), %rax call *%rax 18

  19. LLVM Sanitization ● Test cases detect bugs through assertions, segmentation faults, traps, exceptions ● Enforce stronger policies during testing! – Address Sanitizer: memory safety – Leak Sanitizer: memory leaks – Memory Sanitizer: uninitialized memory – UBSan: undefined behavior – Thread Sanitizer: data races – HexVASAN: variadic argument checker – HexType: type safety 19

  20. Type Safety 20

  21. Type confusion detection* ● A static cast is checked only at compile time – Fast but no runtime guarantees ● Dynamic casts are checked at runtime – High overhead, limited to polymorphic classes ● HexType design: – Conceptually check all casts dynamically – Aggressively optimize design and implementation * TypeSanitizer: Practical Type Confusion Detection. Istvan Haller, Yuseok Jeon, Hui Peng, Mathias Payer, Herbert Bos, Cristiano Giuffrida, Erik van der Kouwe. In CCS'16 * HexType: Efficient Detection of Type Confusion Errors for C++. Yuseok Jeon, Priyam Biswas, Scott A. Carr, Byoungyoung Lee, and Mathias Payer. In CCS'17 21

  22. Making type checks explicit ● Enforce runtime check at all cast sites – static_cast<ToClass>(Object) – dynamic_cast<ToClass>(Object) – reinterpret_cast<ToClass>(Object) – (ToClass)(Object) ● Build global type hierarchy ● Keep track of the allocation type of each object – Must instrument all forms of allocation – Requires disjoint metadata 22

  23. HexType: design Source Instrumentation code (Type casting verification) HexType Clang Binary HexType Type Hierarchy Runtime Information Library LLVM Pass Link 23

  24. HexType: aggressive optimization ● Limit tracing to unsafe types – Remove tracing of types that are never cast ● Limit checking to unsafe casts – Remove statically verifiable casts ● No more RTTI for dynamic casts – Replace dynamic casts with fast lookup 24

  25. Demo Time! 25

  26. HexType coverage 26

  27. Newly discovered bugs ● Discovered seven new vulnerabilities: Apache Xerces C++ Qt base library DOMNode QMapNode Base DOM DOM Character Element Data QMapNode DOM DOM Text ElementImpl Type DOM Confusion! TextImpl 27

  28. Sanitizer Summary: Type Safety ● Type confusion fundamental in today’s exploits ● Existing sanitizers are incomplete, partial, slow ● HexType – (Almost) full coverage (2-6x increase) – Reasonable overhead (SPEC CPU: 0-32x improvement, Firefox: 0-0.5x slowdown) – Future work: remaining coverage, optimizations 28

  29. T-Fuzz 29

  30. Fuzzing Challenges ● Challenges Shallow code paths Shallow code paths – Shallow coverage start – Hard to find “deep” bugs Deep code paths Deep code paths check1 ● Root cause check2 – Fuzzer-generated inputs cannot bypass complex check3 sanity checks in the target program bug – Existing work limits itself to input generation end 30

  31. T-Fuzz: Fuzz the Program! ● Option 1: generate input to bypass checks by heavy-weight program analysis techniques – Driller (concolic analysis) – VUzzer (dynamic taint analysis) ● Our idea: remove program’s sanity checks – Checks filter orthogonal input, e.g., magic values, checksum, or hashes (Non-Critical Check, NCC) – Insight: removing NCCs is safe if ( strncmp (hdr, “ELF", 3) == 0) { // main program logic } else { error (); } 31

  32. Design and Implementation ● Fuzzer generates inputs Transformed Programs ● When “stuck” – Detect NCCs* Inputs Fuzzer Program ● Transform program (e.g. AFL) Transformer Crashing ● Verify crashes inputs Bug Reports Crash Analyzer False Positjves *Approximation of NCCs: edged in the CFG connecting covered/uncovered nodes 32

  33. Detecting NCC’s ● Approximate NCCs as edges connecting covered and uncovered nodes in CFG – Over approximate, may contain false positives – Lightweight and simple to implement 33 Covered Node Uncovered Node NCC Candidates 33

  34. Program Transformation start ● Our approach: negate NCCs – Simple: static binary rewriting A == B – Zero runtime overhead in True branch False branch resulting target program – Unchanged CFG end – Trace in transformed program maps to original program start Negated Check – Path constraints of original A != B program can be recovered True branch False branch 34 end 34

  35. Comparison to Symbolic Executoion ● Explores all code paths, tracks constraints ● Path explosion, e.g., loops ● Each branch doubles the number of code paths ... ... ... ... ● Resource requirement ... ● Theoretically beautiful, limited scalability ( Path 1 , ( Path n , ... constraint set 1 ) constraint set n ) 35

  36. Comparison to Concolic Execution ● Guided by concrete inputs input ● Follows single code path, Not C1 C1 collects constraints for new code paths ● Reduced resource requirements ... ... ... ... ● Still an exponential number ... of paths to explore! 36

  37. Comparison to Driller (Fuzz & CE) ● Fuzzing until coverage wall ● When fuzzing gets “stuck”, Fuzzer concolic execution explores mutating SE & constraint solving new code paths using fuzzer generated inputs Inputs ● Limitations target – “SE & constraints solving” slows program down fuzzing – Not able to bypass “hard” checks Crashes 37

  38. T-Fuzz: fuzz first, solve only crashes ● Fuzzing/SE decoupled ● SE only applied to T-Fuzz detected crashes Program ● For “hard” checks, Fuzzer Transformation T-Fuzz detects the guarded bug, but program cannot verify it SE & constraints solving Crashes T-Fuzz in action 38

  39. Evaluation ● Implementation – Fuzzer: shellphish fuzzer (python wrapper of AFL) – Program Transformer: angr tracer, radare2 – Crash Analyzer: 2k LoC Python hackery ● Evaluation – DARPA CGC dataset – LAVA-M dataset – 4 real-world programs 39

Recommend


More recommend