Memory Categorization Separating Attacker-Controlled Data Matthias Neugschwandtner Alessandro Sorniotti Anil Kurmus IBM Research - Zurich 16th Conference on Detection of Intrusions and Malware & Vulnerability Assessment; Gothenburg, June 19-20
Memory Safety - Approaches ● Ensure temporal and spatial memory safety ○ managed runtimes (Java) ○ native code (SoftBounds) ○ hardware support (MPX) ● Mitigate memory violations ○ control flow integrity ○ data flow integrity ● Runtime checks cause overhead ○ optimizations for performance-critical code ■ ASAP, SplitKernel, PartiSan, BinRec ● Optimize based on data! Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 2
Memory Categorization Attacker-Controlled Data Non Attacker-Controlled Data ● Untrusted data ● Program internal data ○ Input read from Network ○ Memory addresses ● Trusted data ○ Cryptographic material ○ Configuration read from disk ● Separate AC from nAC data ● Attacker only has access to their own data ● Loose form of memory safety by itself ● Enables mitigations based on selective hardening Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 3
Memory Categorization I. Provide separate allocators II. Categorize decide which allocator should be used III. Instrument implement decision in program Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 4
Separate Allocators ● Stack allocators ○ nAC and AC allocators ● Heap allocators ○ nAC and AC allocators ○ “mixed” allocator ■ Complex data structures (list item: metadata + content, packet: header + payload) ■ Custom memory managers (single large allocated chunk of memory) ● Allocation sites ○ Location where allocator is invoked ○ Stack allocations ■ limited in scope to current function → intraprocedural ○ Heap allocations ■ long(er)-lived ■ depends on calling context → interprocedural ■ allocation wrappers, e.g. xmalloc() Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 5
Label Allocation Sites I. Identify AC data sources 1 char ∗ cmalloc (int sz) { 2 if (sz == 0) return NULL; II. Track pointers backwards 3 return (char ∗ )malloc(sz); 4 } III. Find allocation sites 5 int main (int argc, char ∗∗ argv) { 6 int fd = open (argv[1], O_RDONLY); 7 char ∗ buf = cmalloc(10); 8 read(fd, buf, 10); 9 } AC allocation site Context: 7, 3 Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 6
Static Analysis ● Andersen’s points-to analysis ○ field-sensitive, but context- and flow insensitive ○ field-sensitivity required for structs and classes with both AC & nAC fields ○ “partitioning” for SVF ● Sparse Value-Flow analysis ○ produces mSSA (memory single-static-assignment) form of the program ○ pointer dereference (load of address-taken variable) = USE ○ pointer assignment (store of address-taken variable) = DEF + USE ○ function callsite (for function operating on address-taken variable) = (DEF +) USE ● Sparse Value-Flow-Graph ○ combines SSA and mSSA to an interprocedural flow graph ○ nodes = variable definitions ○ edges = value flow dependencies ● Context-sensitive backward traversal through VFG SVF: https://llvm.org/devmtg/2016-03/Presentations/SVF_EUROLLVM2016.pdf Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 7
Dynamic Analysis ● Fills in gaps of static analysis ○ e.g., because of dynamically loaded code, limits of points-to analysis ○ limited to heap allocations ● Intercept allocators ○ unwind call stack to obtain context information ○ allocate memory on “limbo” heap, annotate with context ● Intercept memory access ○ write access to limbo heap ○ categorize allocation context of corresponding memory region based on access Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 8
MemCat Compiler Pass ● Clang/LLVM LTO compiler pass 1 char ∗ cmalloc (int sz) { 2 if (sz == 0) return NULL; ● Client for SVF 3 return (char ∗ )malloc(sz); 4 } ○ constructs value-flow-graph LLVM IR ○ value flows 5 void A () { ■ direct: top-level pointers 6 int fd = open(...); value flow ■ indirect: address-taken pointers 7 char ∗ buf = cmalloc(10); analysis 8 read(fd, buf, 10); ■ interprocedural 9 } 10 void B(char *foo) { 11 char *tmp = cmalloc(20); value flow graph 3 7 8 12 strcpy(tmp, foo); 13 } 11 12 Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 9
MemCat Compilation Pass ● Look for AC data sources ○ source function return values / output parameters ○ e.g., fgetc, fgets, fread, fscanf, pread, LLVM IR read, recv* ● VFG traversal AC data source ○ start from node representing source value flow configuration analysis ○ worklist-style backward traversal ○ label encountered allocation sites ■ flag stack allocations ■ record context for heap allocations value flow graph AC allocation sites graph traversal Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 10
MemCat Compilation Pass ● Stack ○ rewrite allocations ○ safestack implementation ● Heap LLVM IR ○ split basic blocks at contexts’ return sites to be able to reference them at IR level ○ embed context in IR and available at AC data source value flow runtime configuration analysis rewrite static allocations value flow graph AC allocation sites categorized IR graph traversal embed dynamic context Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 11
MemCat Runtime ● Read categorized allocation sites from the binary ● Intercept allocators ○ site known → serve memory from corresponding heap ○ site not known → serve from limbo heap ● Intercept limbo heap writes ○ categorize based on data source (code) that is writing Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 12
MemCat Runtime ● Modified ptmalloc2 ○ providing three arena pools ○ hardened allocator based on mmap + guard pages, mitigates ■ uninitialized data leaks ■ linear buffer overflows ■ double free ● Identifying context ○ stack unwinding, depth configurable ○ 8-byte context hash for fast matching ○ categorization cached across runs on disk Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 13
MemCat Runtime - Limbo Heap ● Limbo heap ○ read-only memory mappings ○ trap on access ■ remove protection ■ re-execute faulting instruction ■ categorize ■ reprotect ● Categorization termination heuristics ○ stop at program termination ○ stop after N writes ○ stop as soon as all bytes have been written ■ special handling of memset and bzero Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 14
MemCat Runtime - Indirect Categorization ● Intercept AC data sources ○ keep record of caller and targeted memory region ● additional check on limbo heap traps: ○ if caller in a record is part of the context AND ○ memory source matches record THEN ○ inherit categorization of the original record Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 15
Evaluation - Use Cases Vulnerability Type Program Categorization CVE-2012-0920 use-after-free Dropbear AC CVE-2014-0160 buffer overread OpenSSL mixed CVE-2016-6309 use-after-free OpenSSL AC CVE-2016-3189 use-after-free bzip2 AC Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 16
Evaluation - Dropbear ● Small SSH server, part of busybox ● CVE-2012-0920 ○ use-after free ○ allows for RCE by removing limitation on char ∗ forced_command ● MemCat ○ configured to consider read() from network as AC ○ categorizes 4 allocation sites connected to read_packet() as AC at compile time ○ 3 allocation sites categorized at runtime as mixed ○ mitigates vulnerability because forced_command allocation resides on nAC heap Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 17
Evaluation - OpenSSL ● CLI tool in server mode, perform TLS 1.2 handshake ○ performs all relevant operations (key agreement, hashing and (asymmetric) encryption, record parsing and I/O handling) ● MemCat compile time ○ 22 data sources providing AC input ○ Stack: 551 out of 3648 allocations AC ○ Heap: 1724 allocation sites AC ● MemCat runtime ○ categorization ■ 1st handshake: 1967 limbo, 5 AC, 38 mixed ■ 2nd handshake: 4 limbo, 5 AC, 39 mixed ○ 2.3% performance overhead on 2nd handshake Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 18
Evaluation - OpenSSL ● CVE-2016-6309 use-after-free ○ reallocation of the message-receive buffer leaves dangling pointers ○ allocation is AC → UAF limited to AC heap data (or entirely prevented) ● CVE-2014-0160 buffer overread (Heartbleed) ○ receive buffer is on AC heap → limited to AC (or entirely prevented) Memory Categorization - M. Neugschwandtner, A. Sorniotti, A. Kurmus - DIMVA 2019 19
Recommend
More recommend