weird machines a model for code reuse attacks
play

Weird machines: a model for code-reuse attacks Sergey Bratus - PowerPoint PPT Presentation

Weird machines: a model for code-reuse attacks Sergey Bratus Rebecca Shapiro Anna Shubina Dartmouth T rust Lab Outline Code re-use: unexpected computation, programming models Containing computation: Coarse intent-based ABI-level


  1. Weird machines: a model for code-reuse attacks Sergey Bratus Rebecca Shapiro Anna Shubina Dartmouth T rust Lab

  2. Outline Code re-use: unexpected computation, programming models Containing computation: Coarse intent-based ABI-level semantics/ region-describing types LangSec: co-design of data & code, via constrained input handlers & input languages

  3. T erminology "Code {re,ab}use" is unexpected computation Classes of attacks are more: they are unexpected programming models Essence of code reuse: code becomes part 
 of an emergent programming model

  4. Input data is the program Strings are programs for regexps (DFAs) T ape is the program for T uring machines "Everything is an interpreter" (Greg Morrisett) "Any complex enough input is indistinguishable from bytecode"

  5. Invisible machines: stack Standard function prologues & epilogues are an automaton distributed through code. data fragments on stack are its programs implements control flow graph Aleph1 > Solar Designer > Newsham > gera > Nergal > ... Return-oriented Programming

  6. Invisible machines: heap Heap management code is a machine, heap metadata its programs "Once upon a free" (Phrack 57:8), 
 "Vudo malloc tricks" (Phrack 58:9) ISA: aa4bmo, chunk->flink->blink = chunk->blink Configured via a series of mallocs: "Heap Feng- shui" (Sotirov 2007), ..., starvation-based machines (Gorenc et al. Recon.cx 2015)

  7. Invisible machines: signals Sigreturn-oriented programming (Bosman & Bos, 2014) "portable shellcode" via sigreturn structs Counterfeit OO-oriented (COOP , 2015) "Interrupt-oriented programming" (T an et al, 2014) "bugdoor" via nesting MSP430 interrupts; fixed- entry, timed-exit "un-gadgets"

  8. Symbol-related machines Dynamic linker (cf. Nergal's RTLD gadget) Ld.so relocation (Shapiro et al, 2013; cf. LOCREATE) ELF relocation entries are T .-c. "bytecode" DWARF exception handler (helpfully a part of most processes) is T .-c. (Oakley et al, 2012) Diff. between execve() & ld.so: 
 "All you need is GOT" (Bangert et al., 29c3)

  9. The weirdest machine (possibly) x86 MMU is T uring-complete on GDT+IDT +TSS+Page T ables (Bangert et al., 2013) Arbitrary computation can be compiled in a combinations of these tables No instruction is successfully dispatched #PF & #DF alternate, acting as clock cycles

  10. The "weird machine" upshot Code re-use/code abuse is possible whenever (meta)data guides code into actions Code re-use likely has an emergent programming model associated with it (a WM) data to drive it need not be ill-formed or corrupt memory

  11. A verification problem

  12. Ab Ovo Proving correctness from axioms, by deductive construction Cf. with construction of types ~ proofs ~ programs

  13. P { Q } R Precondition Code Result

  14. The root of weirdness? Assume P { Q } R holds If P' is not quite right, what will P' do under Q? P

  15. The root of weirdness? What can we make "correct" Q compute 
 by varying P it wasn't verified for? ∆ P ∆ R P ∆ R ∆ P What is " ∆ R" given " ∆ P" for a Q?

  16. Proof-carrying code FTW? "Weird machines in PCC", Vanegue @ 1st IEEE LangSec S&P Workshop, 2014 PCC doesn't capture additional instructions a machine may execute ("divergent machines") Proof-carrying code can execute untrusted computations not captured by proofs

  17. A hypothesis We need "Differential computability": how to easily reason about 
 " ∆ R" given " ∆ P" for a Q We program not with statements {Q} but, implicitly, with tuples P {Q} - but we rarely capture P explicitly. Hence bugs & WMs.

  18. Unforeseen preconditions The "correct" P is rarely obvious e.g. "well formed" =/=> safe (ELF , MMU) Parser differentials ("master key", X.509) P influenced by opinion & idea/model of a system P can't reflect not-yet-discovered threats or state P may be dependent on composition effects!

  19. Constraining Q If Q is sufficiently "constrained", P doesn't have to be so large E.g.: P is "input is a formal language of class X" Question: how can we usefully characterize the Languages Acceptors power of Q? beyond the Chomsky 
 hierarchy of recognizers

  20. Coarse types for code & data intents Control flow enforcement (not quite CFI :) ) ELFbac: Sections are types (with very coarse semantics by data access & flow) "Gostak semantics" (The Gostak distims the doshes) Dependent typing to enforce intended use of data Range dependencies, intent by range

  21. Beyond address ranges A code section's intended accesses are its type "You are what you work with/operate on"

  22. Beyond address ranges A code section's intended accesses are its type "You are what you work with/operate on" SSL app SSL libpng initialization logic RW R R RW W RW Input Output SSL keys buffer buffer

  23. LangSec approach to input Since all input data are programs driving the code, construct input-handing as verifiable recognizer automata Requires regular or context-free languages to avoid undecidability (e.g., in verifying parser equivalence) Verifying input-handlers: big payoff, but underused? Not all bugs are parser bugs, but latest biggest ones sure were! (Heartbleed, GnuTLS Hello, BERserk, ...)

  24. More weird machines?

  25. Code as a "contour/circuit" with a characteristic "frequency response"? How code reacts to periodically injected failure? Systems: resource starvation WMs Networks: packet loss and/or delay What new behavior patterns can be produced? Protocol implementations exposed to induced periodic packet loss/delay

  26. Periodic packet drop vs OpenVPN Blowfish-CBC AES256-CBC DES-CBC RC2-CBC

  27. Thank you IEEE Language-theoretic Security Workshop (LangSec SPW) co-located with IEEE S&P Symposium (San Jose) http://spw14.langsec.org http://spw15.langsec.org

Recommend


More recommend