The Geometry of Innocent Flesh on the Bone Return-into-libc without Function Calls (on the x86) Hovav Shacham CCS ‘07 hovav@cs.ucsd.edu
Technical Background ● Gadget: a short instructions sequence (e.x. pop %edx; ret;) ● Return-Oriented Programming(ROP): a technique by which an attacker can induce arbitrary behavior in a program whose control flow he has diverted, without injecting any code ● Return-into-libc: a buffer overflow attack which causes the vulnerable program to jump to some existing code which is already loaded into the memory
Background: Attacker Model ● Must find some way to subvert the program’s control flow ○ Overwriting a return address on the stack ● Must cause the program to act in the manner of his choosing ○ Injecting code into the process image
Background: Defenses ● Non-executable stack ● W X: Marks all writeable(“W”) locations in a process’ address space as non-executable(“X”) ● Deployment: Linux (via PaX patches); OpenBSD; Windows; OS X; ● Hardware support: Intel “XD” bit, AMD “NX” bit
Background: Existing Problem ● Return-into-libc is considered a limited attack ○ the attacker can execute only straight-line code ■ calling one libc function after another ○ the attacker can be restricted ■ removing certain functions from libc
Building Blocks: Traditional vs New return-into-libc ● Traditional return-into-libc building blocks are functions ○ can be removed by the maintainers of libc ● Our building blocks are short code sequences ○ very difficult to eliminate
Building Blocks We rely on the following: ● x86 instructions are not aligned ○ we can make out more words on the page ● x86 ISA is extremely dense ○ random byte stream can be interpreted as a series of valid instruction with high probability Our goal ➔ Find sequences that end in a return instruction ( c3 ) ◆ ● libc contains many such sequences
Building Blocks: Example f7 c7 07 00 00 00 test $0x00000007, %edi 0f 95 45 c3 setnzb -61(%ebp) Starting one byte later c7 07 00 00 00 0f movl $0x0f000000, (%edi) 95 xchg %ebp, %eax 45 inc %ebp c3 ret
Building Blocks: Defense ● Apply an instruction alignment scheme ( like the one MIPS uses ) Cons ➔ Code compiled for this scheme cannot call libraries not ◆ compiled for this May introduce slowdowns ◆
New return-into-libc Techniques ● Short code sequences ● Calls no functions at all ● The code sequences have random interfaces, unlike function-call interface is standard. ● Called code sequences weren’t placed in libc by the authors, so are not easily removed.
Finding Sequences: Useful Instruction Sequences ● Valid instructions sequences that : ○ could be used on our gadgets ○ end in a ret instruction None of the instructions should cause the processor to transfer execution ➔ away ( not reaching the ret ) ret causes the processor to continue to the next step ➔
Finding Sequences: Recording our Findings ● Any suffix of an instruction sequence is also a useful instruction sequence e.g. If we find "a; b; c; ret" then "b; c; ret" also exist ● We care if some sequence occurs but not how often it does We chose to record sequences on a trie ➔ root of the trie is the ret ➔
Finding Sequences: Producing the Trie The Idea ● Scan backwards from an already found sequence for valid instructions Some sequences ending with ret are ignored ➔ leave; ret; ◆ pop %ebp; ret; ◆ ret; or unconditional jump ◆
Finding Sequences: Implementation
Implementation & Performance Results ➔ Analyzed 1,189,501 bytes of libc's executable segment ◆ ● yielded a trie with 15,121 nodes ● took 1,6 sec on 1.33GHz PowerPC G4 with 1GB RAM
Return-Oriented Programming (ROP) ● Stack pointer (%esp) determines which instruction sequence to fetch & execute ● Processor doesn’t automatically increment %esp; but the “ret” at end of each instruction sequence does
Operations ● Load/Store ● Arithmetic & Logic ○ Loading a Constant ○ Add ○ Loading from Memory ○ Exclusive OR (XOR) ○ Storing to Memory ○ And, Or, Not ○ Shift and Rotate ● Control Flow ● System Calls ○ Unconditional Jump ○ Conditional Jumps
Load a constant into a register
Add Operation
Control Flow: Unconditional Jump
Control Flow: Conditional Jump Strategy: ● Check whether a value is equal to zero by using neg ○ clears CF if equal to zero or sets CF otherwise ● We have a word that contains either esp_delta (if flag is 1) or 0 (if flag is 0) esp_delta is the amount we’d like to perturb %esp by ● Perturb %esp by the computed amount (using esp_delta or 0)
System Calls (1/2) ● Syscalls have simple wrappers in libc with the following behavior: ○ move the arguments from the stack to the registers arguments are loaded in register %ebx , %ecx , %edx , %esi , %edi , %ebp (in this order) ○ set the syscall number in %eax ○ trap into the kernel ○ check for error and translate the return value we can invoke any syscall we want by: ➔ setting up the parameters ourselves ◆ jump into a wrapper that is immediately before lcall ◆
System Calls (2/2)
Return-Oriented Shellcode (1/2) Invokes the execve system call to run a shell. ➢ Requirements: Set the system call index in %eax to 0xb ( execve ) ➔ Set the path of the program to run in %ebx to the string “/bin/sh” ➔ Set the argument vector argv in %ecx ➔ Set the environment vector envp in %edx ➔
Return-Oriented Shellcode (2/2)
Return-Oriented Shellcode (2/2) lacll is invoked with arguments: ● %ebx = “/bin/sh” ● %ecx = addr of argv ● %edx = addr of envp
Catalog of rets: Origin of c3 byte ● Check whether c3: ○ belongs to a function exported in libc’s SYMTAB section ■ disassemble the function until we discover which instruction includes the c3 Out of 975,626 covered bytes, 5,483 are c3 bytes (one in every ➔ 178)
Catalog of rets: avoid spurious rets ● each procedure could have a single exit point ( early exits jump to this point ) ● %ebx could be avoided as an accumulator for adds ● moves from %eax to %ebx could be avoided ( or written using instruction other than mov ) ● instruction placements could be adjusted (avoid offsets with c3) Drawbacks ➔ compiler would be less transparent and complicated ◆ loss of efficiency in the use of registers ◆
Catalog of rets: kinds of returns ● c3 ○ near return ● c2 imm16 ○ near return with stack unwind ● cb ○ far return ● ca imm16 ○ far return with stack unwind Last three variants are more difficult to use in the exploits we described ➔
Conclusion ● A new way of organising return-into-libc exploits on x86 ● Discovered short instruction sequences ● Showed how to combine such sequences into gadgets
Thank you! Questions? Παπαδόπουλος Παναγιώτης-Ηλίας Παπαδογιαννάκη Ευαγγελία Κλεφτογιώργος Κωνσταντίνος
Recommend
More recommend