Binary Stirring: Self-randomizing Instruction Addresses of Legacy - PowerPoint PPT Presentation

University of Crete – Computer Science Department CS457 – Introduction to Information Systems Security Binary Stirring: Self-randomizing Instruction Addresses of Legacy x86 Binary Code Papadaki Eleni 872 Rigakis Nikolaos 2422 Trivyzadakis Zacharias 1466 Zidianaki ioanna 857

History of attacks ¡ Write-xor-execute Directly inject Protections (DEP , malicious machine ExecShield) ¡ code ¡ Redirect control ASLR (address flow to dangerous space layout code inside victim randomization) ¡ process ( ROP ) ¡ IPR (deployment Redirect control issues), ILR (high from gadgets inside performance the binary code ¡ overhead) ¡

STIR – Self-Transforming Instruction Relocation ¡ ¨ A new technique that gives binary code the ability to self- randomize its instruction addresses each time it is launched. ¨ Input: binary code without any source code, debug symbols, or relocation information (legacy code). ¨ Output: new binary whose basic block addresses are dynamically determined at load-time ¨ Evaluation for Windows and Linux platforms shows about 1.6% overhead ¡

STIR – Pros ¡ • Fully transparent: Self-randomize legacy code each time it is launched • Easily deployable: apply STIR to a binary and distribute the binary code normally. • Reduced performance overheads: new static code transformation approach ¡

Challenges ¡ ¨ Preserving the semantics of computed jumps ¨ Prevent randomizing data along with the code ¨ Disassembly undecidability: static disassemblers rely on heuristics to find the reachable code ¨ Callback pointers are not used as jump targets by any instruction visible to the randomizer ¨ Position-dependent instructions ¡

Static binary rewriting phase ¡ • Solves Code/data interleaving and imperfect disassembly • Treat all bytes as both data and code • Bytes treated as data • Bytes disassembled • Keep initial addresses into code blocks • Non-executable section ¡ • Data bytes become unreachable code

Load-time phase ¡ ¨ Random stirring of the code-only section by a trusted library statically linked into the new binary. ¨ This library initializer code always runs before the target code it stirs. ¨ Stale pointers: some code pointers continue to point into the data-only segment ¨ Static phase translates all computed jump instructions into a short alternative sequence that dynamically detects and re- points old pointers to new addresses at runtime. ¡

The architecture of STIR ¡ ¨ Three main components: ¤ a conservative disassembler ¤ a lookup table generator, and ¤ a load-time reassemble

Disassembler target ¡ ¨ Takes a target binary and transforms it to a randomizable representation. ¨ An address map of the randomizable representation is encoded into the new binary by the lookup table generator. ¡ ¡

Static Rewriting Phase ¡ ¨ Target binaries are first disassembled to assembly code ¨ Disassembler interpret all bytes that constitute valid instruction encodings as code ¨ Assembly code is partitioned into basic blocks which can be any contiguous sequence of instructions ¡ ¨ Once new code section has been generated, lookup table generator overwrites all potential computed jump targets in the original code ¨ Since each module loads into virtual address space, it is not possible to place old code within a single virtual address range ¡

Load-time stirring phase ¡ ¨ STIR library’s initializer ¡ code runs, when the rewritten program is launched ¨ Lookup table in the linking module’s section is updated ¨ Library that implements stirring is loaded dynamically into the address space at library initialization ¨ Unloaded before stirred binary runs

Special Cases ¨ Callbacks ¨ Position Independent Code ¨ Statically Computed Returns ¨ Short Functions

Special Cases Callbacks ¨ A callback occurs when the OS uses a code pointer previously passed from the program as a computed jump destination ¨ Unlike typical computed jumps, callback pointers are not used as jump targets ¨ The only instructions that use them as jump targets are within the OS

Special Cases Short Functions ¨ Our jump table implementation overwrites each computed jump target with a 5-byte tagged pointer ¨ This design assumes that nearby computed jump targets are at least 5 bytes apart; otherwise the two pointers must overlap

EMPIRICAL EVALUATION ¨ Effectiveness ¨ Performance Overhead

EMPIRICAL EVALUATION Effectiveness ¨ Rewriting Time and Space Overheads ¨ Gadget Elimination

EMPIRICAL EVALUATION Effectiveness - Gadget Elimination % of Gadgets Eliminated 100.00% 99.98% 99.96% 99.94% 99.92%

EMPIRICAL EVALUATION Performance Overhead ¨ Windows Runtime Overhead ¨ Linux Runtime Overhead

Windows Runtime Overhead SPEC2000 Windows Runtime Overhead 20% 15% 10% 5% 0% -5% -10% gzip vpr mcf parser gap bzip2 twolf mesa art equake

-15% -10% -5% 0% 5% Linux Runtime Overhead base64 cat cksum comm cp expand factor fold head join ls md5sum nl od paste sha1sum sha224sum sha256sum sha384sum sha512sum shred shuf unexpand wc

Entropy Discussion ¨ ASLR ¤ 2 n-1 probes where n is the number of bits of randomness ¨ STIR ¤ ( 2 ↑𝑜 ) ! /2( 2 ↑𝑜 ¡−g) ! probes where g is the number of gadgets in the payload n Must guess each where each gadget is with each probe.

Conclusion ¨ First static rewriter to protect against RoP attacks ¤ Greatly increases search space ¤ Introduces no deployment issues ¤ Tested on 100+ Windows and Linux binaries ¤ 99.99% gadget reduction on average ¤ 1.6% overhead on average ¤ 37% process size increase on average

Binary Stirring: Self-randomizing Instruction Addresses of Legacy - PowerPoint PPT Presentation

University of Crete Computer Science Department CS457 Introduction to Information Systems Security Binary Stirring: Self-randomizing Instruction Addresses of Legacy x86 Binary Code Papadaki Eleni 872 Rigakis Nikolaos 2422

RANDOMIZING AND RANDOMIZING AND AUTOMATING ASSESSMENT AUTOMATING ASSESSMENT WITH R WITH R exams

Binary Numbers Binary numbers look like this Binary Numbers or Binary Code Binary numbers or

A Quick Review Decimal to binary Binary to decimal Binary to hexadecimal

FPRandom: Randomizing core browser objects to break advanced device fingerprinting techniques

Detecting Network Effects Randomizing Over Randomized Experiments Martin Saveski (@msaveski)

Binary Trees, Heaps Binary Trees, Heaps Binary trees Binary trees A binary tree (

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Binary Numbers 723 Binary Numbers 723 = 7x100 + 2x10 + 3x1 Binary Numbers 723 = 7x100 + 2x10 +

Virtual and Physical Addresses Physical addresses are provided by the hardware: one physical

Monetising IP Addresses Geoff Huston APNIC Addresses are not Property The original view of

Virtual Memory Names, Virtual Addresses & Physical Addresses Source Absolute Program

CMSC 206 Binary Search Trees 1 Binary Search Tree n A Binary Search Tree is a Binary Tree in

Binary Search Trees and Balanced Binary Search Trees using AVL Trees Mark Redekopp David Kempe

LECTURE 2 Review 1 Binary Math and Assembly BINARY MATH In this section, we review Binary

Binary trees Binary trees David Morgan Binary trees Binary trees elements have up to 2

Binary Code Analysis: Concepts and Perspectives Emmanuel Fleury

Part III Synchronization Semaphores The bearing of a child takes nine months, no matter how many

Logics for Program Reasoning University of Oslo Ratan Thapa/ratanbt@ifi.uio.no INF 5140:

Interval Temporal Logics: a selective overview Dedicated to the memory of Sasha Chagrov Valentin

Principles of Programming Languages Kristopher Micinski This class is about understanding how

Decompilation and Data Flow Analysis Silvio Cesare <silvio.cesare@gmail.com> Who am I and

Transparent Parallelization of Binary Code Benot Pradelle Alain Ketterlin Philippe Clauss

Registers & Counters M. Sachdev Dept. of Electrical & Computer Engineering University of

Binary Stirring: Self-randomizing Instruction Addresses of Legacy - PowerPoint PPT Presentation

University of Crete Computer Science Department CS457 Introduction to Information Systems Security Binary Stirring: Self-randomizing Instruction Addresses of Legacy x86 Binary Code Papadaki Eleni 872 Rigakis Nikolaos 2422

RANDOMIZING AND RANDOMIZING AND AUTOMATING ASSESSMENT AUTOMATING ASSESSMENT WITH R WITH R exams

Binary Numbers Binary numbers look like this Binary Numbers or Binary Code Binary numbers or

A Quick Review Decimal to binary Binary to decimal Binary to hexadecimal

FPRandom: Randomizing core browser objects to break advanced device fingerprinting techniques

Detecting Network Effects Randomizing Over Randomized Experiments Martin Saveski (@msaveski)

Binary Trees, Heaps Binary Trees, Heaps Binary trees Binary trees A binary tree (

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Binary Numbers 723 Binary Numbers 723 = 7x100 + 2x10 + 3x1 Binary Numbers 723 = 7x100 + 2x10 +

Virtual and Physical Addresses Physical addresses are provided by the hardware: one physical

Monetising IP Addresses Geoff Huston APNIC Addresses are not Property The original view of

Virtual Memory Names, Virtual Addresses &amp; Physical Addresses Source Absolute Program

CMSC 206 Binary Search Trees 1 Binary Search Tree n A Binary Search Tree is a Binary Tree in

Binary Search Trees and Balanced Binary Search Trees using AVL Trees Mark Redekopp David Kempe

LECTURE 2 Review 1 Binary Math and Assembly BINARY MATH In this section, we review Binary

Binary trees Binary trees David Morgan Binary trees Binary trees elements have up to 2

Binary Code Analysis: Concepts and Perspectives Emmanuel Fleury

Part III Synchronization Semaphores The bearing of a child takes nine months, no matter how many

Logics for Program Reasoning University of Oslo Ratan Thapa/ratanbt@ifi.uio.no INF 5140:

Interval Temporal Logics: a selective overview Dedicated to the memory of Sasha Chagrov Valentin

Principles of Programming Languages Kristopher Micinski This class is about understanding how

Decompilation and Data Flow Analysis Silvio Cesare &lt;silvio.cesare@gmail.com&gt; Who am I and

Transparent Parallelization of Binary Code Benot Pradelle Alain Ketterlin Philippe Clauss

Registers &amp; Counters M. Sachdev Dept. of Electrical &amp; Computer Engineering University of

Virtual Memory Names, Virtual Addresses & Physical Addresses Source Absolute Program

Decompilation and Data Flow Analysis Silvio Cesare <silvio.cesare@gmail.com> Who am I and

Registers & Counters M. Sachdev Dept. of Electrical & Computer Engineering University of