The S2E Platform Finding vulnerabilities in Linux and Windows programs Vitaly Chipounov https://s2e.systems Ready-for-use docker image, demos, tutorials, source code, documentation
Vitaly Chipounov Ph.D. in 2014 at EPFL, Switzerland • DSLAB, George Candea S2E started back in 2008 when we wanted to reverse engineer Windows • device drivers using symbolic execution Realized that we could also use this to find bugs in drivers, do • performance analysis, etc. => created the S2E platform Co-founder and chief architect at Cyberhaven • Finalists in the Cyber Grand Challenge •
The World's First All-Machine Hacking Tournament • Evaluate software for vulnerabilities ( Attack ) • Defend software against attacks ( Defend ) • Keep software running and available ( Availability ) Two teams used S2E Team CodeJitsu Team Disekt Cyberhaven UC Berkeley Syracuse University
Tutorial Overview (Part 1) Finding vulnerabilities in user-space apps with S2E • Overview of S2E and symbolic execution • Getting started with S2E • Design and Implementation • Automated generation of proofs of vulnerability
Tutorial Overview (Part 2) Testing Windows device drivers with S2E FAT12/16/32 driver from the WDK ( fastfat.sys) • Dealing with path explosion >12 KLOC driver, large input files • Parallel symbolic execution, concolic execution, fuzzer integration, symbolic fault injection, function models, etc.
Extensible On real OSes, with Write your own tools real apps, libraries, drivers S2E is a platform for in-vivo multi-path analysis of software systems Symbolic execution Bug finding Pretty much anything that Concolic execution Verification runs on computers State merging Testing Fuzzing Security checking … …
Dynamic Symbolic Execution void main( int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } ✓ ✓
Dynamic Symbolic Execution void main( int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } 30 GB disk 4 GB RAM?
Dynamic Symbolic Execution void main( int argc, char **argv) { int r = 1, i = 1; if (i < argc) { if (argv[i][0] == 'n') { r = 0; ++i; } } 30 GB disk 4 GB RAM?
#include <stdio.h> int main(int argc, char **argv) { FILE *fp = fopen(argv[1], "rb"); if (!fp) { goto err; } int data; if (fread(&data, sizeof(data), 1, fp) != 1) { goto err; } if (data == 0xdeadbeef) { printf("Hello world\n"); } else { *(char *)0xbadcafe = 0; } err: if (fp) { fclose(fp); } return 0; }
Quick Start ssh -CX userXX@issisp.anu.edu.au $ wget -O crash.c https://pastebin.com/raw/B1BbapMV $ gcc -g -O0 —m32 -o ./crash ./crash.c $ source s2e/s2e_activate $ s2e new_project ./crash @@ $ cd s2e/projects/crash $ ./launch-s2e.sh …
Analysis Output $ ls -1 s2e/projects/crash/s2e-last/ assembly.ll debug.txt ExecutionTracer.dat info.txt module.bc run.stats tbcoverage-0.json tbcoverage-1.json testcase-crash:0x50f:0x804858a-0-_tmp_input testcase-kill-0-_tmp_input testcase-kill-1-_tmp_input warnings.txt
Reproducing crashes with concrete test cases ~/s2e/projects/crash$ gdb --args ./crash s2e-last/testcase-crash:0x50f:0x804858a-0-_tmp_input (gdb) r Starting program: /home/vitaly/s2e/projects/crash/crash s2e-last/ testcase-crash:0x50f:0x804858a-0-_tmp_input Program received signal SIGSEGV, Segmentation fault. 0x0804858a in main (argc=2, argv=0xffffcec4) at crash.c:18 18 *(char *)0xbadcafe = 0; (gdb)
Profiling Path Explosion $ s2e forkprofile crash … 01295 crash:0x08048571 1 /home/ubuntu/s2e/env/crash.c:15 (main) if (data == 0xdeadbeef) { printf("Hello world\n"); } else { *(char *)0xbadcafe = 0; } More on that in the 2nd part of the tutorial…
Displaying Code Coverage $ s2e coverage lcov --html crash Overall coverage rate: lines......: 84.6% (11 of 13 lines) functions..: no data found Line coverage saved to /home/vitaly/s2e/env/projects/crash/s2e-last/crash.info. An HTML report is available in /home/vitaly/s2e/env/projects/crash/s2e-last/crash_lcov
$ s2e coverage lcov --html crash
Overview Finding vulnerabilities in user-space apps with S2E • Overview of S2E and symbolic execution • Getting started with S2E • Design and Implementation • Automated generation of proofs of vulnerability
Valgrind CUTE ThreadSanitizer AddressSanitizer EXE Manticore Testing KLEE SJPF Dingo P r o fi l i n g Astrée Coverity SFI Immunity TEMU CodeSonar Nooks BitBlaze DTrace CFI VMware Oprofile Dimmunix Check properties on Debugging CoreDet BAP VeriSoft Verification execution paths SyncFinder IDApro Eraser ESD Jakstab SAGE Parfait Angr DART Saturn SimOS Cloud9 Simulation Isolation Calysto Pin BitScope LLBMC PinOS bddbddb Isabelle LFI … …
Checking Properties on Execution Paths ./prog ./prog 123 int main(argc, argv) { if (argc == 2) { printf(“%c”, *argv[2]); … argc != 2 argc == 2 } return 0; } ✖
Enumerating All Execution Paths int main(argc, argv) Program { void *p = malloc(…); Libraries if (!p) { Off-limits exit(-1); } Kernel … return 0; } Hardware system size ~2 paths
Environment Modelling int main(argc, argv) Program { void *p = malloc(…); if (!p) { exit(-1); Impractical Environment } … Model return 0; KLEE } Cloud9 SLAM program size LLBMC ~ 2 paths Coverity …
Calling the Environment int main(argc, argv) Program { void *p = malloc(…); Libraries if (!p) { False negatives exit(-1); } Kernel False positives … return 0; } Hardware KLEE EXE program size DART ~ 2 paths Fuzzing …
Analysis Tools Make Trade-offs • Accuracy vs. performance False positives vs. false negatives • Types of analysis Testing, verification, profiling, etc. • Software stack level Applications, libraries, kernel modules, etc. • Source code vs. binaries
S2E is a Platform that Enables Flexible Trade-offs • Accuracy vs. performance Symbolic execution, state merging, fuzzing, etc. You can write models, but don’t have to! • Types of analysis Wide range of analyses • Software stack level In-vivo, at any level of software stack • Source code vs. binaries Works on binaries
Applications Verification Testing Libraries Software router BIOS Testing [WOOT’15] dataplanes [NSDI’14] Avatar [NDSS’14] Kernel CHEF [ASPLOS’14] Achilles [ASPLOS’13] Security SWIFT [EUROSYS’12] Hardware SymDrive [OSDI’12] SymNet [WRIPE’12] CVE-2015-1536 DDT [USENIX’11] … Reverse Profiling Distributed systems engineering PROF s [ASPLOS’11] RevNIC [EUROSYS’10]
VM Applications Libraries Kernel Drivers Virtual Hardware S2E Dynamic Symbolic Binary Execution Translator Engine Instrumentation Engine What input to make symbolic Path Check for crashes, Analysis What input to make concrete Selection vulnerability conditions, Plugins Search heuristics Plugins performance metrics, etc.
KVM Extensions for VM Symbolic Execution Applications Libraries • S2E uses QEMU Kernel Drivers • S2E and QEMU are completely Virtual Hardware decoupled /dev/kvm • S2E is contained in libs2e.so S2E KVM-compatible interface • libs2e.so intercepts and Dynamic Symbolic libs2e.so replaces /dev/kvm functionality Binary Execution Translator Engine • Need a few simple KVM extensions Instrumentation Engine to intercept DMA, disk R/W, and device state snapshotting Path Analysis Selection Plugins • You don’t have to use QEMU with Plugins S2E
Modular Architecture VM Applications Libraries • We refactored QEMU’s translator to Kernel Drivers make it standalone Virtual Hardware • libcpu, libtcg: code translation and generation libraries /dev/kvm S2E KVM-compatible interface • libs2ecore, libs2eplugins, klee, Dynamic Symbolic libvmi, etc. libs2e.so Binary Execution Translator Engine • You can reuse these in your own Instrumentation Engine projects Path • You can swap out the symbolic Analysis Selection Plugins execution engine with your own if Plugins you want
Dynamic Binary Translation while(true) { tb = translate(cpu->pc) tb->func(cpu); }} 0x80000000: mov [ebx], eax void tb_0x80000000(cpu) { tmp1 = cpu->regs[EBX]; tmp2 = cpu->regs[EAX]; __stl_mmu(tmp1, tmp2); }
Dynamic Binary Translation while(true) { tb = translate(cpu->pc) tb->func(cpu); }} translate() Frontend Host-independent 0x80000000: mov [ebx], eax micro-operations Backend (one per target void tb_0x80000000(cpu) { architecture) tmp1 = cpu->regs[EBX]; Host instructions tmp2 = cpu->regs[EAX]; (x86, arm, mips, etc.) __stl_mmu(tmp1, tmp2); }
Dynamic Binary Translation while(true) { tb = translate(cpu->pc) tb->func(cpu); }} translate(pc) { translate(pc) { do { do { if (s2e_instrument_ins(pc)) { ins = disassemble(pc); emit_uops_s2e(); emit_uops(ins); }} pc += ins.size; ins = disas(pc); } while (ins != jmp); emit_uops(ins); }} pc += ins.size; } while (ins != jmp); }}
Recommend
More recommend