coverage guided fuzzing of individual functions without
play

Coverage-guided Fuzzing of Individual Functions Without Source Code - PowerPoint PPT Presentation

Coverage-guided Fuzzing of Individual Functions Without Source Code Alessandro Di Federico Politecnico di Milano October 25, 2018 1 Index Coverage-guided fuzzing An overview of rev.ng Experimental results 2 Fuzzing 3 Fuzzing 1 Generate


  1. Coverage-guided Fuzzing of Individual Functions Without Source Code Alessandro Di Federico Politecnico di Milano October 25, 2018 1

  2. Index Coverage-guided fuzzing An overview of rev.ng Experimental results 2

  3. Fuzzing 3

  4. Fuzzing 1 Generate a lot of different inputs 2 Feed them to a program 3 Wait for it to reach an invalid state 4 Collect a report for the analyst 4

  5. Features Pros: • Easy to setup • It can find subtle bugs Cons: • It might require large amount of resources • Semi-decidable 5

  6. A huge leap forward Coverage-guided fuzzing 6

  7. A huge leap forward Coverage-guided fuzzing Privilege inputs leading to cover new code paths 7

  8. A huge leap forward int main() { if (A && B) { crash (); } else { all_good (); } } 8

  9. The Control-flow Graph A B 9

  10. First run Input: 0000 0000 0000 0000 A B 10

  11. First run Input: 0000 0000 0000 0000 A B 11

  12. First run Input: 0000 0000 0000 0000 A B 12

  13. First run Input: 0000 0000 0000 0000 A B 13

  14. Second run Input: 0000 0000 0000 0001 A B 14

  15. Second run Input: 0000 0000 0000 0001 A B 15

  16. Second run Input: 0000 0000 0000 0001 A B 16

  17. Second run Input: 0000 0000 0000 0001 A B 17

  18. This input is not interesting! 18

  19. Third run Input: 0001 0000 0000 0000 A B 19

  20. Third run Input: 0001 0000 0000 0000 A B 20

  21. Third run Input: 0001 0000 0000 0000 A B 21

  22. Third run Input: 0001 0000 0000 0000 A B 22

  23. Third run Input: 0001 0000 0000 0000 A B 23

  24. This input is interesting! It led us to discover a new basic block 24

  25. Fourth run Input: 0011 0000 0000 0000 A B 25

  26. Fourth run Input: 0011 0000 0000 0000 A B 26

  27. Fourth run Input: 0011 0000 0000 0000 A B 27

  28. Fourth run Input: 0011 0000 0000 0000 A B 28

  29. Fourth run Input: 0011 0000 0000 0000 A B 29

  30. american fuzzy lop • It made coverage-guided fuzzing popular • Developed by lcamtuf • Performs instrumentation to detect executed basic blocks • Two key modes of operation: • Source mode • Binary mode 30

  31. Source mode Instrumentation is performed at compiler-level 31

  32. Source mode Instrumentation is performed at compiler-level int main() { record (1); if (A && B) { record (2); crash (); } else { record (3); all_good (); } record (4); } 32

  33. Binary mode An emulator is employed to detect executed basic blocks 33

  34. Binary mode An emulator is employed to detect executed basic blocks • QEMU is the chosen emulator • It incurs in a sensible slowdown 34

  35. libfuzzer • Alternative to afl • It requires the source code to be available • Based on LLVM 35

  36. What’s LLVM? LLVM is a compiler framework Famous for its C/C++ frontend (clang) and its intermediate representation (the LLVM IR) 36

  37. libfuzzer can be a lot faster It doesn’t fork int main() { while (true) { char *new_input = random_input (); target(new_input ); } } 37

  38. Index Coverage-guided fuzzing An overview of rev.ng Experimental results 38

  39. What is rev.ng ? rev.ng is a unified framework for binary analysis based on QEMU and LLVM 39

  40. What is rev.ng ? rev.ng is a unified framework for binary analysis based on QEMU and LLVM Everything you’ll see here is architecture-agnostic 40

  41. How does QEMU work? 41

  42. A dynamic binary translator AArch64 AArch64 ARM Alpha CRIS Unicore AArch64 SPARC ARM SPARC64 x86 SuperH x86-64 SystemZ PowerPC QEMU IR QEMU IR MIPS PowerPC64 PowerPC XCore SystemZ MIPS SPARC MIPS64 OpenRISC TCI MicroBlaze x86-64 x86 RISC V 42

  43. The frontend is a lifter AArch64 AArch64 ARM Alpha CRIS Unicore AArch64 SPARC ARM SPARC64 x86 SuperH x86-64 SystemZ PowerPC QEMU IR QEMU IR MIPS PowerPC64 PowerPC XCore SystemZ MIPS SPARC MIPS64 OpenRISC TCI MicroBlaze x86-64 x86 RISC V 43

  44. QEMU translates at run-time 44

  45. QEMU translates at run-time rev.ng translates offline 45

  46. rev.ng : a static binary translator Collect md5sum.arm entry points Lift to QEMU IR Collect new Translate entry points to LLVM IR Link runtime md5sum.x86-64 functions 46

  47. Alpha ARM CRIS AArch64 Unicore RISC V SPARC64 Hexagon SPARC SuperH x86 QEMU IR SystemZ x86-64 MicroBlaze PowerPC OpenRISC PowerPC64 MIPS64 MIPS XCore 47

  48. Alpha ARM CRIS AArch64 Unicore RISC V SPARC64 Hexagon SPARC SuperH x86 LLVM IR SystemZ x86-64 MicroBlaze PowerPC OpenRISC PowerPC64 MIPS64 MIPS XCore 48

  49. Alpha ARM CRIS AArch64 Unicore RISC V SPARC64 Hexagon SPARC rev.ng SuperH x86 SystemZ x86-64 MicroBlaze PowerPC OpenRISC PowerPC64 MIPS64 MIPS XCore 49

  50. Alpha ARM CRIS AArch64 Unicore RISC V SPARC64 Hexagon SPARC rev.ng SuperH x86 SystemZ x86-64 MicroBlaze PowerPC OpenRISC PowerPC64 MIPS64 MIPS XCore 50

  51. We produce LLVM IR 51

  52. We produce LLVM IR We can employ libfuzzer directly 52

  53. Steps 1 Lift the program to LLVM IR 2 Identify all the functions 3 Identify a function to fuzz 4 Create the fuzzing function 5 Compile fuzzing function 6 Instrument using libfuzzer 7 Launch the fuzzer 53

  54. Steps 1 Lift the program to LLVM IR 2 Identify all the functions 3 Identify a function to fuzz MANUAL 4 Create the fuzzing function MANUAL 5 Compile fuzzing function 6 Instrument using libfuzzer 7 Launch the fuzzer 54

  55. Index Coverage-guided fuzzing An overview of rev.ng Experimental results 55

  56. We are sensibly faster than QEMU 56

  57. We are sensibly faster than QEMU 1 The LLVM optimizer has a wider view on the code 2 The translation is performed offline 57

  58. Runtime (seconds) 4000 rev.ng Native QEMU 3000 2000 1000 458.sjeng 464.h264ref 400.perlbench 471.omnetpp 462.libquantum 473.astar 2000 1500 1000 500 401.bzip2 483.xalancbmk 429.mcf 403.gcc 445.gobmk 456.hmmer 58

  59. On average, 68% faster than QEMU 59

  60. A practical case study We want to fuzz the PCRE library 60

  61. A practical case study We want to fuzz the PCRE library Not directly, but embedded in another program ( less ) 61

  62. Steps (again) 1 Lift the program to LLVM IR 2 Identify all the functions 3 Identify a function to fuzz 4 Create the fuzzing function 5 Compile fuzzing function 6 Instrument using libfuzzer 7 Launch the fuzzer 62

  63. Steps (again) 1 Lift the program to LLVM IR 2 Identify all the functions 3 Identify a function to fuzz 4 Create the fuzzing function 5 Compile fuzzing function 6 Instrument using libfuzzer 7 Launch the fuzzer 63

  64. Fuzzing function (simplified) int LLVMFuzzerTestOneInput(uint8_t *data , size_t size) { char input_string [] = "Test␣string!"; void *compiled_re; compiled_re = pcre_compile(data); pcre_exec(compiled_re , input_string , strlen(input_string )); pcre_free(compiled_re ); return 0; } 64

  65. We were able to find a known vulnerability in PCRE 65

  66. Comparing with afl Are we faster than afl? • afl fuzzing worked directly on PCRE (without less ) • Used black-box mode 66

  67. Performances Execs per second Total execs 1 min 10 min 60 min 60 min afl 3 582 3 495 3 682 13 187 295 150 617 79 701 78 306 271 217 728 rev.ng 67

  68. Summary • We do not require the source code • We can fuzz any entry point • We are sensibly faster than existing techniques 68

  69. Future works • Improve performances • Perform symbolic execution (through KLEE) 69

  70. Future works Backup slides 70

  71. Very effective! 71

  72. License This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. 72

Recommend


More recommend