Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. Which input to pick? Run on the Input Input Inputs 2. How to mutate an Pick an Input input? Input Input . Input 3. How many mutants to . Input Fuzzer Program . generate? Input . Inputs 4. What kind of feedback? . 5. How to decide if an Input input is interesting? Yes: add Input Resolved using heuristics over a period of 10 years Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 50
Feedback-directed Fuzzing 101 Seed Interesting Lots of choices: Inputs Mutate 1. Which input to pick? Run on the Input Input Inputs 2. How to mutate an Pick an Input input? Input Input . Input 3. How many mutants to . Input Fuzzer Program . generate? Input . Inputs 4. What kind of feedback? . 5. How to decide if an Input input is interesting? Yes: add Input Resolved using heuristics over a period of 10 years Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 51
Feedback-directed Fuzzing 101 Fuzzers: Seed • AFL Interesting Inputs • AFLFast Mutate Run on the Input • Libfuzzer Input Inputs Pick an • Input Angora Input Input • . VUzzer Input . Input • Steelix Fuzzer Program . Input . Inputs • AFLGo . • AFLSmart Input • Nautilus Yes: add • FairFuzz Input • PerfFuzz • JQF/Zest • FuzzFactory Feedback Interesting? • • Coverage RLCheck • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 52
What Bugs Can Fuzzing Find? • Assertion violations • Segmentation faults • Buffer overflows • Use-after-frees • Integer signedness • etc. … 53
What Bugs Have Fuzzing Found? • Tons of them ... • CVE-2014-6277: ShellShock bug in Bash: – GNU Bash through 4.3 bash43-026 does not properly parse function definitions in the values of environment ... • CVE-2014-0160: Heartbleed bug in OpenSSL: – A read buffer overflow allowed an attacker to extract information from servers using OpenSSL • CVE-2016-8677: ImageMagick – imagemagick: memory allocate failure in AcquireQuantumPixels (quantum.c) • CVE-2014-1564: Firefox – Mozilla Firefox before 32.0, Firefox ESR 31.x before 31.1, and Thunderbird 31.x before 31.1 do not properly initialize memory for GIF rendering • CVE-2010-0539: Safari Remote Execution – Integer signedness error in the window drawing implementation in Apple Java for Mac OS X 10.5 ... • See http://lcamtuf.coredump.cx/afl/ for an exhaustive list of bugs and security vulnerabilities found by a state-of-the-art fuzzer AFL 54
How Good is Fuzzing? 55
What’s Missing? Uneven Coverage Observation: some parts of the program easier to int process_xml(char * fuzzed_data, cover int fuzzed_data_len) { Hit by 100k+ inputs if (fuzzed_data_len >= 10) { Code under is // more code well-covered } // ... Hit by 1 input if (starts_with(fuzzed_data, “<!ATTLIST”)){ Code under is // ... barely covered } // ... return process_result; } 56
F a i r F u z z FairFuzz : A Targeted Mutation Strategy for F u z ! Increasing Greybox Fuzz Testing Coverage ? u z r F z u z Caroline Lemieux, Koushik Sen University of California, Berkeley source: https://github.com/carolemieux/afl-rb 57
Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Inputs Pick an Input Input Input . Input . Input Fuzzer Program . Input . Inputs . Input Yes: add Input Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 58
FairFuzz: Ideas FairFuzz Ideas: Seed Interesting Inputs 2 heuristics Mutate Run on the Input Input Inputs Pick an Input 1. Identify : branches hit Input Input . Input by few inputs (rare . Input Fuzzer Program . Input branches) . Inputs . 2. Identify : where input Input can be mutated and hit Yes: add Input branch Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 59
FairFuzz: Ideas FairFuzz Ideas: Seed Interesting Inputs 2 heuristics Mutate Run on the Input Input Inputs Pick an Input 1. Identify : branches hit Input Input . Input by few inputs (rare . Input Fuzzer Program . Input branches) . Inputs . 2. Identify : where an Input input can be mutated Yes: add Input and hit branch Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 60
Summary Results – Coverage Leaders 61
Summary Results – Coverage Leaders FairFuzz achieves the highest coverage fast, for nearly all benchmarks 62
PerfFuzz : Automatically Generating Pathological Inputs Caroline Lemieux, Rohan Padhye, Koushik Sen, Dawn Song University of California, Berkeley source: https://github.com/carolemieux/perffuzz 63
Performance Problems Have Consequences poor user experience security vulnerabilities (DoS) excessive resource consumption 64
Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Inputs Pick an Input Input Input . Input . Input Fuzzer Program . Input . Inputs . Input Yes: add Input Feedback Interesting? • Coverage • New coverage? • • Execution length Longer execution? • Valid input? • Well-formed input • ... • ... No: Discard input 65
PerfFuzz: Idea Seed PerfFuzz Ideas: Interesting Inputs Mutate Run on change heuristic the Input Input Inputs Pick an Input Input Input 1. Feedback: # of . Input . Input times each branch Fuzzer Program . Input . Inputs is executed . 2. Interesting: Longer Input execution of some Yes: add branch Input Feedback Interesting? • • # of times each Longer execution of some branch? branch is executed No: Discard input 66
Macro-Benchmarks: Maximum Path Length • Path length: total number of hits of CFG edges by an input libpng libxml2 libjpeg- zlib turbo 67
Macro-Benchmarks: Maximum Path Length • Path length: total number of hits of CFG edges by an input 24.7x libpng libxml2 libjpeg- zlib turbo 68
PerfFuzz: Memory-alloc Fuzzing PerfFuzz Ideas: Seed Interesting Inputs change heuristic Mutate Run on the Input Input Inputs Pick an Input 1. Feedback: # of Input Input . Input bytes allocated at . Input Fuzzer Program . Input each malloc() call . Inputs . 2. Interesting: More bytes allocated Input than any other Yes: add Input input Interesting? Feedback • More bytes • # of bytes allocated allocated at some at each malloc() call? No: Discard input 69
Memory-alloc fuzzing: OOMs and Bombs • Libpng 1. 100 bytes Input with large dimensions • Reader allocates 2 billion bytes 2. 100 bytes Input with large color space, but fixed dimension • Color table allocated with 4 GB space • Libarchive 1. 50 bytes zipped file: 4GB output 2. Memory leaks with LZMA compression (32 byte ZIP leaks 96 bytes) 70
FuzzFactory : Domain-Specific Fuzzing with Waypoints Rohan Padhye and Caroline Lemieux and Koushik Sen and Laurent Simon and Hayawardh Vijayakumar source: https://github.com/rohanpadhye/FuzzFactory 71
Domain-Specific Fuzzers • Zest [Padhye et al. 2018] – “increase coverage amongst valid inputs” • SlowFuzz [Petsios et al. 2017] – “increase path length” Common Strategy: • PerfFuzz [Lemieux et al. 2018] Save intermediate inputs – “maximize branch exec counts” “Waypoints” • DifFuzz [Nilizadeh et al. 2019] – “leak more info on the side channel” • MemFuzz [Coppik et al. 2019] – “access new input-dependent memory locations” 72
Can we rapidly create domain- specific fuzzers? Without touching the underlying search algorithm 73
Feedback-directed Fuzzing 101 Seed Interesting Inputs Mutate Run on the Input Input Inputs Pick an Input Input Input . Input . Input Fuzzer Program . Input . Inputs . Input Yes: add Input Interesting? Feedback • Better value of dsf(k) for some k ? (key-value map) No: Discard input 74
Example Fuzzers using FuzzFactory • CMP – Goal : Test programs whose inputs require magic bytes, checksums, etc. – Waypoints : inputs which increase progress of strcmp, memcmp, strstr, etc. • MEM – Goal : Find memory allocation and management related bugs – Waypoints : input which which increase args to malloc() • CMP+MEM – Goal : Find memory mgmt bugs in programs with magic bytes, checksums, etc. – Waypoints : CMP or MEM 75
Super-Fuzzer: CMP + MEM 76
Super-Fuzzer: CMP + MEM LZ4 Bomb (4GB alloc when decoding 21-byte input) PNG Bomb (2GB alloc when reading ~100 byte 20px image) 77
Coverage is Still Low 78
Why Coverage is Still Low? ✗ Cannot explore “deep states” ✗ Cannot find complex logical bugs ✗ Gets stuck in input parsing stage ✗ Hardly gets 20%-30% code coverage on real-world software But cheap and simple 79
Time to Bring Human in the Loop Approach: Human restricts the set of inputs to be explored by providing A Randomized A Precondition on or or ... Generator Inputs Algorithms to search the restricted input space 80
Semantic Fuzzing with Zest Rohan Padhye (UC Berkeley), Caroline Lemieux (UC Berkeley), Koushik Sen (UC Berkeley), Mike Papadakis (U. Luxembourg), Yves Le Traon (U. Luxembourg) source: https://github.com/rohanpadhye/jqf 81
? How do I test ... • a program taking an XML file as input – (e.g. Maven, Ant) • a compiler – (e.g. closure or Rhino compilers for JavaScript) • In general, a program taking structurally complex inputs 82
Human Writes a Simple Input Generator Generates random public XMLElement genXML ( Random random) { // Generate a random tag name syntactically valid String name = random.nextString(MAX_TAG_LENGTH) ; XMLElement node = new XMLElement (name); XML documents ✗ May not conform to // Generate a random number of children int n = random.nextInt( MAX_CHILDREN ); a given schema for ( int i = 0; i < n; i++) { // Generate child nodes recursively node.addChild( genXML (random)); } // Maybe insert text inside element if (random. nextBoolean ()) { node.addText( random.nextString(MAX_TEXT_LENGTH) ); } return node; } foo Example generated: <foo><i>xyz</i><br/></foo> i br xyz 83
Zest: Mutate Params to Generator Seed Augmented Program Interesting Inputs Mutate the Generator params params Pick a set of params Input params . Input . Input Fuzzer . Input Program . params . params Yes: add Input Feedback Interesting? • • Coverage New coverage? • Valid input? • Input validity No: Discard input 84
Zest: New bugs discovered Google Closure Compiler : #2842, #2843, #3220, #3173 OpenJDK : JDK-8190332, JDK-8190511, JDK-8190512, JDK-8190997, JDK- 8191023, JDK-8191076, JDK-8191109, JDK-8191174,JDK-8191073, JDK- 8193444, JDK-8193877, CVE-2018-3214 Apache Commons : LANG-1385, COMPRESS-424, COLLECTIONS-714, CVE-2018- 11771 Apache Ant : #62655 Apache Maven : #34, #57 Apache PDFBox : PDFBOX-4333, PDFBOX-4338, PDFBOX-4339, CVE-2018-8036 Apache TIKA : CVE-2018-8017 , CVE-2018-12418 Apache BCEL : BCEL-303, BCEL-307, BCEL-308, BCEL-309, BCEL-310, BCEL- 311, BCEL-312, BCEL-313 Mozilla Rhino : #405, #406, #407, #409, #410 85
Zest finds complex semantic bugs On this JavaScript input, Google’s Closure compiler throws an “ IllegalStateException: Unexpected variable” during optimization passes 86
Time to Bring Human in the Loop Approach: Human restricts the set of inputs to be explored by providing A Randomized A Precondition on or or ... Generator Inputs Algorithms to search the restricted input space 87
Efficient Sampling of SAT and SMT Constraints Rafael Dutra, Kevin Laeufer, Jonathan Bachrach, and Koushik Sen EECS Department UC Berkeley source: https://github.com/RafaelTupynamba/quicksampler 88
Human Writes a Pre-condition on Inputs An over-approximation of valid inputs In SMT (Satisfiability Modulo Theories) Restricts the set of inputs to be generated (x + y = 4 ∧ x ≥ 0 ∧ x < 4) ∧ (mem’[1] < 0 ∨ mem’[1] ≥ 4), where x = mem[0], Goal: sample inputs from y = mem[1], the restricted input space mem’ = store(mem, mem[0], -1 * mem[mem[0]]) mem ∈ Array(BV[4], BV[4]) 89
Sampling SAT and SMT Constraints Input: Logical constraint (SAT formula) Goal: Quickly generate lots of solutions that satisfy the constraint (x1 x4) (x1 ¬x3 ¬x8) x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 1 0 0 0 1 0 0 0 1 0 σ 0 (x1 x8 x6) 0 0 0 1 1 0 0 1 1 0 σ 1 (x2 x5) 1 1 0 0 1 0 0 0 1 0 σ 2 (¬x7 ¬x3 x9) 0 1 0 1 1 0 0 1 1 0 σ 3 (¬x7 x8 ¬x9) 1 0 1 0 1 0 0 0 1 0 σ 4 (x7 x8 ¬x10) 1 1 1 0 1 0 0 0 1 0 σ 5 (x7 x10 ¬x6)
QuickSampler Our goals: Our approach: • Generate samples • Compute patterns of bit >100x faster than other flips which preserve techniques satisfiability • Sampling should be • Combine those bit flip close to uniform patterns to generate lots of samples 91
Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) 92
Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ 93
Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ 94
Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 95
Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 σ 0 96
Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 σ 0 97
Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 0 1 1 1 1 0 1 0 σ 0 σ 1 98
Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 0 1 1 1 1 0 1 0 σ 0 σ 1 99
Formula φ(x0,x1,x2,x3,y0,y1,y2,y3) x0 x1 x2 x3 y0 y1 y2 y3 0 0 1 0 1 1 0 0 Random assignment σ’ MAX-SAT 0 0 1 0 1 1 1 0 Solution σ MAX-SAT 1 0 1 0 0 1 1 0 0 1 1 1 1 0 1 0 0 σ 0 σ 1 100
Recommend
More recommend