NEUZZ: Efficient Fuzzing with Neural Program Smoothing Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana Columbia University 1
Fuzzing: a popular way to uncover bugs [Liang et al. 2019] 2
Evolutionary Fuzzing Advantage: easy to implement Seed Disadvantage: inefficient Mutation • Random mutations are not effective • Often get stuck in long sequence of Children wasteful mutations Hard to find scalable and adaptive Grandchildren heuristics for guided mutation 3
A new approach to fuzzing 4
Fuzzing: An Optimization Problem a program input ∈ X x x # of bugs found by input F ( x ) x generate K inputs from input space C ( X ) X � F ( x ) Maximize x ∈ C ( X ) Find C(X) that can maximize total no. of bugs is discrete and hard to optimize F ( x ) 5
Fuzzing: An Optimization Problem : # of bugs F ( x ) x 2 x 1 Input x Hard to find inputs like and x 2 x 1 among flat plateaus 6
Fuzzing: An Optimization Problem a program input ∈ X x x edge coverage of input G ( x ) x generate K inputs from input space C ( X ) X � G ( x ) Maximize x ∈ C ( X ) Find C(X) that can maximize total number of edges 7
Fuzzing: An Optimization Problem : # of edges G ( x ) Input x 8
Evolutionary optimization : # of edges G ( x ) 4 2 5 1 3 Input x Random mutation is not efficient 9
Gradient-guided Optimization Smooth Approximation + Gradient-guided Mutation : # of edges : smooth approximation of G ( x ) H ( x ) G ( x ) Input x 10
Gradient-guided Optimization Smooth Approximation + Gradient-guided Mutation : smooth approximation of H ( x ) G ( x ) 4 3 5 2 1 Input x 11
Smooth Approximation Problem: How to smoothly approximate G(x)? Universal Approximation Theorem: A NN can approximate any continuous function Neuzz Solution: Use a NN to learn a smooth H(x) 12
Gradient-guided Mutation Why gradient guidance? Gradient indicates critical parts of input What are critical parts of the input? Critical parts of input affect program branches How gradient-guided mutation works? Focus mutations on the critical parts of the input 13
Main Idea behind Neuzz Program Branching Input Behaviors Gradient-guided mutation Smooth Surrogate Branching Input Behaviors NN 14
A Peek Into NN Model 15
Generalization to Unseen branches Observations: - Real world program inputs have critical parts - Most of branches are affected by the critical parts Neuzz Solution: - Identify critical parts based on observed branches - Perform more mutations on the critical part of inputs to explore unseen branches 16
Design of NEUZZ 17
Evaluation Ø 10 real world programs Ø Lava-M and DARPA CGC datasets Ø Comparison with RNN-based fuzzers Ø Performance of different model choices 18
Evaluations: Edge Coverage NEUZZ vs. state-of-the-art fuzzers 10 real world applications for 24 hours NEUZZ achieves on average 3x more edge coverage than other fuzzers 19
Evaluations: Bug Finding NEUZZ vs. state-of-the-art fuzzers NEUZZ finds the most number of bugs and all 5 bug types including two new CVEs 20
Evaluations: Lava-M and CGC Lava-M dataset DARPA CGC dataset NEUZZ outperforms state-of-the-art fuzzers on LAVA-M and CGC 21
Evaluations: NEUZZ vs. RNN-based Fuzzer NEUZZ achieves 6x more edge coverage and 20x less training time 22
Evaluations: Effect of Different NNs Edge coverage for 1M mutations NEUZZ achieves best performance with NN+Incremetal learning 23
Key Takeaways of NEUZZ ● Use NN gradients to identify the critical locations of program inputs ● Focus mutations on the critical locations ● Minimize runtime overhead by using simple feed-forward neural networks ● Retrain the network incrementally to find new critical locations 24
Github Repo NEUZZ is available at https://github.com/Dongdongshe/neuzz 25
NEUZZ: Efficient Fuzzing with Neural Program Smoothing Dongdong She, Kexin Pei, Dave Epstein, Junfeng Yang, Baishakhi Ray, and Suman Jana Columbia University 26
Recommend
More recommend