Ankou: Guiding Grey-box Fuzzing towards Combinatorial Difference Valentin Manès 1 , Soomin Kim 2 , Sang Kil Cha 2 1 CSRC, KAIST 2 KAIST
The Success of Grey-box Fuzzing “OSS-Fuzz has found over 20,000 bugs in 300 open source projects.” github.com/mrash/afl-cve Why one more ? è Many CVEs $$$$ 2
Grey-box, How? Fitness Function: if(“interesting”): Test Case Add to seed pool Fuzzer Seed Pool Program Test Case Output Test Case A Test Case B Test Case C 3
Which Feedback? Coverage has proved a Ankou: Opportunity to good tradeoff between cost improve? and benefits. e t n g u r a s r o r i s k e e C y c v d l a o t a t i b C H n s d / A t h h e t u c c e n o n n f i a d a a o T t r r N B B s Cost - zzuf - AFL - Vuzzer - BFF - LibFuzzer - Angora … … … 4
Coverage-Based Fuzzing Test Case Test Case Test Case A A A B B B C C D int combinedBranches(char *data) { int bits = 0; Value Value Value “A” “A” “A” “BB” “BB” “BB” “AB” “AB” “ABC” if (data[0] == 'A') bits |= 1; Branch 1 Branch 1 X X X X X Branch 1 X if (data[1] == 'B') bits |= 2; Branch 2 X X X Branch 2 Branch 2 X X X if (data[2] == 'C') bits |= 4; if (bits == 7) Branch 3 Branch 3 Branch 3 X printf("BINGO\n"); return 0; } Fuzzer Fitness Function: Program Test Case Outputs if( new branch ): Seed Pool Test Case A Add to seed pool A more informative Fitness Function is needed! Test Case B Test Case D 5
Informative Fitness with Combination Ankou goal: developing a fitness function taking into account combinations . 1. Quantify the difference between program executions. 2. Make fitness computation fast . 3. Make the fitness adaptive to the program. 6
Point Representation 3 2 Branch 2 1 0 0 1 2 3 4 5 6 Branch 1 7
Distance between Executions 3 Euclidean Distance 2 Branch 2 1 0 0 1 2 3 4 5 6 Branch 1 8
Distance between Executions 3 Detects Combinatorial Difference! 2 Branch 2 1 0 0 1 2 3 4 5 6 Branch 1 9
Distance-based Fitness Function 7 Seed Pool 6 5 Branch 2 4 3 2 ? Point-to-Pool ? 1 0 0 1 2 3 4 5 6 7 8 Branch 1 10
Distance-based Fitness Function 7 Seed Pool 6 5 Branch 2 4 3 2 Point-to-Pool = Minimum Point-to-Point 1 0 0 1 2 3 4 5 6 7 8 Branch 1 11
Cost Sensitivity 35 30 25 Seed Pool Branch 2 20 15 10 5 The fitness function is computed for every test case. 0 0 5 10 15 20 25 30 Branch 1 12
Problem: Slow Computation Euclidean Distance = 𝒫 (#branch) 13
Cost Reduction Euclidean Distance = 𝒫 (#branch) Dimensionality Reduction See paper for details on the Dynamic PCA. Euclidean Distance = 𝒫 (#“ reprentative branch ”) 14
Ankou Adaptive Fitness Function Coverage-based fitness function: Ankou fitness function: if(new branch): if(new branch): if(Point-to-Pool distance ?? ): Add test to seed pool Add test to seed pool 15
Ankou Adaptive Fitness Function Ankou fitness function: if(new branch): if(Point-to-Pool distance > 𝜄 !"# ): Add test to seed pool 𝜄 !"# ← Minimum inter-seed distance 5 4 Branch 2 𝜄 !"# 3 2 1 0 0 1 2 3 4 5 6 7 Branch 1 16
Benchmark • Use 24 packages used by CollAFL 1 . • All experiments are 6x24 hours runs. • In total: our experiments constitute 2,682 CPU days. 1 S. Gan, C. Zhang, X. Qin, X. Tu, K. Li, Z. Pei, and Z. Chen, “CollAFL: Path sensitive fuzzing,” 17 in Proceedings of the IEEE Symposium on Security and Privacy , 2018, pp. 660–677.
Q: Is the New Fitness Function Effective? 18
Ankou with and without Distance-based Distance-based finds 44% more crashes. 4.00 4.00 Throughput ratio # Crashes Throughput Crash ratio (in log) (in log) 1.00 1.00 0.25 0.25 Subjects 19
Q: How does Ankou compare to other grey-box fuzzers? 20
Ankou vs. AFL Ankou finds 41% more unique crashes. Crash / Coverage ratio 100 # Crashes Coverage 10 (in log) 1 0.1 0.01 Subjects 21
Ankou vs. AFL: Speed Ankou is 35% slower than AFL. Throughput ratio 10.0 (in log) 1.0 0.1 Subjects 22
Conclusion 1. Coverage-based fuzzers ignore combinations of branches. 2. Ankou distance-based fitness function quantify combinatorial difference while being fast and adaptive to programs. 3. While being 35% slower than AFL, Ankou finds 41% more crashes. 23
Question? 24
Recommend
More recommend