coverage based greybox fuzzing as markov chain
play

Coverage-based Greybox Fuzzing as Markov Chain Marcel Bohme, - PowerPoint PPT Presentation

Coverage-based Greybox Fuzzing as Markov Chain Marcel Bohme, Van-Thuan Pham, Abhik Roychoudhury School Of Computing, NUS, Singapore FM Update 2018 Presented by - Raveendra Kumar M, Animesh Basak Chowdhury TCS Research July 27, 2018 Some of


  1. Coverage-based Greybox Fuzzing as Markov Chain Marcel Bohme, Van-Thuan Pham, Abhik Roychoudhury School Of Computing, NUS, Singapore FM Update 2018 Presented by - Raveendra Kumar M, Animesh Basak Chowdhury TCS Research July 27, 2018 Some of the slides are adapted from Author’s presentation.

  2. Introduction Fuzz testing is an automated testing technique that uncovers software error by executing the target program with large number of randomly generated test inputs. Three main approaches. ◮ Black-box fuzzing : Random testing 1 . ◮ White-box fuzzing: SAGE 2 . ◮ Grey-box fuzzing : American Fuzzy Lop 3 . 1Miller et al, An empirical study of Unix utilities, CACM, 1990. 2Goefroid et al, Automated whitebox fuzz testing, NDSS, 2008. 3Zalewski, http://lcamtuf.coredump.cx/afl/.

  3. Grey-box fuzzing Black-Box Fuzzing → Open Loop Control System. GreyBox Fuzzing → Closed Loop Control System. Feedback Function H(s) ∼ Branch-Pair Coverage (Pair of consecutive nodes in a CFG) Target Instrumented Program P Program P' Generate New Execute P' Monitor Inputs from with . Coverage. . t g t ∈ T G Retain . t g T G = T G ∪ t g Is Yes Interesting behaviour? No Discard . t g

  4. Grey-box fuzzing – Working example 𝒔 𝒏 "𝑏" ① 1 𝑗 = 0 𝑑𝑐 = 0 𝑠𝑓𝑏𝑒(𝑔𝑒, 𝑗𝑜𝑞, 20) 2 false 3 𝑗𝑜𝑞 𝑗 ! = ‘\0’ D true A false 4 𝑗𝑜𝑞[𝑗] == ‘𝑐’ C true Id input AB AC BA CA BD CD DE DF B 𝑑𝑐 = 𝑑𝑐 + 1 "𝑏" 5 1 1 1 1 6 𝑗 = 𝑗 + 1 false 𝑑𝑐 ≥ 5 8 F E 𝑏𝑐𝑝𝑠𝑢() 9 𝑠𝑓𝑢𝑣𝑠𝑜 𝑑𝑐 𝒇 𝒏

  5. Grey-box fuzzing – Working example 𝒔 𝒏 "𝑏" ① 1 𝑗 = 0 ② ③ 𝑑𝑐 = 0 "𝑐" "𝑏𝑐" "𝑑"   𝑠𝑓𝑏𝑒(𝑔𝑒, 𝑗𝑜𝑞, 20) 2 false 3 𝑗𝑜𝑞 𝑗 ! = ‘\0’ D true A false 4 𝑗𝑜𝑞[𝑗] == ‘𝑐’ C true Id input AB AC BA CA BD CD DE DF B 𝑑𝑐 = 𝑑𝑐 + 1 "𝑏" 5 1 1 1 1 2 “b” 1 1 1 6 𝑗 = 𝑗 + 1 3 “ab” 1 1 1 1 1 “c” 1 1 1 false 𝑑𝑐 ≥ 5 8 F E 𝑏𝑐𝑝𝑠𝑢() 9 𝑠𝑓𝑢𝑣𝑠𝑜 𝑑𝑐 𝒇 𝒏

  6. Grey-box fuzzing – Working example 𝒔 𝒏 "𝑏" ① 1 𝑗 = 0 ② ③ 𝑑𝑐 = 0 "𝑐" "𝑏𝑐" "𝑑"    𝑠𝑓𝑏𝑒(𝑔𝑒, 𝑗𝑜𝑞, 20) 2 false 3 𝑗𝑜𝑞 𝑗 ! = ‘\0’ D true A false 4 𝑗𝑜𝑞[𝑗] == ‘𝑐’ C true Id input AB AC BA CA BD CD DE DF B 𝑑𝑐 = 𝑑𝑐 + 1 "𝑏" 5 1 1 1 1 2 “b” 1 1 1 6 𝑗 = 𝑗 + 1 3 “ab” 1 1 1 1 1 “c” 1 1 1 false 𝑑𝑐 ≥ 5 8 F E 𝑏𝑐𝑝𝑠𝑢() 9 𝑠𝑓𝑢𝑣𝑠𝑜 𝑑𝑐 𝒇 𝒏

  7. Grey-box fuzzing – Working example 𝒔 𝒏 "𝑏" ① 1 𝑗 = 0 ② ③ 𝑑𝑐 = 0 "𝑐" "𝑏𝑐" "𝑑"    𝑠𝑓𝑏𝑒(𝑔𝑒, 𝑗𝑜𝑞, 20) ⑤ ④ 2 "𝑑" "𝑐𝑐" " … " "𝑏𝑐𝑏" " … "   "𝑏𝑐𝑐"   false 3 𝑗𝑜𝑞 𝑗 ! = ‘\0’ D true A false 4 𝑗𝑜𝑞[𝑗] == ‘𝑐’ C true Id input AB AC BA CA BD CD DE DF B 𝑑𝑐 = 𝑑𝑐 + 1 "𝑏" 5 1 1 1 1 2 “b” 1 1 1 6 𝑗 = 𝑗 + 1 3 “ab” 1 1 1 1 1 4 “bb” 2 1 1 1 false 𝑑𝑐 ≥ 5 8 F 5 “aba” 1 2 1 1 1 1 E “ abb ” 2 1 1 1 1 1 𝑏𝑐𝑝𝑠𝑢() 9 𝑠𝑓𝑢𝑣𝑠𝑜 𝑑𝑐 𝒇 𝒏

  8. Grey-box fuzzing algorithm Algorithm 1 Grey-box fuzzing algorithm Require: Program P , Initial non-crashing seeds I s . Ensure: Set of crashing inputs T C and a tree of test inputs T G for P . 1: T G = I s 2: Run P with I s and observe visit counts of branch pairs . 3: repeat 4: t = getNextInput() ⊲ t ∈ T G . 5: N = assignEnergy( t ) 6: T m = fuzzTestInput( t , N ) ⊲ T m : { t g | t g ∈ MUTATE ( t ) } 7: for all t g ∈ T m do 8: S g = run( P , t g ) 9: if S g = ⊥ then ⊲ Did t g caused a crash or hang ? 10: T C . add ( t g ) 11: else if isInterestingTestInput( t g , S g ) then 12: T G . add ( t g ) ⊲ Retain interesting test input 13: end if 14: end for 15: until User interrupt received. 16: return ( T G , T C )

  9. N = assignEnergy(t) Let N=100. Let N 1 be the N ∗ a factor inversely proportional to t g ’s execution time. (Ranging from 0.1 for higher execution time to 3 times for lower execution times) Let N 2 be N 1 ∗ a factor based on number of branch pairs covered by t g . (Ranging from 0.25 for lower coverage to 3 times for higher coverage) Let N 3 be N 2 ∗ a factor based on cycle of t g ’s discovery and number of time t fuzzed. (Low = 1 to high = 4) Let N 4 be N 3 ∗ a factor based on depth of t g ’s discovery. (Low = 1 to high = 5) return N 4

  10. Problem Statement BlackBox Fuzzing ◮ Assumption : 2 8 characters. 1 void crashme (char *s) { 2 ◮ Expected no. of testcase required 3 if(s[0] == ’b’) to catch the bug : 2 32 . 4 5 if(s[1] == ’a’) 6 Coverage-based GreyBox 7 if(s[2] == ’d’) 8 Fuzzing (CGF) 9 if(s[3] == ’!’) 10 ◮ Markov Chain modeling of CGF gives the expectation that 2 12 is 11 abort () ; 12 } minimum test required to catch the crash. Listing 1: Program crashes when ◮ Current CGF algorithms are string s == "bad!" independent of judicious energy assignment to interesting test vectors for further fuzzing.

  11. Problem Statement 1 void crashme (char *s) { BlackBox Fuzzing 2 ◮ Assumption : 2 8 characters. 3 if(s[0] == ’b’) 4 ◮ Expected no. of testcase required 5 if(s[1] == ’a’) to catch the bug : 2 32 . 6 7 if(s[2] == ’d’) 8 Coverage-based GreyBox 9 if(s[3] == ’!’) Fuzzing (CGF) 10 11 abort () ; ◮ Markov Chain modeling of CGF 12 } gives the expectation that 2 12 Listing 2: Program crashes when tests are required to catch the string s == "bad!" crash. ◮ Current CGF algorithms are independent of judicious energy Objective assignment to interesting test Tune energy assignment scheme close vectors for further fuzzing. to ideal.

  12. Some terminologies Branch Pair Tuple BP i : < bp i , C i > where, bp i - Branch Pair i , C i - Visit Count. Path: Sequence of branch pair tuples [ BP i , BP j . . . ] visited during the execution of the program P on a test vector t .

  13. Basic Concepts : Probabilistic Modeling Random Variable Maps possible outcomes from Sample Space to a real valued number. X : Ω → R Conditional Probability Calculates probability of an event happening, given a partial information. P ( B | A ) = P ( B ∩ A ) / P ( A ) Stochastic Process Collection of Random Variables indexed by time.

  14. Discrete Time Stochastic Process (DTSP) Sequence of random variables X 0 , X 1 , X 2 , . . Denoted by { X n } . Time: n = 0, 1, 2, . . . State Space: m-dimensional vector, s = ( s 1 , s 2 , . . . , s m ) Set of all values that the X n ’s can take. Also, X n takes one of m values, so X n ↔ s .

  15. Discrete Time Markov Chain (DTMC) DTSP → Discrete time Markov Chain (DTMC) iff P [ X n + 1 = j | X n = i n , ..., X 0 = i 0 ] = P [ X n + 1 = j | X n = i n ] = P ij ( n ) (Markovian Property) Markov Property Future state is independent of the past given the present state is fully known/observable. P ij ( n ) : Probability of transition from state i to state j , at time n . This is also referred as one-step transition probability.

  16. Rat Maze Problem as DTMC 1/2 1/3 1 2 3 1/3 1/2 1 2 3 1/2 1/3 1/3 1/4 1/2 1/3 4 5 6 1/3 1/4 4 5 6 1/4 1/3 7 8 9 1/3 1/2 1/4 1/3 1/3 1/2 Figure : A rat maze. Allowed 1/2 1/3 7 8 9 transitions are horizontal and 1/3 1/2 vertical neighbors. Figure : Markov Chain Modeling of Rat Maze Problem

  17. Homogeneous DTMC DTMC → Homogeneous iff transition probabilities do not depend on the time n, i.e. P [ X n + 1 = j | Xn = i ] = P [ X 1 = j | X 0 = i ] = P ij . Transition matrix of Homogeneous DTMC P = [ P ij ] i , j ∈ E p 1 , 1 p 1 , 2 p 1 , 3 p 1 , 4   p 2 , 1 p 2 , 2 p 2 , 3 p 2 , 4   P = p 3 , 1 p 3 , 2 p 3 , 3 p 3 , 4   p 4 , 1 p 4 , 2 p 4 , 3 p 4 , 4

  18. Coverage-Based Fuzzing as Homogeneous DTMC Coverage-based Greybox fuzzing can modeled as Timed homogeneous DTMC. State Space S = S + + S − . S + - Paths already explored by seeds T G . S − - Paths yet to be discovered by fuzzing t ∈ T G . Assumptions : Probability of exercising path i (undiscovered) from already generated input t j , is same as probability of creating test input t j from test vectors t i .

  19. Coverage-based Greybox Fuzzing as Markov Chain Example � void crashme (char* s) { 1 − 2 − 10 1 **** if (s[0] == ’b’) 2 2 − 10 if (s[1] == ’a’) 3 if (s[2] == ’d’) 4 3 b*** if (s[3] == ’!’) 5 4 abort (); 2 − 10 6 } 7 � 2 + 2 − 10 1 ba** 2 − 10 1 4 + 2 − 9 bad* 1 4 − 2 − 10 2 − 10 • • Defining the coverage-based fuzzer: 2 − 8 bad! • Start with seed that is a random 4-letter word. 4 • Given a seed, the fuzzer chooses a letter and substitutes it. Presented by Marcel Böhme

Recommend


More recommend