pruning the search space in path based test generation
play

Pruning the Search Space in Path-based Test Generation Motivation - PowerPoint PPT Presentation

Pruning the Search Space in Path-based Test Generation Motivation SE S ebastien Bardin Heuristics sebastien.bardin@cea.fr Experiments CEA-LIST, Software Security Labs Conclusion (joint work with Philippe Herrmann) S ebastien Bardin,


  1. Pruning the Search Space in Path-based Test Generation Motivation SE S´ ebastien Bardin Heuristics sebastien.bardin@cea.fr Experiments CEA-LIST, Software Security Labs Conclusion (joint work with Philippe Herrmann) S´ ebastien Bardin, Philippe Herrmann 1/ 31

  2. Context Automatic test data generation from source code (STDG) The test suite must achieve a global structural coverage objective all instructions, all branches, etc. Motivation SE Heuristics Experiments Do not consider the oracle generation issue : assume an external Conclusion automatic oracle perfect oracle (back-to-back testing) partial oracle (assertions / contracts) S´ ebastien Bardin, Philippe Herrmann 2/ 31

  3. Symbolic Execution Symbolic Execution (SE) is a very fruitful approach for STDG efficiency robustness Motivation SE in a nutshell SE Heuristics Constraint-based reasoning : translate a part of the program into a Experiments logical formula ϕ , such that a solution of ϕ is a relevant TD Conclusion Path-based approach : focus on a single path at once + enumerate (bounded) paths simple formulas, only conjunctions (no quantifier / fixpoint) Concolic paradigm : combination of symbolic and dynamic execution robustness to “difficult-to-model” programming features S´ ebastien Bardin, Philippe Herrmann 3/ 31

  4. A few prototypes PathCrawler (CEA) 2004 Dart (Bell Labs), Cute (Uni. of Illinois / Berkeley) 2005 Motivation Exe (Stanford) 2006 SE Heuristics Jpf (NASA) 2007 Experiments Osmose (CEA), Sage (Microsoft), Pex (Microsoft) 2008 Conclusion S´ ebastien Bardin, Philippe Herrmann 4/ 31

  5. Main Limitations Two major bottlenecks for Symbolic Execution 1. constraint solving (along a single path) 2. # paths Path explosion phenomenon Motivation nesting loops and conditional instructions SE inlining of function calls Heuristics Experiments Conclusion Moreover : SE require a user-defined path-bound k things get worse if k is over-estimated sometimes, very long paths to exhibit specific behaviours Our goal : lower the path explosion in SE S´ ebastien Bardin, Philippe Herrmann 5/ 31

  6. Not all Paths are Relevant for STDG Irrelevant paths In practice, SE enumerates all k-paths But the true goal is to cover “items” (instr., branches) Some paths are very unlikely to improve the current coverage Motivation SE Heuristics Idea : detect a priori irrelevant paths to discard them and lower the Experiments path explosion Conclusion Our results 1. three complementary heuristics to prune likely redundant paths 2. implementation in the Osmose tool and experiments S´ ebastien Bardin, Philippe Herrmann 6/ 31

  7. Outline Context Symbolic Execution Motivation SE Heuristics Heuristics Experiments Experiments Conclusion Conclusion S´ ebastien Bardin, Philippe Herrmann 7/ 31

  8. Path Predicate π a finite path of the program P D the input space of P V ∈ D an input vector Path predicate A path predicate for π is a formula ϕ π interpreted on D s.t. if Motivation V | = ϕ π then the execution of P on V exercices π at runtime. SE Heuristics Experiments t 1 t 2 t n More formally : let π = − → − → . . . − → Conclusion the greatest path predicate ϕ π = wpre ( t 1 , wpre ( t 2 , . . . wpre ( t n , ⊤ ))) ¯ a path predicate ϕ π such that ϕ π ⇒ ¯ ϕ π A path predicate is typically computed via strongest postcondition S´ ebastien Bardin, Philippe Herrmann 8/ 31

  9. Framework of Symbolic Execution Path-based test data generation 1 choose an uncovered ( k -bounded) path π 2 compute one of its path predicates ϕ π 3 solve ϕ π : solution = TD exercising path π Motivation SE 4 update coverage, if still something to cover then goto 1 Heuristics Experiments Conclusion Parameter 1 - Logical theory : not relevant here Parameter 2 - Path enumeration strategy : here, standard DFS Extension - Concolic execution S´ ebastien Bardin, Philippe Herrmann 9/ 31

  10. Symbolic Execution, Basic Procedure (BP) Motivation SE Heuristics Experiments Conclusion choose path compute path predicate, solve it, update cover choose the next path by DFS backtracking, and so on S´ ebastien Bardin, Philippe Herrmann 10/ 31

  11. Symbolic Execution, Basic Procedure (BP) Motivation SE Heuristics Experiments Conclusion choose path compute path predicate, solve it, update cover choose the next path by DFS backtracking, and so on S´ ebastien Bardin, Philippe Herrmann 10/ 31

  12. Symbolic Execution, Basic Procedure (BP) Motivation SE Heuristics Experiments Conclusion choose path compute path predicate, solve it, update cover choose the next path by DFS backtracking, and so on S´ ebastien Bardin, Philippe Herrmann 10/ 31

  13. Symbolic Execution, Basic Procedure (BP) Motivation SE Heuristics Experiments Conclusion choose path compute path predicate, solve it, update cover choose the next path by DFS backtracking, and so on S´ ebastien Bardin, Philippe Herrmann 10/ 31

  14. Symbolic Execution, Basic Procedure (BP) Motivation SE Heuristics Experiments Conclusion choose path compute path predicate, solve it, update cover choose the next path by DFS backtracking, and so on S´ ebastien Bardin, Philippe Herrmann 10/ 31

  15. Outline Context Symbolic Execution Motivation SE Heuristics Heuristics Experiments Experiments Conclusion Conclusion S´ ebastien Bardin, Philippe Herrmann 11/ 31

  16. Heuristic 1 : Look-Ahead (LA) main Procedure BP tries to cover a new path at each iteration BUT this new path does not necessarily Motivation cover new items False SE the resolution time is wasted Heuristics True Experiments more useless paths will be Conclusion explored from this prefix On the example, full coverage requires at most 3 TD, while there are ≈ 2 k +1 paths of length ≤ k S´ ebastien Bardin, Philippe Herrmann 12/ 31

  17. Idea Check if uncovered items may be reached from the current instruction. If not, solve the current prefix but do not expand it Optimistic check based on the CFG abstraction of the program Motivation The Look-Ahead heuristic enjoys nice properties SE soundness : discard only redundant paths Heuristics relative completeness : BP+LA achieves always the same Experiments coverage than BP Conclusion path reduction : BP+LA explores always less path than BP Difficulty : efficient computation of the (CFG) reachability set S´ ebastien Bardin, Philippe Herrmann 13/ 31

  18. Reachability Set Computation Procedure ReachSet : node → Set(node) Motivation Standard worklist algorithm has the following problems in our context SE Heuristics all reachability sets are computed at the same time, even if BP Experiments will not use all of them Conclusion not designed for interprocedural or context-sensitive analysis S´ ebastien Bardin, Philippe Herrmann 14/ 31

  19. Reachability Set Computation (2) Efficient interprocedural analysis Efficient computation lazy computation Motivation SE computation cache Heuristics Experiments Conclusion Interprocedural analysis compact representation of sets of nodes : manipulate CFG nodes and Call Graph (CG) nodes function summaries : propagate reachable CG nodes (from CG) lazy computation and computation cache extend to CG S´ ebastien Bardin, Philippe Herrmann 15/ 31

  20. Reachability Set Computation (3) Context-sensitive analysis the current stack is passed as an argument, if the current node can reach a ret instruction, then the procedure is recursively launched on the top of the stack (return site) Motivation SE ReachSet-context(node,stack, rset) : Heuristics c := ReachSet(node) ; r := c ∪ rset Experiments Conclusion if (stack.empty or ret �∈ c ) then return r ; else return ReachSet-context(stack.top,stack.tail, r) Remark : the computation cache is still a map from node to set , rather than a map from ( node , stack ) to set S´ ebastien Bardin, Philippe Herrmann 16/ 31

  21. Heuristic 2 : Max-CallDepth (MCD) Nested function calls are often the major source of path explosion main function f BP explores all the paths in cal- Motivation lees SE c =?= 0 True False Heuristics But in unit testing, need to co- b := 1 b := 0 Experiments ver only paths of the top-level call f Conclusion function b =?= 0 Return Example : only two TD to cover the main function, but ≈ 2 k +1 paths S´ ebastien Bardin, Philippe Herrmann 17/ 31

  22. Idea (claim) top-level paths rarely depend only on specific behaviours in deep function calls MCD heuristic : prevent backtracking in deep nested function calls Motivation SE Implementation : a user-defined mcd parameter, a counter depth Heuristics updated by call and ret , performs branching only if depth ≤ mcd Experiments Conclusion Theoretically : take care, the MCD heuristic is not sound Empirically : experimental results show a very large pruning and no loss in coverage (see after) S´ ebastien Bardin, Philippe Herrmann 18/ 31

  23. Heuristic 3 : Solve-First (SF) DFS has two main drawbacks in our context if # TD is limited, DFS focuses only on a deep narrow portion of the program (slow coverage speed) longer (and more complex ?) prefixes are solved first Motivation SE true Heuristics true Example : assume #node = 2n+1, all Experiments paths are feasible, Conclusion true goal = instruction coverage true only two TD are necessary BP+LA : n+1 TD true S´ ebastien Bardin, Philippe Herrmann 19/ 31

Recommend


More recommend