symbolic execution
play

Symbolic Execution Saswat Anand 22/09/2009 Limitation of Dataflow - PDF document

Symbolic Execution Saswat Anand 22/09/2009 Limitation of Dataflow Analysis if(p < 10) i =10 x = 1 Is the DU pair involving variables I real? if(p > 10) No, because the path is infeasible. x=x+1 j = i+1 1 Outline Background


  1. Symbolic Execution Saswat Anand 22/09/2009 Limitation of Dataflow Analysis if(p < 10) i =10 x = 1 Is the DU pair involving variables I real? if(p > 10) No, because the path is infeasible. x=x+1 j = i+1 1

  2. Outline • Background – feasible and infeasible program paths – constraints, and constraint satisfiability • Symbolic execution – base idea – handling of symbolic references • Overview of compositional symbolic execution • Overview of implementation of symbolic execution • Limitations of symbolic execution • Summary Feasible and Infeasible Paths • A path refers to a path in the (inter- procedural) control-flow graph of the program. • A path is feasible if there exists an input I to the program that covers the path; i.e., when program is executed with I as input, the path is taken. • A path is infeasible if there exists no input I that covers the path. 2

  3. Infeasible Paths • Infeasible path does not imply dead code; However dead If(sameGoto) code implies infeasible path. newTarget = ((IfStmt) stmtSeq[5]).getTarget(); • In all real software, a very else { newTarget = next; large portion of the total no. oldTarget = of paths are infeasible. ((IfStmt) stmtSeq[5]).getTarget(); } • Automatic test-input … generation does not scale If(!sameGoto) when there are large no. of b.getUnits().insertAfter(…); … infeasible paths to the target location that needs to be An example of infeasible path from soot. A path that goes through the then branches covered. of both conditional stmt.s is infeasible. Constraints X > Y Λ Y+X ≤ 10 More types of constraints 1. Linear constraint • X, Y are called free variables . • X > Y Λ Y+X ≤ 10 2. Non-linear constraint • A solution of the • X * Y < 100 constraint is a set of • X % 3 Λ Y > 10 assignments, one for • (X >> 3) < Y each free variable that 3. Use of function symbols makes the constraint • f(X) > 10 Λ (forall a. f(a) = a satisfiable. + 10) • {X = 3, Y=2} is a solution. • {X = 6, Y=5} is not a solution. 3

  4. Constraints (contd.) • A decision procedure is a tool that can decide if a constraint is satisfiable. • A constraint solver is a tool that finds satisfying assignments for a constraint, if it is satisfiable. • In general, checking constraint satisfiability is undecidable. Symbolic Execution • Symbolic execution refers to execution of program with symbols as argument. • Unlike concrete execution, where the taken path is determined by the input, in symbolic execution the program can take any feasible path. • During symbolic execution, program state consists of – symbolic values for some memory locations – path condition • Path condition is a condition on the input symbols such that if a path is feasible its path-condition is satisfiable. • Solution of path-condition is an test-input that covers the respective path. 4

  5. Symbolic Execution 1 int x, y; inputs that cover else branch inputs that cover else branch x=A,y=B 2 if(x > y){ at stmt. 2: at stmt. 2: x=A,y=B x = 3 x = ? y = ? y = 4 A>B x = x+y; 3 x=A+B,y=B A>B y = x – y; 4 x=A+B,y=A inputs that cover then branch inputs that cover then branch A>B x = x – y; at 2 and else at 6: at 2 and else at 6: 5 x=B,y=A x = ? y = ? x = 5 y = 1 A>B if(x > y) 6 assert false; 7 8 } x=B,y=A 9 printf(x,y); A>B Λ B ≤ A One solution of the constraint A>B Λ B ≤ A is A = 5, B = 1 Symbolic Execution 1 int x, y; inputs that cover else branch inputs that cover else branch x=A,y=B 2 if(x > y){ at stmt. 2: at stmt. 2: x=A,y=B x = ? y = ? x = 3 y = 4 A>B x = x+y; 3 x=A+B,y=B A>B y = x – y; 4 x=A+B,y=A inputs that cover then branch A>B x = x – y; at 2 and else at 6: 5 x=B,y=A x = 5 y = 1 A>B if(x > y) 6 x=B,y=A assert false; A>B Λ B>A 7 inputs that cover then branch UNSAT! at 2 and then at 6: 8 } x = ? y = ? Does not exist! 9 printf(x,y); 5

  6. All-paths Symbolic Execution Normal execution int x, y; input: x = 4, y = 3 x=A,y=B PC: true if(x > y){ output: 3, 4 x=A,y=B PC: A>B x = x+y; x=A+B,y=B PC: A>B y = x – y; Symbolic execution x=A+B,y=A PC: A>B x = x – y; input: x = A, y = B x=B,y=A PC: A>B if(x > y) output: A, B x=B,y=A assert false; PC: A>B Λ B>A Path-condition: A ≤ B UNSAT! } output: B, A x=A,y=B x=B,y=A printf(x,y); PC: A ≤ B PC: A>B Λ B ≤ A Path-condition: A>B Λ B ≤ A Handling Symbolic References 1 class Node { int elem; 2 Node next; 3 foo( Node n1, Node n2 ){ 4 if (n1 == null) return ; 5 if (n2 == null ) return ; 6 if (n2.elem == 0) 7 return ; 8 if (n1.next != null ) 9 n1.next.elem = n1.elem -10; 10 assert (n2.elem != 0); 11 12 } 6

  7. Handling Symbolic References • setElem(H,n,e) – updates the elem field of node n in heap H to value e; returns the updated heap • getElem(H,n) – returns the value of elem field of node n in heap H • setNext(H,n,e), getNext(H,n) – likewise for next field Invariants: forall H, n. getElem(setElem(H,n,v),n) = v forall H, n. getNext(setNext(H,n,v),n) = v Handling Symbolic References 1 class Node { int elem; 2 Node next; 3 Path condition for the path 4-5-6-7-9-10-11 foo( Node n1, Node n2 ){ 4 if (n1 == null) return ; 5 n1 ≠ null Λ if (n2 == null ) return ; n2 ≠ null Λ 6 getElem(H1,n2) ≠ 0 Λ if (n2.elem == 0) 7 getNext(H1,n1) ≠ null Λ return ; 8 H2 = setElem(H1, if (n1.next != null ) 9 getNext(H1,n1), n1.next.elem = n1.elem -10; getElem(H1,n1)-10) Λ 10 getElem(H2,n2) = 0 assert (n2.elem != 0); 11 12 } 7

  8. Compositional Symbolic Execution • Goal: generate an input that int abs(int x){ covers leads to execution of if(x >= 0) error() return x; • No. of paths to error() = 2 50 else return –x; • Symbolically executing each } path and checking its feasibility does not scale! int sumAbs(int[] a){ int sum = 0; • Key idea: compute function for ( int i = 0; i < 50; i++) summaries to be used at all sum += abs(a[i]); call-sites of the function if (sum == 13) error(); return sum; } Compositional Symbolic Execution (contd.) • Symbolically execute all int abs(int x){ paths of callee function (e.g., if(x >= 0) abs) and compute a function return x; summary. else • For each path in a function, return –x; the summary encodes path- } condition of each path and the value returned on the int sumAbs(int[] a){ path. int sum = 0; • When symbolically executing for ( int i = 0; i < 50; i++) sum += abs(a[i]); paths in caller function (e.g., if (sum == 13) sumAbs) reuse the summary error(); of the callee instead of return sum; symbolically executing paths } in callee repeatedly. 8

  9. Compositional Symbolic Execution (contd.) 2 paths to symbolically summary of abs function: int abs(int x){ execute forall x. (x ≥ 0 Λ abs(x) = x) V if(x >= 0) (x < 0 Λ abs(x) = -x) return x; else return –x; No. of paths that lead to error() without } descending into abs function = 1 int sumAbs(int[] a){ int sum = 0; path-condition of path leading to error for ( int i = 0; i < 50; i++) abs(a[0]) + abs(a[1]) + …+ abs(a[49]) = 13 sum += abs(a[i]); Λ forall x. (x ≥ 0 Λ abs(x) = x) V if (sum == 13) (x < 0 Λ abs(x) = -x) error(); return sum; } Implementation of Symbolic Execution • Transformation approach – transform the program to another program that operates on symbolic values such that execution of the transformed program is equivalent to symbolic execution of the original program – difficult to implement, portable solution, suitable for Java, .NET • Instrumentation approach – callback hooks are inserted in the program such that symbolic execution is done in background during normal execution of program – easy to implement for C • Customized runtime approach – Customize the runtime (e.g., JVM) to support symbolic execution – Applicable to Java, .NET, difficult to implement, flexible, not portable 9

  10. Implementation of Symbolic Execution for Java (contd.) void foo( int x, int y){ void foo( Expression x, Expression y){ if (x > y){ if (_GT(x, y)){ x = x + y; x = _ADD(x, y); y = x – y; y = _SUB(x, y); x = _SUB(x, y); x = x – y; if (_GT(x,y)) if (x > y) assert false; assert false; } } } } transformed program original program class Expression{ int concreteValue; Operator op; Expression leftOp; Expression rightOp; … } Applications of Symbolic Execution • Test-input generation • Bug finding • Program verification • Determining functional equivalence • Worst case execution time estimation for real-time software 10

Recommend


More recommend