a tutorial on program analysis
play

A Tutorial on Program Analysis Markus Mller-Olm Dortmund - PDF document

A Tutorial on Program Analysis Markus Mller-Olm Dortmund University Thanks ! Helmut Seidl (TU Mnchen) and Bernhard Steffen (Universitt Dortmund) for discussions, inspiration, joint work, ... 1 Dream of Program Analysis program


  1. A Tutorial on Program Analysis Markus Müller-Olm Dortmund University Thanks ! Helmut Seidl (TU München) and Bernhard Steffen (Universität Dortmund) for discussions, inspiration, joint work, ... 1

  2. Dream of Program Analysis program analyzer result ������ ������� ��������� ������������������� ���� ������� ����� ������ ��������������� ������ ������ � ��!����� � Ψ � G( Φ → F property specification Purposes of Automatic Analysis � Optimizing compilation � Validation/Verification � Type checking � Functional correctness � � � � � Security properties � . . . � Debugging ������������������������������������������������������� � 2

  3. Dream of Program Analysis program analyzer result ������ ������� ��������� ������������������� ���� ������� ����� ������ ��������������� ������ ������ � ��!����� � Ψ � G( Φ → F property specification Fundamental Limit Rice's Theorem [Rice,1953]: All non-trivial semantic questions about programs from a universal programming language are undecidable . ������������������������������������������������������� � 3

  4. Two Solutions Weaker formalisms Approximate analyses analyze abstract yield sound but, in � � models of systems general, incomplete results e.g.: automata, labelled � transition systems,... e.g.: detects some � instead of all constants Model checking Flow analysis Abstract interpretation Type checking ������������������������������������������������������� � Weaker Formalisms Exact analyzer for Program Abstract model abstract model ������ ������� ��������� ������������������� ���� ������� ����� ������ ��������������� ������ ������ "#!���� � Approximate Exact ������������������������������������������������������� � 4

  5. Overview � Introduction � Fundamentals of Program Analysis � Interprocedural Analysis � Analysis of Parallel Programs � Invariant Generation � Conclusion Apology for not giving detailed credit ! ������������������������������������������������������� �� Credits � Pioneers of Iterative Program Analysis: � Kildall, Wegbreit, Kam & Ullman, Karr, ... � Abstract Interpretation: � Cousot/Cousot, Halbwachs, ... � Interprocedural Analysis: � Sharir & Pnueli, Knoop, Steffen, Rüthing, Sagiv, Reps, Wilhelm, Seidl, ... � Analysis of Parallel Programs: � Knoop, Steffen, Vollmer, Seidl, ... � And many more: � Apology ... 5

  6. Overview � Introduction � Fundamentals of Program Analysis � Interprocedural Analysis � Analysis of Parallel Programs � Invariant Generation � Conclusion ������������������������������������������������������� �� From Programs to Flow Graphs 0 x=17 ������ 1 ������� � (y>63) y>63 ��������� ������������������� 2 5 ���� 7 y<99 x=x+42 y:=17 ��������� y=x+y 3 6 ����� ������ 8 x:=10 � (y<99) x=y+1 ��������������� ������ 4 9 ������ x:=x+1 y:=11 � 10 x:=y+1 11 ������������������������������������������������������� �� 6

  7. Dead Code Elimination Goal: find and eliminate assignments that compute values which are never used Fundamental problem: undecidability � use approximate algorithm: e.g.: ignore that guards prohibit certain execution paths Technique: 1) perform live variables analyses: variable x is live at program point u iff there is a path from u on which x is used before it is modified 2) eliminate assignments to variables that are not live at the target point ������������������������������������������������������� �� Live Variables 0 x=17 1 � (y>63) y>63 2 5 7 y<99 x=x+42 y:=17 y live y=x+y 3 6 y live 8 x:=10 � (y<99) x=y+1 4 9 x:=x+1 y:=11 10 x dead x:=y+1 11 7

  8. Live Variables Analysis {y} 0 x=17 {x,y} {y} 1 � (y>63) y>63 ∅ ∅ {x,y} 5 ∅ ∅ 2 {x,y} 7 y<99 x=x+42 y:=17 y=x+y {y} 3 {x,y} {y} 6 8 x:=10 � (y<99) x=y+1 {y} 4 9 {x,y} ∅ ∅ ∅ ∅ x:=x+1 y:=11 10 {y} x:=y+1 11 {x,y} Remarks � Forward vs. backward analyses � (Separable) bitvector analyses � forward: reaching definitions, available expressions, ... � backward: live/dead variables, very busy expressions, ... ������������������������������������������������������� �� 8

  9. Partial Order Partial order (L, � ): set L with binary relation � � L � L s.t. � is reflexive: � � ∀ ∈ x L : x x � is antisymetric: � � � ¬ � ∀ x y , ∈ L : x y ( y x ) � is transitive � � � � � ∀ x y z , , ∈ L : ( x y ∧ y z ) x z For a subset X � L: � X : least upper bound ( join ), if it exists � X : greatest lower bound ( meet ), if it exists Complete Lattice � Complete lattice ( L , � ): � a partial order ( L , � ) for which � X exists for all X � L . � In a complete lattice ( L , � ): � X = � { x � L | x � X } � � X exists for all X � L : least element � exists: � = � L = � � � greatest element � exists: � = � � = � L � � Example: � for any set A let P( A ) = { X | X � A }. � (P( A ), � ) is a complete lattice. � (P( A ), � ) is a complete lattice. 9

  10. Interpretation in Approximate Program Analysis x � y : x is more precise information than y. � y is a correct approximation of x. � � X for X �� L : the most precise information consistent with all informations x � X. Remark: often dual interpretation in the literature ! Example: lattice for live variables analysis: � ( P (Var), � ) with Var = set of variables in the program Specifying Live Variables Analysis by a Constraint System Compute (smallest) solution over ( L , � ) = (P(Var), � ) of: � # V [ fin ] init , for fin , the termination node � # # V [ ] u f V ( [ ]), v for each edge e ( , , ) u s v = e where init = Var, f e :P(Var) � P(Var), f e ( x ) = x � kill e � gen e , with kill e = variables assigned at e � � gen e = variables used in an expression evaluated at e 10

  11. Specifying Live Variables Analysis by a Constraint System Remarks : Every solution is „correct“. 1. The smallest solution is called MFP-solution; 2. it comprises a value MFP[u] � L for each program point u. (MFP abbreviates „maximal fixpoint“ for traditional reasons.) 3. The MFP-solution is the most precise one. 4. Data-Flow Frameworks � Correctness � generic properties of frameworks can be studied and proved � Implementation � efficient, generic implementations can be constructed ������������������������������������������������������� �� 11

  12. Questions � Do (smallest) solutions always exist ? � How to compute the (smallest) solution ? � How to justify that a solution is what we want ? ������������������������������������������������������� �� Questions � Do (smallest) solutions always exist ? � How How How to to to compute compute compute the the the ( ( (smallest smallest smallest) ) solution ) solution ? solution ? ? � � How to justify that a solution is what we want ? � How How to to justify justify that that a a solution solution is is what what we we want want ? ? � � ������������������������������������������������������� �� 12

Recommend


More recommend