modelling specification and formal analysis of complex
play

Modelling, Specification and Formal Analysis of Complex Software - PowerPoint PPT Presentation

Modelling, Specification and Formal Analysis of Complex Software Systems Precise Static Analysis of Programs with Dynamic Memory Mihaela Sighireanu IRIF, University Paris Diderot & CNRS VTSA 2015 1 / 99 Static Analysis Establish


  1. Class of Programs: Types Basic Types Numeric types DT ∈ DT on which are defined operations o ∈ O and boolean relations r ∈ R ; D is the union of numerical domains. Record types User defined types are record types RT ∈ RT , defined by a set of numeric df ∈ DF or reference rf ∈ RF fields as follows: struct RT { ty1 f1; ... tyn fn; }; with tyi ::= DT | RT ∗ and f i ∈ FS = DF ∪ RF . 19 / 99

  2. Class of Programs: Variables and Procedures Language strongly typed except the null constant! Variable declaration Numeric variables dv ∈ DV and reference variables rv ∈ RV are either declared global or local to some procedure. Procedure declaration Explicit syntax for output parameters; all procedures return a result. ty P(ty1 v1, ..., tyn out vn) { // declarations for local variables ty v; // sequence of statements start P : ...; v = ...; ... return v; end P : } 20 / 99

  3. Class of Programs: Expressions and Statements Boolean and numeric expressions Fixed evaluation order of arguments; no side effects. bcst | bv | r ( − → ::= de ) | rv 1 == rv 2 | ! be | be ∧ be | be ∨ be be dcst | dv | o ( − → de ::= de ) | rv → df Reference expressions No arithmetics on references! re ::= null | rv | rv → rf Statements Restricted procedure call; explicit dynamic memory (de)allocation. astmt ::= dv = de | rv → df = de | rv = re | rv → rf = re | bv = be | rv =new RT | free ( rv ) | nop astmt | v = P ( − → stmt ::= v ) | stmt ; stmt | if . . . | while . . . 21 / 99

  4. Example Revisited struct list { int data; list* next}; list* search(list* h, int key) { list* it; bool b; it = h; b = false; while (!(it == NULL ∨ b)) { if (it->data == key) b = true; else it = it->next; } return it; } 22 / 99

  5. Formal Semantics: Program Configurations Mem � Stacks × Heaps ∋ m Memory configs Store-less Heap Absence of arithmetics over addresses permits the store-less semantics, i.e. the heap locations are represented by a domain ( L , =) . Strongly-typed Heap Strong typing permits indexing of heap locations by program types, i.e. � L = { ⊠ } ∪ L ty ty ∈ RT ∗ with ⊠ (for null) the only untyped value. Then, the heap formal model is: H ∈ Heaps � [( L × FS ) � → ( D ∪ L )] with H ( ⊠ , f ) undefined for any H and f . 23 / 99

  6. Formal Semantics: Executions CP ∋ ℓ, ℓ ′ Control points ∋ start P , end P for each procedure P ∈ P � � ∗ ] ∋ S Stacks � [ → D ∪ RV � → L ) Stack CP × P × ( DV � Mem � Stacks × Heaps ∋ m Memory Config � CP × ( Mem ∪ { merr } ) ∋ C Configurations Natural Semantics Predicates C ⊢ stmt � C ′ m ⊢ astmt � m ′ | merr m ⊢ be � b | merr m ⊢ de � c | merr m ⊢ re � a | merr with b ∈ { true , false } , c ∈ D , a ∈ L . 24 / 99

  7. Natural Semantics: Rules (Some) ∀ i. m ⊢ de i � c i ∃ i. m ⊢ de i � merr r ( c 1 , . . . , c n ) = true m ⊢ r ( de 1 , . . . , de n ) � true m ⊢ r ( de 1 , . . . , de n ) � merr 25 / 99

  8. Natural Semantics: Rules (Some) ∀ i. m ⊢ de i � c i ∃ i. m ⊢ de i � merr r ( c 1 , . . . , c n ) = true m ⊢ r ( de 1 , . . . , de n ) � true m ⊢ r ( de 1 , . . . , de n ) � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ( a, df ) = c m ⊢ rv → df � c m ⊢ rv → df � merr 25 / 99

  9. Natural Semantics: Rules (Some) ∀ i. m ⊢ de i � c i ∃ i. m ⊢ de i � merr r ( c 1 , . . . , c n ) = true m ⊢ r ( de 1 , . . . , de n ) � true m ⊢ r ( de 1 , . . . , de n ) � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ( a, df ) = c m ⊢ rv → df � c m ⊢ rv → df � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ⊢ free ( rv ); � m [ rv ← ⊠ ] m ⊢ free ( rv ); � merr 25 / 99

  10. Natural Semantics: Rules (Some) ∀ i. m ⊢ de i � c i ∃ i. m ⊢ de i � merr r ( c 1 , . . . , c n ) = true m ⊢ r ( de 1 , . . . , de n ) � true m ⊢ r ( de 1 , . . . , de n ) � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ( a, df ) = c m ⊢ rv → df � c m ⊢ rv → df � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ No garbage detection m ⊢ free ( rv ); � m [ rv ← ⊠ ] m ⊢ free ( rv ); � merr 25 / 99

  11. Natural Semantics: Rules (Some) ∀ i. m ⊢ de i � c i ∃ i. m ⊢ de i � merr r ( c 1 , . . . , c n ) = true m ⊢ r ( de 1 , . . . , de n ) � true m ⊢ r ( de 1 , . . . , de n ) � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ( a, df ) = c m ⊢ rv → df � c m ⊢ rv → df � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ⊢ free ( rv ); � m [ rv ← ⊠ ] m ⊢ free ( rv ); � merr a fresh in L RT ∗ ∀ rf ∈ RT. m .H ( a, rf ) = ⊠ ∀ df ∈ RT. m .H ( a, df ) = c m ⊢ rv = new RT ; � m [ rv ← a ] 25 / 99

  12. Natural Semantics: Rules (Some) ∀ i. m ⊢ de i � c i ∃ i. m ⊢ de i � merr r ( c 1 , . . . , c n ) = true m ⊢ r ( de 1 , . . . , de n ) � true m ⊢ r ( de 1 , . . . , de n ) � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ( a, df ) = c m ⊢ rv → df � c m ⊢ rv → df � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ⊢ free ( rv ); � m [ rv ← ⊠ ] m ⊢ free ( rv ); � merr a fresh in L RT ∗ ∀ rf ∈ RT. m .H ( a, rf ) = ⊠ ∀ df ∈ RT. m .H ( a, df ) = c By default initialisation Infinite heap m ⊢ rv = new RT ; � m [ rv ← a ] 25 / 99

  13. Natural Semantics: Rules (Some) ∀ i. m ⊢ de i � c i ∃ i. m ⊢ de i � merr r ( c 1 , . . . , c n ) = true m ⊢ r ( de 1 , . . . , de n ) � true m ⊢ r ( de 1 , . . . , de n ) � merr See procedure call in the second part! m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ( a, df ) = c m ⊢ rv → df � c m ⊢ rv → df � merr m ( rv ) = a � = ⊠ m ( rv ) = ⊠ m ⊢ free ( rv ); � m [ rv ← ⊠ ] m ⊢ free ( rv ); � merr a fresh in L RT ∗ ∀ rf ∈ RT. m .H ( a, rf ) = ⊠ ∀ df ∈ RT. m .H ( a, df ) = c By default initialisation Infinite heap m ⊢ rv = new RT ; � m [ rv ← a ] 25 / 99

  14. From Program Text to Graph Model Definition An inter-procedural control flow graph (ICFG) over a set of operations Op is a tuple � V, Op , → , start , end � where: V is a finite set of vertices, start ∈ V is a starting vertex and end ∈ V is a final vertex , Op is a finite set of labels , → ∈ V × Op × V is a finite set of edges . For our class of program Op ∋ op ::= be | astmt | call v = P ( . . . ) | return v = P ( . . . ) where (recall) bcst | bv | r ( − → be ::= de ) | rv 1 == rv 2 | ! be | be ∧ be | be ∨ be astmt ::= dv = de | rv → df = de | rv = re | rv → rf = re | bv = be | rv =new RT | free ( rv ) | nop 26 / 99

  15. ICFG for search list* search(list* h, int key) { list* it; bool b; s0 s0: it = h; s1: b = false; s2: while (!(it == NULL ∨ b)) { s3: if (it->data == key) it=h s4: b = true; else s5: it = it->next; s1 s6: } s7: return it; s8: } b=false it==NULL ∨ b s2 s7 !(it==NULL ∨ b) $ret=it nop s3 s8 it->data==key it->data!=key s4 s5 b=true it=it->next s6 27 / 99

  16. ICFG for search and foo list* search(list* h, int key) { list* it; bool b; s0: it = h; s0 f0 s1: b = false; s2: while (!(it == NULL ∨ b)) { call p=search(l,d) s3: if (it->data == key) ... it=h s4: b = true; else s5: it = it->next; s1 f1 s6: } s7: return it; s8: } int foo() { b=false f0: ... f1: p = search(l, d); f2: ... it==NULL ∨ b s2 s7 f3: } p=search(l,d) !(it==NULL ∨ b) $ret=it nop s3 s8 it->data==key it->data!=key return p=search(l,d) s4 s5 f2 b=true it=it->next ... s6 f3 28 / 99

  17. ICFG for User Assertions m: assert(be); !be m BAD m n: be n m: assume(be); m n: be n 29 / 99

  18. ICFG and Labeled Transition Systems ICFG is a finite, syntactic object. The interpretation of ICFG using the natural semantics produces a model of program executions, i.e. LTS. Definition LTS A labeled transition system is a tuple � C, Init, Out, Σ, → � where: C is a set of configurations , Init ∈ C and Out ∈ C are sets of initial and exit configurations, Σ is a finite set of actions , → ⊆ C × Σ × C is a set of transitions. LTS is an infinite, semantic object. 30 / 99

  19. Control Paths, Execution Paths, Runs A control path is a path in the control flow graph: op 0 op k q 0 − − → q 1 . . . q k − − → q k + 1 An execution path is a path in the labeled transition system: op 0 op k ( q 0 , m 0 ) − − → ( q 1 , m 1 ) . . . ( q k , m k ) − − → ( q k + 1 , m k + 1 ) A run is an execution path starting from the initial configuration: op 0 op k − − − − ( q Init , m Init ) → ( q 1 , m 1 ) . . . ( q k , m k ) → ( q k + 1 , m k + 1 ) 31 / 99

  20. Outline Introduction 1 Formal Models and Semantics for IMPR 2 Foundations of Static Analysis by Abstract Interpretation 3 Application: Programs with Lists and Data 4 Application: Decision Procedures by Static Analysis 5 Elements of Inter-procedural Analysis 6 Application: Programs with Lists, Data, and Procedures 7 Extension: Programs with Complex Data Structures 8 32 / 99

  21. Reformulate Our Goal Goal Over-approximate the set of configurations reachable from the initial configuration. The exact set of reachable configurations is: � Post ∗ = { ( q, m ) | ( q, m ) occurs in σ } σ : run � � � { q } × post op k ◦ . . . ◦ post op 0 = ( m Init ) op0 opk − − − − → ... → q q Init where post op : Mem ∪ { merr } � → Mem ∪ { merr } defined by the operational semantics, i.e. standard semantics. We focus on forward analysis. Exercise: Transpose to backward analysis. 33 / 99

  22. Reformulate Our Goal Goal Over-approximate the set of configurations reachable from the initial configuration at each program point. Project Post ∗ on each program point (ICFG vertex): � ∗ ( q ) Post ∗ q � → Post with q � → ∅ ≡ ∅ , = q ∈ ICFG � � � ∗ ( q ) Post = post op k ◦ . . . ◦ post op 0 ( { m Init } ) opk − 1 op0 opk − − − − − − − − q Init → ... → q k → q and post op : P ( Mem ∪ { merr } ) → P ( Mem ∪ { merr } ) is the collecting semantics, post op ( M ) = ∪ m ∈ M post op ( m ) . 34 / 99

  23. Reformulate our Goal ∗ is called − − − → Post MOP for (forward) “Meet Over All Paths” and is the most precise abstraction of the reachable configurations. However, in the presence of control loops, the set of runs is infinite, so: − − − → MOP is not computable in general! Sound Solution Over-approximate the initial system of equations over runs to a system of in-equations over ICFG edges: ∗ ( q Init ) Post ⊇ { m Init } op → q ′ ∈ ICFG ∗ ( q ′ ) ∗ ( q )) for all q ⊇ − Post post op ( Post 35 / 99

  24. Reformulate our Goal ∗ is called − − − → Post MOP for (forward) “Meet Over All Paths” and is the most precise abstraction of the reachable configurations. However, in the presence of control loops, the set of runs is infinite, so: − − − → MOP is not computable in general! Sound Solution Do solutions always exist? Yes, see Knaster-Tarski Fixpoint Theorem! Over-approximate the initial system of equations over runs to a system of in-equations over ICFG edges: ∗ ( q Init ) Post ⊇ { m Init } op → q ′ ∈ ICFG ∗ ( q ′ ) ∗ ( q )) for all q ⊇ − Post post op ( Post 35 / 99

  25. Complete Lattice Definition A partially ordered set ( L, ⊑ ) is a complete lattice if every X ⊆ L has both a greatest lower bound ⊓ X and a least upper bound ⊔ X in ( L, ⊑ ) . In a complete lattice ( L, ⊑ ) ⊔ X is the most precise information consistent with all x ∈ X ⊓ X is the infimum of X , i.e., ⊔ { x | x ⊑ X } least element exists ⊥ , ⊥ = ⊔ L = ⊓ ∅ greatest element exists ⊤ , ⊤ = ⊔ ∅ = ⊓ L Example: Powerset Lattice For any set S , ( P ( S ) , ⊆ ) is a complete lattice. 36 / 99

  26. Knaster-Tarski Fixpoint Theorem Definitions Let ( L, ⊑ ) be a partial order, f : L → L is monotonic iff ∀ x, y ∈ L. x ⊑ y = ⇒ f ( x ) ⊑ f ( y ) . x ∈ L is a fixpoint of f iff f ( x ) = x . Theorem Knaster-Tarski Let L be a complete lattice and f : L → L a monotonic function. The set of fixpoints of f is also a complete lattice. Consequently, least fixpoint lfp ( f ) and greatest fixpoint gfp ( f ) exist and: ⊓ { x ∈ L | f ( x ) ⊑ x } lfp ( f ) = least pre-fixpoint gfp ( f ) = ⊔ { x ∈ L | x ⊑ f ( x ) } greatest post-fixpoint 37 / 99

  27. Lattice of Fixpoints Picture from: Nielson/Nielson/Hankin, Principles of Program Analysis 38 / 99

  28. Reformulate Our Goal To avoid loss of precision, we focus on the least fixpoint of the system: ∗ ( q Init ) ⊇ Post { m Init } → q ′ ∈ ICFG ∗ ( q ′ ) ∗ ( q )) for all q op Post ⊇ post op ( Post − called − − − → MFP for (forward) “Maximal Fixpoint”. How to compute the smallest solution? See Kleene iteration! 39 / 99

  29. Kleene Iteration Kleene Fixpoint Theorem Let ( L, ⊑ ) be a complete partial order and f : L → L monotonic. Then lfp ( f ) is the supremum of the ascending Kleene chain of f , i.e. ⊥ ⊑ f ( ⊥ ) ⊑ f ( f ( ⊥ )) ⊑ . . . ⊑ f n ( ⊥ ) ⊑ . . . Observe that if f i ( ⊥ ) = f i + 1 ( ⊥ ) for some i , then f i ( ⊥ ) is lfp ( f ) . Definition ( L, ⊑ ) satisfies the ascending chain condition if every ascending chain x 0 ⊑ x 1 ⊑ . . . of elements of L is eventually stationary. Termination ( f i ( ⊥ )) i ∈ N converges for ( L, ⊑ ) satisfying the ascending chain condition. 40 / 99

  30. Improved Kleene Iteration: Workset Algorithm W = ∅ ; 1: for (all vertex q ) { P[q] = ⊥ ; W = Add(W,q); } 2: 3: P[qInit] = { mInit }; /* ∀ q. P [ q ] ⊑ − − − → MFP ( q ) ∧ { m Init } ⊑ P [ q Init ] ∧ ∀ q ′ / ∈ W. post op ( P [ q ]) ⊑ P [ q ′ ] with ( q, op, q ′ ) edge */ while (W != ∅ ) { 4: 5: q = Extract(W); 6: for (all edge (q,op,r)) { 7: t = post op (P[q]); if (! (t ⊑ P[r])) { 8: P[r] = P[r] ⊔ t; 9: 10: W = Add(W,r); 11: } } 12: } /* ∀ q. P [ q ] ⊑ − − − ⇒ P = − − − → → MFP ( q ) ∧ P solution = MFP */ 41 / 99

  31. Workset Algorithm Analysis Termination 8: if (! (t ⊑ P[r])) { P[r] = P[r] ⊔ t; 9: 10: W = Add(W,r); 11: } WS terminates if ( L, ⊑ ) satisfies the ascending chain condition. For any vertex r the following sequence converges: ⊥ ⊑ P [ r ] ⊑ P 2 [ r ] ⊑ . . . where P k [ r ] is the value of P[r] after visiting k edges with target r . Otherwise, change computation at line 9 to obtain convergence of ( P i [ r ]) i ∈ N for any r , e.g. widening (see later). Variants Different iteration strategies are obtained by changing the selection of the visited vertices ( Extract ) and edges (line 6). 42 / 99

  32. Precision of MFP For monotonic F , − MOP [ q ] ⊑ − − − − − → → MFP [ q ] for any reachable vertex q . q 0 ( L, ⊑ ) : x=3 x=2 ⊤ q 1 q 2 . . . . . . − 2 − 1 0 1 2 y=2 y=3 q 3 ⊥ z=x+y MOP [ q 4 ] = ( x � → 3, y � → 2, z � → 5 ) ⊔ ( x � → 2, y � → 3, z � → 5 ) q 4 = ( x � → ⊤ , y � → ⊤ , z � → 5 ) MFP [ q 3 ] = ( x � → 3, y � → 2, z � → ⊥ ) ⊔ ( x � → 2, y � → 3, z � → ⊥ ) ( x � → ⊤ , y � → ⊤ , z � → ⊥ ) = MFP [ q 4 ] = ( x � → ⊤ , y � → ⊤ , z � → ⊤ ) 43 / 99

  33. Complete Lattice: Examples Power Set For any set S , ( P ( S ) , ⊑ ) is a complete lattice where � � ⊑ = ⊆ , ⊓ = , ⊔ = , ⊥ = ∅ , ⊤ = S If S is finite then ( P ( S ) , ⊑ ) satisfies a.c.c. Functions For any set S and a complete lattice ( L, ⊑ ) , ( S → L, ⊑ ) is a complete lattice where f ⊑ g if ∀ x ∈ S. f ( x ) ⊑ g ( x ) ⊓ F = λx. ⊓ { f ( x ) | f ∈ F } , ⊔ F = λx. ⊔ { f ( x ) | f ∈ F } , ⊥ = λx. ⊥ , ⊤ = λx. ⊤ If S is finite and ( L, ⊑ ) satisfies a.c.c. then ( S → L, ⊑ ) satisfies a.c.c. 44 / 99

  34. Abstraction Principle Concrete Abstract � V, Op , → , start , end � ICFG: � V, Op , → , start , end � ICFG: ( L ♯ , ⊑ ♯ ) Lattice: ( L, ⊑ ) Lattice: Init ♯ ∈ L ♯ Init ∈ L Initial: Initial: mon ♯ mon op : L ♯ → L ♯ Semantics: post op : L − − − → L Semantics: post − − − mon → L ♯ − − − Abstraction α : L Soundness: − MFP ⊆ − − − → → MFP ♯ − − − − → → MFP ♯ MFP How to systematically ensure the correctness of this principle? − → Abstract Interpretation [Cousot&Cousot,79] . 45 / 99

  35. Abstract Interpretation Concrete Abstract � V, Op , → , start , end � ICFG: � V, Op , → , start , end � ICFG: ( L ♯ , ⊑ ♯ ) Lattice: ( L, ⊑ ) Lattice: Init ♯ ∈ L ♯ Init ∈ L Initial: Initial: mon ♯ mon op : L ♯ → L ♯ Semantics: post op : L − − − → L Semantics: post − − − γ : L ♯ → L α : L → L ♯ ♯ CS ( post op ) CS ( post op ) Soundness: α ( − MFP ) ⊑ ♯ − − − → → MFP ♯ − − − − → → MFP ♯ MFP Conditions Correct abstraction: ( α, γ ) is a Galois connection. Correct interpretation: α ( post op ( x )) ⊑ ♯ post ♯ op ( α ( x )) 46 / 99

  36. Galois Connection Definition A Galois connection between two lattices ( L, ⊑ ) and ( L ♯ , ⊑ ♯ ) is a pair of functions ( α, γ ) with α : L → L ♯ and γ : L ♯ → L satisfying, for all x ∈ L and y ♯ ∈ L ♯ : α ( x ) ⊑ ♯ y ♯ ⊑ γ ( y ♯ ) iff x Vocabulary and intuition: γ is the concretisation function, → γ ( y ♯ ) is the concrete value represented by y ♯ . − α is the abstraction function, − → α ( x ) is the most precise abstract value representing x − → concretisation of α ( x ) approximates x , i.e. ⊒ x . 47 / 99

  37. Galois Connection: Interval Abstraction − γ − − ← ( L, ⊑ ) = ( P ( Z ) , ⊆ ) ( Int, ⊆ ) = − − − − → α α α S ⊂ Z − − − ( inf ( S ) , sup ( S )) { − 1, 2 } − − − → (− 1, 2 ) → e.g. γ γ { ℓ, ℓ + 1, . . . , u } − − − ( ℓ, u ) e.g. { − 1, 0, 1, 2 } − − − (− 1, 2 ) ← ← 48 / 99

  38. Galois Connection: Sign Abstraction Exercise: Explicit the Galois connection for the Sign Abstraction, i.e. , − γ − − ← ( L, ⊑ ) = ( P ( R ) , ⊆ ) ( L ♯ , ⊑ ♯ ) = − − − − ⊤ → α − 0 0 + − + 0 ⊥  ⊥ if x = ∅  e.g. , α ( x ) = + if x ⊆ { r | r > 0 }  . . . . . . 49 / 99

  39. Galois Connection: Characterisation Let ( L, ⊑ ) and ( L ♯ , ⊑ ♯ ) be two lattices. For any two functions α : L → L ♯ and γ : L ♯ → L ,  x ⊑ γ ( α ( x )) for any x ∈ L    for any y ♯ ∈ L ♯  α ( γ ( y ♯ )) ⊑ ♯ y ♯ γ − − − ← ( L ♯ , ⊑ ♯ ) ( L, ⊑ ) − − − − iff → α is monotonic  α    γ is monotonic 50 / 99

  40. Correct Semantics Interpretation Given a monotonic function f : L → L , let consider a monotonic function g ♯ : L ♯ → L ♯ that is a sound approximation of f , i.e. α ◦ f ( x ) ⊑ ♯ g ♯ ◦ α ( x ) Theorem For any monotonic function f : L → L and any monotonic sound approximation of f , g ♯ : L ♯ → L ♯ then � � lfp ( g ♯ ) lfp ( f ) ⊑ γ Definition A sound approximation of post op is called (correct) abstract transformer. 51 / 99

  41. Example: Abstract Transformer for Intervals γ − − − ← Let consider the Galois connection ( P ( Z ) , ⊆ ) − − − − ( Int, ⊆ ) → α The concrete transformer for op ≡ ( z ≥ 0 ) is post z � 0 ( S ) = { v ∈ S | v ≥ 0 } ∀ S ⊆ Z 52 / 99

  42. Example: Abstract Transformer for Intervals γ − − − ← Let consider the Galois connection ( P ( Z ) , ⊆ ) − − − − ( Int, ⊆ ) → α The concrete transformer for op ≡ ( z ≥ 0 ) is post z � 0 ( S ) = { v ∈ S | v ≥ 0 } ∀ S ⊆ Z Several abstract transformers may be defined: g ♯ z � 0 (( ℓ, u )) = ( max ( 0, ℓ ) , u ) ( ℓ, u ) = ⊥ if ℓ > u h ♯ z � 0 (( ℓ, u )) = ( max ( 0, ℓ ) , ∞ ) f ♯ z � 0 (( ℓ, u )) = ⊤ and notice that g ♯ z � 0 ⊑ ♯ h ♯ z � 0 ⊑ f ♯ z � 0 . 52 / 99

  43. Example: Abstract Transformer for Intervals γ − − − ← Let consider the Galois connection ( P ( Z ) , ⊆ ) − − − − ( Int, ⊆ ) → α The concrete transformer for op ≡ ( z ≥ 0 ) is post z � 0 ( S ) = { v ∈ S | v ≥ 0 } ∀ S ⊆ Z Several abstract transformers may be defined: g ♯ z � 0 (( ℓ, u )) = ( max ( 0, ℓ ) , u ) ( ℓ, u ) = ⊥ if ℓ > u h ♯ z � 0 (( ℓ, u )) = ( max ( 0, ℓ ) , ∞ ) f ♯ z � 0 (( ℓ, u )) = ⊤ and notice that g ♯ z � 0 ⊑ ♯ h ♯ z � 0 ⊑ f ♯ z � 0 . What happens with lfp (recall, used in − − − → MFP ) of g ♯ z � 0 , h ♯ z � 0 , f ♯ z � 0 ? 52 / 99

  44. Precision of Abstract Transformers Theorem For any two monotonic functions f , g on a complete lattice ( L, ⊑ ) , if f ( x ) ⊑ g ( x ) for all x ∈ L then lfp ( f ) ⊑ lfp ( g ) . In the previous example, g ♯ is better than h ♯ ! 53 / 99

  45. Precision of Abstract Transformers Theorem For any two monotonic functions f , g on a complete lattice ( L, ⊑ ) , if f ( x ) ⊑ g ( x ) for all x ∈ L then lfp ( f ) ⊑ lfp ( g ) . In the previous example, g ♯ is better than h ♯ ! Definition For any monotonic function f : L → L , the best abstraction of f is the monotonic function f ♯ : L ♯ → L ♯ defined by: f ♯ = α ◦ f ◦ γ 53 / 99

  46. Precision of Abstract Transformers Theorem For any two monotonic functions f , g on a complete lattice ( L, ⊑ ) , if f ( x ) ⊑ g ( x ) for all x ∈ L then lfp ( f ) ⊑ lfp ( g ) . In the previous example, g ♯ is better than h ♯ ! Definition For any monotonic function f : L → L , the best abstraction of f is the monotonic function f ♯ : L ♯ → L ♯ defined by: f ♯ = α ◦ f ◦ γ But the best abstract transformer is difficult to compute! → e.g. , post ♯ − z = e ∗ e for sign abstraction needs to solve e ∗ e = 0 . 53 / 99

  47. Recipe to Design an Analysis 1 Design an abstract complete lattice ( L ♯ , ⊑ ♯ ) , simpler than the concrete one ( L, ⊑ ) , and formalise the “meaning” of abstract values γ − − − ← ( L ♯ , ⊑ ♯ ) by a Galois connection ( L, ⊑ ) − − − − → α → tests for equality with ⊤ ♯ , ⊥ ♯ , algorithms for ⊑ ♯ , ⊔ ♯ , . . . − 2 Design a sound abstract transformer g ♯ for each op ∈ OP in ICFG. − → based on the natural semantics, try to be precise and efficient 3 Compute lfp ( g ♯ ) using the some algorithm ( e.g. , workset) to obtain an over-approximation of − − − → MFP . → generic algorithm with parameters ( L ♯ , ⊑ ♯ ) , ICFG, and g ♯ − 54 / 99

  48. Example: Analysis of Null Aliasing [CC’77] s0 ( L ♯ , ⊑ ♯ ) = ⊤ it=h ⊠ ¬ ⊠ s1 ⊥ b=false Initially: it==NULL ∨ b s2 s7 CP it h W it!=NULL ∧ !b � s0 ⊥ ⊤ s1 ⊥ ⊥ nop s3 ⊥ ⊥ s2 it->data==key it->data!=key s3 ⊥ ⊥ ⊥ ⊥ s4 s4 s5 s5 ⊥ ⊥ b=true it=it->next s6 ⊥ ⊥ s7 ⊥ ⊥ s6 55 / 99

  49. Example: Analysis of Null Aliasing [CC’77] s0 ( L ♯ , ⊑ ♯ ) = ⊤ it=h ⊠ ¬ ⊠ s1 ⊥ b=false s0 extracted: it==NULL ∨ b s2 s7 CP it h W it!=NULL ∧ !b s0 ⊥ ⊤ s1 ⊤ ⊤ nop s3 ⊤ ⊤ � s2 it->data==key it->data!=key s3 ⊥ ⊥ ⊥ ⊥ s4 s4 s5 s5 ⊥ ⊥ b=true it=it->next s6 ⊥ ⊥ s7 ⊥ ⊥ s6 post ♯ it=h ( it � → v ♯ , h � → u ♯ ) = ( it � → u ♯ , h � → u ♯ ) 55 / 99

  50. Example: Analysis of Null Aliasing [CC’77] s0 ( L ♯ , ⊑ ♯ ) = ⊤ it=h ⊠ ¬ ⊠ s1 ⊥ b=false s2 extracted: it==NULL ∨ b s2 s7 CP it h W it!=NULL ∧ !b s0 ⊥ ⊤ s1 ⊤ ⊤ nop s3 ⊤ ⊤ s2 it->data==key it->data!=key ¬ ⊠ � s3 ⊤ ⊥ ⊥ s4 s4 s5 s5 ⊥ ⊥ b=true it=it->next s6 ⊥ ⊥ � s7 ⊤ ⊤ s6 it!=NULL ( it � → v ♯ , h � → u ♯ ) = ( it � → v ♯ ⊓ ♯ ¬ ⊠ , h � → u ♯ ) post ♯ 55 / 99

  51. Example: Analysis of Null Aliasing [CC’77] s0 ( L ♯ , ⊑ ♯ ) = ⊤ it=h ⊠ ¬ ⊠ s1 ⊥ b=false s3 extracted: it==NULL ∨ b s2 s7 CP it h W it!=NULL ∧ !b s0 ⊥ ⊤ s1 ⊤ ⊤ nop s3 ⊤ ⊤ s2 it->data==key it->data!=key ¬ ⊠ s3 ⊤ ¬ ⊠ ⊤ � s4 s4 s5 ¬ ⊠ � s5 ⊤ b=true it=it->next s6 ⊥ ⊥ � s7 ⊤ ⊤ s6 55 / 99

  52. Example: Analysis of Null Aliasing [CC’77] s0 ( L ♯ , ⊑ ♯ ) = ⊤ it=h ⊠ ¬ ⊠ s1 ⊥ b=false s4 extracted: it==NULL ∨ b s2 s7 CP it h W it!=NULL ∧ !b s0 ⊥ ⊤ s1 ⊤ ⊤ nop s3 ⊤ ⊤ s2 it->data==key it->data!=key ¬ ⊠ s3 ⊤ ¬ ⊠ ⊤ s4 s4 s5 ¬ ⊠ � s5 ⊤ b=true it=it->next s6 ¬ ⊠ ⊤ � � s7 ⊤ ⊤ s6 55 / 99

  53. Example: Analysis of Null Aliasing [CC’77] s0 ( L ♯ , ⊑ ♯ ) = ⊤ it=h ⊠ ¬ ⊠ s1 ⊥ b=false s5 extracted: it==NULL ∨ b s2 s7 CP it h W it!=NULL ∧ !b s0 ⊥ ⊤ s1 ⊤ ⊤ nop s3 ⊤ ⊤ s2 it->data==key it->data!=key ¬ ⊠ s3 ⊤ ¬ ⊠ ⊤ s4 s4 s5 ¬ ⊠ s5 ⊤ b=true it=it->next s6 ⊤ ⊤ � � s7 ⊤ ⊤ s6 post ♯ it=it->next ( it � → v ♯ , h � → u ♯ ) = ( it � → ⊤ , h � → u ♯ ) 55 / 99

  54. Example: Analysis of Null Aliasing [CC’77] s0 ( L ♯ , ⊑ ♯ ) = ⊤ it=h ⊠ ¬ ⊠ s1 ⊥ b=false s6 extracted: it==NULL ∨ b s2 s7 CP it h W it!=NULL ∧ !b s0 ⊥ ⊤ s1 ⊤ ⊤ nop s3 ⊤ ⊤ s2 it->data==key it->data!=key ¬ ⊠ s3 ⊤ ¬ ⊠ ⊤ s4 s4 s5 ¬ ⊠ s5 ⊤ b=true it=it->next s6 ⊤ ⊤ � s7 ⊤ ⊤ s6 55 / 99

  55. Example: Analysis of Heap Separation [CC’77] Abstraction Idea Partition the set of list variables (except NULL ) such that: next ∗ next ∗ v 1 , v 2 belong to the same partition if v 1 − − − − → ∩ v 2 − − − − → may be non-empty, otherwise v 1 , v 2 are in different partitions. − → the abstraction keep track of relation between variables v 1 v 2 ( P 4 , ⊑ ♯ ) = ✎ ✎ ✎ ✎ v 3 v 4 ⊠ ✎ ✎ ✎ 12/34 or 12/3/4 56 / 99

  56. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 Initially: xi!=NULL x|y ? v ♯ CP W � s0 x y|xi yl yt s2 s1 ⊥ yt=new(xi->data,NULL) s2 ⊥ nop s3 ⊥ s3 s4 ⊥ yl==NULL yl!=NULL s5 ⊥ s4 s5 s6 ⊥ s7 ⊥ y=yt yl->next=yt ⊥ s8 s6 s7 s8 s9 ⊥ yl=yt xi=xi->next 57 / 99

  57. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s0 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 � s1 x xi|y|yl yt yt=new(xi->data,NULL) s2 ⊥ nop s3 ⊥ s3 s4 ⊥ yl==NULL yl!=NULL s5 ⊥ s4 s5 s6 ⊥ s7 ⊥ y=yt yl->next=yt ⊥ s8 s6 s7 s8 s9 ⊥ yl=yt xi=xi->next xi=x ( v ♯ ) = Extract ( xi , v ♯ ) ⊔ ♯ { x , xi } post ♯ post ♯ y=NULL ( v ♯ ) = Extract ( y , v ♯ ) 57 / 99

  58. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s1 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y|yl yt yt=new(xi->data,NULL) � s2 x xi|y|yl yt nop s3 ⊥ s3 s4 ⊥ yl==NULL yl!=NULL s5 ⊥ s4 s5 s6 ⊥ s7 ⊥ y=yt yl->next=yt ⊥ s8 � s6 s7 s8 s9 x xi|y|yl yt yl=yt xi=xi->next post ♯ xi==NULL ( v ♯ ) = post ♯ xi!=NULL ( v ♯ ) = v ♯ 57 / 99

  59. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s2 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y|yl yt yt=new(xi->data,NULL) s2 x xi|y|yl yt nop � s3 s3 x xi|y|yl|yt s4 ⊥ yl==NULL yl!=NULL s5 ⊥ s4 s5 s6 ⊥ s7 ⊥ y=yt yl->next=yt ⊥ s8 � s6 s7 s8 s9 x xi|y|yl yt yl=yt xi=xi->next post ♯ yt=new... ( v ♯ ) = Extract ( yt , v ♯ ) 57 / 99

  60. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s3 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y|yl yt yt=new(xi->data,NULL) s2 x xi|y|yl yt nop s3 s3 x xi|y|yl|yt � s4 x xi|y|yl|yt yl==NULL yl!=NULL � s5 x xi|y|yl|yt s4 s5 s6 ⊥ s7 ⊥ y=yt yl->next=yt ⊥ s8 � s6 s7 s8 s9 x xi|y|yl yt yl=yt xi=xi->next 57 / 99

  61. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s4 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y|yl yt yt=new(xi->data,NULL) s2 x xi|y|yl yt nop s3 s3 x xi|y|yl|yt s4 x xi|y|yl|yt yl==NULL yl!=NULL � s5 x xi|y|yl|yt � s4 s5 s6 x xi|y yt|yl s7 ⊥ y=yt yl->next=yt ⊥ s8 � s6 s7 s8 s9 x xi|y|yl yt yl=yt xi=xi->next y=yt ( v ♯ ) = Extract ( y , v ♯ ) ⊔ ♯ { y , yt } post ♯ 57 / 99

  62. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s5 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y|yl yt yt=new(xi->data,NULL) s2 x xi|y|yl yt nop s3 s3 x xi|y|yl|yt s4 x xi|y|yl|yt yl==NULL yl!=NULL s5 x xi|y|yl|yt � s4 s5 s6 x xi|y yt yl s7 ⊥ y=yt yl->next=yt ⊥ s8 � s6 s7 s8 s9 x xi|y|yl yt yl=yt xi=xi->next yl->next=yt ( v ♯ ) = v ♯ ⊔ ♯ { yl , yt } post ♯ 57 / 99

  63. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s6 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y|yl yt yt=new(xi->data,NULL) s2 x xi|y|yl yt nop s3 s3 x xi|y|yl|yt s4 x xi|y|yl|yt yl==NULL yl!=NULL s5 x xi|y|yl|yt s4 s5 s6 x xi|y yt yl � s7 x xi|y yt yl y=yt yl->next=yt ⊥ s8 � s6 s7 s8 s9 x xi|y|yl yt yl=yt xi=xi->next 57 / 99

  64. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s7 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y|yl yt yt=new(xi->data,NULL) s2 x xi|y|yl yt nop s3 s3 x xi|y|yl|yt s4 x xi|y|yl|yt yl==NULL yl!=NULL s5 x xi|y|yl|yt s4 s5 s6 x xi|y yt yl s7 x xi|y yt yl y=yt yl->next=yt � s8 x xi|y yt yl � s6 s7 s8 s9 x xi|y|yl yt yl=yt xi=xi->next 57 / 99

  65. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s8 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 � s1 x xi|y yt yl yt=new(xi->data,NULL) s2 x xi|y|yl yt nop s3 s3 x xi|y|yl|yt s4 x xi|y|yl|yt yl==NULL yl!=NULL s5 x xi|y|yl|yt s4 s5 s6 x xi|y yt yl s7 x xi|y yt yl y=yt yl->next=yt s8 x xi|y yt yl � s6 s7 s8 s9 x xi|y|yl yt yl=yt xi=xi->next 57 / 99

  66. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s1 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y yt yl yt=new(xi->data,NULL) � s2 x xi|y yt yl nop s3 s3 x xi|y|yl|yt s4 x xi|y|yl|yt yl==NULL yl!=NULL s5 x xi|y|yl|yt s4 s5 s6 x xi|y yt yl s7 x xi|y yt yl y=yt yl->next=yt s8 x xi|y yt yl � s6 s7 s8 s9 x xi|y|yl yt yl=yt xi=xi->next 57 / 99

  67. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 ... and a 2nd tour: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y yt yl yt=new(xi->data,NULL) s2 x xi|y yt yl nop s3 s3 x xi|y yl|yt s4 x xi|y yl|yt yl==NULL yl!=NULL s5 x xi|y yl|yt s4 s5 s6 x xi|y yt yl s7 x xi|y yt yl y=yt yl->next=yt s8 x xi|y yt yl � s6 s7 s8 s9 x xi|y yl yt yl=yt xi=xi->next 57 / 99

  68. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 s9 extracted: xi!=NULL x|y ? v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y yt yl yt=new(xi->data,NULL) s2 x xi|y yt yl nop s3 s3 x xi|y yl|yt s4 x xi|y yl|yt yl==NULL yl!=NULL s5 x xi|y yl|yt s4 s5 s6 x xi|y yt yl s7 x xi|y yt yl y=yt yl->next=yt s8 x xi|y yt yl s6 s7 s8 s9 x xi|y yl yt yl=yt xi=xi->next 57 / 99

  69. Example: Analysis of Heap Separation [CC’77] copy(x,out y): s0 v ♯ ⊑ ♯ u ♯ iff ∀ p ∈ v ♯ ∃ q ∈ u ♯ . p ⊆ q v ♯ ⊔ ♯ u ♯ based on union-find xi=x;y=NULL xi==NULL s1 s9 xi!=NULL x|y � v ♯ CP W s0 x y|xi yl yt s2 s1 x xi|y yt yl yt=new(xi->data,NULL) s2 x xi|y yt yl nop s3 s3 x xi|y yl|yt s4 x xi|y yl|yt yl==NULL yl!=NULL s5 x xi|y yl|yt s4 s5 s6 x xi|y yt yl s7 x xi|y yt yl y=yt yl->next=yt s8 x xi|y yt yl s6 s7 s8 s9 x xi|y yl yt yl=yt xi=xi->next 57 / 99

  70. Example: Numeric Analysis with Intervals Recall: Termination not guaranteed because ( Int, ⊆ ) does not satisfy a.c.c.! s1 i=0 ( Int, ⊆ ) i>=100 s2 s5 Initially: i<=99 CP W i s3 � s1 ⊤ s2 ⊥ nop s3 ⊥ i=i+1 ⊥ s4 s5 ⊥ s4 58 / 99

  71. Example: Numeric Analysis with Intervals Recall: Termination not guaranteed because ( Int, ⊆ ) does not satisfy a.c.c.! s1 i=0 ( Int, ⊆ ) i>=100 s2 s5 s1 extracted: i<=99 CP W i s3 s1 ⊤ � s2 (0,0) nop s3 ⊥ i=i+1 ⊥ s4 s5 ⊥ s4 58 / 99

  72. Example: Numeric Analysis with Intervals Recall: Termination not guaranteed because ( Int, ⊆ ) does not satisfy a.c.c.! s1 i=0 ( Int, ⊆ ) i>=100 s2 s5 s2 extracted: i<=99 CP W i s3 s1 ⊤ s2 (0,0) nop � s3 (0,0) i=i+1 ⊥ s4 s5 ⊥ s4 58 / 99

  73. Example: Numeric Analysis with Intervals Recall: Termination not guaranteed because ( Int, ⊆ ) does not satisfy a.c.c.! s1 i=0 ( Int, ⊆ ) i>=100 s2 s5 s3 extracted: i<=99 CP W i s3 s1 ⊤ s2 (0,0) nop s3 (0,0) i=i+1 � s4 (1,1) s5 ⊥ s4 58 / 99

  74. Example: Numeric Analysis with Intervals Recall: Termination not guaranteed because ( Int, ⊆ ) does not satisfy a.c.c.! s1 i=0 ( Int, ⊆ ) i>=100 s2 s5 s4 extracted: i<=99 CP i W s3 s1 ⊤ � s2 (0,1) nop i=i+1 s3 (0,0) s4 (1,1) ⊥ s5 s4 Recall: ( 0, 0 ) ⊔ ♯ ( 1, 1 ) = ( 0, 1 ) . 58 / 99

  75. Example: Numeric Analysis with Intervals Recall: Termination not guaranteed because ( Int, ⊆ ) does not satisfy a.c.c.! s1 i=0 ( Int, ⊆ ) i>=100 s2 s5 s2 extracted: i<=99 CP W i s3 s1 ⊤ s2 (0,1) nop � s3 (0,1) i=i+1 s4 (1,1) s5 ⊥ s4 58 / 99

Recommend


More recommend