Modular Procedure Equivalence with Dynamic Heap Allocation Tim Wood 1 Shuvendu Lahiri 2 Sophia Drossopoulou 1 1 Imperial College London 2 Microsoft Research 12th April 2016
Introduction Subject Procedure equivalence for procedures that dynamically allocate heap memory Overview 1. What is procedure equivalence in this context? 2. How can we modularly verify such procedure equivalence? ◮ How to relate the procedures. ◮ Some sound underapproximations that help. 3. Current status of our verification tool.
What we are trying to do ◮ Automatically verify equivalence of procedures ◮ unbounded dynamic heap memory allocation ◮ unbounded recursion (but not loops) ◮ memory-safe / no pointer arithmetic ◮ sequential
What we are trying to do ◮ Automatically verify equivalence of procedures ◮ unbounded dynamic heap memory allocation ◮ unbounded recursion (but not loops) ◮ memory-safe / no pointer arithmetic ◮ sequential ◮ Use a permissive notion of equivalence that allows for ◮ differences in allocation order ◮ differences in garbage ◮ equivalent rearrangements of existing memory
What we are trying to do ◮ Automatically verify equivalence of procedures ◮ unbounded dynamic heap memory allocation ◮ unbounded recursion (but not loops) ◮ memory-safe / no pointer arithmetic ◮ sequential ◮ Use a permissive notion of equivalence that allows for ◮ differences in allocation order ◮ differences in garbage ◮ equivalent rearrangements of existing memory ◮ Apply single program modular verification technology to procedure equivalence verification ◮ take advantage of procedure contracts if present
Outline Introduction What is procedure equivalence? When are procedures that allocate memory equivalent? How do changes in allocation order affect equivalence? How does garbage affect equivalence? How does context affect equivalence? Equivalence and recursion. Verifying procedure equivalence Why is it difficult? Sound approximations - assuming isomorphic state is equal Equivalence and statement ordering Tool current abilities
When are procedures equivalent? ◮ Symdiff 1 says that procedures are equivalent iff they produce equal final stores given equal initial stores ◮ We call this equi-equivalence def s 1 equi-equivalent s 2 ⇐ ⇒ ∀ φ 1 ... 4 : φ 1 = φ 2 ∧ φ 1 , s 1 C φ 3 ∧ φ 2 , s 2 C φ 4 = ⇒ φ 3 = φ 4 1 Lahiri, Shuvendu K., et al. ”Symdiff: A language-agnostic semantic diff tool for imperative programs.” CAV. 2012
A note on termination ◮ Whether the procedures terminate under the same conditions is important. Particularly if a transitive notion of equivalence is required ◮ Only going to talk about terminating programs Related: ◮ Hawblitzel, Chris, et al. ”Towards modularly comparing programs using automated theorem provers.” CADE-24 (2013) ◮ Elenbogen, Dima, et al. ”Proving mutual termination.” Formal Methods in System Design 47.2 (2015)
Example - identical procedures 1 copy(t,r) 15 copy ’(t,r) 2 { 16 { if(t = null) return; if(t = null) return; 3 17 4 18 5 r.v := new; 19 r.v := new; 6 20 7 rl := new; 21 rl := new; 8 copy(t.l, rl); 22 copy ’(t.l, rl); r.v.l := rl.v; r.v.l := rl.v; 9 23 10 24 rr := new; rr := new; 11 25 12 copy(t.r, rr); 26 copy ’(t.r, rr); r.v.r := rr.v; r.v.r := rr.v; 13 27 14 } 28 } ◮ Procedures recursively copy a tree ◮ Heap cell used to return result ◮ Recursive calls can allocate an unbounded amount of memory
Example - identical procedures 1 copy(t,r) 15 copy ’(t,r) 2 { 16 { if(t = null) return; if(t = null) return; 3 17 4 18 1 1 5 r.v := new; 19 r.v := new; 6 20 2 2 7 rl := new; 21 rl := new; 3. . . 3+n 3. . . 3+n 8 copy(t.l, rl); 22 copy ’(t.l, rl); r.v.l := rl.v; r.v.l := rl.v; 9 23 10 24 3+n 3+n rr := new; rr := new; 11 25 12 copy(t.r, rr); 26 copy ’(t.r, rr); r.v.r := rr.v; r.v.r := rr.v; 13 27 14 } 28 } ◮ Equi-equivalence depends on the behaviour of the allocator ◮ With a non-deterministic allocator, these procedures are not equi-equivalent ◮ If we assume a deterministic incrementing allocator, then these procedures are equi-equivalent
Example - allocation order 1 copy(t,r) 15 copy ’(t,r) 2 { 16 { if(t = null) return; if(t = null) return; 3 17 4 18 5 r.v := new; 19 rl := new; 6 20 copy ’(t.l, rl); 7 rl := new; 21 8 copy(t.l, rl); 22 rr := new; r.v.l := rl.v; copy ’(t.r, rr); 9 23 10 24 rr := new; r.v := new; 11 25 12 copy(t.r, rr); 26 r.v.l := rl.v; r.v.r := rr.v; r.v.r := rr.v; 13 27 14 } 28 }
Example - allocation order 1 copy(t,r) 15 copy ’(t,r) 2 { 16 { if(t = null) return; if(t = null) return; 3 17 4 18 1 1 5 r.v := new; 19 rl := new; 2. . . 2+n 6 20 copy ’(t.l, rl); 2 7 rl := new; 21 3. . . 3+n 2+n 8 copy(t.l, rl); 22 rr := new; r.v.l := rl.v; copy ’(t.r, rr); 9 23 10 24 3+n 3+n rr := new; r.v := new; 11 25 12 copy(t.r, rr); 26 r.v.l := rl.v; r.v.r := rr.v; r.v.r := rr.v; 13 27 14 } 28 } ◮ These procedures are not equi-equivalent with an incrementing allocator ◮ The trees produced by copy’ are allocated at different memory addresses than those produced by copy
Use a weaker notion of equivalence ◮ Use a weaker equivalence, which requires stores are related by some weaker relation ≈ rather than =, perhaps: def s 1 ≈ s 2 ⇐ ⇒ ∀ φ 1 ... 4 : φ 1 ≈ φ 2 ∧ φ 1 , s 1 C φ 3 ∧ φ 2 , s 2 C φ 4 = ⇒ φ 3 ≈ φ 4 ◮ what should ≈ be?
Isomorphism ◮ Our programs do not look at the actual values of pointers ◮ Stores should be equivalent whenever they have the same shape ◮ Suggestion: φ 1 ≈ φ 2 when ◮ ∃ a bijection between the allocated addresses in φ 1 and φ 2 ◮ the bijection preserves the shape of the stores
Isomorphism 3 7 f f t 3 t 7 2 g g 14 1 11 x x 1 11 y y n 1 11 n 5 6 z z 4 9 4 9 ≈ { 1 �→ 11 , 2 �→ 14 , 3 �→ 7 , 4 �→ 9 , 5 �→ 6 }
Example - garbage 1 copy(t,r) 17 copy ’(t,r) 2 { 18 { 3 n := new; 19 4 20 if(t = null) return; if(t = null) return; 5 21 6 22 7 rl := new; 23 rl := new; 8 copy(t.l, rl); 24 copy ’(t.l, rl); n.l := rl.v; 9 25 10 26 rr := new; rr := new; copy ’(t.r, rr); 11 27 12 copy(t.r, rr); 28 n.r := rr.v; r.v := new; 13 29 14 30 r.v.l := rl.v; r.v := n; r.v.r := rr.v; 15 31 16 } 32 } ◮ copy creates extra unreachable garbage at the leaves
Isomorphism ◮ Our programs do not resurrect stuff out of the garbage ◮ Stores could be equivalent whenever their reachable parts have the same shape ◮ φ 1 ≈ φ 2 when ◮ ∃ a bijection between the reachable addresses in φ 1 and φ 2 ◮ the bijection preserves the shape of the reachable parts of the stores
Example - garbage 1 copy(t,r) 17 copy ’(t,r) 2 { 18 { n := new; 3 19 4 20 5 if(t = null) return; 21 if(t = null) return; 6 22 7 rl := new; 23 rl := new; 8 copy(t.l, rl); 24 copy ’(t.l, rl); 9 n.l := rl.v; 25 rr := new; 10 26 11 rr := new; 27 copy ’(t.r, rr); copy(t.r, rr); 12 28 13 n.r := rr.v; 29 r.v := new; r.v.l := rl.v; 14 30 15 r.v := n; 31 r.v.r := rr.v; 16 } 32 } ◮ copy creates extra unreachable garbage at the leaves ◮ This presents a couple of difficulties: ◮ The garbage is still reachable at the end of the procedure ◮ The stack variables differ
Use a weaker notion of equivalence ◮ The caller cannot observe the stack frame of the callee ◮ So the final stores only need equivalent calling contexts def s 1 ≈ s 2 ⇐ ⇒ ∀ φ 1 ... 4 : φ 1 ≈ φ 2 ∧ φ 1 , s 1 C φ 3 ∧ φ 2 , s 2 C φ 4 = ⇒ φ ctx ≈ φ ctx 3 4
Context Isomorphism 3 7 g f t 3 t 2 7 g 14 f 1 11 x x 1 11 y y n 1 11 n 5 6 z z 4 9 4 9 �≈ Pointer from t distinguishes objects 2 and 3
Context Isomorphism 3 7 g f t 3 t 2 7 g 14 f 1 11 x x 1 11 y y n 1 11 n 5 6 z z 4 9 4 9 ≈ In caller context no way to distinguish objects 2 and 3
Context Isomorphism 3 7 g f t 3 t 2 7 g 14 f 1 11 x x 1 11 y y n 1 n 11 5 6 z z 4 9 4 9 ≈ 3 7 g f t 3 t 2 7 g 14 f 1 11 x 1 x 11 n m n m y y 1 11 5 6 z z 4 9 4 9 �≈ Suppose there was an extra pointer m in the calling context
Context Isomorphism 3 7 g f t 3 t 2 7 g 14 f 1 11 x x 1 11 y y n 1 n 11 5 6 z z 4 9 4 9 ≈ 3 7 g f t 3 t 2 7 g 14 f 1 11 x 1 x 11 n m n m y y 1 11 5 6 z z 4 9 4 9 �≈ Now pointer 4.m distinguishes objects 2 and 3
Example - context 1 noop(x) 8 swap(x) requires ... requires ... 2 9 3 { 10 { 4 11 t := x.f; 5 12 x.f := x.g; 6 13 x.g := t; 7 } 14 } ◮ These procedures could be equivalent when a precondition is taken into account ◮ e.g. requires x.f==x.g ◮ Preconditions may need to talk about the shape of the calling context ◮ Need a way to write about the calling context shape. ◮ It is always sound to assume a worst case calling context that can distinguish any two existing objects.
Use a weaker notion of equivalence ◮ The procedures are equivalent given some precondition def s 1 ≈ pre s 2 ⇐ ⇒ ∀ φ 1 ... 4 : φ 1 ∈ pre ∧ φ 2 ∈ pre ∧ φ 1 ≈ φ 2 ∧ φ 1 , s 1 C φ 3 ∧ φ 2 , s 2 C φ 4 = ⇒ φ ctx ≈ φ ctx 3 4
Modularity and recursion ◮ We want to reason modularly (for scalability) ◮ When can we use the knowledge that two procedures are equivalent?
Recommend
More recommend