How to get an efficient yet verified arbitrary-precision integer library Raphaël Rieu-Helft (joint work with Guillaume Melquiond and Claude Marché) TrustInSoft Inria May 28, 2018 1/31
Context, motivation, goals goal: efficient and formally verified large-integer library GMP: widely-used, high-performance library tested, but hard to ensure good coverage (unlikely branches) correctness bugs have been found in the past idea: 1 formally verify GMP algorithms with Why3 2 extract efficient C code 2/31
Outline 1 Deductive verification with Why3 2 Reimplementing GMP using Why3 3 Extracting to idiomatic C code 4 An example: schoolbook multiplication 5 Benchmarks, conclusions 3/31
Deductive verification with Why3 4/31
Deductive verification program verification + proof conditions specification 5/31
Kadane’s algorithm (* | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | *) (* ......|######## max ########|.............. *) (* ..............................|### cur #### *) let maximum_subarray (a: array int): int ensures { forall l h: int. 0 ≤ l ≤ h ≤ length a → sum a l h ≤ result } ensures { exists l h: int. 0 ≤ l ≤ h ≤ length a ∧ sum a l h = result } = let max = ref 0 in let cur = ref 0 in for i = 0 to length a - 1 do cur += a[i]; if !cur < 0 then cur := 0; if !cur > !max then max := !cur done; !max 6/31
Kadane’s algorithm (* | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | *) (* ......|######## max ########|.............. *) (* ..............................|### cur #### *) let maximum_subarray (a: array int): int ensures { forall l h: int. 0 ≤ l ≤ h ≤ length a → sum a l h ≤ result } ensures { exists l h: int. 0 ≤ l ≤ h ≤ length a ∧ sum a l h = result } = let max = ref 0 in let cur = ref 0 in let ghost cl = ref 0 in for i = 0 to length a - 1 do invariant { forall l: int. 0 ≤ l ≤ i → sum a l i ≤ !cur } invariant { 0 ≤ !cl ≤ i ∧ sum a !cl i = !cur } cur += a[i]; if !cur < 0 then begin cur := 0; cl := i+1 end; if !cur > !max then max := !cur done; !max 7/31
Kadane’s algorithm (* | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | *) (* ......|######## max ########|.............. *) (* ..............................|### cur #### *) let maximum_subarray (a: array int): int ensures { forall l h: int. 0 ≤ l ≤ h ≤ length a → sum a l h ≤ result } ensures { exists l h: int. 0 ≤ l ≤ h ≤ length a ∧ sum a l h = result } = let max = ref 0 in let cur = ref 0 in let ghost cl = ref 0 in let ghost lo = ref 0 in let ghost hi = ref 0 in for i = 0 to length a - 1 do invariant { forall l: int. 0 ≤ l ≤ i → sum a l i ≤ !cur } invariant { 0 ≤ !cl ≤ i ∧ sum a !cl i = !cur } invariant { forall l h: int. 0 ≤ l ≤ h ≤ i → sum a l h ≤ !max } invariant { 0 ≤ !lo ≤ !hi ≤ i ∧ sum a !lo !hi = !max } cur += a[i]; if !cur < 0 then begin cur := 0; cl := i+1 end; if !cur > !max then begin max := !cur; lo := !cl; hi := i+1 end done; !max 8/31
Computing verification conditions A Hoare triplet: { P } e { Q } precondition property P e expression Q postcondition property { P } e { Q } if we execute e in a state that satisfies P , then the computation terminates in a state that satisfies Q Predicate transformer WP ( e , Q ) (Dijkstra) e expression Q postcondition computes the weakest precondition P such that { P } e { Q } 9/31
Definition of WP WP ( skip , Q ) = Q WP ( t , Q ) = Q [ result �→ t ] WP ( x := t , Q ) = Q [ x �→ t ] WP ( e 1 ; e 2 , Q ) = WP ( e 1 , WP ( e 2 , Q )) WP ( let v = e 1 in e 2 , Q ) = WP ( e 1 , WP ( e 2 , Q )[ v �→ result ]) WP ( let x = ref e 1 in e 2 , Q ) = WP ( e 1 , WP ( e 2 , Q )[ x �→ result ]) WP ( if t then e 1 else e 2 , Q ) = ( t → WP ( e 1 , Q )) ∧ ( ¬ t → WP ( e 2 , Q )) WP ( assert R , Q ) = R ∧ ( R → Q ) 10/31
Soundness of WP Theorem For any e and Q , the triple { WP ( e , Q ) } e { Q } is valid. Can be proved by induction on the structure of the program e w.r.t. some reasonable semantics (axiomatic, operational, etc.) Corollary To show that { P } e { Q } is valid, it suffices to prove P → WP ( e , Q ) . This is what Why3 does. 11/31
Why3 proof session 12/31
Reimplementing GMP using Why3 13/31
General approach game plan: file.mlw implement the GMP algorithms in WhyML Alt-Ergo verify them with Why3 CVC4 extract to C Why3 Z3 difficulties: preserve all GMP etc. implementation tricks file.ml file.c prove them correct extract to efficient C code 14/31
An example: comparison large integer ≡ pointer to array of unsigned integers a 0 ... a n − 1 called limbs n − 1 a i β i usually β = 2 64 value ( a , n ) = ∑ i = 0 type ptr 'a = ... exception Return32 int32 let wmpn_cmp (x y: ptr uint64) (sz: int32): int32 = let i = ref sz in try while !i ≥ 1 do i := !i - 1; let lx = x[!i] in let ly = y[!i] in if lx � = ly then if lx > ly then raise (Return32 1) else raise (Return32 (-1)) done; 0 with Return32 r → r end 15/31
Memory model simple memory model, more restrictive than C type ptr 'a = abstract { mutable data: array 'a ; offset: int } predicate valid (p:ptr 'a) (sz:int) = 0 ≤ sz ∧ 0 ≤ p.offset ∧ p.offset + sz ≤ plength p p.offset 0 1 2 3 4 5 6 7 8 p.data � �� � valid(p,5) val malloc (sz:uint32) : ptr 'a (* malloc(sz * sizeof('a)) *) ... val free (p:ptr 'a) : unit (* free(p) *) ... no explicit address for pointers 16/31
Alias control aliased C pointers ⇔ point to the same memory object aliased Why3 pointers ⇔ same data field only way to get aliased pointers: incr type ptr 'a = abstract { mutable data: array 'a ; offset: int } val incr (p:ptr 'a) (ofs:int32): ptr 'a (* p+ofs *) alias { result.data with p.data } ensures { result.offset = p.offset + ofs } ... val free (p:ptr 'a) : unit requires { p.offset = 0 } writes { p.data } ensures { p.data.length = 0 } Why3 type system: all aliases are known statically ⇒ no need to prove non-aliasing hypotheses 17/31
Example specification: long multiplication specifications are defined in terms of value (** [wmpn_mul r x y sx sy] multiplies [(x, sx)] and [(y,sy)] and writes the result in [(r, sx+sy)]. [sx] must be greater than or equal to [sy]. Corresponds to [mpn_mul]. *) let wmpn_mul (r x y: ptr uint64) (sx sy: int32) : unit requires { 0 < sy ≤ sx } requires { valid x sx } requires { valid y sy } requires { valid r (sy + sx) } writes { r.data.elts } ensures { value r (sy + sx) = value x sx * value y sy } Why3 typing constraint: r cannot be aliased to x or y simplifies proofs : aliases are known statically we need separate functions for in-place operations 18/31
Extracting to idiomatic C code 19/31
Extraction mechanism goals: simple, straightforward extraction (trusted) performance: no added complexity, no closures or indirections inefficiencies caused by extraction must be optimizable by the compiler tradeoff: handle only a small, C-like fragment of WhyML ✓ loops ✗ polymorphism, abstract types ✓ references ✗ higher order ✓ machine integers ✗ mathematical integers ✓ manual memory management ✗ garbage collection 20/31
Exceptions and return / break statements return and break are emulated by exceptions in WhyML recognize the patterns, extract as native return / break reject all other exceptions f (args) let f (args) = { ... ; ...; try (* tail position *) ...; ... raise (R e) ... return e; with R v → v end ... } while ... { try while ... do ... ... raise B ... break; done with B → () end ... } 21/31
Comparison: extracted C code int32_t wmpn_cmp(uint64_t * x, let wmpn_cmp (x y: ptr uint64) uint64_t * y, (sz: int32): int32 int32_t sz) { = let i = ref sz in int32_t i, o; try uint64_t lx , ly; while !i ≥ 1 do i = (sz); i := !i - 1; while (i >= 1) { let lx = x[!i] in o = (i - 1); i = o; let ly = y[!i] in lx = (*(x+(i))); if lx � = ly then ly = (*(y+(i))); if lx > ly if (lx != ly) { then raise (Return32 1) if (lx > ly) return (1); else raise (Return32 (-1)) else return ( -(1)); done; } 0 } with Return32 r → r return (0); end } 22/31
An example: schoolbook multiplication 23/31
Schoolbook multiplication simple algorithm, optimal for smaller sizes GMP switches to divide-and-conquer algorithms at ∼ 20 words mp_limb_t mpn_mul (mp_ptr rp , mp_srcptr up , mp_size_t un , mp_srcptr vp , mp_size_t vn) { /* We first multiply by the low order limb. This result can be stored , not added , to rp. We also avoid a loop for zeroing this way. */ rp[un] = mpn_mul_1 (rp , up , un , vp [0]); /* Now accumulate the product of up[] and the next higher limb from vp []. */ while (--vn >= 1) { rp += 1, vp += 1; rp[un] = mpn_addmul_1 (rp , up , un , vp [0]); } return rp[un]; } 24/31
Why3 implementation while !i < sy do invariant { value r (!i + sx) = value x sx * value y !i } ly := get_ofs y !i; let c = addmul_limb !rp x !ly sx in set_ofs !rp sx c; i := !i + 1; !rp := C.incr !rp 1; done; ... 25/31
Recommend
More recommend