Context Choice-recording semantics Integrating backtracking strategies Conclusion Formalising Luck Improved Probabilistic Semantics for Property-Based Generators Diane Gallois-Wong supervised by C˘ at˘ alin Hrit ¸cu, INRIA Paris September 5th 2016 1/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Property-Based Testing Property-Based Testing : running a program on many random inputs to test a property about it (QuickCheck) Testing insertion function for binary search trees insertBST : ∀ ( tr : int tree )( x : int ) . isBST ( tr ) ⇒ isBST ( insertBST tr x ) 2/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Property-Based Generators ∀ ( tr : int tree )( x : int ) . isBST ( tr ) ⇒ isBST ( insertBST tr x ) Problem: isBST is sparse, most of the tests are irrelevant Solution: Property-Based Generators (PBGs) Problem: hard to implement, easily unbalanced/incomplete (e.g. generate almost only small trees) Luck : programming language dedicated to writing PBGs 3/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Luck: principle Property-Based Generator in Luck: Form of the program: predicate (boolean function) representing the property Running it: give unknowns as arguments, generate valuation V for them so that predicate evaluates to True (Can also be interpreted as a usual predicate by giving concrete values as arguments) Mixing constraint solving and instantiation / backtracking: constraint solving by default, user-controlled instantiation 4/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Luck: standard basis Standard lambda calculus with pairs binary sums: case e of ( L x → e 1 ) ( R y → e 2 ) L T 1 + T 2 e R T 1 + T 2 e recursive types Plus unknowns and special constructs to control instantiation 5/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Binary search trees generator in Luck fun (bst : int -> int -> int -> int tree -> bool) size low high tree = if size == 0 then tree == Empty else case tree of | 1 % Empty -> True | size % Node x l r -> ((low < x && x < high) ! x) && bst (size / 2) low x l && bst (size / 2) x high r Apply bst to given integers size , low , high to obtain a BSTs generator (predicate int tree → bool ) 6/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Choice-recording semantics Big-step operational probabilistic semantics Derivations for a generator gen produce a possible output but also the choices made to reach it, recorded in trace t gen ⇑ t output Doesn’t handle backtracking: output is either an actual valuation V , or need to backtrack ∅ (error monad) 7/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Traces of choices Choice : ( m , n , q ) n : number of possibilities m : index of the one actually taken (0 ≤ m < n ) q : probability of making this choice ( q ∈ Q , 0 < q ≤ 1) Trace : sequence of choices Probability P ( t ) of a trace t : product of the probabilities of its choices 8/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion (Sub)probability distribution ( P ( t ): product of the probabilities of the choices in t ) � π gen = [ output �→ P ( t ) ] t | gen ⇑ t output gen ⇑ [(0 , 2 , 1 ∅ 2 ); (0 , 2 , 1 3 ); (0 , 2 , 2 5 )] gen ⇑ [(0 , 2 , 1 V 1 2 ); (0 , 2 , 1 3 ); (1 , 2 , 3 5 )] 10 ; V 2 �→ 1 1 15 + 1 1 [ V 1 �→ 2 ; ∅ �→ 3 ] gen ⇑ [(0 , 2 , 1 ∅ 2 ); (1 , 2 , 2 3 )] gen ⇑ [(1 , 2 , 1 V 2 2 )] 9/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Advantages of this new semantics Former semantics: directly derives whole probability distribution in a collecting style gen ⇑ π gen Our new choice-recording semantics is: simpler: proofs with Coq more expressive: still able to produce π gen better handles not always terminating programs provides more detailed information, used in next part 10/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Integrating backtracking strategies: motivation π gen ranges over valuations V but also need to backtrack ∅ Running a generator in practice: apply a backtracking strategy to always output a valuation Objective: determine final distribution ρ bstrat over valuations gen of generator gen when using backtracking strategy bstrat 11/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Choice tree of a generator (Assume all executions terminate) (1 , 2 , 1 2 ) (0 , 2 , 1 2 ) V 2 gen ⇑ [(0 , 2 , 1 ∅ 2 ); (0 , 2 , 1 3 ); (0 , 2 , 2 5 )] (1 , 2 , 2 3 ) (0 , 2 , 1 gen ⇑ [(0 , 2 , 1 V 1 3 ) 2 ); (0 , 2 , 1 3 ); (1 , 2 , 3 5 )] gen ⇑ [(0 , 2 , 1 ∅ 2 ); (1 , 2 , 2 3 )] ∅ gen ⇑ [(1 , 2 , 1 V 2 (0 , 2 , 2 2 )] (1 , 2 , 3 5 ) 5 ) ∅ V 1 12/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Markov chains to model application of a strategy Time-homogeneous Markov chains with discrete time and finite state space Init. 1 1 2 Restart-from-scratch strategy 2 1 V 2 1 1 2 1 3 3 Absorbing Markov chain ∅ 2 3 5 5 [ V 1 �→ 1 6; V 2 �→ 5 6] ∅ V 1 1 13/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Example of non absorbing Markov chain Backtrack-from-leaf-to-parent strategy Init. 1 1 2 2 V 1 1 2 3 3 1 1 ∅ ∅ Markov chain is not absorbing: no probability distribution 14/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion More complex strategy I Remember hopeless nodes that lead only to ∅ leaves Backtrack to parent from ∅ leaf or node with only hopeless children Restart execution when more than a set number Bmax of ∅ leaves have been encountered Set of states: Nodes × ( P ( Nodes ) × { 0 , 1 , ..., Bmax } ) ( node , ( hopeless nodes , number of ∅ leaves seen ) ) 15/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion More complex strategy II 1 − → ( C , S ∪ { D } , n + 1) ( D , S , n ) if n < Bmax A 1 1 1 2 ( D , S , Bmax ) − → ( A , S ∪ { D } , 0) 2 V 2 2 / 5 B � 2 ( C , S , n ) − − → ( D , S , n ) G 1 if D , E / ∈ S 3 3 3 / 5 ( C , S , n ) − − → ( E , S , n ) ∅ 1 → ( E , S , n ) − if D ∈ S , E / ∈ S C ( C , S , n ) 2 3 F 5 5 1 ( C , S , n ) − → ( B , S ∪ { C } , n ) if D , E ∈ S (impossible, yet transition exists) ∅ V 1 D E 1 ( G , S , n ) → ( G , S , n ) − 16/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Definition of a strategy Backtracking strategy : computable function associating to a choice tree a Markov chain verifying: it is time-homogeneous with discrete time it has a finite set of states of form Nodes × M it has a single initial state of form ( root , M 0 ) absorbing states are exactly those with a non- ∅ leaf it is absorbing 17/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Computability of the final probability distribution Theorem Final distribution ρ bstrat is computable from gen and bstrat. gen gen → choice tree: computable (finite) choice tree → Markov chain: bstrat computable (def) Markov chain → ρ bstrat gen : � Q � R Transition matrix of absorbing Markov chain 0 I Probability to be absorbed in j from non-absorbing i : s ≥ 0 Q s R = ( I − Q ) − 1 R � 18/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion “Activated only upon backtracking” strategies A strategy is “activated only upon backtracking” if for any Markov chain in its image: for any state that is reachable from the initial state without visiting a ∅ leaf, the transitions follow the choice tree A 2 1 3 3 B C ∀V . ρ bstrat Consequence: gen ( V ) ≥ π gen ( V ) 19/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Restriction about computation of MC from CT A strategy is a computable function from choice trees to Markov chains: very permissive Possible restriction: transitions from a state ( node , M ) computable only from node , M , children of node and edges to them in choice tree Examples need minor adaptation: restart-from-scratch: add memorised information which always contains root more complex strategy: add path of ancestors of the current node to memorised information 20/ 22
Context Choice-recording semantics Integrating backtracking strategies Conclusion Evaluation of backtracking strategies π gen usually more intuitive for the user than ρ bstrat gen so ideally bstrat should keep ρ bstrat close to π gen gen Extremality result: restart-from-scratch most conservative, according to two functions evaluating closeness of distributions � � � D B ( P , Q ) = − log � P ( i ) Q ( i ) Bhattacharyya distance i i P ( i ) log P ( i ) D KL ( P || Q ) = � Kullback-Leibler divergence Q ( i ) Future work: sum with other terms to minimise, such as time complexity term (e.g. expected number of steps in the Markov chain before being absorbed) 21/ 22
Recommend
More recommend