machine learning for instance selection in smt solving w
play

Machine learning for instance selection in SMT solving (W ork in - PowerPoint PPT Presentation

Machine learning for instance selection in SMT solving (W ork in Progress ) Jasmin Christian Blanchete 1, 2 Daniel El Ouraoui 2 Pascal Fontaine 2 Cezary Kaliszyk 3 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands University of Lorraine,


  1. Machine learning for instance selection in SMT solving (W ork in Progress ) Jasmin Christian Blanchete 1, 2 Daniel El Ouraoui 2 Pascal Fontaine 2 Cezary Kaliszyk 3 Vrije Universiteit Amsterdam, Amsterdam, The Netherlands University of Lorraine, CNRS, Inria, and LORIA, Nancy, France University of Innsbruck, Innsbruck, Austria 9th April 2019

  2. Contents 1 Introduction 2 CDCL(T) 3 Instantiation techniques 4 Machine learning for instance selection 5 Evaluation 6 Conclusion 2 / 32

  3. Contents 1 Introduction 2 CDCL(T) 3 Instantiation techniques 4 Machine learning for instance selection 5 Evaluation 6 Conclusion 3 / 32

  4. Motivations Instantiation Satisfiability modulo theories (SMT) Hard for SMT solvers Automation Proof assistant Heuristically solved Verification conditions Model checking Solvers Challenge Z3, cvc 4, veriT , ... Improve instantiation techniques Solve more problems Be more efficient 4 / 32

  5. Our tool Université de Lorraine/UFRN ( http://www.verit-solver.org ) 5 / 32

  6. Contents 1 Introduction 2 CDCL(T) 3 Instantiation techniques 4 Machine learning for instance selection 5 Evaluation 6 Conclusion 6 / 32

  7. Context Ground b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y Instantiation 7 / 32

  8. Ground problem How efficiently check the satisfiability of a ground formula ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) l 2 l 5 l 1 l 3 l 4 l 6 ( l 1 ∨ ¬ l 2 ) ∧ l 3 ∧ l 4 ∧ l 5 ∧ l 6 8 / 32

  9. Ground problem How efficiently check the satisfiability of a ground formula ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) l 2 l 5 l 1 l 3 l 4 l 6 ( l 1 ∨ ¬ l 2 ) ∧ l 3 ∧ l 4 ∧ l 5 ∧ l 6 8 / 32

  10. Ground problem How efficiently check the satisfiability of a ground formula ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) l 2 l 5 l 1 l 3 l 4 l 6 ( l 1 ∨ ¬ l 2 ) ∧ l 3 ∧ l 4 ∧ l 5 ∧ l 6 8 / 32

  11. Ground problem How efficiently check the satisfiability of a ground formula ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) ( f ( a , b ) = g ( a ) ∨ d = b ) ∧ d = g ( b ) ∧ d � = f ( a , b ) ∧ b = a ∧ d � = g ( a ) l 2 l 5 l 1 l 3 l 4 l 6 ( l 1 ∨ ¬ l 2 ) ∧ l 3 ∧ l 4 ∧ l 5 ∧ l 6 8 / 32

  12. CDCL(T) Ground Solver Conflict clauses Theory solvers SAT solver Boolean model Formulas are embedded in SAT SAT solver produces a boolean model Theory solvers produce conflict clauses Conflict clauses guide the SAT solver 9 / 32

  13. First-Order problem b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y Instantiation 10 / 32

  14. First-Order problem How to find an instance such that the problem is UNSAT b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y SAT f ( a ) � = f ( b ) ∨ a = b b � = a ∧ f ( a ) = f ( b ) ∧ f ( a ) � = f ( b ) UNSAT 11 / 32

  15. First-Order problem How to find an instance such that the problem is UNSAT b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y SAT f ( a ) � = f ( b ) ∨ a = b b � = a ∧ f ( a ) = f ( b ) ∧ f ( a ) � = f ( b ) UNSAT 11 / 32

  16. First-Order problem How to find an instance such that the problem is UNSAT b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y SAT f ( a ) � = f ( b ) ∨ a = b b � = a ∧ f ( a ) = f ( b ) ∧ f ( a ) � = f ( b ) UNSAT 11 / 32

  17. First-Order problem How to find an instance such that the problem is UNSAT b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y SAT f ( a ) � = f ( b ) ∨ a = b b � = a ∧ f ( a ) = f ( b ) ∧ f ( a ) � = f ( b ) UNSAT 11 / 32

  18. First-Order problem How to find an instance such that the problem is UNSAT b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y SAT f ( a ) � = f ( b ) ∨ a = b b � = a ∧ f ( a ) = f ( b ) ∧ f ( a ) � = f ( b ) UNSAT 11 / 32

  19. First-Order problem How to find an instance such that the problem is UNSAT b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y b � = a ∧ f ( a ) = f ( b ) ∧ ∀ xy f ( x ) = f ( y ) ⇒ x = y SAT f ( a ) � = f ( b ) ∨ a = b b � = a ∧ f ( a ) = f ( b ) ∧ f ( a ) � = f ( b ) UNSAT 11 / 32

  20. First-Order CDCL(T) SMT Solver Instances Ground solver Instantiation FO model 12 / 32

  21. Contents 1 Introduction 2 CDCL(T) 3 Instantiation techniques 4 Machine learning for instance selection 5 Evaluation 6 Conclusion 13 / 32

  22. State of the art Conflict based instantiation Introduced by Reynolds, this technique produces relevant sets of instances. The idea is that, given a ground model M and a quantified formula ∀ ( x n : τ n ) .ϕ , we find a substitution σ such that M | = ¬ ϕσ . Congruence Closure with Free Variable (CCFV) Introduced by Barbosa et al., generalizes the idea of Conflict based instantiation by reasoning over equivalence classes. 14 / 32

  23. State of the art � Enumerative instantiation ∀ ( x : τ ) .ψ [ x ] ≡ ψ [ t ] t ∈D τ Enumerate all ground terms over the domain of x (aka. Herbrand universe) Trigger based instantiation Triggers A trigger T for a quantified formula ∀ x n .ψ is a set of non-ground terms u 1 , . . . , u n ∈ T ( ψ ) such that: { x } ⊆ FV ( u 1 ) ∪ . . . ∪ FV ( u n ) . E = f ( a ) ≃ g ( b ) , a ≃ g ( b ) Q = ∀ x f ( g ( x )) �≃ g ( x ) T = f ( g ( x )) f ( a ) E -matches f ( g ( x )) under x �→ b 15 / 32

  24. Strategie CCFV Works Fails ground solver Trigger + Enum Figure: Instantiation strategie 16 / 32

  25. Summarize Conflict based instantiation and CCFV : Pro Efficient, if find substitution kill the model Pro All generated instances are useful Cons Finds contradiction involving only one instance Enumerative and Trigger based instanciation : Pro Useful when CCFV fail Cons Many heuristics Cons Generates a lot of junk, and many instances 17 / 32

  26. Summarize Conflict based instantiation and CCFV : Pro Efficient, if find substitution kill the model Pro All generated instances are useful Cons Finds contradiction involving only one instance Enumerative and Trigger based instanciation : Pro Useful when CCFV fail Cons Many heuristics Cons Generates a lot of junk, and many instances Indeed This is what we want improve! 17 / 32

  27. Contents 1 Introduction 2 CDCL(T) 3 Instantiation techniques 4 Machine learning for instance selection 5 Evaluation 6 Conclusion 18 / 32

  28. Problem How many lemmas are generated to solve a problem? around 300 for the UF category of the SMT-LIB some generate more than 100 000 instances How many lemmas are needed to solve a problem? Only 10% of this number, and sometimes much less 19 / 32

  29. Problem How many lemmas are generated to solve a problem? around 300 for the UF category of the SMT-LIB some generate more than 100 000 instances How many lemmas are needed to solve a problem? Only 10% of this number, and sometimes much less Question Could we select the good one? 19 / 32

  30. Our approach ML-Solver Instantiation Ground Solver Instance selection Processing Instances in a priority queue Encode instances Call predictor Instance selection Several strategies for selection Instances Delayed Selected instances Inst 1 Filter ... Inst 1 rank Inst n ... Predictor Inst n rank 20 / 32

  31. State description Model Formula Instances ( l 1 , . . . , l n , ∀ x n . ψ [ x n ] , x 1 �→ t 1 , . . . , x n �→ t n )   Qformula 1 Inst 1 Inst 1 0 x 12 x 13 . . . x 1 n  ( model 1 1 , m )  . . . 1 1 , 1 Qformula 2 Inst 2 Inst 2 1 x 22 x 23 . . . x 2 n ( model 1 1 , m ) . . .   1 1 , 1   rounds { →  ֒  . . . .  ...   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   . . . .    Qformula i Inst i Inst i ( model k k , m ) . . . 0 x d 2 x d 3 . . . x dn k k , 1 21 / 32

  32. Experiments veriT Small proof pre processing Data set balancing data over sampling under sampling Features Train importance XGBoost XGBoost classification predictions Model C code 22 / 32

  33. Contents 1 Introduction 2 CDCL(T) 3 Instantiation techniques 4 Machine learning for instance selection 5 Evaluation 6 Conclusion 23 / 32

  34. Time evaluation Experiments run on UF SMTLIB benchmarks with 120s timeout veriT without learning solves 2923 veriT with learning solves 2939 with learning 24 / 32

  35. Evaluation on test + training set Figure: comparison of veriT configurations on UF SMT-LIB benchmarks. 25 / 32

  36. Evaluation on test set only Figure: comparison of veriT configurations on UF SMT-LIB benchmarks. 26 / 32

Recommend


More recommend