13th international satisfiability modulo theories
play

13th International Satisfiability Modulo Theories Competition - PowerPoint PPT Presentation

13th International Satisfiability Modulo Theories Competition SMT-COMP 2018 Matthias Heizmann Aina Niemetz Giles Reger Tjark Weber Outline Design and scope Main changes from last years competition Short presentation of solvers


  1. 13th International Satisfiability Modulo Theories Competition SMT-COMP 2018 Matthias Heizmann Aina Niemetz Giles Reger Tjark Weber

  2. Outline ◮ Design and scope ◮ Main changes from last year’s competition ◮ Short presentation of solvers ◮ Alt-Ergo, Boolector, Ctrl-Ergo, CVC4, OpenSMT, SMTInterpol, SPASS-SATT, Yices ◮ Selected results

  3. Design and Scope

  4. Background SMT-COMP is an annual competition between SMT solvers. It was first held in 2005 ◮ to spur adoption of the common, community-designed SMT-LIB format, and ◮ to spark further advances in SMT by stimulating improvement in solver implementations. It has evolved into the world’s largest ∗ ATP competition.

  5. SMT-COMP – Procedure submit SMT-LIB SMT-LIB benchmarks benchmarks users curated by Clark Barrett, Pascal Fontaine, Cesare Tinelli upload benchmarks upload SMT solver solvers developers StarExec maintained by Aaron Stump competition results

  6. SMT-COMP – Procedure Martin Bromberger submit SMT-LIB SMT-LIB benchmarks Aman Goel benchmarks users curated by Makai Mann Clark Barrett, Casey Mulligan Pascal Fontaine, Cesare Tinelli Mathias Preiner Clifford Wolf 2018 upload benchmarks upload SMT solver solvers developers StarExec maintained by Aaron Stump competition results

  7. Main Track Main Track benchmark (set-logic ... ) (set-info ... )  . .  .  any number of  (declare-sort ... )   (define-sort ... )   set-info , declare-sort , define-sort , (declare-fun ... ) (define-fun ... ) declare-fun , define-fun , assert (assert term0)   (assert term1)  commands  (assert term2)  .  .  . ← one check-sat command (check-sat) (exit)

  8. Main Track Main Track benchmark (set-logic ... ) (set-info ... )  . .  .  any number of  (declare-sort ... )   (define-sort ... )   set-info , declare-sort , define-sort , (declare-fun ... ) (define-fun ... ) declare-fun , define-fun , assert (assert term0)   (assert term1)  commands  (assert term2)  .  .  . ← one check-sat command (check-sat) (exit) timeout: 20 min Solver output sat / unsat

  9. Main Track Main Track benchmark (set-logic ... ) (set-info ... )  . .  .  any number of  (declare-sort ... )  Scoring  (define-sort ... )   set-info , declare-sort , define-sort , (declare-fun ... ) (define-fun ... ) n = 1 if the solver correctly responds sat or unsat declare-fun , define-fun , assert (assert term0)   (assert term1) e = 1 if the solver incorrectly responds sat or unsat  commands  (assert term2)  .  .  (multiplied by a weight that varies with the benchmark) . ← one check-sat command (check-sat) (exit) timeout: 20 min Solver output sat / unsat

  10. Application Track Application track benchmarks may contain multiple check-sat commands, as well as push and pop commands. Application Track benchmark (set-logic ... ) .  . . any number of   (check-sat)  .  .  set-info , declare-sort , define-sort , .   (check-sat) declare-fun , define-fun , assert , push , . . .  pop , check-sat  (check-sat)   .  commands .  .  (check-sat) (exit)

  11. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . . . (check-sat) (check-sat) . . . (check-sat) . . . (check-sat) . . . (check-sat) (exit) � �� � timeout: 40 min

  12. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . Solver output . . sat / unsat (check-sat) (check-sat) . . . (check-sat) . . . (check-sat) . . . (check-sat) (exit) � �� � timeout: 40 min

  13. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . Solver output . . sat / unsat (check-sat) (check-sat) . . . . . . (check-sat) (check-sat) . . . (check-sat) . . . (check-sat) (exit) � �� � timeout: 40 min

  14. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . Solver output . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . (check-sat) . . . (check-sat) (exit) � �� � timeout: 40 min

  15. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . Solver output . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . . . . (check-sat) (check-sat) . . . (check-sat) (exit) � �� � timeout: 40 min

  16. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . Solver output . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . (check-sat) (exit) � �� � timeout: 40 min

  17. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . Solver output . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . . . . (check-sat) (check-sat) (exit) � �� � timeout: 40 min

  18. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . Solver output . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) (exit) � �� � timeout: 40 min

  19. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . Solver output . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) (exit) (exit) � �� � timeout: 40 min

  20. Application Track Application track benchmarks are fed to the solver incrementally by a trace executor. Application Track benchmark Solver input (set-option :print-success true) (set-logic ... ) (set-logic ... ) . . . . Solver output . . Scoring sat / unsat (check-sat) (check-sat) . . . . . . n = # correct sat / unsat responses sat / unsat (check-sat) (check-sat) . . . . e = 1 if the solver gives an incorrect sat / unsat response . . sat / unsat (check-sat) (check-sat) . . . . . . sat / unsat (check-sat) (check-sat) (exit) (exit) � �� � timeout: 40 min

  21. Unsat-Core Track Main Track benchmark ( unsat ) Solver input (set-option :produce-unsat-cores true) (set-logic ... ) (set-logic ... ) (set-info ... ) (set-info ... ) . . . . . . (declare-sort ... ) (declare-sort ... ) (define-sort ... ) (define-sort ... ) (declare-fun ... ) (declare-fun ... ) (define-fun ... ) (define-fun ... ) (assert term0) (assert (! term0 :named y0)) (assert term1) (assert (! term1 :named y1)) (assert term2) (assert (! term2 :named y2)) . . . . . . (check-sat) (check-sat) (exit) (get-unsat-core) (exit)

  22. Unsat-Core Track Main Track benchmark ( unsat ) Solver input (set-option :produce-unsat-cores true) (set-logic ... ) (set-logic ... ) (set-info ... ) (set-info ... ) . . . . . . (declare-sort ... ) (declare-sort ... ) (define-sort ... ) (define-sort ... ) (declare-fun ... ) (declare-fun ... ) (define-fun ... ) (define-fun ... ) (assert term0) (assert (! term0 :named y0)) (assert term1) (assert (! term1 :named y1)) (assert term2) (assert (! term2 :named y2)) . . . . . . (check-sat) (check-sat) (exit) (get-unsat-core) (exit) Solver output timeout: 40 min unsat (y0 y2)

Recommend


More recommend