sled gehammer hell
play

Sled gehammer Hell The Day after Jud gment Jasmin C. Blanchette TU - PowerPoint PPT Presentation

Sled gehammer Hell The Day after Jud gment Jasmin C. Blanchette TU Mnchen Larry Paulson Jia Meng Kong Susanto Claire Quigley Markus Wenzel Fabian Immler Philipp Meyer Sascha Bhme Sledgehammer Sledgehammer Relevance filter Sledgehammer


  1. Sled gehammer Hell The Day after Jud gment Jasmin C. Blanchette TU München

  2. Larry Paulson Jia Meng Kong Susanto Claire Quigley Markus Wenzel Fabian Immler Philipp Meyer Sascha Böhme

  3. Sledgehammer

  4. Sledgehammer Relevance filter

  5. Sledgehammer Relevance filter ATP translation

  6. Sledgehammer Relevance filter ATP translation E SPASS Vampire SInE Metis Metis Metis Metis proof proof proof proof

  7. Sledgehammer Relevance filter SMT tr. SMT translation ATP translation Z3 CVC3 Yices E SPASS Vampire SInE Metis Metis Metis Metis Metis Metis Metis or SMT or SMT or SMT proof proof proof proof proof proof proof

  8. Sledgehammer Sledgehammer Relevance filter Relevance filter SMT tr. SMT translation ATP translation Z3 CVC3 Yices E SPASS Vampire SInE Metis Metis Metis Metis Metis Metis Metis or SMT or SMT or SMT proof proof proof proof proof proof proof

  9. rev [a, b] = [b, a]

  10. rev [a, b] = [b, a] by (metis Cons_eq_appendI eq_Nil_appendI rev.simps(2) rev_singleton_conv)

  11. 2010

  12. 2010 3 ATPs x 30s 46% 54%

  13. 2010 3 ATPs x 30s 3 ATPs x 30 s nontrivial goals 46% 66% 34% 54%

  14. 2010 3 ATPs x 30s 3 ATPs x 30 s nontrivial goals 46% 66% 34% 54% 2011 (4 ATPs + 3 SMTs) x 30s (4 ATPs + 3 SMTs) x 30s nontrivial goals 39% 54% 46% 61%

  15. Issue #1: Too Many Facts Issue #2: Encoding Overhead Issue #3: Large Terms

  16. Issue #1: Too Many Facts

  17. Conjecture: … c … d … e ...

  18. Conjecture: … c … d … e ... Lemma 1: … c … f ... Lemma 2: … f … g ... Lemma 3: … g … h ...

  19. Conjecture: … c … d … e ... ✓ Lemma 1: … c … f ... Lemma 2: … f … g ... Lemma 3: … g … h ...

  20. Conjecture: … c … d … e ... ✓ Lemma 1: … c … f ... ✓ Lemma 2: … f … g ... Lemma 3: … g … h ...

  21. Conjecture: … c … d … e ... ✓ Lemma 1: … c … f ... ✓ Lemma 2: … f … g ... ✓ Lemma 3: … g … h ...

  22. The More Facts...

  23. E 1.2 Success Rate 0 100 200 300 400 500 600 700 800 900 1000 The More Facts...

  24. E 1.2 Success Rate 0 100 200 300 400 500 600 700 800 900 1000 The More Facts...

  25. E 1.2 Success Rate 0 100 200 300 400 500 600 700 800 900 1000 The More Facts...

  26. SPASS 3.7 Success Rate 0 100 200 300 400 500 600 700 800 900 1000 The More Facts...

  27. SPASS 3.7 Success Rate 0 100 200 300 400 500 600 700 800 900 1000 The More Facts...

  28. Vampire 1.0 Success Rate 0 100 200 300 400 500 600 700 800 900 1000 The More Facts...

  29. Vampire 1.0 Success Rate 0 100 200 300 400 500 600 700 800 900 1000 The More Facts...

  30. Z3 2.15 Success Rate 0 100 200 300 400 500 600 700 800 900 1000 The More Facts...

  31. Z3 2.15 Success Rate 0 100 200 300 400 500 600 700 800 900 1000 The More Facts...

  32. E SPASS Vampire Z3

  33. E SPASS Vampire Z3

  34. How Effective is the Relevance Filter?

  35. Uses 0-9 100-109 200-209 300-309 400-409 490-499 How Effective is the Relevance Filter?

  36. Experiments

  37. Z3 Weights Experiments

  38. Z3 Weights ……..……… +0.4 pp Experiments

  39. Z3 Weights ……..……… +0.4 pp Z3 Triggers Experiments

  40. Z3 Weights ……..……… +0.4 pp Z3 Triggers ……..……… +0.1 pp Experiments

  41. Z3 Weights ……..……… +0.4 pp Z3 Triggers ……..……… +0.1 pp Z3 “Slicing” Experiments

  42. Z3 Weights ……..……… +0.4 pp Z3 Triggers ……..……… +0.1 pp Z3 “Slicing” ........….…… +2.1 pp Experiments

  43. Z3 Weights ……..……… +0.4 pp Z3 Triggers ……..……… +0.1 pp Z3 “Slicing” ........….…… +2.1 pp E Weights Experiments

  44. Z3 Weights ……..……… +0.4 pp Z3 Triggers ……..……… +0.1 pp Z3 “Slicing” ........….…… +2.1 pp E Weights ..……………. +1.4 pp Experiments

  45. Issue #2: Encoding Overhead

  46. HOL to FOL

  47. Application Operatorxxxxx suc(N) app(suc, N) HOL to FOL

  48. Application Operatorxxxxx suc(N) app(suc, N) Type Information suc(N) ti(suc(ti(N, nat)), nat) HOL to FOL

  49. Application Operatorxxxxx suc(N) app(suc, N) Type Information suc(N) ti(suc(ti(N, nat)), nat) ti(app(ti(suc, fun(nat, nat)), ti(N, nat)), nat) HOL to FOL

  50. Problem Size No App. App. No Types ... 32K 41K Types ... 60K 107K

  51. Problem Size No App. App. No Types ... 32K 41K Types ... 60K 107K Solving Time (E 1.2 ) No App. App. No Types ... 1 s 15 s Types ... 172 s ???? s

  52. E SPASS Vampire Z3

  53. E SPASS Vampire Z3 70% 60% 50% 40% 30% 20% 10% 0% Arrow FFT FTA Hoare Jinja NS QE S2S SN All

  54. E SPASS Vampire Z3 70% 60% 50% 40% 30% 20% 10% 0% Arrow FFT FTA Hoare Jinja NS QE S2S SN All Claim: Sorts Rock x (Types)

  55. E SPASS Vampire Z3 70% 60% 50% 40% 30% 20% 10% 0% Arrow Jinja NS SN Claim: Sorts Rock x (Types)

  56. Issue #3: Large Terms

  57. 1 + … + 1 = ( n ::nat) rev [x 1 , …, x n ] = [x n , …, x 1 ] map f [x 1 , …, x n ] = [f x 1 , …, f x n ]

  58. 0,8 s Vampire SPASS 0,6 s E 0,4 s 0,2 s Z3 0 s 1 2 3 4 5 6 7 8 9 10 1 + … + 1 = ( n ::nat) rev [x 1 , …, x n ] = [x n , …, x 1 ] map f [x 1 , …, x n ] = [f x 1 , …, f x n ]

  59. 0,8 s 30,0 s Vampire SPASS Vampire 0,6 s 22,5 s E 0,4 s 15,0 s E 0,2 s 7,5 s Z3 SPASS Z3 0 s 0 s 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 + … + 1 = ( n ::nat) rev [x 1 , …, x n ] = [x n , …, x 1 ] map f [x 1 , …, x n ] = [f x 1 , …, f x n ]

  60. 0,8 s 30,0 s Vampire SPASS Vampire 0,6 s 22,5 s E 0,4 s 15,0 s E 0,2 s 7,5 s Z3 SPASS Z3 0 s 0 s 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 + … + 1 = ( n ::nat) rev [x 1 , …, x n ] = [x n , …, x 1 ] 10,0 s 7,5 s SPASS 5,0 s 2,5 s E Vampire Z3 0 s 1 2 3 4 5 6 7 8 9 10 map f [x 1 , …, x n ] = [f x 1 , …, f x n ]

  61. 0,8 s 30,0 s Vampire SPASS Vampire 0,6 s 22,5 s E 0,4 s 15,0 s E 0,2 s 7,5 s Z3 SPASS Z3 0 s 0 s 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 + … + 1 = ( n ::nat) rev [x 1 , …, x n ] = [x n , …, x 1 ] 10,0 s 7,5 s (but simp can solve SPASS 5,0 s all of these 2,5 s E within 10 ms…) Vampire Z3 0 s 1 2 3 4 5 6 7 8 9 10 map f [x 1 , …, x n ] = [f x 1 , …, f x n ]

  62. Future Work

  63. Isabelle ATP / SMT Future Work

  64. Isabelle ATP / SMT Improve xx Relevance Filter xx Lighten xx Translation xx Provide xx Extralogical Info. xx Future Work

  65. Isabelle ATP / SMT Improve xx Handle Large Relevance Filter xx Axiom Bases Lighten xx Support Translation xx Types Provide xx Exploit Extralogical Info. xx Extralogical Info. Future Work

  66. Sled gehammer Hell The Day after Jud gment Jasmin C. Blanchette blanchette@in.tum.de

Recommend


More recommend