integration of general purpose automated theorem provers
play

Integration of general-purpose automated theorem provers in Lean - PowerPoint PPT Presentation

Integration of general-purpose automated theorem provers in Lean Gabriel Ebner Formal Methods in Mathematics 2020-01-08 Vrije Universiteit Amsterdam Introduction Premise selection Applicative translation to FOL Monomorphizing translation to


  1. Integration of general-purpose automated theorem provers in Lean Gabriel Ebner Formal Methods in Mathematics 2020-01-08 Vrije Universiteit Amsterdam

  2. Introduction Premise selection Applicative translation to FOL Monomorphizing translation to HOL Empirical results Demonstration Conclusion 1

  3. Hammers • “Magic button that proves all your theorems” • e.g. Sledgehammer for Isabelle/HOL • popular, also: HOLyHammer, CoqHammer, etc. • User-friendly integration of automated reasoning tools in proof assistants 2

  4. General idea by hammer General purpose: should work for anything, no setup 3 example (x y z : nat) : x.gcd y ∣ ( x * z ). gcd y :=

  5. Typical setup 1. Find already proven lemmas that look “useful” (“premise selection”, “relevance filter”) 2. Pass lemmas and goal to efficient external prover (e.g. Vampire, E, etc.) • Requires encoding into logic of prover 3. Import generated proof • Popular strategy: mine names of used lemmas, and reconstruct using slow prover 4

  6. Introduction Premise selection Applicative translation to FOL Monomorphizing translation to HOL Empirical results Demonstration Conclusion 5

  7. Features (Based on approach in CoqHammer (Czaja, Kaliszyk 2018)) Assign to every lemma a set of features based on its type: • Every constants c that occurs in the type Ignore: • eq , and , … • Type classes, and type class instance arguments. 6 • The pair ( f , g ) for every subterm fa 1 . . . ( g . . . ) . . . a n

  8. Implementation • Cosine similarity with TF-IDF (term frequency-inverse document frequency) • Common way to calculate similarity between documents (= sequence/set of words) with lots of variations. • Here: document = lemma, word = feature. 2. Scale each coordinate by how rarely it occurs globally 3. Compute similarity of a and b as • Implemented in C++ (for performance reasons) 7 1. Assign to every lemma the characteristic function of its feature set ∈ R | F | a · b ∥ a ∥∥ b ∥

  9. Issue: type classes theorem le_of_lt { α} [preorder α] {a b : α} : sorry by hammer • Should find le_of_lt because it talks about the preorder nat , even though the name preorder does not occur in the goal. sorry • Should not prefer le_of_lt' either. 8 a < b → a ≤ b := example ( a b : nat ) : ¬ a < b ∨ a ≤ b := theorem le_of_lt' {a b : nat} : a < b → a ≤ b :=

  10. Introduction Premise selection Applicative translation to FOL Monomorphizing translation to HOL Empirical results Demonstration Conclusion 9

  11. Applicative translation Translation to single-sorted first-order logic, like CoqHammer: • Equality is translated as equality. • Constant s means Type u . For each constant to be exported, we write one formula expressing its type. 10 • Binary function a ( x , y ) for application xy • Predicate p ( x ) : (proposition) x is inhabited • Relation t ( x , y ) : x has type y

  12. Example translation fof(cnat_o_le__succ, axiom, (![Xn_n3]: (t(Xn_n3, cnat) => p(a(a(a(a(chas__le_o_le, cnat ), cnat_o_has__le), Xn_n3), a(cnat_o_succ, Xn_n3)))))). 11 theorem nat.le_succ : ∀ ( n : nat ), @has_le . le .{0} nat nat . has_le n ( nat . succ n ) ∀ n , t ( n , nat ) → p ( a ( a ( a ( a ( has _ le . le , nat ) , nat . has _ le ) , n ) , a ( nat . succ , n )))

  13. Example translation fof(cnat_o_le__succ, axiom, (![Xn_n3]: (t(Xn_n3, cnat) => p(a(a(a(a(chas__le_o_le, cnat ), cnat_o_has__le), Xn_n3), a(cnat_o_succ, Xn_n3)))))). 11 theorem nat.le_succ : ∀ ( n : nat ), @has_le . le .{0} nat nat . has_le n ( nat . succ n ) ∀ n , t ( n , nat ) → p ( a ( a ( a ( a ( has _ le . le , nat ) , nat . has _ le ) , n ) , a ( nat . succ , n )))

  14. Unsoundness Translation is unsound (= does not preserve unprovability). Two main reasons: 1. Definitional equality and propositional equality are identified. 2. Type u and Type (u+1) are identified. 12 → “spurious” proofs

  15. Type class coherence We often need to show that two type class instances are equal. E.g. if you want to apply le_refl to natural numbers: p(a(a(a(a(chas__le_o_le, X_ga_n2), a(a(cpreorder_o_to__has__le, X_ga_n2), X__inst__1_n3)), Xa_n4), Xa_n4)) vs. p(a(a(a(a(chas__le_o_le, cnat), cnat_o_has__le), Xx_n18), Xx_n18)) 13 → Heuristically add extra equations relating type class instances.

  16. Introduction Premise selection Applicative translation to FOL Monomorphizing translation to HOL Empirical results Demonstration Conclusion 14

  17. Simply-typed higher-order logic Types: • Booleans • Base types: nat , list nat , etc. Terms (formulas are terms of Boolean type): • Constants: nat.add , etc. • Application: ts • Variable: x (We use closed Lean expressions as names for constants and base types.) 15 • Function types: τ 1 → τ 2 • Lambdas: λ x t

  18. Two phases Lean HOL HOL abstraction type instantiation • sound translation • enables provers to do non-first-order reasoning • synthesize lambdas • induction • solves type class coherence issue • mitigates issue with type classes in relevance filter 16 • built-in support for N , Z , R , . . .

  19. • Replace non-HOL subterms by HOL constants. Abstraction • dependent applications • pi types • types like list nat • ... • Instance-implicit arguments are also included in the constants. 17 Turn ∀ {α : Type u} [preorder α] (a : α), a ≤ a into ∀ a : ‘ ?m_1 ’ , ‘ @has_le.le ?m_1 ?m_2.to_has_le ’ a a

  20. Type instantiation • Unify the constants in the HOL terms • @has_le.le ?m_1 ?m_2.to_has_le occurs in lemma • @has_le.le nat nat.has_le occurs in goal nat.has_le . • Also solves additional type-class constraints. E.g. a lemma about Archimedian fields might have an assumption archimedian α which does not occur in any constant. 18 Turn ∀ a : ‘ ?m_1 ’ , ‘ @has_le.le ?m_1 ?m_2.to_has_le ’ a a into ∀ a : ‘ nat ’ , ‘ @has_le.le nat nat.preorder.to_has_le ’ a a → Instantiate lemma by unifying ?m_1 =?= nat and ?m_2.to_has_le =?=

  21. Limitations • equality between types: m = n → zmod m = zmod n • proof arguments: • Just elide them? (Only affects nonemptiness of α here.) 19 → Bundle the non-type arguments? That is, translate to Σ n, zmod n . • dependent families: ∀ i, fin i is translated to a base type @roption.get : ∀ α, ∀ o : roption α, o.dom → α

  22. Introduction Premise selection Applicative translation to FOL Monomorphizing translation to HOL Empirical results Demonstration Conclusion 20

  23. Implementation • Basic relevance filter (in C++) • Applicative first-order encoding • interfaces with Vampire • HOL encoding • interfaces with E-HO • Proof reconstruction using super • Small superposition prover written in (meta-)Lean 21

  24. Experiment setup • 31112 theorems in mathlib + core (everything that’s a declaration.thm ) • Try tactic at the same point in the file as the theorem. • Applicative translation with 10 selected lemmas • Monomorphizing translation with 10/100 selected lemmas • super with 10 selected lemmas • library_search • simp • refl • Time limit of 30s for external provers + try_for 100000 • longest total runtime is 125s 22

  25. Success rate 23 % of non-refl theorems 10 15 20 25 30 35 40 0 5 init logic tactic algebra order group_theory geometry data field_theory directory ring_theory category_theory set_theory topology category linear_algebra measure_theory success_hammer computability analysis number_theory

  26. Success rate, compared 24 % of non-refl theorems 10 15 20 25 30 35 40 0 5 init logic tactic algebra order group_theory geometry data field_theory directory ring_theory category_theory set_theory topology category linear_algebra measure_theory method__ super simp library_search hammer computability analysis number_theory

  27. Unique successes (i.e., not by library_search or simp; incl. super) 25 % of non-refl theorems 10 15 20 25 30 35 0 5 init logic tactic algebra data order group_theory category_theory directory ring_theory set_theory topology field_theory measure_theory linear_algebra analysis unique_success computability number_theory category

  28. Effect of monomorphization 26 % of non-refl theorems 10 15 20 25 30 35 0 5 init logic algebra tactic order field_theory group_theory data ring_theory directory geometry set_theory category topology computability monomorphization linear_algebra category_theory analysis True False measure_theory number_theory

  29. Robustness of reconstruction 27 additional success in % 100 20 40 60 80 0 number_theory set_theory logic topology measure_theory computability data tactic analysis directory linear_algebra extra_success_if_reconstruction_always_worked group_theory init order ring_theory category_theory algebra category geometry field_theory

  30. Lots of room for improvements—lemma selection 28 % of non-refl theorems 10 20 30 40 50 0 init algebra order ring_theory tactic logic field_theory group_theory data directory set_theory geometry analysis lemmas_extracted_from_proof measure_theory computability linear_algebra True False topology category_theory category number_theory

  31. Introduction Premise selection Applicative translation to FOL Monomorphizing translation to HOL Empirical results Demonstration Conclusion 29

Recommend


More recommend