Learning theorem proving through self-play Stanisaw Purga The goal - PowerPoint PPT Presentation

Learning theorem proving through self-play Stanisław Purgał

The goal Learn to prove theorems without: • any proofs • any theorems What we get: • a list of axioms defining the logic 1

Overview • AlphaZero (briefly) • Proving game • adjusting MCTS for proving game • some results 2

Neural black box game state S move policy expected outcome π ∈ R n v ∈ R 3

Neural black box ( S 1 , π 1 , v 1 ) . . . ( S n , π n , v n ) 4

Monte-Carlo Tree Search game state S move policy expected outcome π ∈ R n v ∈ R 5

Monte-Carlo Tree Search S choose a child according S 1 S 2 S 3 to the formula: v π √ n c · n i π i + v i log n + c base + 1 � � c = + c init c base weighted c base = 19652 average c init = 1 . 25 6

Monte-Carlo Tree Search 7

Monte-Carlo Tree Search 8

Closing the loop • play lots of games • choose moves randomly, according to MCTS policy • use finished games for training: • target value in the result of the game • target policy is the MCTS policy • also add noise to neural network output to increase exploration 9

Proving game theorem Prove the theorem win lose 10

Proving game Construct a theorem Adversary wins Prove the theorem Prover wins 11

Prolog-like proving A ⊢ X A ⊢ Y (1) A ⊢ X ∧ Y holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) (2) 12

Prolog-like proving [ X : A ⊢ X ∧ ¬¬ X , ... ] A ⊢ X ∧ Y :- A ⊢ X , A ⊢ Y X : A ⊢ X ∧ ¬¬ X :- X : A ⊢ X , X : A ⊢ ¬¬ X [ X : A ⊢ X , X : A ⊢ ¬¬ X , ... ] 13

Prolog-like proving [ X : A , and ( X , not ( not ( X )))) , ... ] holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) holds ( X : A , and ( X , not ( not ( X )))) :- holds ( X : A , X ) , holds ( X : A , not ( not ( X ))) [ holds ( X : A , X ) , holds ( X : A , not ( not ( X ))) , ... ] 14

Prolog-like theorem constructing [ holds ( X : A , and ( X , not ( not ( X )))) , ... ] holds ( X : A , and ( X , not ( not ( X )))) :- holds ( X : A , X ) , holds ( X : A , not ( not ( X ))) holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) [ holds ( X : A , X ) , holds ( X : A , not ( not ( X ))) , ... ] bad idea 15

Prolog-like theorem constructing [ holds ( A , ♣ ) , ... ] holds ( A , ♣ ) :- holds ( A , or ( ♦ , ♥ )) , holds ( A , implies ( ♦ , ♣ )) , holds ( A , implies ( ♥ , ♣ )) holds ( A , Z ) :- holds ( A , or ( X , Y )) , holds ( A , implies ( X , Z )) , holds ( A , implies ( Y , Z )) [ � , � , � , ... ] bad idea 16

Prolog-like theorem constructing [ T ] holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) holds ( A , and ( X , Y )) :- holds ( A , X ) , holds ( A , Y ) [ holds ( A , X ) , holds ( A , Y ) ] 17

Prolog-like theorem constructing T holds ( X : A , and ( X , not ( not ( X )))) holds ( x:a , and ( x , not ( not ( x )))) 18

Forcing termination of the game Step limit: • ugly extension of game state • strategy may depend on number of steps left • even if we hide it, there is a correlation: large term constructed ∼ few steps left ∼ will likely lose 19

Forcing termination of the game Sudden death chance: • game states nicely equal • no hard limit for length of a theorem During training playout, randomly terminate game with chance p d . In MCTS, adjust value v ′ = ( − 1 ) · p d + v · ( 1 − p d ) . 20

Disadvantages of this game • two different players - if one player starts winning every game, we can’t learn much • proof use single inference steps - inefficient • players don’t take turns - MCTS not designed for that situation 21

Not using maximum 22

Certainty propagation 26

Certainty propagation recursively: for uncertain leafs: for certain leafs: v = min( u , max( l , a )) v = � v = result a = � +Σ v i · n i a = � a = result n + 1 l = max i l i l = − 1 l = result u = max i u i u = 1 u = result when player changes: • values and bounds flip • lower and upper bound switch places 29

Learning the proving game Like AlphaZero, with few differences: • using Transformer (encoder) for � • for theorems that prover failed to prove, show proper path with additional training samples • during evaluation, greedy policy and step limit instead of sudden death • balance training batches to have even split of won and lost games 30

Proving game evaluation Construct a theorem evaluation theorem Adversary wins Prove the theorem Prover wins 31

Potential problems Players are non symmetrical: • Prover could be winning everything • Adversary could be winning everything to some extent this is handled by additional training samples can be solved by more exploration 32

Uninteresting space of hard theorems ∃ x f ( x ) = y (where f is a one-way function) • easy to prove if you can choose what y is • hard to prove if y is fixed so hard that we can’t expect the prover to learn it this is stable - more learning and/or exploration won’t help 33

Results (intuitionstic first-order - sequential calculus) 20 15 solved theorems 10 5 0 0 5 10 15 20 25 time (hours) 34

Results Solved: ⊢ ( ∀ a ∀ b p c ( f c ( a , b )) → ∃ d ∃ e p c ( f c ( d , e ))) ⊢ ( ¬ ( p a ( ∅ ) → p b ( ∅ )) → ( p b ( ∅ ) → p a ( ∅ ))) Unsolved: ⊢ ( ∃ a p b ( a ) → ∃ c p b ( c )) (3) 35

Results (intuitionstic first-order - sequential calculus) construction failed proven not proven 100% 75% 50% 25% 0% 5 10 15 20 time 36

Results (intuitionistic first-order - sequential calculus) unproven theorems - first hour: A , ⊥ ⊢ C ⊢ ( ⊥ → B ) ( A → B ) , A ⊢ B A , B , C , D , E , F , G , H ⊢ H A , B , C , D , E , F , G , H , I , J , K , L , M ⊢ M A , B , C , D , E , F , G , H , I ⊢ I 37

Results (intuitionistic first-order - sequential calculus) unproven theorems - second hour: ∀ a Ω a C ⊢ Ω a C ⊢ ( B ∨ ( ¬⊥ ∨ C )) ( A ∧ Ω c Ω e F ) ⊢ ∃ e Ω c Ω e F ( A ∧ B ) ⊢ ( D → B ) ( A ∧ B ) ⊢ ( D ∨ A ) ⊢ (( B ∧ ( C ∧ D )) → C ) 38

Results (intuitionistic first-order - sequential calculus) unproven theorems - third hour: ∀ a (Ω c Ω a E ∧ Ω g (Ω a J ⋆ Ω a L )) ⊢ Ω g (Ω a J ⋆ Ω a L ) A , B , C , D , E , F , G , (( H ∧ ⊥ ) ∧ I ) ⊢ ¬ K A , B , C , D , E , F , G , H , ⊥ ⊢ ( J ∨ K ) A , ¬ B , C , ( D ∧ B ) ⊢ ( F ∨ G ) ∀ a ( p b ( f c ( f d ( a , ∅ ) , ∅ )) ∧ ⊥ ) , ¬¬ E ⊢ ∃ g Ω g I A , B , ¬ C , D , E , ( C ∧ F ) ⊢ ( H ↔ ¬⊥ ) 39

Results (intuitionistic first-order - sequential calculus) unproven theorems - twelth hour: A , B , ( ∀ c Ω e (Ω c H ⋆ ¬¬¬¬ Ω j Ω l ¬ ( ¬⊥ ⋆ ( ¬¬ ( ⊥ ⋆ Ω c Q ) ⋆ ¬¬ Ω c S ))) ↔ A ) ⊢ Ω e (Ω c H ⋆ ¬¬¬¬ Ω j Ω l ¬ ( ¬⊥ ⋆ ( ¬¬ ( ⊥ ⋆ Ω c Q ) ⋆ ¬¬ Ω c S ))) A , B , ( ∀ c X ↔ A ) ⊢ X 40

How to do better • train longer and/or harder costly • relegate low-level reasoning to some more efficient solver need to invent some other mechanism for generating theorems • allow use of theorems, not only axioms action space becomes large and changing over time all above still face uninteresting theorem space • use some other objective would be nice to find theorems that are useful in proving other theorems – but how exactly would that work? 41

Thank you for your attention! Stanisław Purgał

Learning theorem proving through self-play Stanisaw Purga The goal - PowerPoint PPT Presentation

Learning theorem proving through self-play Stanisaw Purga The goal Learn to prove theorems without: any proofs any theorems What we get: a list of axioms defining the logic 1 Overview AlphaZero (briefly) Proving game

Learning theorem proving through self-play Stanisaw Purga Overview AlphaZero Proving

Visual theorem proving with the Incredible Proof Machine The idea Theorem Proving without

Artificial Intelligence in Theorem Proving Cezary Kaliszyk VTSA Overview Last Lecture theorem

Theorem-Proving Environments Nathan Ng CSC2547: Learning to Search Theorem Proving What is a

On Theorem Proving for Program Checking Historical perspective and recent developments Maria

Symbolic Computation and Theorem Proving in Program Analysis Laura Kov acs Chalmers

Artificial Intelligence in Theorem Proving Cezary Kaliszyk VTSA 2019 Computer Theorem Proving

Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the

31. Stokes Theorem Stokes theorem is to Greens theorem, for the work done, as the

Automated Theorem Proving 1/4: Introduction and Propositional Theorem Proving A.L. Lamprecht

Instantiation-Based Automated Theorem Proving for First-Order Logic Konstantin Korovin The

Functional Programming Functional Programming and Theorem Proving and Theorem Proving for

Formal Verification Methods 4: Theorem Proving John Harrison Intel Corporation Need for

Automated Theorem Proving 2/4: First-Order Theorem Proving A.L. Lamprecht Course Program

The Role of Play in Self-Regulation The Role of Play in Self-Regulation Opportunities to teach

Saturation-based Theorem Proving and ML Course Machine Learning and Reasoning 2020 MLR 2020 1 1

Unit 1: Introduction to data Lecture 3: Introduction to statistical inference via simulation

Document Summarization Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC

Energy Networks Association Open Networks Future Worlds Stakeholder Event Edinburgh 29 th

Is there a wild animal welfare emergency facilitated by negative linguistic framing in wildlife

Audio/Video Transport Core Maintenance Working Group Magnus Westerlund Roni Even

Formal Formal Specif Specification ication and Verific and Verification ation of Solid of

Theory of Design Michael Leyton Center for Discrete Mathematics & Theoretical Computer

Assisting Clients with Transitions To and From Employer-sponsored Coverage March 29, 2018

Learning theorem proving through self-play Stanisaw Purga The goal - PowerPoint PPT Presentation

Learning theorem proving through self-play Stanisaw Purga The goal Learn to prove theorems without: any proofs any theorems What we get: a list of axioms defining the logic 1 Overview AlphaZero (briefly) Proving game

Learning theorem proving through self-play Stanisaw Purga Overview AlphaZero Proving

Visual theorem proving with the Incredible Proof Machine The idea Theorem Proving without

Artificial Intelligence in Theorem Proving Cezary Kaliszyk VTSA Overview Last Lecture theorem

Theorem-Proving Environments Nathan Ng CSC2547: Learning to Search Theorem Proving What is a

On Theorem Proving for Program Checking Historical perspective and recent developments Maria

Symbolic Computation and Theorem Proving in Program Analysis Laura Kov acs Chalmers

Artificial Intelligence in Theorem Proving Cezary Kaliszyk VTSA 2019 Computer Theorem Proving

Does God play dice with the cell? Does God play dice with the cell? Does God play dice with the

31. Stokes Theorem Stokes theorem is to Greens theorem, for the work done, as the

Automated Theorem Proving 1/4: Introduction and Propositional Theorem Proving A.L. Lamprecht

Instantiation-Based Automated Theorem Proving for First-Order Logic Konstantin Korovin The

Functional Programming Functional Programming and Theorem Proving and Theorem Proving for

Formal Verification Methods 4: Theorem Proving John Harrison Intel Corporation Need for

Automated Theorem Proving 2/4: First-Order Theorem Proving A.L. Lamprecht Course Program

The Role of Play in Self-Regulation The Role of Play in Self-Regulation Opportunities to teach

Saturation-based Theorem Proving and ML Course Machine Learning and Reasoning 2020 MLR 2020 1 1

Unit 1: Introduction to data Lecture 3: Introduction to statistical inference via simulation

Document Summarization Statistical NLP Spring 2011 Lecture 25: Summarization Dan Klein UC

Energy Networks Association Open Networks Future Worlds Stakeholder Event Edinburgh 29 th

Is there a wild animal welfare emergency facilitated by negative linguistic framing in wildlife

Audio/Video Transport Core Maintenance Working Group Magnus Westerlund Roni Even

Formal Formal Specif Specification ication and Verific and Verification ation of Solid of

Theory of Design Michael Leyton Center for Discrete Mathematics &amp; Theoretical Computer

Assisting Clients with Transitions To and From Employer-sponsored Coverage March 29, 2018

Theory of Design Michael Leyton Center for Discrete Mathematics & Theoretical Computer