Useful Lemmas in E ATP Proofs Zarathustra Goertzel and Josef Urban - PowerPoint PPT Presentation

Useful Lemmas in E ATP Proofs Zarathustra Goertzel and Josef Urban Czech T echnical University in Prague AITP’19

Outline of talk ● What are lemmas and why do they matter? ● Quantifying lemma usefulness. ● Machine learning to identify lemmas. ● Conclusion. 2

Lemmas Lemmas are: ● True statements ● Intermediate results ● Sometimes used in multiple theorems Why seek lemmas? ● ATPs struggle to fjnd long proofs. ● Conjecturing new (interesting) results. ● Concise presentations of proofs. 3

Lemmas as Cuts Given axiom set Γ and conjecture C, we want to prove . We call L a lemma if the following holds: * This doesn’t require L be a “useful lemma”. 4

Lemmas via Excluded Middle E is a refutational theorem prover and tries to derive a contradiction: . Therefore the problem can be broken into two sub-problems: 5

Lemma Usefulness: Proof Shortening Ratio If the two sub-problems can be solved (by E) with psr(L, Γ, C) < 1, L can be said to be a useful lemma. 6

Dataset: Built From E Proofs ● E’s a saturation-based refutational ATP . ● Goal: Prove conjecture from premises. ● E has two sets of clauses: ● Processed clauses P (initially empty) ● Unprocessed clauses U (Negated Conjecture and Premises) ● Given Clause Loop: ● Select ‘ given clause ’ g to add to P ● Apply inference rules to g and all clauses in P ● Process new clauses. Add non-trivial and non-redundant ones to U. ● Proof search succeeds when empty clause is inferred. ● Proof consists of given clauses. 7

Down and Dirty with the Datset ● 3161 CNF problems from Mizar 40 dataset ● Proved by single E strategy ● For each clause of proof P , solve both sub- problems. ● 230528 clauses in total 8

Lemma Stats Of the 230528 clauses: ● 98472 are axioms and negated conjectures. ● 87161 are anti-useful lemmas ● 44895 are useful lemmas ● 154 have psr(L, Γ, C) = 1 9

Lemma Stats ● Best lemma’s psr: 0.0036 (275 times faster) ● Worst lemma: 77 times slower ● Number of lemmas under 0.1: 1509 10

Lemma Classifjcation Why? ● To gauge the diffjculty of the dataset ● Clear yes/no results compared to regression Possible use-cases: ● Proof compression for E inference guidance ● Analyze incomplete proof-search to look for lemmas 11

Clauses Vectors ● Treat clause as tree. Abstract vars and skolem symbols ● Features are descending paths of length 3 12

Clauses Vectors Enumerate features (→ R^|Features| vector space) Count features in a clause for its vector 13

ML Methods ● Support Vector Machine Classifjer (SVC) from scikit-learn ● XGBoost: gradient boosted random decision forest: ● SVC and XGBoost use |Clause ++ Conjecture| Enigma features. ● Graph Attention Networks (GAT): ● Assign labels or numbers to nodes via the graph structure. ● At each level, a node’s features depend on its neighbors. ● Drawback: graph adjacency matrix, large memory consumption ● Question: Will the proof-graph structure help identify lemmas? 14

Results F score F score 15 Images courtesy of https://en.wikipedia.org/wiki/F1_score

Results F-score Precision Recall Accuracy SVC 0.53 0.45 0.64 0.74 GAT 0.55 0.45 0.72 0.55 XGBoost 0.68 0.65 0.72 0.77 Precision and Recall are with Precision and Recall are with Results are on a 10% test set. Results are on a 10% test set. respect to useful lemmas. respect to useful lemmas. 16

Conclusions ● GAT appears not to scale, and the proof-graph is not efgectively utilized. ● XGBoost is cheap to train and suffjciently efgective as to be used in further experiments with E. Todo: ● Learn more semantic features ● Work on generating lemmas 17

Useful Lemmas in E ATP Proofs Zarathustra Goertzel and Josef Urban - PowerPoint PPT Presentation

Useful Lemmas in E ATP Proofs Zarathustra Goertzel and Josef Urban Czech T echnical University in Prague AITP19 Outline of talk What are lemmas and why do they matter? Quantifying lemma usefulness. Machine learning to identify

ATP Evaporation Plant ATP - Evaporation Unpacking ATP - Evaporation Positioning The Effects

ATP AUTOMOTIVE ATP MOTORS ATP TRANSIT SPECIFIC COMPANIES TURNOVER GROUP Over 82 mil

Quantifier Elimination Helpful lemmas Let S be a set of sentences. Helpful lemmas Let S be a set

Welcome to PROOFS! PROOFS: Security Proofs for Embedded Systems Introduction to the fifth

Zero-Knowledge Proofs 1 Zero-Knowledge Proofs Lecture 15 1 Interactive Proofs 2 Interactive

Zero-Knowledge Proofs Lecture 15 Interactive Proofs Interactive Proofs Interactive Proofs

Sperner, Tucker and Ky Fans lemmas for manifolds Oleg R. Musin University of Texas at

Interactive Proofs Lecture 18 AM 1 Interactive Proofs 2 Interactive Proofs IP[k] 2

2019 Active Transportation Program Workshop (Cycle 4) January 22, 2018 2019 ATP Program 8th

11 An introduction to Riemann Integration The PROOFS of the standard lemmas and theorems

Energy in the Cell ATP= Most commonly used energy in the cell Adenosine triphosphate - Adenosine

? 190 ATP or ATP-analogues 120 + Hp DnaB ssDNA Laboratory of Physical Chemistry

Our Plan for Today Part 1a: ATP and Proof Calculi Build Your Own First-Order Prover Part 1b:

Structure, mechanism, and inhibition of the membrane motor of the ATP synthase inferred from

Lecture 5: SOS Proofs and the Motzkin Polynomial Lecture Outline Part I: SOS proofs and

Proofs, Continued Today Proofs : A style guide Proofs should be easy to

year on Paul Smith, Head of Policy, PSR 1 Contents The PSR after a year What have we

PULSAR FORMATION RATES HOW HOW ? ?

THE TWO-STREAM TRANSVERSE INSTABILITY & BEAM PERFORMANCE LIMITATION Vadim Dudnikov,

Energy Efficiency and 111(d) A resource and tool for Clean Air Act Compliance Presented by Sara

CTP for PANA draft-bournelle-pana-ctp-00.txt Julien Bournelle Maryline Laurent-Maknavicius

An introduction to the theory of SDDP algorithm V. Lecl` ere (ENPC) August 1, 2014 V. Lecl`

Lecture V: Construction of Complete Quaternionic K ahler Manifolds Vicente Cort es

Challenge on the Inclusive analysis of Fermi-LAT point sources Goal: to localise and classify Fermi

Useful Lemmas in E ATP Proofs Zarathustra Goertzel and Josef Urban - PowerPoint PPT Presentation

Useful Lemmas in E ATP Proofs Zarathustra Goertzel and Josef Urban Czech T echnical University in Prague AITP19 Outline of talk What are lemmas and why do they matter? Quantifying lemma usefulness. Machine learning to identify

ATP Evaporation Plant ATP - Evaporation Unpacking ATP - Evaporation Positioning The Effects

ATP AUTOMOTIVE ATP MOTORS ATP TRANSIT SPECIFIC COMPANIES TURNOVER GROUP Over 82 mil

Quantifier Elimination Helpful lemmas Let S be a set of sentences. Helpful lemmas Let S be a set

Welcome to PROOFS! PROOFS: Security Proofs for Embedded Systems Introduction to the fifth

Zero-Knowledge Proofs 1 Zero-Knowledge Proofs Lecture 15 1 Interactive Proofs 2 Interactive

Zero-Knowledge Proofs Lecture 15 Interactive Proofs Interactive Proofs Interactive Proofs

Sperner, Tucker and Ky Fans lemmas for manifolds Oleg R. Musin University of Texas at

Interactive Proofs Lecture 18 AM 1 Interactive Proofs 2 Interactive Proofs IP[k] 2

2019 Active Transportation Program Workshop (Cycle 4) January 22, 2018 2019 ATP Program 8th

11 An introduction to Riemann Integration The PROOFS of the standard lemmas and theorems

Energy in the Cell ATP= Most commonly used energy in the cell Adenosine triphosphate - Adenosine

? 190 ATP or ATP-analogues 120 + Hp DnaB ssDNA Laboratory of Physical Chemistry

Our Plan for Today Part 1a: ATP and Proof Calculi Build Your Own First-Order Prover Part 1b:

Structure, mechanism, and inhibition of the membrane motor of the ATP synthase inferred from

Lecture 5: SOS Proofs and the Motzkin Polynomial Lecture Outline Part I: SOS proofs and

Proofs, Continued Today Proofs : A style guide Proofs should be easy to

year on Paul Smith, Head of Policy, PSR 1 Contents The PSR after a year What have we

PULSAR FORMATION RATES HOW HOW ? ?

THE TWO-STREAM TRANSVERSE INSTABILITY &amp; BEAM PERFORMANCE LIMITATION Vadim Dudnikov,

Energy Efficiency and 111(d) A resource and tool for Clean Air Act Compliance Presented by Sara

CTP for PANA draft-bournelle-pana-ctp-00.txt Julien Bournelle Maryline Laurent-Maknavicius

An introduction to the theory of SDDP algorithm V. Lecl` ere (ENPC) August 1, 2014 V. Lecl`

Lecture V: Construction of Complete Quaternionic K ahler Manifolds Vicente Cort es

Challenge on the Inclusive analysis of Fermi-LAT point sources Goal: to localise and classify Fermi

THE TWO-STREAM TRANSVERSE INSTABILITY & BEAM PERFORMANCE LIMITATION Vadim Dudnikov,