using binary decision diagrams to enumerate inductive
play

Using Binary Decision Diagrams to Enumerate Inductive Logic - PowerPoint PPT Presentation

Hikaru Shindo*, Masaaki Nishino**, Akihiro Yamamoto* September 4, 2018 * Graduate School of Informatics, Kyoto University ** NTT Communication Science Laboratories Using Binary Decision Diagrams to Enumerate Inductive Logic Programming


  1. Hikaru Shindo*, Masaaki Nishino**, Akihiro Yamamoto* September 4, 2018 * Graduate School of Informatics, Kyoto University ** NTT Communication Science Laboratories Using Binary Decision Diagrams to Enumerate Inductive Logic Programming Solutions

  2. 1 • Key idea: We use Binary Decision Diagram for enumeration. represents the set of solutions. Binary Decision Diagram that • We show how to build recursively a • We propose an efficient algorithm for enumerating solutions of representing compactly a Boolean function. might miss important solutions. • Basic formalization of ILP allows many potential solutions, and we Abstract Inductive Logic Programming problem with Binary Decision Diagram s. ⇒ Enumeration is fundamental technique to avoid such missing. • Binary Decision Diagram (BDD) is a directed acyclic graph 0 1 2 0 1

  3. 1. Introduction 2. Binary Decision Diagram and Enumeration of Solutions 3. Applications 4. Experiments 5. Conclusion and Future work 2 Table of contents

  4. Introduction

  5. 3 • ILP system generate solutions for given positive examples and negative examples. On the view point of logic, a lot of candidates of solutions might be generated. • Every ILP system choose some appropriate solutions based on some criteria or its search method. . . . Motivation Example Σ = { p ( a ) } , E + = { p ( a ) } , ⇒ Σ = { p ( x ) ← q ( x ) , q ( a ) } , E − = { p ( b ) } , B = {} We call the solution of ILP problem as hypothesis .

  6. Merits of the enumeration: The importance of a hypothesis depends on the case, so algorithms that give only one hypothesis may not return the best hypothesis. Users can select a hypothesis or compare some hypotheses using an evaluation function. We can efficiently perform online leaning, i.e., updating the current set of hypothesis when new examples are added. 4 Fundamental idea: Enumeration of hypotheses Enumeration of hypotheses is keeping all hypotheses. 1. Preventing hypothesis omission 2. Hypothesis selection 3. Online-learning

  7. • We assume that a finite set of clauses that can be an element of hypotheses is given explicitly. • Even in that finite space, enumerating all hypotheses naively is an implausible task because there are a serious amount of candidate hypotheses. of hypotheses for enumeration. • In this work, we developed an efficient recursive algorithm for constructing a BDD. 5 Approach • To treat such large scale sets of hypotheses, we use Binary Decision Diagram (BDD) s that give compressed representation

  8. • An efficient algorithm for enumerating hypotheses using BDDs. • The class of ILP problems that we can apply our algorithm. • An efficient algorithm to get the best hypothesis with an evaluation function. • We empirically show that our method can be applied to real data. 6 Contribution

  9. Binary Decision Diagram and Enumeration of Solutions

  10. represents a Boolean function. Binary operations between BDDs can be executed efficiently. 7 Binary Decision Diagrams A Binary Decision Diagram (BDD) is a directed acyclic graph that 0 1 2 0 1 BDD that represents F ( x 0 , x 1 , x 2 ) = ( x 0 ∧ x 1 ) ∨ x 2 For example, given two BDDs representing logical functions F and G , then the BDD representing H = F ∧ G can be computed in time linear to F and G sizes.

  11. knowledge, and hypotheses are represented by first-order logic. 8 Inductive Logic Programming In Inductive Logic Programming (ILP) , all data, background ILP Problem Input Finite sets E + , E − , and B of ground atoms Output A set of definite clauses Σ such that 1. for all A ∈ E + Σ ∪ B | = A 2. for all A ∈ E − Σ ∪ B ̸| = A Example E + = { p ( a ) } , E − = { p ( b ) } , B = {} Σ = { p ( a ) } , { p ( x ) ← q ( x ) , q ( a ) } , . . .

  12. • To enumerate ILP hypotheses with BDDs, we introduce Boolean variables, because BDD is a representation of a Boolean function. (1) 9 Using BDDs for enumerating ILP solutions • Boolean variables make the hypothesis enumeration problem equivalent to the problem of identifying a Boolean function . • Hypothesis space H is a finite set of clauses that can be an element of the hypothesis. We assume that H is given explicitly . For each clause C ∈ H , we introduce a propositional variable v C ∈ Σ that becomes true if and only if clause C ∈ Σ . For readability, we represent [ C ∈ Σ] instead of v C ∈ Σ , C ∈ Σ ⇔ [ C ∈ Σ] = T .

  13. 10 Then, a BDD that represents the set of hypotheses is Given: The BDD to be built: Building a BDD that represents hypotheses We define F A as a BDD that represents the Boolean function that becomes true if and only if Σ ∪ B | = A . ∧ ∧ F A ∧ ¬ F A . A ∈E + A ∈E − Example E + = { p ( a ) } , E − = { p ( b ) } , B = {} , 0 1 4 4 F p ( a ) ∧ ¬ F p ( b ) = ∧ 2 3 0 1 1 0

  14. 11 (2) Solving ILP problem on the BDD I C : the BDD that represents the Boolean variable [ C ∈ Σ] BK A : the BDD that represents a constant that becomes true if and only if A ∈ B . Then F A for A ∈ E + ∪ E − is recursively defined as ( ) ∨ ∧ F A = BK A ∨ I C ∧ F B i . C ∈H ∃ θ Cθ = A ← B 1 ∧ ... ∧ B n The right side of equation (2) represents the fact that Σ ∪ B | = A if 1. A ∈ B , or 2. A is deduced by a substitution.

  15. 12 2 Introduced variables: 0 4 1 3 Solving ILP problem on the BDD Example ⃝ [ p ( a ) ∈ Σ] , ⃝ [ p ( b ) ∈ Σ] , ⃝ [ q ( a ) ∈ Σ] , ⃝ [ q ( b ) ∈ Σ] , ⃝ [ p ( x ) ← q ( x ) ∈ Σ] F p ( a ) = ∨ ( I p ( x ) ← q ( x ) ∧ F q ( a ) ) I p ( a ) 0 4 2 1 ) 0 1 0 1 0 ∨ ( ∧ F p ( b ) = I p ( b ) ∨ ( I p ( x ) ← q ( x ) ∧ F q ( b ) ) 1 4 3 1 ) 0 1 0 1 0 ∨ ( ∧

  16. 13 Introduced variables: . . . Enumerated hypotheses: 4 3 2 1 0 Solving ILP problem on the BDD Problem 0 E + = { p ( a ) } , E − = { p ( b ) } , B = {} , { } p ( a ) , p ( b ) , H = . 1 q ( a ) , q ( b ) , p ( x ) ← q ( x ) 1 2 ⃝ [ p ( a ) ∈ Σ] ⃝ [ p ( b ) ∈ Σ] 3 3 ⃝ [ q ( a ) ∈ Σ] ⃝ [ q ( b ) ∈ Σ] ⃝ [ p ( x ) ← q ( x ) ∈ Σ] 4 4 Σ = { p ( a ) } 0 1 Σ = { q ( a ) , p ( x ) ← q ( x ) } F p ( a ) ∧ ¬ F p ( b )

  17. Applications

  18. 14 The hypothesis with minimum number of 0 Introduced variables: 1 path colored red. 2 This corresponds to the minimum-weight 3 atoms: 4 Search for the best hypothesis 0 ⃝ [ p ( a ) ∈ Σ] ⃝ [ p ( b ) ∈ Σ] 1 1 ⃝ [ q ( a ) ∈ Σ] ⃝ [ q ( b ) ∈ Σ] 1 2 ⃝ [ p ( x ) ← q ( x ) ∈ Σ] 1 1 1 3 3 Example 1 4 4 2 1 Σ best = { p ( a ) } 2 0 1 F p ( a ) ∧ ¬ F p ( b )

  19. Experiments

  20. 15 Classification of natural numbers When n is even, E + = { e (0) , e ( s 2 (0)) , . . . , e ( s n (0)) } , E − = { e ( s (0)) , e ( s 3 (0)) , . . . , e ( s n +1 (0)) } . When n is odd, E + = { e (0) , e ( s 2 (0)) , . . . , e ( s n +1 (0)) } , E − = { e ( s (0)) , e ( s 3 (0)) , . . . , e ( s n (0)) } . Example In the case of n = 1 , E + , E − , B , and H are, respectively, E + = { e (0) , e ( s 2 (0)) } , E − = { e ( s (0)) } , B = ∅ , and   e (0) , e ( x ) ,      e ( s (0)) , e ( s ( x )) ,        e ( s 2 (0)) , e ( s 2 ( x )) , H = . e ( s 2 ( x )) ← e ( x ) ,  e ( s ( x )) ← e ( x ) ,         e ( s 2 ( x )) ← e ( s ( x )) , e ( s 2 ( x )) ← e ( s ( x )) ∧ e ( x )   

  21. 16 263 4 69 1.16msec 5 134 69 1.48msec 6 101 27 2.21msec 7 520 156 1.68msec 8 1033 219 2.66msec 1.02msec 42 36 0.62msec variables nodes hypotheses BDD construction time best hypothesis search time 1 10 28 7.56msec 8 9.63msec 2 19 3 14 192 0.68msec Results n 1 . 25 × 10 7 1 . 90 × 10 msec 1 . 31 × 10 13 3 . 08 × 10 msec 4 . 82 × 10 32 7 . 00 × 10 msec 9 . 77 × 10 63 3 . 50 × 10 2 msec 2 . 26 × 10 141 1 . 68 × 10 3 msec 1 . 80 × 10 308 + 1 . 20 × 10 4 msec Table 1: The results of the natural number problem

  22. 17 2243 1 https://archive.ics.uci.edu/ml/datasets/soybean+(small) 2345 117 Shuttle 13495msec 2 https://archive.ics.uci.edu/ml/datasets/Shuttle+Landing+Control 788498 Soybean One of the best hypotheses found in problem of Soybean(small) is, construction time BDD hypotheses nodes variables Problem 3 http://archive.ics.uci.edu/ml/index.php UCI Machine Learning Repository 3 . 30msec Classification of real data (1) Soybean(small) 1 and (2) Shuttle Landing Control 2 from Target concept: D 1 , no _ auto respectively. 1 . 80 × 10 308 + 6 . 76 × 10 10 Table 2: The results of real data problem Σ best = { class ( x, D 1) ← stem _ canker ( x, above _ soil ) } .

  23. Conclusion and Future work

  24. • We proposed a BDD-based method to enumerate hypotheses of an ILP. • We showed that users can get the best hypothesis following an evaluation function from the constructed BDD. • Enumerating hypotheses that have some errors • Combination with other ILP approaches • Enumeration with other data structures 18 Conclusion and Future work Conclusion Future Work

  25. the hypothesis. and it satisfies the following two requirements. 19 Requirements Hypothesis space is a finite set of clauses that can be an element of We assume that the hypothesis space is given explicitly , Requirement 1 The hypothesis space does not contain any mutually recursive clauses . Requirement 2 The hypothesis space is variable-bounded .

Recommend


More recommend