SAT-based Encodings for Optimal Decision Trees with Explicit Paths s Janota 1,2 , Ant´ onio Morgado 1 Mikol´ aˇ 1 INESC-ID/IST, Universidade de Lisboa, Portugal 2 Czech Technical University in Prague, Czech Republic SAT 2020 Janota and Morgado 1 / 19
Example of Decision Trees Question: Should I start writing a new article paper ? Janota and Morgado 2 / 19
Example of Decision Trees Question: Should I start writing a new article paper ? Features: A - Is the idea for the paper great and innovative? B - Are the results bad? C - Is there a very close deadline for the paper? Janota and Morgado 2 / 19
Example of Decision Trees Question: Should I start writing a new article paper ? Features: A - Is the idea for the paper great and innovative? B - Are the results bad? C - Is there a very close deadline for the paper? Samples : A B C Write? 0 0 0 1 0 0 1 0 0 1 1 0 1 0 1 1 1 0 0 1 1 1 0 0 Janota and Morgado 2 / 19
Example of Decision Trees Question: Should I start writing a new article paper ? Features: A - Is the idea for the paper great and innovative? B - Are the results bad? C - Is there a very close deadline for the paper? Samples : A A B C Write? 0 1 0 0 0 1 0 0 1 0 C B 0 1 1 0 ⇒ 1 0 1 1 0 1 0 1 1 0 0 1 1 1 0 0 Write No Write No Janota and Morgado 2 / 19
Objectives and Motivation What and Why: Given a set of samples, find provably smallest decision tree. Janota and Morgado 3 / 19
Objectives and Motivation What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better. Janota and Morgado 3 / 19
Objectives and Motivation What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better. Why provably smallest? standard algorithms heuristically find small trees Janota and Morgado 3 / 19
Objectives and Motivation What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better. Why provably smallest? standard algorithms heuristically find small trees How: Encode into SAT the question is there a tree of size N ? Janota and Morgado 3 / 19
Objectives and Motivation What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better. Why provably smallest? standard algorithms heuristically find small trees How: Encode into SAT the question is there a tree of size N ? Look for the smallest tree iteratively. Janota and Morgado 3 / 19
Objectives and Motivation What and Why: Given a set of samples, find provably smallest decision tree. Why smallest? by Occam’s razor, smaller trees generalize better. Why provably smallest? standard algorithms heuristically find small trees How: Encode into SAT the question is there a tree of size N ? Look for the smallest tree iteratively. Two minimization criteria investigated: depth and size. Janota and Morgado 3 / 19
Minimum Depth Optimal Decision Tree Example benchmark postoperative-patient-data-un 1-un with 50% sampling (approx. 20 features, 40 samples): 1 0 1 2 35 0 1 0 1 3 26 T F 1 0 0 1 F 4 27 32 1 0 0 1 0 1 F 5 T 29 T F 0 1 0 1 6 19 T F 1 0 1 0 1 0 1 7 16 T 21 2 3 1 0 0 1 0 1 0 1 0 1 F 8 F T T F F T 6 7 0 1 0 1 0 1 T 10 8 9 18 19 1 0 0 1 0 1 1 0 0 1 F 11 T F 12 13 T 20 24 25 0 1 0 1 0 0 1 0 1 0 1 1 T F T F F T T F F T T F sklearn d-dtfinder (depth 11, 37 nodes) (depth 6, 29 nodes) Remaining tools timeout (after 1000s) Janota and Morgado 4 / 19
Encoding Decision Trees into SAT 1 Current SAT approaches: ◮ model the tree as DAG ◮ impose tree structure Janota and Morgado 5 / 19
Encoding Decision Trees into SAT 1 Current SAT approaches: ◮ model the tree as DAG ◮ impose tree structure 2 Our, new, approach: ◮ model the tree as a set of paths ◮ impose conditions on the paths to make up a (binary) tree Janota and Morgado 5 / 19
Encoding Decision Trees into SAT 1 Current SAT approaches: ◮ model the tree as DAG ◮ impose tree structure 2 Our, new, approach: ◮ model the tree as a set of paths ◮ impose conditions on the paths to make up a (binary) tree 3 Advantages: ◮ explicit control over tree’s depth ◮ no need for cardinality constraints over neighbors (DAG) ◮ no need for distinction between internal/leaf nodes Janota and Morgado 5 / 19
Trees as Paths tree expands to O ( n ) paths consecutive paths overlap until they diverge A 0 1 B G 0 1 C D 0 1 E F Janota and Morgado 6 / 19
Trees as Paths tree expands to O ( n ) paths consecutive paths overlap until they diverge A 0 1 0 0 A B C B G 0 1 0 0 1 A B D E C D 0 1 1 A B D F 0 1 1 A G E F Janota and Morgado 6 / 19
Trees as Paths tree expands to O ( n ) paths consecutive paths overlap until they diverge A 0 1 0 0 A B C B G 0 1 0 0 1 A B D E C D 0 1 1 A B D F 0 1 1 A G E F Janota and Morgado 6 / 19
Trees as Paths tree expands to O ( n ) paths consecutive paths overlap until they diverge A 0 1 0 0 A B C B G 0 1 0 0 1 A B D E C D 0 1 1 A B D F 0 1 1 A G E F Janota and Morgado 6 / 19
Trees as Paths tree expands to O ( n ) paths consecutive paths overlap until they diverge A 0 1 0 0 A B C B G 0 1 0 0 1 A B D E C D 0 1 1 A B D F 0 1 1 A G E F Janota and Morgado 6 / 19
Paths as Booleans The shape of any path modeled as follows: go right? go right? terminate? terminate? . . . equal to previous? equal to previous? step 1 step S Janota and Morgado 7 / 19
Paths as Booleans, example A 0 1 B C Janota and Morgado 8 / 19
Paths as Booleans, example A 0 A B 0 1 1 A C B C Janota and Morgado 8 / 19
Paths as Booleans, example A 0 A B 0 1 1 A C B C 1 A C go right=T go right=* go right=* terminate=F terminate=F terminate=T equal to prev=T equal to prev=F equal to prev=F step 1 step 2 step 3 Janota and Morgado 8 / 19
Shaping Paths into Trees Directions: N 1 0 0 0 0 0 1 0 1 0 1 0 0 1 1 N 2 N 3 1 0 0 1 0 1 0 1 0 1 1 1 0 N 4 N 5 N 6 N 7 1 1 1 Janota and Morgado 9 / 19
Shaping Paths into Trees Directions: N 1 0 0 1 N 2 N 3 1 0 0 1 0 1 0 1 1 1 0 N 6 N 7 1 1 1 Janota and Morgado 9 / 19
Shaping Paths into Trees, Main Rules R 0 1 S 0 1 L R 1 0 1 First path always goes to the left Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules R 0 1 S 0 1 L R 1 0 1 First path always goes to the left 2 Last path always goes to the right Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules R 0 1 S 0 1 L R 1 0 1 First path always goes to the left 2 Last path always goes to the right 3 Paths must not cross. Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules R 0 1 S 0 1 L R 1 0 1 First path always goes to the left 2 Last path always goes to the right 3 Paths must not cross. 4 To avoid gaps: once a path diverges . . . Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules R 0 1 S 0 1 L R 1 0 1 First path always goes to the left 2 Last path always goes to the right 3 Paths must not cross. 4 To avoid gaps: once a path diverges . . . ◮ it has to keep going left, Janota and Morgado 10 / 19
Shaping Paths into Trees, Main Rules R 0 1 S 0 1 L R 1 0 1 First path always goes to the left 2 Last path always goes to the right 3 Paths must not cross. 4 To avoid gaps: once a path diverges . . . ◮ it has to keep going left, ◮ the previous path has to keep going right. Janota and Morgado 10 / 19
Encoding Semantics 1 Each path has a classification class (single Boolean). Janota and Morgado 11 / 19
Encoding Semantics 1 Each path has a classification class (single Boolean). 2 Each node is assigned a feature. Janota and Morgado 11 / 19
Encoding Semantics 1 Each path has a classification class (single Boolean). 2 Each node is assigned a feature. 3 For each path determine which samples reach its end. Janota and Morgado 11 / 19
Encoding Semantics 1 Each path has a classification class (single Boolean). 2 Each node is assigned a feature. 3 For each path determine which samples reach its end. 4 If a positive sample reaches the end of a path, the path must be positive. Janota and Morgado 11 / 19
Encoding Semantics 1 Each path has a classification class (single Boolean). 2 Each node is assigned a feature. 3 For each path determine which samples reach its end. 4 If a positive sample reaches the end of a path, the path must be positive. 5 If a negative sample reaches the end of a path, the path must be negative. Janota and Morgado 11 / 19
Optimizations Enforcing example matching Janota and Morgado 12 / 19
Optimizations Enforcing example matching Pure features Janota and Morgado 12 / 19
Optimizations Enforcing example matching Pure features Quasi-pure features Janota and Morgado 12 / 19
Recommend
More recommend