It’s Not What Machines Can Learn, It’s What We Cannot Teach ICML 2020 Gal Yehuda, Moshe Gabel, Assaf Schuster
Applications of machine learning G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 2
Example : TSP Given a graph, we feed it to a model which outputs whether a route with cost < C exists YES GNN NO Prates, Avelar, Lemos, Lamb, Vardi, Learning to Solve NP-Complete Problems - A Graph Neural Network for Decision TSP ,AAAI 2019 G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 3
The machine learning process Propose: architecture, Generate Data Train model Evaluate SUCCESS features, embedding G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 4
Current Data Generation SotA ML methods are data hungry • Need many labeled examples YES ? Labeling training data is slow • Need to solve TSP, check 3-SAT, etc. NO ? Instead, data augmentation : • Start with small labeled training set • Apply label-preserving transformation G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 5
Our Main Result When starting with NP-hard problem, any efficient data generation or augmentation provably results in easier NP-hard subproblem. slow (non-poly-time) data generation This creates a catch-22: • Slow data generation à dataset too small NP ∩ coNP fast (poly-time) data • Fast data generation à easier subproblem generation or augmentation G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 6
Case Study: Conjunctive Query Containment Experiment on a case study, CQC. 100 90 80 Used common data sampling + 70 augmentation approach 60 Accuracy 50 40 Model appears to learn well! 30 20 10 Results on “real” space much lower. 0 • Up to 30% drop Augmented Sampled G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 7
Takeaways Efficient data generation results in easier subproblem when training. Can cause overestimation of accuracy when testing. Results in catch-22 : • small amounts of training data from right problem? • or large amounts of training data from easier subproblem? G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 8
Let’s dive deeper G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 9
What exactly did we show? Let L be an NP -hard language The binary classification problem: is x ∈ L or not? Sampler for L : probabilistic algorithm that generates labeled instances Efficient Sampler for L : a sampler that runs in poly-time , YES Sampler , NO G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 10
Result 1: All polynomial time samplers are incomplete • There are infinitely many instances it cannot generate ! The problem space, seen by efficient sampler The original problem space poly-time sampler G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 11
Result 2: Poly-time sampler yields easier subproblem If 𝑇 ! is a polynomial time sampler for a language 𝑀 , then the classification task over the instances 𝑇 ! generates is in NP ∩ coNP. ( , YES/NO) poly-time Is in L? L sampler The original problem was NP-hard Resulting problem is NP ∩ coNP G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 12
Meaning: efficient sampling harder does not preserve hardness NP-hard Even if we started with an NP-hard problem, NP-complete what’s left after an efficient sampling is an easier sub-problem NP coNP P easier G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 13
Proof NP = easy to verify that x ∈ L For all x, ∃𝑣 such that M(x,u) = 1 ⟺ x ∈ L coNP = easy to verify that x ∉ L For all x, ∃𝑣 such that M(x,u) = 1 ⟺ x ∉ L G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 14
Proof If x was generated by an efficient sampler 𝑇 ! , we can use the randomness used by the sampler both as a membership certificate and a non-membership cetificate To show that x ∈ L , check if 𝑇 ! (u) outpus (x, YES) è L ∈ NP To show that x ∉ L, check if 𝑇 ! (u) outputs (x, NO) è L ∈ co-NP G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 15
Result 3: It can get really bad… We show an L such that: 1. Original L is NP-hard. 2. Output of any polynomial time sampler for L is trivial to classify: the first bit of X is the label with high probability. sample x poly-time L sampler 1 0 NO YES G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 16
harder It can get really bad… NP-hard Meaning: any learning algorithm trained NP-complete on efficiently generated data ”thinks” it has 100% accuracy, where in fact it learns nothing about the original problem. NP coNP P constant time easier G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 17
Case study: Conjunctive Query Containment • A conjunctive query q over a dataset is a first order predicate of the form: • The task: given two queries q and p , are the results of q contained in the results of p regardless of database they run on? • This is an NP-complete problem. • Implications on query optimization, cache management, and more. G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 18
Case study: CQC sample from label using label preserving phase transition solver transformations N Y Y Y Y N N Vampire N N N N data Y theorem Y Y Y augmentation N prover Y Y N N N N N Y Y N N G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 19
Case study: CQC Proposed an architecture and trained it to high validation accuracy 95 Accuracy (%) 0 . 5 90 Accuracy 85 0 . 4 Loss Loss 80 0 . 3 75 0 . 2 70 0 . 0 2 . 5 5 . 0 7 . 5 10 . 0 12 . 5 15 . 0 Million samples G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 20
Case study: CQC Evaluate 0.942 aug 30% Test set accuracy 0.804 all-cqc drop 0.647 µ (10 , 8) 0 . 0 0 . 1 0 . 2 0 . 3 0 . 4 0 . 5 0 . 6 0 . 7 0 . 8 0 . 9 1 . 0 Accuracy G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 21
In Summary • Can we use Machine Learning to approximately solve NP-hard problems? • Not enough to worry about the representation power of the network. Also worry about the procedure used to generate the data. • All poly-time data generators result in easier sub-problems. • And it may be very easy. • We must be careful when we evaluate our models. G. Yehuda, M. Gabel, A. Schuster. It's Not What Machines Can Learn, It's What We Cannot Teach. ICML 2020 22
THANK YOU! We will he bappy to discuss the work and answer questions. ygal@cs.technion.ac.il mgabel@cs.toronto.edu
Recommend
More recommend