Neural Program Synthesis Rishabh Singh, Google Brain
Great Collaborators!
Deep Learning and Evolutionary Progression Vision Perceptual Tasks Speech Language Algorithmic Tasks Programming
Neural Program Learning M ore Complex Tasks G eneralizability I nterpretability
Long term Vision Agent to win programming contests [T opCoder] Program Representations Fuzzing/Security Testing Program Repair Program optimization
Neural Program Induction
Differentiable Neural Computer [Graves et al. Nature 2016]
Neural RAM [Kurach et al. ICLR 2016] 14 modules An LSTM Controller choosing modules and arguments Differentiable Semantics
Neural Program Meta- Neural Neural Program Induction Program Synthesis Synthesis F unctional Abstractions D ifferentiable memory, stack Functional Abstractions G eneralizes Better D ifficult to Generalize Strong Generalization L ots of Examples L ots of Examples Few Examples S ingle-task learning S ingle-task learning Multi-task learning N on-Interpretable programs I nterpretable programs Interpretable programs E xamples: NTM, DNC, etc. E xamples: QuickSort
Spectrum of Program Meta-Induction J. Devlin, R. Bunel, R. Singh, M. Hausknecht, P .Kohli [NIPS 2017]
Neural Program Meta- Neural Neural Program Induction Program Synthesis Synthesis F unctional Abstractions D ifferentiable memory, stack Functional Abstractions G eneralizes Better D ifficult to Generalize Strong Generalization L ots of Examples L ots of Examples Few Examples S ingle-task learning S ingle-task learning Multi-task learning N on-Interpretable programs I nterpretable programs Interpretable programs E xamples: NTM, DNC, etc. E xamples: QuickSort
Neuro-Symbolic Program Synthesis [ICLR 2017] Emilio Parisotto, Abdelrahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, Pushmeet Kohli
FlashFill in Excel 2013 * Taken from Gulwani, Polozov, Singh NOW 2017 Gulwani, Harris, Singh [CACM Research Highlight 2012]
FlashFill DSL
Example FlashFill Task Input (v) Output William Henry Charles Charles, W. Michael Johnson Johnson, M. Barack Rogers Rogers, B. Martha D. Saunders Saunders, M. Concat(f 1 , ConstStr(“, ”), f 2 , ConstStr(“.”)) f 1 = SubStr(v, (Word,-1,Start), (Word,-1,End)) f 2 = SubStr(v, CPos(0), CPos(1))
General Methodology Sampler – DSL Training Data Synthesizer Neural Model
Synthetic Training Data
Real-world Test Data
Neural Architecture Tree Decoder I/O Encoder Examples
Key Idea: Guided Enumeration CFG/DSL: S -> e + e e -> x e -> 1 e -> 0 Non-Terminals = {S, e} Terminals = {x, 1, 0, +} S -> e + e S -> e + e S -> e + e a 1 a 1 a 5 S + + + a 1 : S -> e + e e -> 1 e -> x e e e e -> 1 a 4 : e -> x a 1 : e -> x a 1 : e -> x a 5 : e -> 1 a 2 : e -> 1 a 2 : e -> 1 a 6 : e -> 0 a 3 : e -> 0 a 3 : e -> 0 1 x 1 f(x) = x + 1
Key Idea: Guided Enumeration Problem How to assign probabilities to each action a i such that the global tree state is taken into account? S -> e + e S -> e + e S -> e + e a 1 a 1 a 5 S + + + a 1 : S -> e + e e -> 1 e -> x e e e e -> 1 a 4 : e -> x a 1 : e -> x a 1 : e -> x a 5 : e -> 1 a 2 : e -> 1 a 2 : e -> 1 a 6 : e -> 0 a 3 : e -> 0 a 3 : e -> 0 1 x 1 f(x) = x + 1
Neural- Guided Enumeration f ( , I-O ) =
2 Key Challenges Program Representation Example Representation I-O
Recursive-Reverse-Recursive Neural Network (R3NN) •
Recursive Input: Distributed representations of each leaf’s symbol. Output: Global root representation.
Reverse-Recursive Input: root representation from recursive pass Output: Global leaf representations.
Conditioning on I/O Examples 1. An LSTM produces a vector for each I/O pair. 2. All I/O pair LSTM vectors are combined into a single conditioning vector c. 3. The R3NN model takes additional input c when generating the program tree. The whole model is trained end-to-end.
Cross-Correlation Encoder
Synthetic Data Results (< 13 AST)
FlashFill Benchmarks Batching Trees for larger programs R3NN for contextual program embeddings
RobustFill [ICML 2017] J. Devlin, J. Uesato, S. Bhuptiraju, R. Singh, A. Mohamed, P . Kohli
Multiple I/O Examples
Extended DSL
92% Generalization Accuracy
Robustness with Noise
Incorrect Generalization
Program Induction Model
Induction vs Synthesis
Other Synthesis Domains More Complex DSLs FlashFill (Functional) Karel (Imperative with Control Flow) Python & R Scripts (Stateful Variables) Grammar Learning (CFG s & CSGs) Specification Modalities Natural Language (NL2Excel) Partial Programs (Sketching)
Karel the Robot Input Output Program
Karel DSL
Synthesis Architecture CNNs for Encoder, LSTMs for decoder
Supervised Learning Top-1 Top-5 Supervised 71.91 80.00
Multiple Consistent Programs Input Output
Reinforcement Learning 1. First Supervised Training 2. Sample Program from the model 3. Run the program on I/O 4. Positive Reward if Output matches Top-1 Top-5 Supervised 71.91 80.00 REINFORCE 71.99 74.11 Beam REINFORCE 77.68 82.73
Stanford CS106a Test 7/16 problems = 43% Neural Symbolic
Neural Program Representations for Software Engineering Applications
Fuzzing for Security Bugs Crash! Execute Random Binary Seed Mutations Input Coverage guided — AFL
Neural Grammar-based Fuzzing More coverage, Bugs! Patrice Godefroid, Hila Peleg Rishabh Singh. Learn&Fuzz: Machine Learning for Input Fuzzing. ASE 2017
Learning where to Fuzz More coverage Identify useful bytes from past fuzzing More crashes Mohit Rajpal, William Blum, Rishabh Singh. Not all bytes are equal: Neural byte sieve for fuzzing.
Neural Program Repair Sahil Bhatia, Rishabh Singh. Automated Correction for Syntax Errors in Programming Assignments using RNNs.
Neural Programmer Rishabh Singh, rising@google.com Input/Output Examples Long T erm Vision: An agent to Natural Language win programming contests Partial Programs [T opCoder] Neural Architectures for Program and Spec Representation Neural Synthesis [ICLR2017 , ICML2017] Neural Repair [ICSE2018, ICLRW 2018] Program Induction [NIPS2017] Neural Fuzzing [ASE2017 , arxiv2017]
Recommend
More recommend