Neural Program Synthesis from Diverse Demonstration Videos Sriram Shao-Hua Sun* 1 Hyeonwoo Noh* 2 Joseph J. Lim 1 Somasundaram 1 1 University of Southern California 2 Pohang University of Science and Technology *Equal contribution
A program is an interpretable and executable way to describe behaviors
Human A program is an interpretable and executable way to describe behaviors
Human Robot A program is an interpretable and executable way to describe behaviors
Program Synthesis Hey, can you…
Program Synthesis Task specification ( , ) Input/output pairs ( , ) Devlin et al. “ Robustfill: Neural program learning under noisy i/o .” ICML 2017 Balog, et al. "Deepcoder: Learning to write programs." ICLR 2017 Motivation Rudy R et al. “ Leveraging grammar and reinforcement learning for neural program synthesis .” ICLR 2018
Program Synthesis Task specification Program ( , ) def run() Input/output if frontIsClear(): pairs move() ( , ) else: turnLeft() move() turnLeft repeat(2): turnRight() putMarker() Devlin et al. “ Robustfill: Neural program learning under noisy i/o .” ICML 2017 Balog, et al. "Deepcoder: Learning to write programs." ICLR 2017 Motivation Rudy R et al. “ Leveraging grammar and reinforcement learning for neural program synthesis .” ICLR 2018
Program Synthesis Task specification Program ( , ) def run() Input/output if frontIsClear(): pairs move() ( , ) else: turnLeft() move() turnLeft repeat(2): turnRight() putMarker() Devlin et al. “ Robustfill: Neural program learning under noisy i/o .” ICML 2017 Balog, et al. "Deepcoder: Learning to write programs." ICLR 2017 Motivation Rudy R et al. “ Leveraging grammar and reinforcement learning for neural program synthesis .” ICLR 2018
Program Synthesis Task specification Program ( , ) def run() Input/output if frontIsClear(): pairs move() ( , ) else: turnLeft() move() … turnLeft repeat(2): Demonstration turnRight() Sequences putMarker() …
Problem Formulation Input a set of demo videos ? Program
Problem Formulation Input a set of demo videos Output a program describing the demonstrated behavior ? Program
Problem Formulation Input a set of demo videos Challenges Output a program describing the demonstrated behavior • Extracting unique behaviors in each demo ? Program • Summarizing diverse behaviors as a program
Problem Formulation Input a set of demo videos Challenges Output a program describing the demonstrated behavior • Extracting unique behaviors in each demo Reviewer module ? Program • Summarizing diverse behaviors as a program Relation module
Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors
Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize
Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode
Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode
Reviewer Module Encoder Demo 1 t=1 t=2 t=T Demo 1 … … CNN CNN CNN LSTM
Reviewer Module Demo 1 Encoder Demo 2 t=1 t=2 t=T Demo 2 … … CNN CNN CNN LSTM
Reviewer Module Demo 1 Demo 2 Encoder Demo k t=1 t=2 t=T … … Demo k … CNN CNN CNN LSTM
Reviewer Module Demo 1 Overall Demo 2 Tendency Encoder Demo k t=1 t=2 t=T … … Demo k … CNN CNN CNN LSTM
Reviewer Module Encoder Demo 1 t=1 t=2 t=T Demo 1 … Overall Demo 2 … Tendency CNN CNN CNN LSTM … Demo k Review each demo
Reviewer Module Encoder Demo 1 t=1 t=2 t=T Demo 1 … Overall Demo 2 … Tendency CNN CNN CNN LSTM … Demo k Overall Tendency Reviewer LSTM
Reviewer Module Encoder Demo 1 t=1 t=2 t=T … … CNN CNN CNN LSTM Overall Tendency Reviewer LSTM
Reviewer Module Encoder Demo 1 t=1 t=2 t=T … … CNN CNN CNN LSTM … Overall Demo feature 1 Tendency Reviewer LSTM
Model Overview Demos Demo features Encoder Reviewer Encoder Module Encoder
Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode
Relation Module Demo 1 if frontIsClear(): move() else: Demo 2 Reviewer turnLeft() Module …
Relation Module Demo 1 if frontIsClear(): move() else: Demo 2 Reviewer turnLeft() Module … Compare demo pairs to infer branching conditions
Relation Module Relation Module Demo pairs { Reviewer … Module … { Santoro et al. "A simple neural network module for relational reasoning.” NIPS 2017
Relation Module Relation Module Demo pairs { g θ Program vector Reviewer … Module g θ … g θ { Santoro et al. "A simple neural network module for relational reasoning.” NIPS 2017
Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode
Decode a Program Program vector LSTM <end> def run() move() if
Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode
Experiments
Environments def run() def run() Karel ViZDoom if frontIsClear(): while(inTarget( move() HellKnight)): else: attack() turnLeft() moveForward() move() if isThere(Demon): turnLeft moveRight() repeat(2): else: turnRight() moveLeft() putMarker() moveBackward() Richard E Pattis. “Karel the robot: a gentle introduction to the art of programming.” John Wiley & Sons, Inc., 1981 Kempka et al., “Vizdoom: A doom-based ai research platform for visual reinforcement learning.” In CIG, 2016
Baselines • Program synthesis baseline • Program induction baseline Demos Encoder Without reviewer and relation modules Average vector Encoder Decoder Program Encoder
Baselines • Program synthesis baseline • Program induction baseline Input Demos Encoder Average vector move() Encoder Decoder Action a A testing demo frame Encoder State s
Example Result: Karel Ground truth Synthesis baseline def run(): def run(): if frontIsClear(): move() move() move() else: turnRight() turnLeft() putMarker() move() turnRight() repeat(2): putMarker() turnRight() putMarker() Miss the if-else statement Evaluation Metrics
Example Result: Karel Ground truth Synthesis baseline Ours def run(): def run(): def run(): if frontIsClear(): move() if frontIsClear(): move() move() move() else: turnRight() else: turnLeft() putMarker() turnLeft() move() turnRight() move() repeat(2): putMarker() turnRight() turnRight() putMarker() putMarker() turnRight() putMarker() Evaluation Metrics
Example Result: ViZDoom Ground truth Synthesis baseline def run(): def run(): if inTarget(Demon): while(inTarget( attack() HellKnight)): moveLeft() attack() else: if isThere(Demon): moveRight() moveRight() if isThere(Demon): attack() attack() else: moveLeft() moveLeft() Evaluation Metrics
Example Result: ViZDoom Ground truth Synthesis baseline Ours def run(): def run(): def run(): if inTarget(Demon): while(inTarget( if inTarget(Demon): attack() HellKnight)): attack() moveLeft() attack() moveLeft() else: if isThere(Demon): else: moveRight() moveRight() moveRight() if isThere(Demon): attack() if isThere(Demon): attack() else: attack() moveLeft() moveLeft() moveLeft() Evaluation Metrics
Quantitative Result: Infer Programs Environments and Results
Sequence Accuracy Measure the accuracy based on code sequences • Synthesized Ground truth Synthesized Ground truth Synthesized Ground truth program program program program program program def run(): def run(): def run(): def run(): def run(): def run(): if A(): if not A(): if A(): if A(): if A(): if A(): x() y() x() x() x() x() else: else: else: else: else: else: y() x() y() y() while(B()): repeat(5): y() y() z() z() Evaluation Metrics
Sequence Accuracy Measure the accuracy based on code sequences • Synthesized Ground truth Synthesized Ground truth Synthesized Ground truth program program program program program program def run(): def run(): def run(): def run(): def run(): def run(): if A(): if not A(): if A(): if A(): if A(): if A(): x() y() x() x() x() x() else: else: else: else: else: else: y() x() y() y() while(B()): repeat(5): y() y() z() z() Evaluation Metrics
Recommend
More recommend