neural program synthesis from diverse demonstration videos
play

Neural Program Synthesis from Diverse Demonstration Videos Sriram - PowerPoint PPT Presentation

Neural Program Synthesis from Diverse Demonstration Videos Sriram Shao-Hua Sun* 1 Hyeonwoo Noh* 2 Joseph J. Lim 1 Somasundaram 1 1 University of Southern California 2 Pohang University of Science and Technology *Equal contribution A program is an


  1. Neural Program Synthesis from Diverse Demonstration Videos Sriram Shao-Hua Sun* 1 Hyeonwoo Noh* 2 Joseph J. Lim 1 Somasundaram 1 1 University of Southern California 2 Pohang University of Science and Technology *Equal contribution

  2. A program is an interpretable and executable way to describe behaviors

  3. Human A program is an interpretable and executable way to describe behaviors

  4. Human Robot A program is an interpretable and executable way to describe behaviors

  5. Program Synthesis Hey, can you…

  6. Program Synthesis Task specification ( , ) Input/output pairs ( , ) Devlin et al. “ Robustfill: Neural program learning under noisy i/o .” ICML 2017 Balog, et al. "Deepcoder: Learning to write programs." ICLR 2017 Motivation Rudy R et al. “ Leveraging grammar and reinforcement learning for neural program synthesis .” ICLR 2018

  7. Program Synthesis Task specification Program ( , ) def run() Input/output if frontIsClear(): pairs move() ( , ) else: turnLeft() move() turnLeft repeat(2): turnRight() putMarker() Devlin et al. “ Robustfill: Neural program learning under noisy i/o .” ICML 2017 Balog, et al. "Deepcoder: Learning to write programs." ICLR 2017 Motivation Rudy R et al. “ Leveraging grammar and reinforcement learning for neural program synthesis .” ICLR 2018

  8. Program Synthesis Task specification Program ( , ) def run() Input/output if frontIsClear(): pairs move() ( , ) else: turnLeft() move() turnLeft repeat(2): turnRight() putMarker() Devlin et al. “ Robustfill: Neural program learning under noisy i/o .” ICML 2017 Balog, et al. "Deepcoder: Learning to write programs." ICLR 2017 Motivation Rudy R et al. “ Leveraging grammar and reinforcement learning for neural program synthesis .” ICLR 2018

  9. Program Synthesis Task specification Program ( , ) def run() Input/output if frontIsClear(): pairs move() ( , ) else: turnLeft() move() … turnLeft repeat(2): Demonstration turnRight() Sequences putMarker() …

  10. Problem Formulation Input a set of demo videos ? Program

  11. Problem Formulation Input a set of demo videos Output a program describing the demonstrated behavior ? Program

  12. Problem Formulation Input a set of demo videos Challenges Output a program describing the demonstrated behavior • Extracting unique behaviors in each demo ? Program • Summarizing diverse behaviors as a program

  13. Problem Formulation Input a set of demo videos Challenges Output a program describing the demonstrated behavior • Extracting unique behaviors in each demo Reviewer module ? Program • Summarizing diverse behaviors as a program Relation module

  14. Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors

  15. Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize

  16. Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode

  17. Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode

  18. Reviewer Module Encoder Demo 1 t=1 t=2 t=T Demo 1 … … CNN CNN CNN LSTM

  19. Reviewer Module Demo 1 Encoder Demo 2 t=1 t=2 t=T Demo 2 … … CNN CNN CNN LSTM

  20. Reviewer Module Demo 1 Demo 2 Encoder Demo k t=1 t=2 t=T … … Demo k … CNN CNN CNN LSTM

  21. Reviewer Module Demo 1 Overall Demo 2 Tendency Encoder Demo k t=1 t=2 t=T … … Demo k … CNN CNN CNN LSTM

  22. Reviewer Module Encoder Demo 1 t=1 t=2 t=T Demo 1 … Overall Demo 2 … Tendency CNN CNN CNN LSTM … Demo k Review each demo

  23. Reviewer Module Encoder Demo 1 t=1 t=2 t=T Demo 1 … Overall Demo 2 … Tendency CNN CNN CNN LSTM … Demo k Overall Tendency Reviewer LSTM

  24. Reviewer Module Encoder Demo 1 t=1 t=2 t=T … … CNN CNN CNN LSTM Overall Tendency Reviewer LSTM

  25. Reviewer Module Encoder Demo 1 t=1 t=2 t=T … … CNN CNN CNN LSTM … Overall Demo feature 1 Tendency Reviewer LSTM

  26. Model Overview Demos Demo features Encoder Reviewer Encoder Module Encoder

  27. Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode

  28. Relation Module Demo 1 if frontIsClear(): move() else: Demo 2 Reviewer turnLeft() Module …

  29. Relation Module Demo 1 if frontIsClear(): move() else: Demo 2 Reviewer turnLeft() Module … Compare demo pairs to infer branching conditions

  30. Relation Module Relation Module Demo pairs { Reviewer … Module … { Santoro et al. "A simple neural network module for relational reasoning.” NIPS 2017

  31. Relation Module Relation Module Demo pairs { g θ Program vector Reviewer … Module g θ … g θ { Santoro et al. "A simple neural network module for relational reasoning.” NIPS 2017

  32. Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode

  33. Decode a Program Program vector LSTM <end> def run() move() if

  34. Model Overview Demos Demo features Encoder Program vector Reviewer Relation Decoder Encoder Program Module Module Encoder Extract unique behaviors Summarize Decode

  35. Experiments

  36. Environments def run() def run() Karel ViZDoom if frontIsClear(): while(inTarget( move() HellKnight)): else: attack() turnLeft() moveForward() move() if isThere(Demon): turnLeft moveRight() repeat(2): else: turnRight() moveLeft() putMarker() moveBackward() Richard E Pattis. “Karel the robot: a gentle introduction to the art of programming.” John Wiley & Sons, Inc., 1981 Kempka et al., “Vizdoom: A doom-based ai research platform for visual reinforcement learning.” In CIG, 2016

  37. Baselines • Program synthesis baseline • Program induction baseline Demos Encoder Without reviewer and relation modules Average vector Encoder Decoder Program Encoder

  38. Baselines • Program synthesis baseline • Program induction baseline Input Demos Encoder Average vector move() Encoder Decoder Action a A testing demo frame Encoder State s

  39. Example Result: Karel Ground truth Synthesis baseline def run(): def run(): if frontIsClear(): move() move() move() else: turnRight() turnLeft() putMarker() move() turnRight() repeat(2): putMarker() turnRight() putMarker() Miss the if-else statement Evaluation Metrics

  40. Example Result: Karel Ground truth Synthesis baseline Ours def run(): def run(): def run(): if frontIsClear(): move() if frontIsClear(): move() move() move() else: turnRight() else: turnLeft() putMarker() turnLeft() move() turnRight() move() repeat(2): putMarker() turnRight() turnRight() putMarker() putMarker() turnRight() putMarker() Evaluation Metrics

  41. Example Result: ViZDoom Ground truth Synthesis baseline def run(): def run(): if inTarget(Demon): while(inTarget( attack() HellKnight)): moveLeft() attack() else: if isThere(Demon): moveRight() moveRight() if isThere(Demon): attack() attack() else: moveLeft() moveLeft() Evaluation Metrics

  42. Example Result: ViZDoom Ground truth Synthesis baseline Ours def run(): def run(): def run(): if inTarget(Demon): while(inTarget( if inTarget(Demon): attack() HellKnight)): attack() moveLeft() attack() moveLeft() else: if isThere(Demon): else: moveRight() moveRight() moveRight() if isThere(Demon): attack() if isThere(Demon): attack() else: attack() moveLeft() moveLeft() moveLeft() Evaluation Metrics

  43. Quantitative Result: Infer Programs Environments and Results

  44. Sequence Accuracy Measure the accuracy based on code sequences • Synthesized Ground truth Synthesized Ground truth Synthesized Ground truth program program program program program program def run(): def run(): def run(): def run(): def run(): def run(): if A(): if not A(): if A(): if A(): if A(): if A(): x() y() x() x() x() x() else: else: else: else: else: else: y() x() y() y() while(B()): repeat(5): y() y() z() z() Evaluation Metrics

  45. Sequence Accuracy Measure the accuracy based on code sequences • Synthesized Ground truth Synthesized Ground truth Synthesized Ground truth program program program program program program def run(): def run(): def run(): def run(): def run(): def run(): if A(): if not A(): if A(): if A(): if A(): if A(): x() y() x() x() x() x() else: else: else: else: else: else: y() x() y() y() while(B()): repeat(5): y() y() z() z() Evaluation Metrics

Recommend


More recommend