learning from language
play

Learning from Language Jacob Andreas Doing things with language 2 - PowerPoint PPT Presentation

Learning from Language Jacob Andreas Doing things with language 2 Doing things with language Who is left of Go up, then go left. the truck? The hooded oriole is a large bird with black wings. A man with a white shirt and


  1. NMNs and strong generalization [ A RDK16a] TRAIN 100 Is there anything left of a circle? 88 90.6 Is there anything above a circle? 75 TEST 76.5 63 Is there anything above and left 
 of a circle? CNN + RNN NMN 50 73

  2. NMNS for other tasks [ A , Rohrbach, Darrell, Klein; 16b] name type coastal island Is Key Largo 
 city no no Columbia an island? river yes no Cooper city yes no Charleston [Hu, Rohrbach, A , Darrell, Saenko; 17] man in sunglasses walking towards 
 [Cirik, Berg-Kirkpatrick, Morency; 18] two men [Yu, Lin, Shen, Yang, Lu, Bansal, Berg; 18] [Suhr, Lewis, Artzi; 17] There is exactly one black triangle not touching any edge. 74

  3. Lessons ↦ Linguistic structure lets us learn composable neural ↦ yes modules from weak supervision. These modules allow us to 
 exists more accurately interpret 
 and new statements, questions and above red references. circle 75

  4. REASONING LANGUAGE & LEARNING BELIEF A , Klein & Levine. Modular Multitask Reinforcement Learning […]. ICML 17.

  5. Learning classifiers yes blue exists color and right above red circle circle Is there a red shape above a circle? What color is the shape right of a circle? 77

  6. Learning behaviors get wood use saw get wood use axe Make planks: 
 Make sticks: 
 get wood, then use a saw. get wood, then use an axe. 78

  7. Learning from intermediate rewards r r [Kearns & Singh 02, Kulkarni et al. 16] 79

  8. Learning from demonstrations [Stolle & Precup 02, Fox & Krishnan et al. 16] 80

  9. Learning from intermediate rewards -has(wood) 
 +has(wood) +has(plank) +at(saw) [e.g. SacerdoE 75, Hauskrecht et al. 98] 81

  10. Learning from sketches get wood use saw Ï 82

  11. Learning from sketches Make planks: get wood use saw 83

  12. Learning from sketches Make sticks: get wood use axe 84

  13. Learning from sketches use saw get wood π 1 π 2 STOP STOP get wood use axe π 1 π 3 STOP 85

  14. Experiments: crafting game 86

  15. Experiments: crafting game 87

  16. Experiments: crafting game Sketches / Modular Instruction following Reward Unsupervised 0 1 2 3 x 10 6 episodes 88

  17. Experiments: crafting game Sketches / Modular Instruction following Reward Unsupervised 0 1 2 3 x 10 6 episodes 89

  18. Experiments: locomotion 90

  19. Experiments: locomotion Sketches / Modular log Reward Instruction following Unsupervised 0 1 2 3 x 10 8 timesteps 91

  20. Fast adaptation What if I don’t get a sketch at test time? ??? 92

  21. Fast adaptation What if I don’t get a sketch at test time? 93

  22. Fast adaptation What if I don’t get a sketch at test time? get iron use axe 94

  23. Fast adaptation What if I don’t get a sketch at test time? get iron use saw 95

  24. Fast adaptation What if I don’t get a sketch at test time? 100 75 76 50 42 25 1 Unsup. / Sketches / 
 Ordinary RL Modular Modular 0 Avg. Reward 96

  25. Lessons We can also learn modular behaviors from ungrounded get iron use axe “sketches” of abstract plans. We can use these modules to use saw help reinforcement learning even get wood when sketches are not available. use axe 97

  26. Beyond “tasks” LOCALIZATION Q&A POLICY SEARCH Man in glasses 
 How many 
 go near the near two men. men? corner 98

  27. Toward a model of everything LANGUAGE LEARNING Man in glasses 
 How many 
 go near the near two men. men? corner 99

  28. REASONING LEARNING LANGUAGE & BELIEF A & Klein. Reasoning about Pragmatics with Neural Listeners and Speakers . EMNLP 16. A , Dr ă gan & Klein. Translating Neuralese. ACL 17. A & Klein. Analogs of Linguistic Structure in Deep Representations . EMNLP 17. Fried, A & Klein. Unified Pragmatic Models for Generating and Following […] . NAACL 18.

Recommend


More recommend