Natural Language Communication with Robots Yonatan Bisk ISI-USC - PowerPoint PPT Presentation

Natural Language Communication with Robots Yonatan Bisk ISI-USC Joint work with: Deniz Yuret Daniel Marcu Koç University ISI-USC

Components of Communication Entity/Spatial Grounding Understanding Planning and Plan Recognition Language Generation ….

Grounding The third block from the left

Understanding place the nvidia block east of the hp block .

Plans Draw the number six with a rigid base and a right diagonal top. Start with a line of 6 blocks in the middle of the table … 5

Generation [I need to] move UPS from the left side of the board to just below Starbucks, leaving a small gap.

Goal Introduce a dataset collection paradigm for   Human-Robot Communication:   Understanding, Learning, and Generation 1. Easily evaluated + Models to begin addressing understanding 2. Data exists in 3D space 3. Natural language utterances 4. Parallel annotation at differing levels of abstraction 5. Computer Vision can help but is not a pre-requisite

Dataset

Action Sequences Identifiable Sequences … … Random Blank Sequences … …

Problem Solution Sequences 0 1 13 14 20 Single Single Short Seq Long Seq We focus on Single Actions in this work 10

Corpus Creation Simple Actions Move HP in front of Twitter and slightly to the left 11

Corpus Creation Difficult Actions Remove the block above the right bottom block and place it on top of the left stack of blocks. 12

Nine Annotations 1. coca cola , hp , nvidia . 2. nvidia , to the right of hp 3. place the nvidia block east of the hp block . 4. move the nvidia block to the right of the hp block 5. place the nvidia block to the east of the hp block . 6. move the nvidia block directly to the right of the hp block . 7. move the nvidia block just to the right of the hp block in line with the mercedes block . 8. put the nvidia block on the right end of the row of blocks that includes the coca cola and hp blocks . 9. put the nvidia block on the same row as the coca cola block, in the first open space to the right of the coca cola block . 13

V1 Corpus Statistics Actions Types Tokens Ave Len MNIST 11,870 1,359 ~257K 15 tokens Random 2,492 1,172 ~84K 23.5 tokens

Natural Language Understanding

Action Understanding Given: Goal: World Execute a command Utterance Block to Move ( x, y, z ) S Where to Move ( x, y, z ) T place the nvidia block east of the hp block .

World Representation Images (w/ Occlusion) Exact Locations Adidas 0.8 0.1 0.76 BMW -0.3 0.1 -0.4 Burger King 0.5 0.1 0.14 Coke -0.07 0.1 0.00 … This Work 20 x 3 Matrix

Evaluation: Euclidean Distance Block to Move || ( x, y, z ) SP red − ( x, y, z ) SGold || 2 Where to Move || ( x, y, z ) T P red − ( x, y, z ) T Gold || 2 18

Baseline Models Output: Where to Move Block to Move ( x, y, z ) S ( x, y, z ) T Random We also Random Block to move Perform Random Block to place it next to Human Evaluation Center Perfect knowledge of which block to move Always place it in the center of the board

Simple Semantics Model 1: A Discrete world (Source, Direction, Reference) Move the BMW block in front of the Adidas block Move the Source block Direction the Reference block ∈ [1,20] ∈ [1,20] ∈ [1,9] NW N NE W TOP E SW S SE 20

} Simple Semantics Model 1: A Discrete world (Source, Direction, Reference) Embedding FF Softmax Forced Semantic Source Structure ∈ [1,20] Sentence Block IDs Direction ∈ [1,9] (S,D,R) Sentence Block IDs programatic Target conversion ∈ [1,20] Sentence Block IDs to (x,y,z) 21

End-to-End Model Move the BMW block in front of the Adidas block ( x, y, z ) SP red or ( x, y, z ) T P red 22

End-to-End Model Move the BMW block in front of the Adidas block Direction Reference Assumed Logic:   Can we encode this? ± x, ± y, ± z ( x, y, z ) ( x, y, z ) T P red 23

End-to-End Model Encoder Representation Grounding Prediction Semantics 3 W 1 . Hidden . . Semantics 2 Hidden + W i ( x, y, z ) . . . World (3x20) Hidden Semantics 1 * W n Trained Twice Source + Target 24

MNIST Performance Source Target Mean Mean Human 0.00 0.53 Simple Semantics 0.14 0.98 End-To-End 0.19 1.05 Center Baseline 3.43 Random Baseline 6.49 6.21 25

Blank Block Performance Source Target Mean Mean Human 0.30 1.39 Simple Semantics 5.00 5.57 End-To-End 3.47 3.70 Center Baseline 4.06 Random Baseline 4.97 5.44 26

Common Errors Multi-relation actions Place block 20 parallel with the 8 block and slightly to the right of the 6 block. Geometric Understanding Continue the diagonal row of 20, 19 and 15 downward with 13. Grammatical Ambiguity 19 moved from behind the 8 to under the 18th block. 27

Summary This Work: • Initial Models for Language Understanding • An environment for exploring grounded phenomena Moving Forward: • Language Generation, Planning, … • Increased task difficulty.

Thanks! http://nlg.isi.edu/language-grounding/

Natural Language Communication with Robots Yonatan Bisk ISI-USC - PowerPoint PPT Presentation

Natural Language Communication with Robots Yonatan Bisk ISI-USC Joint work with: Deniz Yuret Daniel Marcu Ko University ISI-USC Components of Communication Entity/Spatial Grounding Understanding Planning and Plan Recognition

UNIVERSAL ROBOTS RUC 2018 Universal Robots - Evolving the future UNIVERSAL ROBOTS SET THE

The Imitation Game: The New Frontline of Security Fighting Robots Weve been warned for a

Robots Playing Catch Brandon Tolsch Brandon Tolsch Robots Playing Catch Two robots throwing

Human robot interaction www.biorobotics.ttu.ee Social robots Traditional robots Tools

Human Language vs. Animal Communication Linguistics 101 Human Language vs. Animal Communication

SK Telecom 1 U U U U U U U- U - - communication - - - - - communication

Natural Language Understanding We want to communicate with computers using natural language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

ROBOTS AND HEALTHCARE PAST, PRESENT, AND FUTURE COMPILED BY HOWIE BAUM What do you think of when

Agenda Overview of Mobile Industrial Robots Future Steps for Mobile Industrial Robots

Modular Robots Modular Robots by D. Dibbern and A. Werdermann by D. Dibbern and A. Werdermann

exclusively international students: reflections on the challenges and opportunities of native

Dynamic Games and Bargaining Johan Stennek 1 Dynamic Games Logic of cartels Idea:

Modals and conditionals Kai von Fintel (MIT) CSSL17 July 1014, 2017 1 This intermediate

The Calculus of Communicating Systems Dr. Liam OConnor University of Edinburgh LFCS (and UNSW)

Social influence Conformity Informational influence Influence that produces conformity when a

(Interactive) Proofs Proofs from 900 BCE until 1800s Pythagorass Theorem: Proof: Looks legit.

Hidden Markov Models Based on Foundations of Statistical NLP by C. Manning & H.

Our Responsibility to Defeat Mass Surveillance Erik Drnenburg Martin Fowler

Sambuz

Useful Links

Newsletter

Mail Us

Natural Language Communication with Robots Yonatan Bisk ISI-USC - PowerPoint PPT Presentation

Natural Language Communication with Robots Yonatan Bisk ISI-USC Joint work with: Deniz Yuret Daniel Marcu Ko University ISI-USC Components of Communication Entity/Spatial Grounding Understanding Planning and Plan Recognition

UNIVERSAL ROBOTS RUC 2018 Universal Robots - Evolving the future UNIVERSAL ROBOTS SET THE

The Imitation Game: The New Frontline of Security Fighting Robots Weve been warned for a

Robots Playing Catch Brandon Tolsch Brandon Tolsch Robots Playing Catch Two robots throwing

Human robot interaction www.biorobotics.ttu.ee Social robots Traditional robots Tools

Human Language vs. Animal Communication Linguistics 101 Human Language vs. Animal Communication

SK Telecom 1 U U U U U U U- U - - communication - - - - - communication

Natural Language Understanding We want to communicate with computers using natural language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

Information Extraction Industrial Natural Language Processing Industrial Natural Language

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Paula

Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS Lecture

ROBOTS AND HEALTHCARE PAST, PRESENT, AND FUTURE COMPILED BY HOWIE BAUM What do you think of when

Agenda Overview of Mobile Industrial Robots Future Steps for Mobile Industrial Robots

Modular Robots Modular Robots by D. Dibbern and A. Werdermann by D. Dibbern and A. Werdermann

exclusively international students: reflections on the challenges and opportunities of native

Dynamic Games and Bargaining Johan Stennek 1 Dynamic Games Logic of cartels Idea:

Modals and conditionals Kai von Fintel (MIT) CSSL17 July 1014, 2017 1 This intermediate

The Calculus of Communicating Systems Dr. Liam OConnor University of Edinburgh LFCS (and UNSW)

Social influence Conformity Informational influence Influence that produces conformity when a

(Interactive) Proofs Proofs from 900 BCE until 1800s Pythagorass Theorem: Proof: Looks legit.

Hidden Markov Models Based on Foundations of Statistical NLP by C. Manning &amp; H.

Our Responsibility to Defeat Mass Surveillance Erik Drnenburg Martin Fowler

Sambuz

Useful Links

Newsletter

Mail Us

Hidden Markov Models Based on Foundations of Statistical NLP by C. Manning & H.