Improved Models and Queries for Grounded Human-Robot Dialog - PowerPoint PPT Presentation

Outline • Background • Completed Work – Integrating Learning of Dialog Strategies and Semantic Parsing (Padmakumar et.al., 2017) – Opportunistic Active Learning for Grounding Natural Language Descriptions (Thomason et. al., 2017) – Learning a Policy for Opportunistic Active Learning (Padmakumar et. al., 2018) • Proposed Work • Conclusion 44

Opportunistic Active Learning for Grounding Natural Language Descriptions [Thomason et. al., 2017] Bring the blue mug from Alice’s office Semantic Grounding Understanding Dialog Policy Natural Where should I bring a Language blue mug from? Generation 45

Opportunistic Active Learning • Asking locally convenient questions during an interactive task. • Questions may not be useful for the current interaction but expected to help future tasks. 46

Opportunistic Active Learning Bring the blue mug from Alice’s office Would you use the word “blue” to refer to this object? Yes 47

Opportunistic Active Learning Bring the blue mug from Alice’s office Would you use the word “ tall ” to refer to this object? Yes 48

Opportunistic Active Learning Still query for labels most likely to improve the model. ? 49

Opportunistic Active Learning Why? • Robot may have good models for on-topic concepts. • No useful on-topic queries. • Some off-topic concepts may be more important because they are used in more interactions. 50

Opportunistic Active Learning - Challenges Some other object might be a better candidate for the question Purple? 51

Opportunistic Active Learning - Challenges The question interrupts another task and may be seen as unnatural Bring the blue mug from Alice’s office Would you use the word “ tall ” to refer to this object? 52

Opportunistic Active Learning - Challenges The information needs to be useful for a future task. Red? 53

Object Retrieval Task 54

Object Retrieval Task • User describes an object in the active test set • Robot needs to identify which object is being described 55

Object Retrieval Task • Robot can ask questions about objects on the sides to learn object attributes 56

Two Types of Questions 57

Two Types of Questions 58

Experimental Conditions This is a yellow bottle with water filled in it • Baseline (on-topic) - the robot can only ask about “yellow”, “bottle”, “water”, “filled” • Inquisitive (opportunistic) - the robot can ask about any concept it knows, possibly “red” or “heavy” 59

Results • Inquisitive robot performs better at understanding object descriptions. • Users find the robot more comprehending, fun and usable in a real-world setting, when it is opportunistic. 60

Learning a Policy for Opportunistic Active Learning [Padmakumar et. al., 2018] Bring the blue mug from Alice’s office Semantic Grounding Understanding Dialog Policy Natural Where should I bring a Language blue mug from? Generation 62

Learning a Policy for Opportunistic Active Learning • Goal of this work - Learn a dialog policy that decides how many and which questions to ask to improve grounding models. • To learn an effective policy, the agent needs to learn – To identify good queries in the opportunistic setting. – When a guess is likely to be successful. – To trade off between model improvement and task completion. 63

Task Setup Target Description 64

Task Setup 65

Task Setup 66

Grounding Model A white umbrella {white, umbrella} white/ not white SVM Pretrained CNN umbrella/ not umbrella SVM 67

Active Learning • Agent starts with no classifiers. • Labeled examples are acquired through questions and used to train the classifiers. • Agent needs to learn a policy to balance active learning with task completion. 68

MDP Model Dialog Agent State: Action: ● Target description ● Label query ● Train and test Max correct guesses ● Example Query objects Reward : ● Guess with short dialogs ● Agent’s perceptual classifiers User 69

Challenges • What information about classifiers should be represented? • Variable number of actions • Size of action space increases over time • Number of classifiers increases over time • Very large action space after initial interactions. 70

Tackling challenges • Features based on active learning methods – Representing classifiers • Featurize state-action pairs – Variable number of actions and classifiers • Sampling a beam of promising queries – Large action space 71

Feature Groups • Query features - Active learning metrics used to determine whether a query is useful • Guess features - Features that use the predictions and confidences of classifiers to determine whether a guess will be correct 72

Experiment Setup • Policy learning using REINFORCE. • Baseline - A hand-coded dialog policy that asks a fixed number of questions selected using the same sampling distribution. 73

Experiment Phases • Initialization - Collect experience using the baseline to initialize the policy. • Training - Improve the policy from on-policy experience. • Testing - Policy weights are fixed, and we run a new set of interactions, starting with no classifiers, over an independent test set with different predicates. 74

Results 0.44 0.37 0.35 0.29 Ablations of major feature groups 75

Results 16 12.95 6.16 6.12 Ablations of major feature groups 76

Summary • We can learn a dialog policy that learns to acquire knowledge of predicates through opportunistic active learning. • The learned policy is more successful at object retrieval than a static baseline, using fewer dialog turns on average. 77

Outline • Proposed Work – Learning to Ground Natural Language Object Descriptions Using Joint Embeddings – Identifying Useful Clarification Questions for Grounding Object Descriptions – Learning a Policy for Clarification Questions using Uncertain Models – Bonus Contributions 79

Perceptual Grounding Using Classifiers Perceptual blue mug Grounding Classifier Classifier blue mug blue/not blue mug/not mug Classifier Classifier not blue mug blue/not blue mug/not mug 80

Grounding Using a Joint Vector Space 81

Grounding Using a Joint Vector Space • Represent words and images as vectors in the same space. • Words are near images they apply to and vice versa. 82

Grounding Using a Joint Vector Space To ground a description, such as “blue mug”, find the image which minimizes the sum of distances to the words. 83

Grounding Using a Joint Vector Space Related prior work • Word vectors in learned joint spaces are more useful for many tasks, eg: semantic relatedness [Lazaridou et. al., 2015] • Neural networks that score an image-description pair perform well at grounding but use sentence embeddings [Hu et. al. 2016, Xiao et. al. 2017]. • We expect that words would generalize better than phrases/ sentences. 87

Learning the Joint Space 88

Learning the Joint Space d(f( ), g(blue)) ≤ d(f( ), g(blue)) d(f( ), g(blue)) ≤ d(f( ), g(pink)) Constraints captured using a ranking loss 89

Identifying Useful Clarification Questions for Grounding Object Descriptions Bring the blue mug from Alice’s office What should I bring? The blue coffee mug What should I bring? 91

Identifying Useful Clarification Questions for Grounding Object Descriptions Bring the blue mug from Alice’s office Is this the object I should bring? No 92

Recent Related Work [Das, et. al., 2017] [De Vries et. al., 2017] 93

Identifying Useful Clarification Questions for Grounding Object Descriptions • Clarification questions that help narrow down an object being referred to. • More specific than a new description. • More general than showing each possible object. • Provide ground truth answers to questions at training time to learn human semantics. 94

Attribute Based Queries Bring the blue mug from Alice’s office Is the object I should bring a cup? Yes 95

Choosing a Good Query blue mug • Query that is most likely to reduce the search space. • Choose the attribute with respect to which the dataset has highest entropy 96

Challenge In a joint embedding space how do you determine whether an attribute is applicable? blue mug 97

Possible solutions • Distance threshold, clustering to get classifier-like predictions. • Might be possible to formulate an optimization problem using distances. 98

Learning a Policy for Clarification Questions using Uncertain Models blue mug 100

Improved Models and Queries for Grounded Human-Robot Dialog - PowerPoint PPT Presentation

Improved Models and Queries for Grounded Human-Robot Dialog Aishwarya Padmakumar Doctoral Dissertation Proposal Natural Language Interaction with Robots 2 Understanding Commands Bring the blue mug from Alices office 3 Sources of

Continually Improving Grounded Natural Language Understanding through Human-Robot Dialog Jesse

Robothlon Team competition, each team programs a robot for each event Events Robot

Response-based Learning for Grounded Grounded SMT Riezler, Machine Translation Simianer, Haas

Lear Learning M ning Multi ulti-Moda Modal l Grounded Lingu Grounded Linguistic istic

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Grounded Action Transformation for Robot Learning in Simulation Josiah Hanna and Peter Stone

Queries in PSM The following rules apply to the use of queries: CS 235: 1. Queries

Rational Robot A Test Automation Tool What is Rational Robot? Rational Robot is a complete

Verifying the Motion of a Robot Arm Akul Penugonda 1 /6 Akul Penugonda - Robot Arm Motion 2

What is a robot? A robot is an intelligent system that interacts with the Robot Lecture 2:

Human-Robot Interaction CMSC 691 Spring 2016 2 u What is an interaction with a robot? u What is

Outline Introduction Definition History Features When should Grounded Theory be used? Types

TAKE TAKE GROUNDED GROUNDED DECISIONS DECISIONS Farm Modelling Statistic based, gamification

Range Minimum and Lowest Common Ancestor Queries Slides by Solon P. Pissis November 15, 2019

Top- -k k Queries Queries on SQL on SQL Databases Databases Top Top-k Queries on SQL

Middleware Queries Queries Middleware Middleware Queries Prof. Paolo Ciaccia Prof. Paolo

Review of the National Ambient Air Quality Standards (NAAQS) for Ozone Clarifications on the

D etecti ng and Correcti ng Errors of O m i ssi on A fter Expl anati on-based Learni

1 MD-VIPER Medical Device Vulnerability Sharing Stakeholders including manufacturers,

Ne f cafca ad cdea e:

Early I Early Interv ntervening ening Service Services Individual Individ als wit with h

Board-GAC Interactions Group (BGIG) Meeting ICANN65 25 June 2019 | 1 | 1 Meeting Agenda

2020 Mitigation Workgroup Policy Scenario Results June 18, 2020 Updated June 22, 2020 Reminder

Commonsense Knowledge in Pre-trained Language Models Vered Shwartz July 5th, 2020 Commonsense