ALIHT 2011 W. Bradley Knox Jake Beal Brenna Argall Sonia Chernova - PowerPoint PPT Presentation

Agents Learning Interactively from Human Teachers ALIHT 2011 W. Bradley Knox Jake Beal Brenna Argall Sonia Chernova Peter Stone Matt Taylor Andrea Thomaz These slides are posted on the ALIHT website’s Program page.

Welcome!

Quick stats • 14 papers • 5 invited talks • Joanna Bryson (University of Bath) • Thomas G. Dietterich (Oregon State) • Ian Fasel (University of Arizona) • Jan Peters (Max-Planck Institute) • Dan Roth (University of Illinois - Urbana Champaign)

Best Presentation Award

Agents Learning Interactively human sees an effect of learning before teaching finishes (t each -> observe learning -> teach) from Human Teachers implies the human considers the student and communicates intentionally

Outline • Why? • Taxonomy • Discussion points/questions

Why? (grounded answers) • Programming for non-programmers • Customization/extension by the end-user • Faster and/or less costly learning • “You don’t know something until you teach it.” • To study how people teach

Why? (speculative answers) • Interaction may build trust and human understanding of the agent • Learning creates social connection • The thrill of teaching • Human-centered AI

From many contributions, sorting it out

Taxonomy Purpose of teaching • Autonomous task completion • Teaching new tasks • Customizing existing task solutions • Improving communication • Learning through teaching

Taxonomy Human-to-agent communication modalities • Demonstration • Reward/punishment • Verbal advice/directions • Curriculum design / Environment shaping • Gestures • Unconstrained interaction • Unintentional signals (e.g., facial expressions)

Taxonomy Agent-to-human communication modalities • Observable behavior • Asking (for help, information, guidance, etc.) • Belief/prediction statements • Emotional expression

Taxonomy Interaction scheme • Iterations between teacher and student • Teacher and student act concurrently

Taxonomy Knowledge representation • Behavior parameters • Value functions • Probabilistic/predictive models • Logical formulas

Taxonomy Learning from multiple sources • Multiple teaching modalities (demonstration and feedback) • Combining with non-teaching information (e.g., MDP reward for reinforcement learning)

Taxonomy Evaluation metrics • Effectiveness - learned performance • Efficiency • Human time • Training cost by performance • User satisfaction

Taxonomy • Purpose of teaching • Human-to-agent communication • Agent-to-human communication • Interaction scheme • Knowledge representation • Learning from multiple sources • Evaluation metrics

Let’s discuss (over the next two days)

Discussion topics Comparative evaluation Interactive algorithms often aren’t compared. But we must evaluate relative strengths to move forward. Standardized challenge task? • room for robots?

Discussion topics Theory What should we try to prove? What assumptions must be made? At what cost to applicability? Perhaps one of our goals should be to provide the correct assumptions.

Discussion topics Gathering/reusing data Ease : Supervised learning > reinforcement learning > learning interactively from a human In what situations can data be reused? Strategies for reducing cost of human data?

Discussion topics Experimental logistics Experiments with authors or colleagues as subjects yield narrower results. But technical academic departments often lack infrastructure for facilitating human studies. Tap our collective experience in creating such infrastructure.

Discussion topics Publishing venues General AI - IJCAI, AAAI Machine learning – ICML, ECML, NIPS, Agents-focused – AAMAS, GECCO, IVA Robots/Interaction – HRI, ICRA, IROS, ROMAN, RSS(?) HCI/Interfaces – IUI, UMAP , CHI, SIGGRAPH(?) Developmental learning – ICDL NLP - ACL, CoNLL, EMNLP , NAACL Journals - TAMD (and many others)

Discussion topics Reviewers ALIHT straddles several areas, and reviewers often come from narrower backgrounds. Strategies for addressing reviewer's biases? (e.g., from the RL community, arguably misplaced standards for theory and extensiveness of experiments and too much lenience on number and source of subjects) At community and individual levels

Discussion topics Fundamentals of ALIHT Is our task to integrate developments from machine learning, psychology, etc.? Or are there fundamental contributions that generalize across the ALIHT subfield? • Biggest bottlenecks? • What can we offer our larger communities? And what can we take from each other?

Proposed discussion topics • Comparative evaluation • Theory • Gathering/reusing data • Experimental logistics • Publishing venues • Reviewers • Fundamentals of ALIHT

Enjoy! (And discuss!)

ALIHT 2011 W. Bradley Knox Jake Beal Brenna Argall Sonia Chernova - PowerPoint PPT Presentation

Agents Learning Interactively from Human Teachers ALIHT 2011 W. Bradley Knox Jake Beal Brenna Argall Sonia Chernova Peter Stone Matt Taylor Andrea Thomaz These slides are posted on the ALIHT websites Program page. Welcome! Quick

Reciprocal Learning via Dialogue Interaction: Challenges and Prospects Raquel Fernndez, Staffan

H1 2011 Financial Results H1 2011 Financial Results th August 2011 Milan, 26 th August 2011

Humans Teaching Robots: Challenges to Decoding the Intention Behind Natural Instruction IJCAI

COLOMBIAN CAPITAL MARKETS EVOLUTION 2011: YTD Dec 2011: YTD Dec 2011: YTD Dec 2011: YTD

Understanding Human Teaching Modalities in Reinforcement Learning Environments A Preliminary

Annual Meeting Annual Meeting 2011 2011 2011 2011 The Landings Association, Inc. F b

2011 SERTP Input 2011 SERTP Input Assumptions Assumptions 1 2011 Load Forecast 2011 Load

Results presentation for the year ended 30 June 2011 14 September 2011 2 Full Year 2011 Image

Asiasoft Corporation PLC. Opportunity day Q3/2011 17 November 2011 Market Overview in Q3/2011

Second Quarter 2011 July 28, 2011 Results Second Quarter 2011 Results 1 Contents 1. Second

Annual results 2011/12 Patrick Kron 4 May 2012 Annual results 2011/12 Main events 2011/12

Sampling CS 6965 Fall 2011 Creative Program 3 CS 6965 Fall 2011 2 CS 6965 Fall 2011 3 CS

9-month 2011 results 15 November 2011 Q3 2011 results 15 November 2011 2 This presentation

AFRICACRYPT 2011 Call for Papers AFRICACRYPT 2011 Call for Papers Africacrypt 2011 You are

2011 2011 .............................................. London 18 May 2011 London, 18 May 2011

BEYOND 2011 BEYOND 2011 ISPE San Francisco/Bay Area ISPE San Francisco/Bay Area 2011 Code

Formal Methods and Cybersecurity Education James Davenport & Tom Crick

Exploiting Home Automation Protocols for Load Monitoring in Smart Buildings David Irwin, Anthony

Click to go to website: www.njctl.org Slide 2 / 78 Evolution Practice Questions www.njctl.org

The left _ join v erb J OIN IN G DATA W ITH D P LYR Chris Cardillo Data Scientist Batmobile v

n p

As a prelude to the back-analysis intended for the full MAE Center report that is currently under

Cache Refill/Access Decoupling for Vector Machines Christopher Batten, Ronny Krashinsky, Steve

Childrens Mental Health Workshop Helen Ford Integrated Care System Programme Lead Childrens

Sambuz

Useful Links

Newsletter

Mail Us

ALIHT 2011 W. Bradley Knox Jake Beal Brenna Argall Sonia Chernova - PowerPoint PPT Presentation

Agents Learning Interactively from Human Teachers ALIHT 2011 W. Bradley Knox Jake Beal Brenna Argall Sonia Chernova Peter Stone Matt Taylor Andrea Thomaz These slides are posted on the ALIHT websites Program page. Welcome! Quick

Reciprocal Learning via Dialogue Interaction: Challenges and Prospects Raquel Fernndez, Staffan

H1 2011 Financial Results H1 2011 Financial Results th August 2011 Milan, 26 th August 2011

Humans Teaching Robots: Challenges to Decoding the Intention Behind Natural Instruction IJCAI

COLOMBIAN CAPITAL MARKETS EVOLUTION 2011*: YTD Dec 2011*: YTD Dec 2011*: YTD Dec 2011*: YTD

Understanding Human Teaching Modalities in Reinforcement Learning Environments A Preliminary

Annual Meeting Annual Meeting 2011 2011 2011 2011 The Landings Association, Inc. F b

2011 SERTP Input 2011 SERTP Input Assumptions Assumptions 1 2011 Load Forecast 2011 Load

Results presentation for the year ended 30 June 2011 14 September 2011 2 Full Year 2011 Image

Asiasoft Corporation PLC. Opportunity day Q3/2011 17 November 2011 Market Overview in Q3/2011

Second Quarter 2011 July 28, 2011 Results Second Quarter 2011 Results 1 Contents 1. Second

Annual results 2011/12 Patrick Kron 4 May 2012 Annual results 2011/12 Main events 2011/12

Sampling CS 6965 Fall 2011 Creative Program 3 CS 6965 Fall 2011 2 CS 6965 Fall 2011 3 CS

9-month 2011 results 15 November 2011 Q3 2011 results 15 November 2011 2 This presentation

AFRICACRYPT 2011 Call for Papers AFRICACRYPT 2011 Call for Papers Africacrypt 2011 You are

2011 2011 .............................................. London 18 May 2011 London, 18 May 2011

BEYOND 2011 BEYOND 2011 ISPE San Francisco/Bay Area ISPE San Francisco/Bay Area 2011 Code

Formal Methods and Cybersecurity Education James Davenport &amp; Tom Crick

Exploiting Home Automation Protocols for Load Monitoring in Smart Buildings David Irwin, Anthony

Click to go to website: www.njctl.org Slide 2 / 78 Evolution Practice Questions www.njctl.org

The left _ join v erb J OIN IN G DATA W ITH D P LYR Chris Cardillo Data Scientist Batmobile v

n p

As a prelude to the back-analysis intended for the full MAE Center report that is currently under

Cache Refill/Access Decoupling for Vector Machines Christopher Batten, Ronny Krashinsky, Steve

Childrens Mental Health Workshop Helen Ford Integrated Care System Programme Lead Childrens

Sambuz

Useful Links

Newsletter

Mail Us

COLOMBIAN CAPITAL MARKETS EVOLUTION 2011: YTD Dec 2011: YTD Dec 2011: YTD Dec 2011: YTD

Formal Methods and Cybersecurity Education James Davenport & Tom Crick