On the Feasibility of Learning Human Biases for Reward Inference - PowerPoint PPT Presentation

On the Feasibility of Learning Human Biases for Reward Inference Rohin Shah, Noah Gundotra, Pieter Abbeel, Anca Dragan

A conversation amongst IRL researchers

A conversation amongst IRL researchers [Ziebart et al, 2008] To deal with suboptimal demos, let’s model the human as noisily rational

A conversation amongst IRL researchers [Ziebart et al, 2008] To deal with suboptimal demos, let’s model the human as noisily rational [Christiano, 2015] Then you are limited to human performance, since you don’t know how the human made a mistake

A conversation amongst IRL researchers [Ziebart et al, 2008] To deal with suboptimal demos, let’s model the human as noisily rational [Christiano, 2015] Then you are limited to human performance, since you don’t know how the human made a mistake [Evans et al, 2016], [Zheng et al, 2014], [Majumdar et al, 2017] We can model human biases: π ( a | s ) ∝ e β Q ( s , a ; r ) - Myopia s - Hyperbolic time discounting a - Sparse noise r - Risk sensitivity

A conversation amongst IRL researchers [Christiano, 2015] Then you are limited to human performance, since you don’t know how the human made a mistake [Evans et al, 2016], [Zheng et al, 2014], [Majumdar et al, 2017] We can model human biases: π ( a | s ) ∝ e β Q ( s , a ; r ) - Myopia s - Hyperbolic time discounting a - Sparse noise r - Risk sensitivity [Steinhardt and Evans, 2017] Your human model will inevitably be misspecified

A conversation amongst IRL researchers [Evans et al, 2016], [Zheng et al, 2014], [Majumdar et al, 2017] We can model human biases: π ( a | s ) ∝ e β Q ( s , a ; r ) - Myopia s - Hyperbolic time discounting a - Sparse noise r - Risk sensitivity [Steinhardt and Evans, 2017] Your human model will inevitably be misspecified s Hmm, maybe we can learn the a systematic biases from data? r Then we could correct for these biases during IRL

A conversation amongst IRL researchers [Steinhardt and Evans, 2017] Your human model will inevitably be misspecified s Hmm, maybe we can learn the a systematic biases from data? r Then we could correct for these biases during IRL [Armstrong and Mindermann, 2017] That’s impossible without additional assumptions

Learning a policy isn’t sufficient s s w π D a π r r a Biases are a part of cognition, They are in the planning algorithm and are not in the policy π D that created the policy π We consider a multi-task setting so that we can learn D from examples

Architecture To learn the biased planner, minimize over θ To perform IRL, minimize over R

Algorithms Algorithm 1: Some known rewards Algorithm 2: ”Near” optimal 1. On tasks with known rewards, 1. Use Algorithm 1 to mimic a learn the planner simulated optimal agent 2. Freeze the planner and learn 2. Finetune planner and reward the reward on remaining tasks jointly on human demonstrations

Experiments We developed five simulated human biases to test our algorithms.

(Some) Results Optimal Our algorithms perform better on Boltzmann average, compared to a learned Known rewards Optimal or Boltzmann model “Near” optimal … But an exact model of the demonstrator does much better, hitting 98%.

Conclusion Learning systematic biases has the potential to improve reward inference , but differentiable planners need to become significantly better before this will be feasible.

On the Feasibility of Learning Human Biases for Reward Inference - PowerPoint PPT Presentation

On the Feasibility of Learning Human Biases for Reward Inference Rohin Shah, Noah Gundotra, Pieter Abbeel, Anca Dragan A conversation amongst IRL researchers A conversation amongst IRL researchers [Ziebart et al, 2008] To deal with suboptimal

Interactive Reinforcement Learning Human Generated Reward Presentation for Summer Camp 2015

TEXT AND TEXT AND AUTOMATED BIASES AUTOMATED BIASES NATURAL LANGUAGES ARE THE NATURAL

Reward Shaping in Episodic Reinforcement Learning Marek Grze s Canterbury, UK AAMAS 2017

Towards an Understanding of Human Persuasion and Biases in Argumentation Pierre Bisquert,

Improve Planning Estimates by Reducing Your Human Biases

Where's The Reward? Where's The Reward? A Review of Reinforcement Learning for Instructional

Measuring social biases in human annotators using counterfactual queries in Crowdsourcing BHAVYA

Risk/Reward Risk/Reward If you buy here, what is the target? What is the risk? 1 221

Using Natural Language for Reward Shaping in Reinforcement Learning Prasoon Goyal , Scott Niekum

The ULTIMATE Business Incentive Company REWARD YOUR CUSTOMERS; REWARD YOUR EMPLOYEES REWARD YOUR

Diagnostic Error Human Expertise and Cognitive Biases Diagnostic Error A recent article by

Learning frameworks Associative reinforcement learning Given input, learn to produce output

Foundations of Machine Learning Reinforcement Learning Reinforcement Learning Agent exploring

Inverse Reinforcement Learning CS 294-112: Deep Reinforcement Learning Sergey Levine Todays

Using iterated learning to reveal biases for well-structured meanings in language Jon W. Carr

CSC304 Lecture 6 Game Theory : Minimax Theorem via Expert Learning CSC304 - Nisarg Shah 1

Capital Budgeting: Biases (Welch, Chapter 13-5) Ivo Welch More Biases Overconfidence Are you

Learning biases in opaque interactions BRANDON PRICKETT UNIVERSITY OF MASSACHUSETTS, AMHERST

Reinforcement Learning: How Does It Work? We detect a state Reinforcement Learning We choose an

Multiple scales of task and reward - based learning Jane Wang Zeb Kurth - Nelson , Sam Ritter ,

Adversarial Reward Learning for Visual Storytelling Xin Wang, Wenhu Chen, Yuan-Fang Wang, William

Reinforcement Learning with a Corrupted Reward Channel Tom Everitt, Victoria Krakovna, Laurent

Fairness and bias in Machine Learning A quick review on tools to detect biases in machine

Learning Theory CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016