Optimal Control Solution Mode 1: Training example Cost Map 2-D - PDF document

11/22/16 X X Learning Y Y (Sensor Data) (Input) (Output) (Path to goal) CSE 571 Inverse Optimal Control (Inverse Reinforcement Learning) Many slides by Drew Bagnell Carnegie Mellon University Optimal Control Solution Mode 1: Training example Cost Map 2-D Learning Planner Y (Path to goal) Mode 1: Training example Mode 1: Learned behavior 1

11/22/16 Mode 1: Learned behavior Mode 1: Learned cost map Mode 2: Training example Mode 2: Training example Mode 2: Learned behavior Mode 2: Learned behavior 2

11/22/16 Mode 2: Learned cost map Feature vector Cost = w' F Weighting vector Ratliff, Bagnell, Zinkevich 2005 Ratliff, Bradley, Bagnell, Chestnutt, NIPS 2006 Silver, Bagnell, Stentz, RSS 2008 w=[], F=[] w=[w 1 ], F=[F 1 ] ( , High Cost) ( , High Cost) Learn F 1 Learn F 2 ( , Low Cost) ( , Low Cost) Ratliff, Bagnell, Zinkevich, ICML 2006 Ratliff, Bagnell, Zinkevich, ICML 2006 Ratliff, Bradley, Bagnell, Chestnutt, NIPS 2006 Ratliff, Bradley, Bagnell, Chestnutt, NIPS 2006 Silver, Bagnell, Stentz, RSS 2008 Silver, Bagnell, Stentz, RSS 2008 Ratliff, Bradley, Chesnutt, Bagnell 06 Zucker, Ratliff, Stolle, Chesnutt, Bagnell, Atkeson, Kuffner 09 3

11/22/16 Learned Cost Function Examples Learned Cost Function Examples Learning Manipulation Preferences Learned Cost Function Examples • Input: Human demonstrations of preferred behavior (e.g., moving a cup of water upright without spilling) • Output: Learned cost function that results in trajectories satisfying user preferences 22 Demonstration(s) Demonstration(s) Graph 23 24 4

11/22/16 Demonstration(s) Graph Demonstration(s) Graph Projection 25 26 Demonstration(s) Graph Projection Demonstration(s) Graph Projection Learned cost 27 28 Demonstration(s) Graph Projection Demonstration(s) Graph Projection Discrete sampled Output Discrete sampled Learned cost Learned cost paths trajectories paths 29 30 5

11/22/16 Demonstration(s) Graph Projection Demonstration(s) Graph Projection Discrete Local Trajectory MaxEnt IOC Optimization Output Discrete sampled Output Discrete sampled Learned cost Learned cost trajectories paths trajectories paths 31 32 Graph generation 2D obstacle avoidance task • Goal: Construct a graph in the robot’s configuration space providing good coverage 2D state: (x,y) 33 34 Projection Learning the cost function • Goal: Project the continuous demonstration onto the • Goal: Given projected demonstrations, learn the cost graph, resulting in a discrete graph path function • Learn feature weights ( *) using softened value θ • Use a modified Dijkstra’s algorithm minimizing sum of: iteration on the discrete graph (MaxEnt IOC - Ziebart et al., 2008) – Length of discrete path (Euclidean) – State dependent features (eg: Distance to obstacles) – Distance to continuous demonstration 35 36 6

11/22/16 Setup • Binary state-dependent features (~95) • Histograms of distances to objects Experimental Results • Histograms of end-effector orientation • Object specific features (electronic vs non-electronic) • Approach direction w.r.t goal • Comparison : • Human demonstrations • Obstacle avoidance planner (CHOMP) • Locally optimal IOC approach (similar to Max-Margin planning, Ratliff et. al., 2007) 37 38 Laptop task: Demonstration Laptop task: LTO + Discrete graph path ( Not part of training set) 39 40 Statistics for Laptop task Laptop task: LTO + Smooth random path 41 42 7

Optimal Control Solution Mode 1: Training example Cost Map 2-D - PDF document

11/22/16 X X Learning Y Y (Sensor Data) (Input) (Output) (Path to goal) CSE 571 Inverse Optimal Control (Inverse Reinforcement Learning) Many slides by Drew Bagnell Carnegie Mellon University Optimal Control Solution Mode 1: Training

Inverse problems and control optimal in non-linear mechanics C. Stolz 1 2 Introduction

2/17/2017 Continued from yesterday >java RealQueen 5 SOLUTION: 1 3 5 2 4 SOLUTION: 1 4 2 5

High Warehouse Racks: Optimal Feedback Control and High Warehouse Racks: Optimal Feedback Control

Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline

Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline

Part 23 Optimal Control: Examples 142 Definition of optimal control problems Commonly

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Output Feedback Optimal Control with Constraints Mar a M. Seron September 2004 Centre for

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

CS137: Dynamic Programming Electronic Design Automation Solution Solution described is

Sensitivity analysis for optimal control problems. Chance-constrained stochastic optimal control.

OPTIMAL CONTROL PROBLEMS ON THE COEFFICIENTS FOR THE PARABOLIC EQUATIONS A. Alla May 19 th ,

Industrial Robots Industrial Robots Control Control Part 1 Control Control Part 1 Part 1

u Efficient Solution of Optimal Multimarket Electricity Bid Models 1/16 d Efficient Solution of

Numerical solution of optimal control problems for descriptor systems Volker Mehrmann TU Berlin

Sensitivity analysis for relaxed optimal control problems with final-state constraints eric

Autonomous Navigation CSE 571 Inverse Optimal Control (Inverse Reinforcement Learning) Many

VAB: Visual Audit Browsing James A. Hoagland, Christopher Wee, Karl Levitt Computer Security

Slide 1 _ _

Labor Standards Enforcement Webinar Investigations June 6, 2017 Lindsay Moore, Dept. of

Introduction to Symbolic Dynamics Part 3: Sofic shifts Silvio Capobianco Institute of

EPICS Base R3.14.11, Whats Next? Andrew Johnson Argonne National Laboratory Base R3.14.11

Jasper Bongertz, Airbus CyberSecurity @packetjay Scanning for network IoCs is relatively easy:

Institute of Coding A brief introduction Overview Concept announced by George Osborne Nov 2015

Optimal Control Solution Mode 1: Training example Cost Map 2-D - PDF document

11/22/16 X X Learning Y Y (Sensor Data) (Input) (Output) (Path to goal) CSE 571 Inverse Optimal Control (Inverse Reinforcement Learning) Many slides by Drew Bagnell Carnegie Mellon University Optimal Control Solution Mode 1: Training

Inverse problems and control optimal in non-linear mechanics C. Stolz 1 2 Introduction

2/17/2017 Continued from yesterday &gt;java RealQueen 5 SOLUTION: 1 3 5 2 4 SOLUTION: 1 4 2 5

High Warehouse Racks: Optimal Feedback Control and High Warehouse Racks: Optimal Feedback Control

Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline

Optimal Control Theory The theory Optimal control theory is a mature mathematical discipline

Part 23 Optimal Control: Examples 142 Definition of optimal control problems Commonly

Toward Computing Towards an Optimal . . . An (Almost) Optimal . . . Minor Problem an Optimal

Output Feedback Optimal Control with Constraints Mar a M. Seron September 2004 Centre for

Optimal Agents Nick Hay 27th September 2005 1 / 36 Nick Hay Optimal Agents The Optimal Agent

CS137: Dynamic Programming Electronic Design Automation Solution Solution described is

Sensitivity analysis for optimal control problems. Chance-constrained stochastic optimal control.

OPTIMAL CONTROL PROBLEMS ON THE COEFFICIENTS FOR THE PARABOLIC EQUATIONS A. Alla May 19 th ,

Industrial Robots Industrial Robots Control Control Part 1 Control Control Part 1 Part 1

u Efficient Solution of Optimal Multimarket Electricity Bid Models 1/16 d Efficient Solution of

Numerical solution of optimal control problems for descriptor systems Volker Mehrmann TU Berlin

Sensitivity analysis for relaxed optimal control problems with final-state constraints eric

Autonomous Navigation CSE 571 Inverse Optimal Control (Inverse Reinforcement Learning) Many

VAB: Visual Audit Browsing James A. Hoagland, Christopher Wee, Karl Levitt Computer Security

Slide 1 ___________________________________ ___________________________________

Labor Standards Enforcement Webinar Investigations June 6, 2017 Lindsay Moore, Dept. of

Introduction to Symbolic Dynamics Part 3: Sofic shifts Silvio Capobianco Institute of

EPICS Base R3.14.11, Whats Next? Andrew Johnson Argonne National Laboratory Base R3.14.11

Jasper Bongertz, Airbus CyberSecurity @packetjay Scanning for network IoCs is relatively easy:

Institute of Coding A brief introduction Overview Concept announced by George Osborne Nov 2015

2/17/2017 Continued from yesterday >java RealQueen 5 SOLUTION: 1 3 5 2 4 SOLUTION: 1 4 2 5

Slide 1 _ _