Imitating Latent Policies from Observation Ashley D. Edwards, - PowerPoint PPT Presentation

Aug 09, 2023 •434 likes •553 views

Imitating Latent Policies from Observation Ashley D. Edwards, Himanshu Sahni, Yannick Schroecker, Charles L. Isbell Georgia Institute of Technology Introduction Imitation from Observation enables learning from state sequences Typical

Imitating Latent Policies from Observation Ashley D. Edwards, Himanshu Sahni, Yannick Schroecker, Charles L. Isbell Georgia Institute of Technology
Introduction • Imitation from Observation enables learning from state sequences • Typical approaches need extensive environment interactions • Humans can learn policies just by watching
Approach Given: Sequence of noisy expert observations Assumption: Discrete actions with deterministic transitions • z is defined as a latent action that caused a transition to occur • z can imply a real action or some other type of transition Action: Right Action: Right Z = 1 Z = 2 • A latent policy is the probability of taking a latent action in some state
Approach ILPO 1. Given sequence of observations, learn latent policy 2. Use a few environment steps to align actions
Approach ILPO 1. Given sequence of observations, learn latent policy 2. Use a few environment steps to align actions Latent policy network
Approach ILPO 1. Given sequence of observations, learn latent policy 2. Use a few environment steps to align actions (b) Action Remapping Network Action remapping network
Experiments: Classic Control • Access to expert observations only • No reward function used in approach • Comparison to Behavioral Cloning from Observation [1] [1] Torabi, Faraz, Garrett Warnell, and Peter Stone. "Behavioral cloning from observation." Proceedings of the 27th International Joint Conference on Artificial Intelligence . AAAI Press, 2018.
Experiments: CoinRun
Experiments: CoinRun
Thank You! Room: Pacific Ballroom at 6:30pm (Today)! Poster: #33

Recommend

Qatar observation stations Qatar observation stations, Instruments and calibrations By By

Qatar observation stations Qatar observation stations, Instruments and calibrations By By Mohammed Almarri Qatar Meteorology Department 1 observation stations Number of surface observation Distribution map of stations Case examples

586 views • 34 slides

1 Latent variable models In the next section we will discuss latent variable models for

Statistical Modeling and Analysis of Neural Data (NEU 560) Princeton University, Spring 2018 Jonathan Pillow Lecture 16 notes: Latent variable models and EM Tues, 4.10 1 Latent variable models In the next section we will discuss latent

271 views • 4 slides

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Spectral Algorithms for Latent Variable Models Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent Variable Models, Edinburgh, UK Joint work with Mariya Ishteva, Ankur Parikh, Eric Xing, Byron Boots , Geoff

548 views • 40 slides

A Rapidly Progressive Hydatid Disease Imitating Metastatic Malignancy: An Unusual Multi-Organ

Open Access SAJ Case Reports Case Report ISSN: 2375-7043 A Rapidly Progressive Hydatid Disease Imitating Metastatic Malignancy: An Unusual Multi-Organ Presentation. Report of a Case and Review of Literature Qays A. Hassan Al-Timimy * Section

338 views • 4 slides

Imitating Reality A Moulage Experience Laerdal SUN Conference October 2018 Orlando, FL About

Imitating Reality A Moulage Experience Laerdal SUN Conference October 2018 Orlando, FL About SIMETRI SIMETRI designs and develops creative technologies and capabilities for medical training. We are a woman- owned, minority-owned small

349 views • 31 slides

Recognizing and Imitating Programmer Style: Adversaries in Program Authorship Attribution Lucy

Recognizing and Imitating Programmer Style: Adversaries in Program Authorship Attribution Lucy Simko , Luke Zettlemoyer, Tadayoshi Kohno simkol@cs.washington.edu homes.cs.washington.edu/~simkol sim Source Code Attribution B int main() { A

565 views • 52 slides

Thailand Earth Observation Activities and THEOS Thailand Earth Observation Activities and THEOS

Thailand Earth Observation Activities and THEOS Thailand Earth Observation Activities and THEOS Thailand Earth Observation Activities and THEOS Chanchai Peanvijarnpong Chanchai Peanvijarnpong Chanchai Peanvijarnpong Vice Executive Director

851 views • 56 slides

Medicare Medicare Outpatient Observation Notice Outpatient Observation Notice Janet Miller

Medicare Medicare Outpatient Observation Notice Outpatient Observation Notice Janet Miller Center for Medicare February 7, 2017 Overview Overview Medicare Outpatient Observation Notice The MOON What Why About What is the MOON?

361 views • 12 slides

Rotational Momentum Observation Experiment 1 - Figure Skater Observation Experiment 2 - Diver

Rotational Momentum Observation Experiment 1 - Figure Skater Observation Experiment 2 - Diver Rotational Momentum Observation Experiment 3 You sit on a chair that can spin with little friction and hold barbells far from your body. You spin

355 views • 17 slides

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330 Student Presentation Table of Contents Motivation & problem Method overview Experiments Takeaways Discussion (strengths

653 views • 24 slides

C unobserved construct (e.g. Disordered v. Non- Disordered) Latent classes are mutually

University of Ulster at Magee, Friday 15 th June 2012 Latent Transition Analysis Overview of latent class and Dr Oliver Perra latent transition models Institute of Child Care Research Queens University Belfast email : o.perra@qub.ac.uk

812 views • 38 slides

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang,

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang, Su-Mei Wang and Chung-Chu Hsu 12/17/2010 Breast cancer data van't Veer et al . Nature 2002 The 78 sporadic lymph-node-negative breast cancer

359 views • 24 slides

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor & Client: Dr. Randy

Latent Damage and Reliability in Semiconductor Devices May1625 - Advisor & Client: Dr. Randy Geiger Sean Santella Hayle Olson David Ackerman Jaehyuk Han 04/27/16 Latent Damage and Reliability in Semiconductor Devices What is Latent

584 views • 40 slides

ZEB1 Regulates the Latent- -Lytic Lytic Switch Switch ZEB1 Regulates the Latent in Infection

ZEB1 Regulates the Latent- -Lytic Lytic Switch Switch ZEB1 Regulates the Latent in Infection by Epstein- -Barr Virus Barr Virus in Infection by Epstein Xianming Yu, Zhenxun Wang, Janet E. Mertz PLoS Pathogen, 3(12): 867-77 Speaker: Chien

899 views • 27 slides

Latent Class Models: The Latent Class Logit Model Accouting for unobserved heterogeneity:

Latent Class Models: The Latent Class Logit Model Accouting for unobserved heterogeneity: Accouting for unobserved heterogeneity: Random Parameters Parametric assumption: must specify the functional form of the mixing distribution (for

191 views • 8 slides

Demystifying Relational Latent Representations Sebastijan Dumani, Hendrik Blockeel DTAI, KU

Demystifying Relational Latent Representations Sebastijan Dumani, Hendrik Blockeel DTAI, KU Leuven September 6, ILP 2017 1 Outline 2/24 1 Introduction 2 Understanding latent features 3 Properties of latent spaces Demystifying

281 views • 24 slides

MODEL-BASED API TESTING FOR SMT SOLVERS Aina Niemetz , Mathias Preiner , Armin

MODEL-BASED API TESTING FOR SMT SOLVERS Aina Niemetz , Mathias Preiner , Armin Biere Johannes Kepler University, Linz, Austria Stanford University, USA SMT Workshop 2017, July 22 23 Heidelberg, Germany SMT Solvers

701 views • 27 slides

Schema validation and evolution for PGs Eugenia Oshurko (ENS Lyon) 7 March 2019 Main ideas

Schema validation and evolution for PGs Eugenia Oshurko (ENS Lyon) 7 March 2019 Main ideas Schema as a PG Schema validation via graph homomorphisms Schema nodes define node types Schema relations define relations allowed between

340 views • 11 slides

Design Patterns Factory Method Oliver Haase 1 Idea If client knows when to create certain

Design Patterns Factory Method Oliver Haase 1 Idea If client knows when to create certain object(s), but doesn't know (neither care) how, then... make client an abstract class; define an abstract method for object creation; leave

506 views • 37 slides

Secure recharge of disposable RFID tickets Riccardo Focardi Flaminia Luccio Universit` a Ca

Secure recharge of disposable RFID tickets Riccardo Focardi Flaminia Luccio Universit` a Ca Foscari, Venezia { focardi,luccio } @unive.it FAST 2011 15-16 September 2011, Leuven FAST 2011 () Secure recharge of disposable RFID tickets

558 views • 34 slides

License Introduction to Version Control with Git Andreas Skielboe 1 All images adapted from Pro

License Introduction to Version Control with Git Andreas Skielboe 1 All images adapted from Pro Git by Scott Chacon and released Adapted by Dr. Andrew Vardy 2 under license Creative Commons BY-NC-SA 3.0. 1. Dark Cosmology Centre See

335 views • 7 slides

Improving Imitation Learning with Reinforcement Learning Niklas Fiedler University of Hamburg

MIN Faculty Department of Informatics Improving Imitation Learning with Reinforcement Learning Niklas Fiedler University of Hamburg Faculty of Mathematics, Informatics and Natural Sciences Department of Informatics Technical Aspects of

905 views • 33 slides

Coarse Classification of Binary Minimal Clones Zarathustra Brady Minimal clones A clone C is

Coarse Classification of Binary Minimal Clones Zarathustra Brady Minimal clones A clone C is minimal if f C nontrivial implies C = Clo( f ). Minimal clones A clone C is minimal if f C nontrivial implies C = Clo( f ). If Clo( f

1.52k views • 96 slides

Rcpp classes and vectors Romain Franois Consulting Datactive, ThinkR DataCamp Optimizing R

DataCamp Optimizing R Code with Rcpp OPTIMIZING R CODE WITH RCPP Rcpp classes and vectors Romain Franois Consulting Datactive, ThinkR DataCamp Optimizing R Code with Rcpp Previously on this course Create C++ functions Write loops

572 views • 35 slides