Learning Latent Dynamics for Planning from Pixels Danijar Hafner, - PowerPoint PPT Presentation

Jul 03, 2023 •223 likes •446 views

Learning Latent Dynamics for Planning from Pixels Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson @danijar h danijar.com/planet Planning with Learned Models Watter et al., 2015, Banijamali

Learning Latent Dynamics for Planning from Pixels Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson @danijar h danijar.com/planet
Planning with Learned Models Watter et al., 2015, Banijamali et al. 2017, Zhang et al. 2017 Agrawal et al., 2016; Finn & Levine, 2016; Ebert et al., 2018
Visual Control Tasks partially many sparse contacts balance observable joints reward Some model-free methods can solve these tasks but need up to 100,000 episodes
Visual Control Tasks partially many sparse contacts balance observable joints reward Some model-free methods can solve these tasks but need up to 100,000 episodes
We introduce PlaNet Recipe for scalable model-based reinforcement learning 1 Efficient planning in latent space with large batch size 2 Reaches top performance using 200X fewer episodes 3
Latent Dynamics Model encode images
Latent Dynamics Model encode images predict states
Latent Dynamics Model encode images predict states decode images
Latent Dynamics Model encode images predict states decode images decode rewards
Recurrent State Space Model deterministic stochastic h 1 h 2 h 3 h 1 h 2 h 3 z 1 s 1 z 2 s 2 z 3 s 3 z 1 z 2 z 3 Recurrent Neural Network State Space Model Recurrent State Space Model
Unguided Video Predictions by Single Agent 5 frames context and 45 frames predicted
Unguided Video Predictions by Single Agent 5 frames context and 45 frames predicted
Planning in Latent Space
Planning in Latent Space
Planning in Latent Space
Planning in Latent Space
Planning in Latent Space
Planning in Latent Space
Comparison to Model-Free Agents Training time 1 day on a single GPU
Enabling More Model-Based RL Research Explore dynamics Distill the planner to save Value function to extend without supervision computation planning horizon
Learning Latent Dynamics for Planning from Pixels Website with code, videos, blog post, animated paper: danijar.com/planet 33

Recommend

Pixels Pixels Row and column indicates a PIXEL not a POINT. A pixel can theoretically contain

Pixels Pixels Row and column indicates a PIXEL not a POINT. A pixel can theoretically contain infinitely many points Drawing lines with pixels: fill pixels along the trajectory of a line from point p1 to p2 Aliasing Antialiasing

498 views • 15 slides

Learning Latent Dynamics for Planning from Pixels Danijar Hafner, Timothy Lillicrap, Ian Fischer,

Learning Latent Dynamics for Planning from Pixels Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson @danijarh danijar.com/planet Planning with Learned Models Watter et al., 2015, Banijamali

519 views • 31 slides

Turn Right Walk forward 100 pixels Start Here Walk Forward Turn Left and 100 pixels walk

Turn Right Walk forward 100 pixels Start Here Walk Forward Turn Left and 100 pixels walk forward 200 pixels import turtle franklin = turtle.Turtle() franklin = turtle.Turtle() franklin.shape('turtle') franklin = turtle.Turtle()

415 views • 22 slides

Polygon Filling Goal intensify the pixels that belong to the polygon Issues which pixels belong

Polygon Filling Goal intensify the pixels that belong to the polygon Issues which pixels belong to the polygon Approach use a (horizontal) scan line that traverses the polygon intensify the pixels along the spans (the segments of

895 views • 50 slides

1 Latent variable models In the next section we will discuss latent variable models for

Statistical Modeling and Analysis of Neural Data (NEU 560) Princeton University, Spring 2018 Jonathan Pillow Lecture 16 notes: Latent variable models and EM Tues, 4.10 1 Latent variable models In the next section we will discuss latent

271 views • 4 slides

Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent

Spectral Algorithms for Latent Variable Models Part III: Latent Tree Models Le Song ICML 2012 Tutorial on Spectral Algorithms for Latent Variable Models, Edinburgh, UK Joint work with Mariya Ishteva, Ankur Parikh, Eric Xing, Byron Boots , Geoff

548 views • 40 slides

? Which intermediate 4 pixels to turn on? 3 (3,2) 2 1 0 1 2 3 4 5 6 7 8 9 10 11

Recall: Line drawing algorithm Programmer specifies (x,y) of end pixels Need algorithm to determine pixels on line path 8 Line: (3,2) -> (9,6) 7 (9,6) 6 5 ? Which intermediate 4 pixels to turn on? 3 (3,2) 2 1 0 1 2 3 4

374 views • 34 slides

Chapter 4: Modifying Pixels in a Range Reminder: Pixels are in a matrix Matrices have two

Chapter 4: Modifying Pixels in a Range Reminder: Pixels are in a matrix Matrices have two dimensions: A height and a width We can reference any element in the matrix with (x,y) or (horizontal, vertical) We refer to those coordinates as

915 views • 75 slides

Opening Exercise Write a method that turns pixels with an average intensity less than 85 to

Opening Exercise Write a method that turns pixels with an average intensity less than 85 to GREEN, pixels with an average intensity less than 170 to RED, and all other pixels to BLUE. What Is Wrong With This Answer? public void exercise01()

366 views • 11 slides

What Can You Say With Only What Can You Say With Only Three Pixels? Three Pixels? Christopher

What Can You Say With Only What Can You Say With Only Three Pixels? Three Pixels? Christopher Campbell Christopher Campbell IBM Almaden Research Laboratory IBM Almaden Research Laboratory Peter Tarasewich Peter Tarasewich Human- Human

261 views • 11 slides

Deep Learning Helicopter Dynamics Models Ali Punjani Pieter Abbeel UC Berkeley EECS Latent

Deep Learning Helicopter Dynamics Models Ali Punjani Pieter Abbeel UC Berkeley EECS Latent State: Airflow, Flexibility, Engine Dynamics etc. Similar trajectories have similar dynamics acceler acceleration ation stat state-contr e-control

372 views • 12 slides

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330

Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model CS330 Student Presentation Table of Contents Motivation & problem Method overview Experiments Takeaways Discussion (strengths

653 views • 24 slides

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine

Learning Overcomplete Latent Variable Models through Tensor Methods Anima Anandkumar UC Irvine Joint work with Majid Janzamin Rong Ge UC Irvine Microsoft Research Latent Variable Probabilistic Models Latent (hidden) variable h R k ,

769 views • 47 slides

Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent

Mutual Angular Regularization of Latent Variable Models: Theory, Algorithm and Applications Pengtao Xie Joint work with Yuntian Deng and Eric Xing Carnegie Mellon University 1 Latent Variable Models (LVMs) Machine Learning Latent Variable

1.04k views • 47 slides

C unobserved construct (e.g. Disordered v. Non- Disordered) Latent classes are mutually

University of Ulster at Magee, Friday 15 th June 2012 Latent Transition Analysis Overview of latent class and Dr Oliver Perra latent transition models Institute of Child Care Research Queens University Belfast email : o.perra@qub.ac.uk

812 views • 38 slides

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang,

Optimization-Based Model Fitting for Latent Class and Latent Profile Analyses Guan-Hua Huang, Su-Mei Wang and Chung-Chu Hsu 12/17/2010 Breast cancer data van't Veer et al . Nature 2002 The 78 sporadic lymph-node-negative breast cancer

359 views • 24 slides

Entropy minimization in emergent languages Eugene Kharitonov , Rahma Chaabouni, Diane Bouchacourt,

Entropy minimization in emergent languages Eugene Kharitonov , Rahma Chaabouni, Diane Bouchacourt, Marco Baroni Setup: signalling game (Lewis, 1969) Two deterministic neural agents, Sender sends a discrete message (one- Sender and

253 views • 24 slides

DReX: A Declarative Language for Efficiently Evaluating Regular String Transformations Rajeev

DReX: A Declarative Language for Efficiently Evaluating Regular String Transformations Rajeev Alur Loris DAntoni Mukund Raghothaman POPL 2015 1 DReX is a DSL for String Transformations align-bibtex ... ... @book{Book1 , @book{Book1 ,

436 views • 39 slides

on learned visual embedding patrick prez Allegro Workshop Inria Rhnes-Alpes 22 July 2015

on learned visual embedding patrick prez Allegro Workshop Inria Rhnes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim ( 100 100,000 ) Generic, unsupervised: BoW, FV, VLAD / DBM, SAE

547 views • 32 slides

SLAM: COMPARATIVE APPROACH Khooshal Saurty 1 OUTLINE Introduction - What is SLAM? EKF SLAM

Intelligent Robotics Seminar - 31 October 2016 SLAM: COMPARATIVE APPROACH Khooshal Saurty 1 OUTLINE Introduction - What is SLAM? EKF SLAM FAST SLAM Comparison Cartographer Conclusion and References 2 INTRODUCTION - WHAT IS SLAM?

211 views • 20 slides

Latent Variable Models Volodymyr Kuleshov Cornell Tech Lecture 5 Volodymyr Kuleshov (Cornell

Latent Variable Models Volodymyr Kuleshov Cornell Tech Lecture 5 Volodymyr Kuleshov (Cornell Tech) Deep Generative Models Lecture 5 1 / 35 Announcements Glitches with Google Hangout link should be resolved. Will be checking email at the

352 views • 34 slides

Advanced Model-Based Reinforcement Learning CS 294-112: Deep Reinforcement Learning Sergey

Advanced Model-Based Reinforcement Learning CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 3 is extended by one week, to Wednesday after next Todays Lecture 1. Managing overfitting in model-based RL

686 views • 40 slides

Meta-Reinforcement Learning of Structured Exploration Strategies Abhishek Gupta , Russell

Meta-Reinforcement Learning of Structured Exploration Strategies Abhishek Gupta , Russell Mendonca, YuXuan Liu, Pieter Abbeel, Sergey Levine Human Exploration vs Robot Exploration Human Exploration vs Robot Exploration Human Exploration vs Robot

1.2k views • 30 slides

Dream to Control: Learning Behaviors by Latent Imagination Danijar Hafner, Timothy Lillicrap,

Dream to Control: Learning Behaviors by Latent Imagination Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi Topic: Model Based RL Presenter: Haotian Cui Motivation and Main Problem 1-4 slides Should capture - High level

786 views • 26 slides