DeepMDP Learning Latent Space Continuous Models for Representation - PowerPoint PPT Presentation

Dec 30, 2023 •384 likes •573 views

DeepMDP Learning Latent Space Continuous Models for Representation Learning Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, Marc G. Bellemare Simple Representations for RL 2 12 DeepMDP Latent Space Model: Neural networks MDP:

DeepMDP Learning Latent Space Continuous Models for Representation Learning Carles Gelada, Saurabh Kumar, Jacob Buckman, Ofir Nachum, Marc G. Bellemare
Simple Representations for RL 2 12
DeepMDP Latent Space Model: Neural networks MDP: & trained via the following two losses:
Reward Loss
Transition Loss
Tractable Losses
Deep Policies
Representation Quality
Only Discards: Ferns, N., Panangaden, P., and Precup, D. Metrics for Finite Markov Decision Processes. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, UAI ’04, pp. 162–169, 2004.
Phi as a Representation
Donut World
DeepMDP on Donut World 2D latent space + DeepMDP losses
DeepMDP on Donut World Visualization of latent distance
DeepMDP Auxiliary Task Base C51 agent + DeepMDP losses
DeepMDP Auxiliary Task Base C51 agent + DeepMDP losses
● DeepMDPs as Models of the Environment ● Norm-MMD Metrics and their Associated Smoothness
Thanks For Listening Poster #108

Recommend

Emergent solu-ons to high dimensional mul--task reinforcement learning Stephen Kelly &

Humies Compe--on GECCO 2018 Emergent solu-ons to high dimensional mul--task reinforcement learning Stephen Kelly & Malcolm Heywood Why does the result qualify as human compe--ve? Visual State s ( t ) End-of-Evalua-on Game

353 views • 12 slides

Who Gets Hous ho Gets Housing Fir ing First? st? Facilitated Discussion on Prioritization

MAY 2019 Who Gets Hous ho Gets Housing Fir ing First? st? Facilitated Discussion on Prioritization following Coordinated Assessment / Validation and Psychometric Testing of the Vulnerability Index Service Prioritization Decision Assistance

538 views • 32 slides

PLAYING ATARI WITH DEEP REINFORCEMENT LEARNING NEURAL NETWORK VISION FOR ROBOT DRIVING ARJUN

PLAYING ATARI WITH DEEP REINFORCEMENT LEARNING NEURAL NETWORK VISION FOR ROBOT DRIVING ARJUN CHANDRASEKARAN DEEP LEARNING AND PERCEPTION (ECE 6504) PLAYING ATARI WITH DEEP REINFORCEMENT LEARNING NEURAL NETWORK VISION FOR ROBOT DRIVING

1.07k views • 62 slides

Welcome! Todays Agenda: Introduction Light Sources Materials Sensors

INFOGR Computer Graphics J. Bikker - April-July 2015 - Lecture 9 : Shading Models Welcome! Todays Agenda: Introduction Light Sources Materials Sensors Shading INFOGR Lecture 9 Shading

772 views • 43 slides

Game Engines 1 Overview Game engines are a significant part of the modern games industry

Game Engines 1 Overview Game engines are a significant part of the modern games industry Middleware Game Engines Why use an engine? Unreal and Unity Why dont we use engines in this course? Middleware Some part of

357 views • 13 slides

The Project FeederWatch Top 20 feeder birds in the Southeast Based on the reports of citizen

The Project FeederWatch Top 20 feeder birds in the Southeast Based on the reports of citizen scientists from across the region. White-breasted Nuthatch by Steve Delloff Do you like to watch the birds that visit your backyard bird feeder?

285 views • 26 slides

Tracking Cold Related Illness in Massachusetts Photo: "Boston Winter Days" by

Tracking Cold Related Illness in Massachusetts Photo: "Boston Winter Days" by bettlebrox Melanie Jetter, MS Environmental Analyst Massachusetts Department of Public Health NAHDO 35th Annual Conference August 26, 2020 1 Prolonged

196 views • 18 slides

CISC 326 Game Architecture Module 02: Challenges In Game Development Ahmed E. Hassan (with

CISC 326 Game Architecture Module 02: Challenges In Game Development Ahmed E. Hassan (with slides from Drs. Scott Grant and Nick Graham) World of Warcraft Released November 23, 2004 after a closed beta Over 100 million lifetime players

1.12k views • 62 slides

according to curriculum framework A Sndor Egri, Pter dm, Gyula Honyek, Pter

Methods for teaching physics according to curriculum framework A Sndor Egri, Pter dm, Gyula Honyek, Pter Simon, Gbor Hornyi, Ferenc Elblinger Contact: Sndor Egri, University of Debrecen, egris@science.unideb.hu Radnti

118 views • 8 slides

Function Approximation, Deep Q Network Milan Straka November 12, 2018 Charles University in

NPFL122, Lecture 5 Function Approximation, Deep Q Network Milan Straka November 12, 2018 Charles University in Prague Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics unless otherwise stated n -step Methods

754 views • 33 slides

T-orders in MaxEnt Arto Anttila (Stanford University) and Giorgio Magri (CNRS) Society for

T-orders in MaxEnt Arto Anttila (Stanford University) and Giorgio Magri (CNRS) Society for Computation in Linguistics Salt Lake City | January 4-7, 2018 A. Anttila and G. Magri T-orders in MaxEnt SCiL 2018 1 / 48 Introduction A. Anttila and

1.01k views • 81 slides

MA/CSSE 474 Theory of Computation Nondeterminism NFSMs Your Questions? Previous class

3/13/2018 MA/CSSE 474 Theory of Computation Nondeterminism NFSMs Your Questions? Previous class days' HW1 solutions material HW3 or HW4 problems Reading Assignments Anything else 1 3/13/2018 MORE DFSM EXAMPLES

281 views • 15 slides

CS885 Reinforcement Learning Lecture 12: June 8, 2018 Deep Recurrent Q-Networks [GBC] Chap. 10

CS885 Reinforcement Learning Lecture 12: June 8, 2018 Deep Recurrent Q-Networks [GBC] Chap. 10 University of Waterloo CS885 Spring 2018 Pascal Poupart 1 Outline Recurrent neural networks Long short term memory (LSTM) networks Deep

119 views • 11 slides

North Dakota Tier II Instructions If you have already filed a Tier II report for a previous year,

North Dakota Tier II Instructions If you have already filed a Tier II report for a previous year, then you can skip to page 37 1 If you start from your Internet Explorer and type in http://www.nd.gov/des, you should be taken to the following site.

841 views • 59 slides

Task-agnostic priors for reinforcement learning Karthik Narasimhan Princeton Collaborators:

Task-agnostic priors for reinforcement learning Karthik Narasimhan Princeton Collaborators: Yilun Du (MIT), Regina Barzilay (MIT), Tommi Jaakkola (MIT) State of RL ~1100 PF/s days ~800 PF/s days or 45000 years (source: OpenAI) Little to no

556 views • 30 slides

Todays topics More network flow reductions CSE 421 Airplane scheduling Image

Todays topics More network flow reductions CSE 421 Airplane scheduling Image segmentation Algorithms Baseball elimination Richard Anderson Lecture 27 Network Flow Applications Airplane Scheduling Compatible segments

357 views • 3 slides

Computer Vision and stuff Willie Brink Applied Mathematics, Stellenbosch University

Computer Vision and stuff Willie Brink Applied Mathematics, Stellenbosch University wbrink@sun.ac.za 1/4 Previous work structured light for 3D surface reconstruction in real-time 2/4 Previous work structured light for 3D surface

683 views • 13 slides

Vibrational Gyroscopes in Instrumentation and in Creation Robert Leland Oral Roberts University

Vibrational Gyroscopes in Instrumentation and in Creation Robert Leland Oral Roberts University rleland@oru.edu L: flickr.com R: en.wikipedia.org/wiki/Halteres Good design is a reflection of a wise designer. Even in the insects that God

662 views • 27 slides

Activity-Centered Domain Characterization Liz Marai Electronic Visualization Laboratory

Activity-Centered Domain Characterization Liz Marai Electronic Visualization Laboratory University of Illinois at Chicago Domain characterization is difficult [Statistical and Applied Math Sciences Institute] [Munzner 2009: A Nested Model]

494 views • 24 slides

Non Transgenic Replacement and Suppression of Vector Mosquitoes Stephen Dobson 13 March,

3/21/2016 Non Transgenic Replacement and Suppression of Vector Mosquitoes Stephen Dobson 13 March, 2016 Summit on the Aedes aegypti Crisis in the Americas: Joining Together to Address a Grand Challenge Macei, Brazil 1 3/21/2016

372 views • 9 slides

AMMI Introduction to Deep Learning 1.3. What is really happening? Fran cois Fleuret

AMMI Introduction to Deep Learning 1.3. What is really happening? Fran cois Fleuret https://fleuret.org/ammi-2018/ Wed Aug 29 16:56:56 CAT 2018 COLE POLYTECHNIQUE FDRALE DE LAUSANNE (Zeiler and Fergus, 2014) Fran cois Fleuret

615 views • 16 slides

The search for the missing snowball in Eucalyptus Matthew Larcombe, Barbara Holland , Dorothy

The search for the missing snowball in Eucalyptus Matthew Larcombe, Barbara Holland , Dorothy Steane, Rebecca Jones, Dean Nicolle, Ren Vaillancourt, Brad Potts Reproductive isolation is central to speciation There is an obvious relationship

544 views • 23 slides

Objectives I. Give you the information and tools you need to successfully compost II. Highlight

Objectives I. Give you the information and tools you need to successfully compost II. Highlight the benefits and bust the myths III. Ensure you understand the program commitment What is composting? Recycling organic materials: In nature

620 views • 33 slides

From Dependence to Independence Jouko V a an anen Helsinki and Amsterdam Jan 2011

From Dependence to Independence Jouko V a an anen Helsinki and Amsterdam Jan 2011 Jouko V a an anen (Helsinki and Amsterdam) From Dependence to Independence Jan 2011 1 / 60 LogICCC project Logic of Interaction (LINT)

1.02k views • 60 slides