Using State Predictions for Value Regularization in Curiosity Driven - PowerPoint PPT Presentation

Oct 05, 2022 •257 likes •671 views

Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement Learning Oliver Richter, , Manuel Fritsche Gino Brunner, Roger Wattenhofer ETH Zurich Distributed Computing

Using State Predictions for Value Regularization in Curiosity Driven Deep Reinforcement Learning Oliver Richter, , Manuel Fritsche Gino Brunner, Roger Wattenhofer ETH Zurich – Distributed Computing – www.disco.ethz.ch TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A AAAA
Base actions on predictions
Reinforcement learning Agent Environment
Reinforcement learning
How to choose the action?
Return value
Value function
Reinforcement learning Agent Environment
Sparse reward settings ? Agent Environment
Agent Environment
Reward the exploration of novel states
Reward the exploration of novel states
How to find novel states? make predictions A
How to find novel states? make predictions get surprised A F
Curiosity prediction reality
Asynchronous Advantage Actor-Critic architecture (A3C) Feature A3C Extractor Network A3C
Adding curiosity Feature A3C Extractor Network 1 Forward Feature Model Extractor 2
Learning good features Feature A3C Extractor Network 1 Forward Feature Model Extractor 2 Inverse Feature Model Extractor 2 Pathak et. al, ICML 2017, A3C + ICM
Good features for all Feature A3C Extractor Network Forward Model Inverse Feature Model Extractor A3C + Pred
Adding Value Prediction Feature A3C Extractor Network Forward A3C Model Network Inverse Feature Model Extractor A3C + Pred + VPC
Value Prediction Consistency
Value Prediction Consistency
Value Prediction Consistency
Let’s see how it works in practice
Rewards per episode
Rewards per episode
Rewards per episode
Rewards per episode
Thinking bigger
Rewards per episode
Rewards per episode
Rewards per episode
Rewards per episode
Doom environment
Doom Setup
Rewards per episode
Rewards per episode
Rewards per episode
Rewards per episode
Question & Answers ?

Recommend

Regularization Overview Regularization Overview Problems & Multicollinearity We will

Regularization Overview Regularization Overview Problems & Multicollinearity We will discuss three popular methods for obtaining better estimates of the linear model coefficients Regularization Techniques Principal

305 views • 12 slides

Regularization Regularization is a general approach to add a complexity parameter to a

Regularization Regularization is a general approach to add a complexity parameter to a learning algorithm. Requires that the model parameters be continuous. (i.e., Regression OK, IAML: Regularization and Ridge Regression Decision trees

204 views • 3 slides

10. Regularization More on tradeoffs Regularization Effect of using different norms

CS/ECE/ISyE 524 Introduction to Optimization Spring 201718 10. Regularization More on tradeoffs Regularization Effect of using different norms Example: hovercraft revisited Laurent Lessard (www.laurentlessard.com) Review of

407 views • 21 slides

1 Predictions for 2020 Predictions for 2020 We will live in flying houses. 1966

1 Predictions for 2020 Predictions for 2020 We will live in flying houses. 1966 Predictions for 2020 We will have personal helicopters. 1951 Predictions for 2020 C, X, and Q will not be part of the alphabet. 1900 Predictions

818 views • 49 slides

Manifold Regularization Lorenzo Rosasco MIT, 9.520 L. Rosasco Manifold Regularization About

Manifold Regularization Lorenzo Rosasco MIT, 9.520 L. Rosasco Manifold Regularization About this class Goal To analyze the limits of learning from examples in high dimensional spaces. To introduce the semi-supervised setting and the use of

437 views • 31 slides

Regularization Parameter Estimation for Least Squares: Using the 2 -curve Rosemary Renaut, Jodi

Regularization Parameter Estimation for Least Squares: Using the 2 -curve Rosemary Renaut, Jodi Mead Supported by NSF Arizona State and Boise State Harrachov, August 2007 Outline Introduction Methods Examples Chi squared Method

554 views • 32 slides

CS7015 (Deep Learning) : Lecture 8 Regularization: Bias Variance Tradeoff, l2 regularization,

CS7015 (Deep Learning) : Lecture 8 Regularization: Bias Variance Tradeoff, l2 regularization, Early stopping, Dataset augmentation, Parameter sharing and tying, Injecting noise at input, Ensemble methods, Dropout Mitesh M. Khapra Department of

1.06k views • 86 slides

CFD Modeling of a Submersible in a Realistic Surfaced Sea State Condition for Predictions of

CFD Modeling of a Submersible in a Realistic Surfaced Sea State Condition for Predictions of Hydrodynamic Wave Impact Loading Minyee Jiang, David Drazen (NSWCCD) and Jack Lee (NAVSEA) STAR 2011 June 28 - June 29, 2011, Chicago, USA 1

548 views • 15 slides

Land Regularization in Brazil and the Global Land Grab: A State- making Framework for Analysis

Land Regularization in Brazil and the Global Land Grab: A State- making Framework for Analysis Gustavo de L. T. Oliveira Department of Geography University of California, Berkeley Paper presented at the International Conference on Global Land

383 views • 17 slides

Function Space Priors in Bayesian Deep Learning Roger Grosse Motivation Today Bayesian deep

Function Space Priors in Bayesian Deep Learning Roger Grosse Motivation Today Bayesian deep learning is most often tested on regularization (Bayesian Occams Razor, description length regularization) smoothing the predictions

420 views • 25 slides

Regularization for Multi-Output Learning Lorenzo Rosasco 9.520 L. Rosasco Regularization for

Regularization for Multi-Output Learning Lorenzo Rosasco 9.520 L. Rosasco Regularization for Multi-Output Learning About this class Goal In many practical problems, it is convenient to model the object of interest as a function with multiple

507 views • 47 slides

Iterative regularization for general inverse problems Guillaume Garrigos with L. Rosasco and S.

Iterative regularization for general inverse problems Guillaume Garrigos with L. Rosasco and S. Villa CNRS, cole Normale Suprieure Sminaire CVN - Centrale Suplec - 23 Jan 2018 Regularization of inverse problems 1 Regularization by

720 views • 56 slides

On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study

On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection Vivian Lai and Chenhao Tan @vivwylai | @chenhaotan vivlai.github.io | chenhaot.com University of Colorado Boulder

391 views • 24 slides

THE CORONAVIRUS AND ITS LIKELY IMPACT ON THE BANKRUPTCY WORLD: EIGHT PREDICTIONS FROM TWO

THE CORONAVIRUS AND ITS LIKELY IMPACT ON THE BANKRUPTCY WORLD: EIGHT PREDICTIONS FROM TWO RESTRUCTURING PROFESSIONALS A STATE BAR OF NEVADA VIRTUAL TOWN HALL PRESENTED BY THE BANKRUPTCY LAW SECTION Moderator: James Patrick Shea, Chair,

383 views • 16 slides

LIC-Based Regularization of Multi-Valued Images David Tschumperl CNRS UMR 6072 (GREYC/ENSICAEN)

LIC-Based Regularization of Multi-Valued Images David Tschumperl CNRS UMR 6072 (GREYC/ENSICAEN) - Image Team ICIP2005, Genova, 11-14 September 2005 Data Regularization Aim of regularization consists in transforming a noisy signal into a

583 views • 22 slides

Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 L. Rosasco

Regularization via Spectral Filtering Lorenzo Rosasco MIT, 9.520 Class 7 L. Rosasco Regularization via Spectral Filtering About this class Goal To discuss how a class of regularization methods originally designed for solving ill-posed inverse

889 views • 48 slides

Lecture 3: Regularization I Princeton University COS 495 Instructor: Yingyu Liang What is

Deep Learning Basics Lecture 3: Regularization I Princeton University COS 495 Instructor: Yingyu Liang What is regularization? In general: any method to prevent overfitting or help the optimization Specifically: additional terms in the

467 views • 32 slides

Introduction CSCE 970 CSCE 970 Lecture 3: Lecture 3: Regularization Regularization CSCE 970

Introduction CSCE 970 CSCE 970 Lecture 3: Lecture 3: Regularization Regularization CSCE 970 Lecture 3: Stephen Scott Stephen Scott and Vinod and Vinod Regularization Variyam Variyam Machine learning can generally be distilled to an

551 views • 9 slides

Regularization Methods for System Identification Input Design Biqiang MU Academy of Mathematics

Regularization Methods for System Identification Input Design Biqiang MU Academy of Mathematics and Systems Science, CAS Table of contents 1. Introduction 2. Regularization methods 3. Input design for regularization methods 4. Conclusions

516 views • 48 slides

Analysis of SNP Marker Data for Predictions Some remarks about various methods/software Fikret

Analysis of SNP Marker Data for Predictions Some remarks about various methods/software Fikret Isik, PhD North Carolina State University, Department of Forestry and Environmental Resources These notes were presented during the 3 rd Annual

247 views • 24 slides

9.54 class 8 Supervised learning Optimization, regularization, kernels Shimon Ullman + Tomaso

9.54 class 8 Supervised learning Optimization, regularization, kernels Shimon Ullman + Tomaso Poggio Danny Harari + Daneil Zysman + Darren Seibert 9.54, fall semester 2014 The Regularization Kingdom Loss functions and empirical risk

786 views • 67 slides

A new approach for regularization of inverse problems in image processing I. Souopgui 1 , 2 , E.

Inverse problems : variational formulation Generalized Diffusion regularization Application Conclusion A new approach for regularization of inverse problems in image processing I. Souopgui 1 , 2 , E. Kamgnia 2 , F.-X. Le Dimet 1 , A. Vidard 1

380 views • 25 slides

CMU-Q 15-381 Lecture 15: Predictions in Markov Chains Markov Decision Processes Teacher:

CMU-Q 15-381 Lecture 15: Predictions in Markov Chains Markov Decision Processes Teacher: Gianni A. Di Caro M AKING P REDICTIONS : G ENERAL T WO - STATE MC 6 (7) = 6 (8) . 7 Probability distribution over the states after 9 steps, given

652 views • 33 slides

Lecture 4: Regularization II Princeton University COS 495 Instructor: Yingyu Liang Review

Deep Learning Basics Lecture 4: Regularization II Princeton University COS 495 Instructor: Yingyu Liang Review Regularization as hard constraint Constrained optimization = 1 min (, ,

989 views • 27 slides