Parameter Space Noise for Exploration Matthias Plappert, Rein - PowerPoint PPT Presentation

Oct 30, 2022 •27 likes •217 views

Parameter Space Noise for Exploration Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, and Marcin Andrychowicz 1 Let the Noise Flo - Flo Rida 2 Background

Parameter Space Noise for Exploration Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, and Marcin Andrychowicz 1
“Let the Noise Flo” - Flo Rida 2
Background – Reinforcement Learning 3
Parameter Space Noise – Motivation 4
Parameter Space Noise – Formulation We sample the noise at the beginning of each rollout, and keep it fixed for the duration of the rollout. 5
Parameter Space Noise – Formulation 6
Parameter Space Noise – Problems 7
Parameter Space Noise – Problems 8
Parameter Space Noise – Problems 9
Parameter Space Noise – Problem 1 Adding noise to now perturbs activations which are normalized to zero mean and unit variance more sensitivity to mean noise Each layer would have similar sensitivity to 10
Parameter Space Noise – Problem 2 11
          Parameter Space Noise – Experiments (1) We test for exploration on a simple but scalable toy environment [1] Chains of length N with initial state . Each episode lasts N + 9 steps, algorithm successful if it can get the optimal reward of 10.   Experiments on DQN with different exploration methods [1] “Deep exploration via Bootstrapped DQN”, Osband et al., 2016 12
Parameter Space Noise – Experiments (2) 13
Parameter Space Noise – Experiments (3) 14
Parameter Space Noise – Experiments (4) Evaluation on 7 MuJoCo continuous control problems   DDPG with different exploration methods   Exploration of additive Gaussian noise (left) vs. parameter space noise (right) 15
Parameter Space Noise – Experiments (5) 16
Parameter Space Noise – Conclusion Conceptually simple concept designed as a drop-in replacement for action space noise (or as an addition)   Often leads to better performance due to better exploration   Especially helps when exploration is especially important (i.e. sparse rewards)   Seems to escape local optima (e.g. HalfCheetah)   Works for off- and on-policy algorithms for discrete and continuous action spaces   17
Parameter Space Noise – Related Work Concurrently to our work, DeepMind has proposed “Noisy Networks for Exploration”, Fortunato et al., 2017   “Deep Exploration via Bootstrapped DQN”, Osband et al., 2016   “Evolution strategies as a scalable alternative to reinforcement learning”, Salimans et al., 2017   “State-dependent exploration for policy gradient methods”, Rückstieß et al., 2008   And a lot of other papers on the general topic of exploration in RL 18
Thank you! 19

Recommend

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple Temperature Sensor Modules Page 1 Reference: Prof. Ali Niknejad, Two Port Noise Analysis, 2015 Why do we refer noise to input? Modules Page 2 noise

310 views • 7 slides

Visioning Committee Air Quality and Noise January 23, 2020 Noise Data Noise is evaluated on

Visioning Committee Air Quality and Noise January 23, 2020 Noise Data Noise is evaluated on intensity, duration, and area impacted S ource: FAA Noise Contour Map 2 ICAO Aircraft Certification - Noise Reference Points (1476 ft.) (6562

670 views • 24 slides

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Lecture 19- ECE 240a Phase Noise Phase Noise Power Spectrum RIN Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise Lecture 19- ECE 240a Phase Noise Phase Noise Assume laser is operating well

482 views • 16 slides

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in computation human error 2 Noise in computation human error malicious third party 2 Noise in computation human randomness error malicious

1.94k views • 124 slides

Johnson Noise: Determinations of k and Absolute Zero Edwin Ng | 12 December 2011 Nyquists

Johnson Noise: Determinations of k and Absolute Zero Edwin Ng | 12 December 2011 Nyquists Theory of Johnson Noise Johnson noise is thermal noise in circuits Nyquists Theory of Johnson Noise Johnson noise is thermal noise in circuits

290 views • 25 slides

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random: gives the appearance of randomness Determinism: same input gives the same result every time White noise ? Dimensions ? Dimensions Better noise

555 views • 18 slides

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

837 views • 20 slides

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

CS 381 Spring 2016 6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void f(int x, int y) { y := x+1 }; x := 3; What is the value of z? z := 1; f(2*x,z); ... it depends on the parameter

244 views • 22 slides

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning Control Examples Where to apply parameter control How to apply parameter control Parameter Control Motivation Motivation An EA

659 views • 7 slides

NOISE AT WORK AWARENESS SESSION FOR WORKERS WHAT IS NOISE Noise is all around us at home,

LIFE NEEDS SOUND NOISE AT WORK AWARENESS SESSION FOR WORKERS WHAT IS NOISE Noise is all around us at home, at leisure and at work If noise is too loud and we are exposed for too long it can damage our hearing and afgect our safety at

249 views • 14 slides

Noise Barrier Meeting March 12, 2019 WHY ARE WE HERE TONIGHT? Noise Barrier Final Design Noise

I-69 S Sec ectio ion 6, 6, M Mar artinsville lle Noise Barrier Meeting March 12, 2019 WHY ARE WE HERE TONIGHT? Noise Barrier Final Design Noise Barrier Design Noise Analysis Survey Recommended Key Terms View of Residents

180 views • 17 slides

Widening and Improvements Noise Review: Grant Road Hampton St to Santa Rita Rd January 13, 2016

Grant Road Widening and Improvements Noise Review: Grant Road Hampton St to Santa Rita Rd January 13, 2016 Bill Holliday, P.E. Noise Expert, LLC Study Area for Noise Noise Study Steps 1. Measure Existing Noise and Calibrate Noise Model 2.

391 views • 15 slides

Noise Programs & NextGen Briefing Stan Shepherd, Manager Airport Noise Programs 1

Noise Programs & NextGen Briefing Stan Shepherd, Manager Airport Noise Programs 1 Presentation Overview Noise Information Hotline Increased Operations Economic Impacts Noise Contours Noise Mitigation Programs Fight

645 views • 20 slides

HLT MET Noise Filters in Run2011B Alex Mott Caltech Review of Noise Filters HBHE noise

HLT MET Noise Filters in Run2011B Alex Mott Caltech Review of Noise Filters HBHE noise filters were deployed online earlier this year Based on the offline Hcal HBHE Noise filters used in the PromptReco For HLT, add an additional

338 views • 11 slides

and Production Noise Impact Evaluation and Mitigation The Basics of Sound and Noise Impact

Oil & Gas Exploration and Production Noise Impact Evaluation and Mitigation The Basics of Sound and Noise Impact Evaluations What Is Noise? Noise is unwanted sound which may be hazardous to health, interfere with speech

857 views • 50 slides

Parameter Passing and Pointers Parameter passing and functions I: reference parameters

Parameter Passing and Pointers Parameter Passing and Pointers Outline Parameter Passing and Pointers Parameter passing and functions I: reference parameters call-by-value vs call-by-name call-by-name by using reference parameters Martin Emms

60 views • 4 slides

Portable Designs for Performance Using the Hybrid Task Graph Scheduler Tim Blattner NIST | ITL |

Portable Designs for Performance Using the Hybrid Task Graph Scheduler Tim Blattner NIST | ITL | SSD | ISG Disclaimer No approval or endorsement of any commercial product by NIST is intended or implied. Certain commercial software, products,

652 views • 62 slides

SOLUTION OF LINEAR ALGEBRAIC EQUATIONS I. Hajj 2017 Linear Equation Solution Methods Consider a

ECE 552 Numerical Circuit Analysis Chapter Three SOLUTION OF LINEAR ALGEBRAIC EQUATIONS I. Hajj 2017 Linear Equation Solution Methods Consider a set of n linear algebraic equations Ax = b where A is a real or complex n x n nonsingular matrix.

1.18k views • 96 slides

Build a Buoy- Writing a Newspaper Article Assignment Science students will be constructing buoys

Build a Buoy- Writing a Newspaper Article Assignment Science students will be constructing buoys in order to break the world record for the number of golf balls that a buoy can hold. In language arts, students will be observing students as

245 views • 11 slides

Presentation Preparation Checklist As a presenter, you are the backbone of the conference.

you are part of a panel, demonstration, think tank, etc., determine with Ask about your colleagues presentations and coordinate content to limit reserved for audience questions and a discussant, and the sequence Ask about length of time for

415 views • 3 slides

Learning with Marginalized Corrupted Features L. van der Maaten, M. Chen, S. Tyree, K. Weinberger

Learning with Marginalized Corrupted Features L. van der Maaten, M. Chen, S. Tyree, K. Weinberger ICML 2013 Jan Gasthaus Tea talk April 11, 2013 1 / 17 2 / 17 2 / 17 2 / 17 2 / 17 2 / 17 2 / 17 2 / 17 2 / 17 Data Augmentation Secret

446 views • 28 slides

LMS Adaptive Equalizer Matlab Project S-88.2111 Signal Processing in Telecommunications 1

LMS Adaptive Equalizer Matlab Project S-88.2111 Signal Processing in Telecommunications 1 Spring 2009, Lecture Period IV written by Mobien Mohammed Signal Processing Laboratory Helsinki University of Technology Outline 1. System Model 2.

312 views • 13 slides

Is Gauss quadrature better than Clenshaw-Curtis? (paper submitted Nick Trefethen to SIAM Review

Is Gauss quadrature better than Clenshaw-Curtis? (paper submitted Nick Trefethen to SIAM Review ) Oxford University For f C[ 1,1], define n 1 I n = I = w k f ( x k ) f ( x ) dx , 1 k =0 where { x k } are nodes in [

279 views • 16 slides

Gaussian Quadrature September 25, 2011 Interpolation Approximation of integrals Approximation

Interpolation Gaussian Quadrature September 25, 2011 Interpolation Approximation of integrals Approximation of integrals by quadrature Many definite integrals cannot be computed in closed form, and must be approximated numerically.

197 views • 16 slides

Parameter Space Noise for Exploration Matthias Plappert, Rein - PowerPoint PPT Presentation

Parameter Space Noise for Exploration Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, and Marcin Andrychowicz 1 Let the Noise Flo - Flo Rida 2 Background

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Visioning Committee Air Quality and Noise January 23, 2020 Noise Data Noise is evaluated on

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Johnson Noise: Determinations of k and Absolute Zero Edwin Ng | 12 December 2011 Nyquists

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -&gt; value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -&gt; value Pseudo-random:

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

NOISE AT WORK AWARENESS SESSION FOR WORKERS WHAT IS NOISE Noise is all around us at home,

Noise Barrier Meeting March 12, 2019 WHY ARE WE HERE TONIGHT? Noise Barrier Final Design Noise

Widening and Improvements Noise Review: Grant Road Hampton St to Santa Rita Rd January 13, 2016

Noise Programs &amp; NextGen Briefing Stan Shepherd, Manager Airport Noise Programs 1

HLT MET Noise Filters in Run2011B Alex Mott Caltech Review of Noise Filters HBHE noise

and Production Noise Impact Evaluation and Mitigation The Basics of Sound and Noise Impact

Parameter Passing and Pointers Parameter passing and functions I: reference parameters

Portable Designs for Performance Using the Hybrid Task Graph Scheduler Tim Blattner NIST | ITL |

SOLUTION OF LINEAR ALGEBRAIC EQUATIONS I. Hajj 2017 Linear Equation Solution Methods Consider a

Build a Buoy- Writing a Newspaper Article Assignment Science students will be constructing buoys

Presentation Preparation Checklist As a presenter, you are the backbone of the conference.

Learning with Marginalized Corrupted Features L. van der Maaten, M. Chen, S. Tyree, K. Weinberger

LMS Adaptive Equalizer Matlab Project S-88.2111 Signal Processing in Telecommunications 1

Is Gauss quadrature better than Clenshaw-Curtis? (paper submitted Nick Trefethen to SIAM Review

Gaussian Quadrature September 25, 2011 Interpolation Approximation of integrals Approximation

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Noise Programs & NextGen Briefing Stan Shepherd, Manager Airport Noise Programs 1