statistics and samples in distributional reinforcement
play

Statistics and Samples in Distributional Reinforcement Learning - PowerPoint PPT Presentation

Statistics and Samples in Distributional Reinforcement Learning Mark Rowland, Robert Dadashi, Saurabh Kumar, Rmi Munos, Marc G. Bellemare, Will Dabney ICML 2019 Google Research Brain team Distributional Reinforcement Learning Distributional


  1. Statistics and Samples in Distributional Reinforcement Learning Mark Rowland, Robert Dadashi, Saurabh Kumar, Rémi Munos, Marc G. Bellemare, Will Dabney ICML 2019 Google Research Brain team

  2. Distributional Reinforcement Learning Distributional RL aims to learn Distributional Bellman operator: full return distributions. Return distribution: [Bellemare et al., 2017] Distributional Bellman equation: Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  3. Distributional Reinforcement Learning In practice , we often work with parametric approximate distributions . Non-parametric Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  4. Distributional Reinforcement Learning In practice , we often work with parametric approximate distributions . Non-parametric Categorical [Bellemare et al., 2017] Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  5. Distributional Reinforcement Learning In practice , we often work with parametric approximate distributions . Non-parametric Categorical [Bellemare et al., 2017] Dirac deltas [Dabney et al., 2018] Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  6. Main Contribution: An Alternative Perspective Distributional RL algorithms learn statistical functionals of the return distribution. ● Moments, tail probabilities, expectations, etc. Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  7. Main Contribution: An Alternative Perspective Distributional RL algorithms learn statistical functionals of the return distribution. ● Moments, tail probabilities, expectations, etc. Theory: What properties of return distributions can be learnt through dynamic programming? Algorithmic: A general framework for approximate learning of statistics. Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  8. Main Contribution: An Alternative Perspective Distributional RL algorithms learn statistical functionals of the return distribution. ● Moments, tail probabilities, expectations, etc. Theory: What properties of return distributions can be learnt through dynamic programming? Algorithmic: A general framework for approximate learning of statistics. Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  9. A General Framework for Distributional RL Algorithms Current statistics Bellman-updated statistics Imputation strategy Imputed samples Bellman-updated distribution Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  10. A General Framework for Distributional RL Algorithms Current statistics Bellman-updated statistics Imputation strategy Imputed samples Bellman-updated distribution Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  11. A General Framework for Distributional RL Algorithms Current statistics Bellman-updated statistics Imputation strategy Imputed samples Bellman-updated distribution Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  12. A General Framework for Distributional RL Algorithms Current statistics Bellman-updated statistics Imputation strategy Imputed samples Bellman-updated distribution Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  13. A General Framework for Distributional RL Algorithms Current statistics Bellman-updated statistics Imputation strategy Imputed samples Bellman-updated distribution Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  14. A General Framework for Distributional RL Algorithms Current statistics Bellman-updated statistics Imputation strategy Imputed samples Bellman-updated distribution Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  15. Application: Expectiles We apply this framework to learn expectiles of return distributions. New deep RL agent: Expectile Regression DQN (ER-DQN) , with improved mean performance on Atari-57 relative to QR-DQN. Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  16. Summary A new perspective on distributional RL Theoretical progress on what it is possible to learn A general framework for distributional RL algorithms Statistics and Samples in Distributional Reinforcement Learning — MARK ROWLAND

  17. THANK YOU Poster #113

Recommend


More recommend