kernel based reinforcement learning in robust markov
play

Kernel-based Reinforcement Learning in Robust Markov Decision - PowerPoint PPT Presentation

Kernel-based Reinforcement Learning in Robust Markov Decision Processes Shiau Hong Lim, Arnaud Autef Motivation Robust Markov Decision Process (MDP) framework Tackle model mismatch and parameter uncertainty Previously, for state


  1. Kernel-based Reinforcement Learning in Robust Markov Decision Processes Shiau Hong Lim, Arnaud Autef

  2. Motivation • Robust Markov Decision Process (MDP) framework – Tackle model mismatch and parameter uncertainty – Previously, for state aggregation, performance bound on improved via robust policies: 12/6/2019 Arnaud Autef - ICML 2019 2

  3. Contribution 1. Robust performance bound improvement on extended to the general kernel averager setting 2.Formulation of a practical kernel-based robust algorithm, with empirical results on benchmark tasks 12/6/2019 Arnaud Autef - ICML 2019 3

  4. Kernel-based approach 1.MDP to solve 2.Kernel averager and representative states to approximate the value function: and 12/6/2019 Arnaud Autef - ICML 2019 4

  5. Kernel-based approach 2.Define a non-trivial robust MDP with states = representative states 3.Obtain optimal robust value in 4.Derive in greedy w.r.t , with: 12/6/2019 Arnaud Autef - ICML 2019 5

  6. Theoretical Result Theorem : optimal robust value in , greedy policy w.r.t , optimal value in : ∗ – � � � ∗ – Function approximator limitations � � ∗ Smoothness – � � � � � � 12/6/2019 Arnaud Autef - ICML 2019 6

  7. Practical algorithm 1.Second kernel averager to approximate the MDP model from data 2.Solve with the approximate robust Bellman operator: With Robustness parameter 12/6/2019 Arnaud Autef - ICML 2019 7

  8. Experiments: Acrobot 12/6/2019 Arnaud Autef - ICML 2019 8

  9. Acrobot 12/6/2019 Arnaud Autef - ICML 2019 9

  10. Experiments: Double Pole Balancing 12/6/2019 Arnaud Autef - ICML 2019 10

  11. Double Pole Balancing 12/6/2019 Arnaud Autef - ICML 2019 11

  12. Conclusion • Theoretical performance guarantees for robust kernel-based reinforcement learning in • Significant empirical benefits from robustness, even stronger with model mismatch (real-world settings) 12/6/2019 Arnaud Autef - ICML 2019 12

  13. Thank you! Please come to see our poster tonight Shiau Hong Lim, Arnaud Autef

Recommend


More recommend