de deep r reinforcement learning i in a a ha handf dful
play

De Deep R Reinforcement Learning i in a a Ha Handf dful of of - PowerPoint PPT Presentation

De Deep R Reinforcement Learning i in a a Ha Handf dful of of Trials ls u using Probabilistic D Dynamics M Models Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine University of California, Berkeley How L Lon ong D


  1. De Deep R Reinforcement Learning i in a a Ha Handf dful of of Trials ls u using Probabilistic D Dynamics M Models Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine University of California, Berkeley

  2. How L Lon ong D Doe oes s Lea earnin ing Take? e? ~50 million frames ~800,000 [Mnih et al. 2015] grasp attempts ~21 million [Levine et al. 2017] games [Silver et al. 2017]

  3. Can Can w we speed t this u up?

  4. Mo Model-Ba Based ed Reinforcem emen ent Learning Optimize Policy Train Dynamics Model Execute Policy

  5. Comparative P Perf rform rmance on Ha HalfCh Chee eetah

  6. Comparative P Perf rform rmance on Ha HalfCh Chee eetah

  7. Determ rministic N Neural Nets as Models

  8. Determ rministic N Neural Nets as Models

  9. Determ rministic N Neural Nets as Models

  10. Determ rministic N Neural Nets as Models

  11. Determ rministic N Neural Nets as Models

  12. Probabilisti tic Neural N Nets ts a as Models

  13. Probabilisti tic Ensembles as Models

  14. Probabilisti tic Ensembles as Models

  15. Trajec ector ory S Sampling f g for State Prop opagation on

  16. Trajec ector ory S Sampling f g for State Prop opagation on

  17. Trajec ector ory S Sampling f g for State Prop opagation on

  18. Trajec ector ory S Sampling f g for State Prop opagation on

  19. Trajec ector ory S Sampling f g for State Prop opagation on

  20. Trajec ector ory S Sampling f g for State Prop opagation on

  21. Trajec ector ory S Sampling f g for State Prop opagation on

  22. Trajec ector ory S Sampling f g for State Prop opagation on

  23. Trajec ector ory S Sampling f g for State Prop opagation on

  24. Trajec ector ory S Sampling f g for State Prop opagation on

  25. Ex Experi rimental Results

  26. De Deep R Reinforcement Learning i in a a Ha Handf dful of Trials of ls u using Probabilistic D Dynamics M Models Poster #165 Code: https://github.com/kchua/handful-of-trials Website: https://sites.google.com/view/drl-in-a-handful-of-trials  Data efficient  Competitive asymptotic performance  Easy to implement Roberto Calandra Rowan McAllister Sergey Levine Kurtland Chua

Recommend


More recommend