De Deep R Reinforcement Learning i in a a Ha Handf dful of of Trials ls u using Probabilistic D Dynamics M Models Kurtland Chua, Roberto Calandra, Rowan McAllister, Sergey Levine University of California, Berkeley
How L Lon ong D Doe oes s Lea earnin ing Take? e? ~50 million frames ~800,000 [Mnih et al. 2015] grasp attempts ~21 million [Levine et al. 2017] games [Silver et al. 2017]
Can Can w we speed t this u up?
Mo Model-Ba Based ed Reinforcem emen ent Learning Optimize Policy Train Dynamics Model Execute Policy
Comparative P Perf rform rmance on Ha HalfCh Chee eetah
Comparative P Perf rform rmance on Ha HalfCh Chee eetah
Determ rministic N Neural Nets as Models
Determ rministic N Neural Nets as Models
Determ rministic N Neural Nets as Models
Determ rministic N Neural Nets as Models
Determ rministic N Neural Nets as Models
Probabilisti tic Neural N Nets ts a as Models
Probabilisti tic Ensembles as Models
Probabilisti tic Ensembles as Models
Trajec ector ory S Sampling f g for State Prop opagation on
Trajec ector ory S Sampling f g for State Prop opagation on
Trajec ector ory S Sampling f g for State Prop opagation on
Trajec ector ory S Sampling f g for State Prop opagation on
Trajec ector ory S Sampling f g for State Prop opagation on
Trajec ector ory S Sampling f g for State Prop opagation on
Trajec ector ory S Sampling f g for State Prop opagation on
Trajec ector ory S Sampling f g for State Prop opagation on
Trajec ector ory S Sampling f g for State Prop opagation on
Trajec ector ory S Sampling f g for State Prop opagation on
Ex Experi rimental Results
De Deep R Reinforcement Learning i in a a Ha Handf dful of Trials of ls u using Probabilistic D Dynamics M Models Poster #165 Code: https://github.com/kchua/handful-of-trials Website: https://sites.google.com/view/drl-in-a-handful-of-trials Data efficient Competitive asymptotic performance Easy to implement Roberto Calandra Rowan McAllister Sergey Levine Kurtland Chua
Recommend
More recommend