guided policy search
play

Guided Policy Search Sergey Levine Learning on PR2 Shape sorting - PowerPoint PPT Presentation

Guided Policy Search Sergey Levine Learning on PR2 Shape sorting cube Visuomotor Policies Guided Policy Search trajectory optimization supervised learning expectation under current policy trajectory distribution(s) Lagrange multiplier


  1. Guided Policy Search Sergey Levine

  2. Learning on PR2

  3. Shape sorting cube

  4. Visuomotor Policies

  5. Guided Policy Search trajectory optimization supervised learning

  6. expectation under current policy trajectory distribution(s) Lagrange multiplier

  7. Supervised Learning Objective

  8. Trajectory Optimization (without GPS)

  9. Trajectory Optimization

  10. Trajectory Optimization new old [see Levine & Abbeel ‘14 for details]

  11. [see L. et al. NIPS ‘14 for details]

  12. Trajectory Optimization (with GPS)

  13. [see L. et al. NIPS ‘14 for details]

  14. Instrumented Training training time test time

  15. ~ 92,000 parameters Chelsea Finn

  16. Experimental Tasks

  17. Shape sorting cube

  18. Hanger

  19. Hammer

  20. Bottle

  21. Igor Mordatch Locomotion better trajectory optimization + large scale simulation

  22. Igor Mordatch Darwin Robot better trajectory optimization + large scale simulation + adaptation to real world dynamics Mordatch, Mishra, Eppner, Abbeel

  23. Guided Policy Search Applications manipulation dexterous hands with N. Wagener and P. Abbeel with V. Kumar and E. Todorov locomotion aerial vehicles tensegrity robot with G. Kahn, T. Zhang, P. Abbeel with M. Zhang, K. Caluwaerts, P. Abbeel with V. Koltun

  24. DAGGER typically 0.0, except when i = 1, then 1.0

  25. DAGGER Video See http://videolectures.net/aistats2011_ross_reduction/

  26. Trajectory Optimization – Dynamics Fitting

  27. [see L. et al. NIPS ‘14 for details]

  28. Learned Motion Skills

  29. More Visuomotor Experiments

  30. Beyond Instrumented Training training time test time Finn, Tan, Duan, Darrell, L., Abbeel ‘15

  31. Learning Visual State Spaces

  32. Visual State Space Experiments

Recommend


More recommend