learning symmetric and low energy locomotion
play

Learning Symmetric and Low-energy Locomotion Wenhao Yu, Greg Turk, - PowerPoint PPT Presentation

Learning Symmetric and Low-energy Locomotion Wenhao Yu, Greg Turk, C. Karen Liu Georgia Institute of Technology 2 /45 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3591461/ https://www.youtube.com/watch?v=Wz3beWec7D4 2 /45 [Robotis OP2] 3


  1. Learning Symmetric and Low-energy Locomotion Wenhao Yu, Greg Turk, C. Karen Liu Georgia Institute of Technology

  2. 2 /45

  3. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3591461/ https://www.youtube.com/watch?v=Wz3beWec7D4 2 /45

  4. [Robotis OP2] 3 /45

  5. [Yin et al. 2009] 4 /45

  6. [OptiTrack] [Liu et al. 2015] 5 /45

  7. No FSM No Mocap 6 /45

  8. [Tan et al. 2011] [Tan et al. 2014] 7 /45

  9. [Heess et al. 2017] [Peng et al. 2018] 8 /45

  10. Deep Reinforcement Learning 9 /45

  11. 1.0 m/s Deep Reinforcement Learning 3.0 m/s 9 /45

  12. Deep Reinforcement Energy Minimization Gait Symmetry Learning 9 /45

  13. � Deep Reinforcement Energy Minimization Gait Symmetry Learning 10 /45

  14. Deep Reinforcement Learning State: 11 /45

  15. Deep Reinforcement Learning State: 11 /45

  16. State: 12 /45

  17. State: 12 /45

  18. State: Action: 13 /45

  19. State: Action: 14 /45

  20. State: Action: Transition: 14 /45

  21. State: Action: Transition: Reward: 15 /45

  22. Markov Decision Process: { } Control Policy: [ s 0 , a 0 , s 1 , a 1 , . . . , s T ] Rollout: Reinforcement Learning: DeepReinforcement Learning: : π 17 /45

  23. Policy Gradient ∂ L RL ( π ) π new ∂π 18 /45

  24. T ( · ) = R ( s , a ) = − || v − ¯ v || ∗ w v −|| a || 1 ∗ w a − LaternalDeviation ∗ w L − TorsoRotation ∗ w T +AliveBonus 19 /45

  25. � Deep Reinforcement Learning 20 /45

  26. � � Curriculum Learning Deep Reinforcement Energy Minimization Learning 20 /45

  27. all possible gaits what we want Keep Balance Go Forward Energy Efficient 21 /45

  28. all possible gaits typical RL result what we want Keep Balance Somewhat Energy Efficient Go Forward Energy Efficient 21 /45

  29. search space what we want Keep Balance Go Forward Energy Efficient 22 /45

  30. Keep Balance Go Forward Energy Efficient https://www.youtube.com/watch?v=5BiesFPqYWE https://www.youtube.com/watch?v=5a21xnaqCzk 23 /45 https://www.youtube.com/watch?v=LjnZHbFpDxc

  31. lateral push walls waist support propel forward treadmill 24 /45

  32. lateral push walls waist support propel forward treadmill 24 /45

  33. With Assistance 25 /45

  34. Iteration i 26 /45

  35. Iteration i propel strength time (0,0) 26 /45

  36. Iteration i propel strength time (0,0) 26 /45

  37. Iteration i+1 propel strength time (0,0) 26 /45

  38. 27 /45

  39. � � Curriculum Learning Deep Reinforcement Energy Minimization Gait Symmetry Learning 28 /45

  40. Flip 29 /45

  41. Flip 29 /45

  42. Flip left hand height right hand height time 29 /45

  43. left hand height right hand height time 30 /45

  44. Asymmetry = ( ) left hand height right hand height time 30 /45

  45. Reward -= ( ) 31 /45

  46. Reward -= ( ) 31 /45

  47. 32 /45

  48. 32 /45

  49. 32 /45

  50. symmetric symmetric 33 /45

  51. Mirror state Mirror(state) = action Mirror( ) action mirror 34 /45

  52. L RL minimize: action = Mirror( ) action mirror subject to: *weights omitted for clarity 36 /45

  53. 2 - L RL + Mirror( ) action mirror action minimize: 2 *weights omitted for clarity 36 /45

  54. 37 /45

  55. Curriculum Learning Deep Reinforcement Energy Minimization Gait Symmetry Learning 38 /45

  56. Running Stylized 39 /45

  57. Walking Running Walking back 40 /45

  58. Trotting Galloping 41 /45

  59. Slow Fast 42 /45

  60. Ball Walking 43 /45

  61. What’s Next? - Biomechanics-based models - Extend to more agile motions such as gymnastics - Running on real-hardware 44 /45

  62. Thank you! - paper link: https://arxiv.org/abs/1801.08093 - code link: https://github.com/VincentYu68/SymmetryCurriculumLocomotion - my website: wenhaoyu.weebly.com - email: wenhaoyu@gatech.edu 45 /45

Recommend


More recommend