representing movement primitives as implicit dynamical
play

Representing Movement Primitives as Implicit Dynamical Systems - PowerPoint PPT Presentation

Representing Movement Primitives as Implicit Dynamical Systems learned from Multiple Demonstrations Robert Krug and Dimitar Dimitrov Center for Applied Autonomous Sensor Systems (AASS) rebro University, Sweden robert.krug@oru.se Robert Krug


  1. Representing Movement Primitives as Implicit Dynamical Systems learned from Multiple Demonstrations Robert Krug and Dimitar Dimitrov Center for Applied Autonomous Sensor Systems (AASS) Örebro University, Sweden robert.krug@oru.se Robert Krug ICAR 2013 1 / 12

  2. Dynamical Movement Primitives (DMP) [Ijspeert et al., 2002] Feedback controllers in joint/task space . . . . . . formulated as one dynamical system per DoF: ˙ x ( t ) = f ( x ( t ) , s ( t )) Common phase variable s ( t ) to synchronize DoF Robert Krug ICAR 2013 2 / 12

  3. Dynamical Movement Primitives (DMP) [Ijspeert et al., 2002] Feedback controllers in joint/task space . . . . . . formulated as one dynamical system per DoF: ˙ x ( t ) = f ( x ( t ) , s ( t )) Common phase variable s ( t ) to synchronize DoF � t “On-the-fly” motion profile generation: x ( t ) = 0 f ( x ( τ ) , s ( τ )) d τ Robert Krug ICAR 2013 2 / 12

  4. Motivation Outline Motivation 1 Concept 2 Results 3 Contributions & Outlook 4 Robert Krug ICAR 2013 2 / 12

  5. Motivation Why use primitive motion controllers? Generate desired motions for a platform with many DoF Shadow Hand & Arm with 24 DoF Robert Krug ICAR 2013 3 / 12

  6. Motivation Why use primitive motion controllers? Generate desired motions for a platform with many DoF Controllers ˙ x = f ( x , s ) are state policies Replaces explicit planning Disturbance compensation Time synchronization of arbitrary many DoF Shadow Hand & Arm with 24 DoF Robert Krug ICAR 2013 3 / 12

  7. Motivation Why use primitive motion controllers? Generate desired motions for a platform with many DoF Controllers ˙ x = f ( x , s ) are state policies Replaces explicit planning Disturbance compensation Time synchronization of arbitrary many DoF Motions resemble demonstrations Simple implementation Shadow Hand & Arm with 24 DoF Robert Krug ICAR 2013 3 / 12

  8. Motivation What’s the problem? DMP [Ijspeert et al., 2002]: Stable spring excited by a learned control input u � q � ∈ R 2 x ( t ) = f ( x , s ) = Ax ( t ) + B u ( s ; p ) , x = ˙ q ˙ � �� � � �� � spring learned p Robert Krug ICAR 2013 4 / 12

  9. Motivation What’s the problem? DMP [Ijspeert et al., 2002]: Stable spring excited by a learned control input u � q � ∈ R 2 x ( t ) = f ( x , s ) = Ax ( t ) + B u ( s ; p ) , x = ˙ q ˙ � �� � � �� � spring learned p Problem: One-shot learning → undesirable behavior in regions not covered by the demonstration Robert Krug ICAR 2013 4 / 12

  10. Motivation What’s the problem? DMP [Ijspeert et al., 2002]: Stable spring excited by a learned control input u � q � ∈ R 2 x ( t ) = f ( x , s ) = Ax ( t ) + B u ( s ; p ) , x = ˙ q ˙ � �� � � �� � spring learned p Problem: One-shot learning → undesirable behavior in regions not covered by the demonstration Solution: Capture different dynamics from multiple demonstrations [Ude et al., 2010][Forte et al., 2012] Robert Krug ICAR 2013 4 / 12

  11. Motivation What’s the problem? DMP [Ijspeert et al., 2002]: Stable spring excited by a learned control input u � q � ∈ R 2 x ( t ) = f ( x , s ) = Ax ( t ) + B u ( s ; p ) , x = ˙ q ˙ � �� � � �� � spring learned p Problem: One-shot learning → undesirable behavior in regions not covered by the demonstration Solution: Capture different dynamics from multiple demonstrations [Ude et al., 2010][Forte et al., 2012] Presented approach → locally optimal combination: D ∑ x ( t ) = Ax ( t )+ B λ d ( t ) u d ( s ; p d ) ˙ d = 1 Robert Krug ICAR 2013 4 / 12

  12. Concept Outline Motivation 1 Concept 2 Results 3 Contributions & Outlook 4 Robert Krug ICAR 2013 4 / 12

  13. Concept Re-compute the dynamical system online Optimize combination of pre-learned control inputs at each time step k . . . D ∑ x [ k ] = Ax [ k ]+ B λ d [ k ] u d [ k ] ˙ d = 1 . . . by minimizing a distance criterion between current and demonstrated states Robert Krug ICAR 2013 5 / 12

  14. Concept Re-compute the dynamical system online Optimize combination of pre-learned control inputs at each time step k . . . D ∑ x [ k ] = Ax [ k ]+ B λ d [ k ] u d [ k ] ˙ d = 1 . . . by minimizing a distance criterion between current and demonstrated states States evolve “in between” demonstrations . . . . . . or get “pulled” onto them with dynamics governed by A Encodes different dynamics Robert Krug ICAR 2013 5 / 12

  15. Concept Re-compute the dynamical system online Optimize combination of pre-learned control inputs at each time step k . . . D ∑ x [ k ] = Ax [ k ]+ B λ d [ k ] u d [ k ] ˙ d = 1 . . . by minimizing a distance criterion between current and demonstrated states States evolve “in between” demonstrations . . . . . . or get “pulled” onto them with dynamics governed by A Encodes different dynamics First step towards Model Predictive Control with state constraints Robert Krug ICAR 2013 5 / 12

  16. Concept How does it work? Robert Krug ICAR 2013 6 / 12

  17. Results Outline Motivation 1 Concept 2 Results 3 Contributions & Outlook 4 Robert Krug ICAR 2013 6 / 12

  18. Results Generalization in simulation Robert Krug ICAR 2013 7 / 12

  19. Results Disturbance rejection in simulation Robert Krug ICAR 2013 8 / 12

  20. Results Evaluation on the Shadow Robot platform Grasp motions recorded with a sensorized glove . . . . . . and used to learn primitive controllers for the Shadow Hand Robert Krug ICAR 2013 9 / 12

  21. Results Evaluation on the Shadow Robot platform Robert Krug ICAR 2013 10 / 12

  22. Contributions & Outlook Outline Motivation 1 Concept 2 Results 3 Contributions & Outlook 4 Robert Krug ICAR 2013 10 / 12

  23. Contributions & Outlook To sum up . . . Contributions: Learn motion controllers from multiple demonstrations . . . . . . and form a (locally) optimal combination to generate movements Allows to encode fundamentally different dynamics Predictable behavior without explicit costly motion planning! Robert Krug ICAR 2013 11 / 12

  24. Contributions & Outlook To sum up . . . Contributions: Learn motion controllers from multiple demonstrations . . . . . . and form a (locally) optimal combination to generate movements Allows to encode fundamentally different dynamics Predictable behavior without explicit costly motion planning! Future work: Optimize over a time window → Model Predictive Control Incorporate spatial & temporal state space constraints (obstacle avoidance . . . ) Reactive on-line planning & control scheme [Anderson et al., 2012] Robert Krug ICAR 2013 11 / 12

  25. Contributions & Outlook That’s it . . . Robert Krug ICAR 2013 12 / 12

  26. References References Anderson, S., Karumanchi, S., and Iagnemma, K. (2012). Constraint-based planning and control for safe, semi-autonomous operation of vehicles. In IEEE Intelligent Vehicles Symposium, pages 383 – 388. Forte, D., Gams, A., Morimoto, J., and Ude, A. (2012). On-line motion synthesis and adaptation using a trajectory database. Robotics and Autonomous Systems, 60(10):1327 – 1339. Ijspeert, A., Nakanishi, J., and Schaal, S. (2002). Movement imitation with nonlinear dynamical systems in humanoid robots. In Proc. of the IEEE Int. Conf. on Robotics and Automation, volume 2, pages 1398 – 1403. Ude, A., Gams, A., Asfour, T., and Morimoto, J. (2010). Task-specific generalization of discrete and periodic dynamic movement primitives. IEEE Transactions on Robotics, 26(5):800 – 815. Robert Krug ICAR 2013 12 / 12

Recommend


More recommend