LCCC - Learning and Adaptation for Sensorimotor Control, October 24-26, Lund University, Sweden Measuring Motion Complexity and Its Applications to Learning of Motion Skills Hanyang University, Seoul, Korea October 24, 2018 Il Hong Suh
Contents 1. What motion will be more complex? 2. What motion skill will be better learned first? 3. What and where to attend to learn from demonstrations? 2
1. What motion will be more complex? 2. What motion skill will be better learned first? 3. What and where to attend to learn from demonstrations? 3
Do you think what motion is complex? circle line Random stroke alphabets rectangle High complexity Low complexity Low complexity order disorder 4
Neural Complexity Measure (1/2) Liquid Neural Complexity (G. Tononi, Science 1998) n k k C X I X I X N j n k 1 Crystal Ideal gas Random Regular Random + Regular 5
Neural Complexity Measure (2/2) Mondrian Pollock Bosch Low Randomness High Randomness High Randomness Simple ! Simple ! Complex ! Liquid [Objective] Calculating Motion Complexity [Problem] [Neural Complexity] → Intractable computation complexity (ensemble average of all possible subsystems ) * in time-varying motion trajectories Crystal Ideal gas 6
Motion Complexity and Motion Significance Example – ‘Pouring’ task Spatial entropy Temporal entropy high low medium high medium low Quick pouring water into a bowl, Normally pouring water into a cup, Slow pouring water into a bottle, which has a large-size mouth which has a medium-size mouth which has a small-size mouth - Definitions - Motion significance indicates the relative significance of each motion frame to accomplish the goal of a task at every time index of human demonstrations. Motion complexity indicates how complex a whole set of human demonstrations is to learn. - How to measure - Motion significance is measured by considering both spatial entropy and temporal entropy of a motion frame, based on the analysis of Gaussian mixtures. Motion complexity is defined by measuring the averaged amount of motion significance involved in an entire set of human demonstrations. 7
ST-GMM based Motion Complexity/Motion Significance Motion Significance Three Motion Trajectories Gaussian Mixture Model 𝐿 𝜐 = Temporal 𝜕 𝑗 ∙ −log 𝜕 𝑗 + 1 𝐼 𝑗 𝜐 2log 2𝜌𝑓 Σ 𝑗 Entropy 𝑗=1 𝐿 𝑌 = Spatial 𝜕 𝑗 ∙ −log 𝜕 𝑗 + 1 2log 2𝜌𝑓 𝐸 Σ 𝑗 𝐼 𝑗 𝑌 Entropy 𝑗=1 for temporal entropy 𝑇 𝑢 = 𝑨𝑡𝑑𝑝𝑠𝑓 𝐼 𝜐 (𝑢) 𝐿 Motion 𝑄 Ψ = 𝜕 𝑗 ∙ 𝑂 Ψ|𝜈 𝑗 , Σ 𝑗 Significance 𝑨𝑡𝑑𝑝𝑠𝑓 𝐼 𝑌 (𝑢) 𝑗=1 𝜐 𝜐𝑌 Σ 𝑗 Σ 𝑗 ∗ 𝐼 𝜐 (𝑢) , 𝐼 𝑌 (𝑢) : Interpolated temporal and spatial entropies of all GMMs 𝜐 𝑌 𝜈 𝑗 = 𝜈 𝑗 𝜈 𝑗 Σ 𝑗 = where , 𝑌𝜐 𝑌 Σ 𝑗 Σ 𝑗 Motion Complexity Significance/Complexity Liquid for spatial entropy 𝑈 C = 1 𝑈 𝑇(𝑢) Ideal gas Crystal 𝑢=1 Spatial Temporal Entropy Entropy 8 Regularity Reference Paper: Il Hong Suh, Sang Hyong Lee, Nam Jun Cho, Woo Young Kwon, Measuring Motion Significance and Motion Complexity, Journal of Information Science ,Vol388-389, May 2017
Do you think what motion is complex? Motion Complexity 1.4 1.2 1 0.8 0.6 0.4 0.2 circle 0 line circle rectangle alphabets Random stroke Motion Complexity line Random stroke alphabets rectangle High complexity Low complexity Low complexity order disorder 9
What motion will be more complex and significant? Motion Complexity Motion Significance 10 I. H. Suh et al., “Measuring motion significance and motion complexity,” Information Sciences, 388, 84 -98, 2017.
1. What motion will be more complex? 2. What motion skill will be better learned first? 3. What and where to attend to learn from demonstrations? 11
What motion skill will be better learned first in fitting task? triangle rectangle irregular concave hexagon Objective: When human demonstrates how to fit a shape, the robot has to learn fitting other two shapes by using pre-demonstrated motion as well as RL. Q1) What fitting motion skill is more complex among triangle-, rectangle-, and hexagon-shaped fitting?? Q2) For effective learning and effective learning transfer, Complex one needs to be learned first? Or simpler one needs to be learned first? 12
Overview of Learning Process Extracting a Set of Human Demonstrations Reaction force/torque through F/T sensor, ① force signals for control, position/rotation of end-effector Clustering Reaction Force/Torque (Calculating Motion Complexity) ② Subsets of data grouped by the clustering ⑤ Modeling HMMs 1 Modeling DMPs 2 ③ (for recognition) (for control) Reaction force/torque from Policy parameters ④ Improved Policy improved of DMPs parameters of DMPs policy Performing PoWER 3 1 HMM(Hidden Markov Model): to model reaction force/torque according to the directions of inserting pegs 2 DMP(Dynamic Movement Primitive): to model control signals 13 3 PoWER(Policy Learning by Weighting Exploration with the Returns): to improve policy parameters through RL
DMP and PoWER for RL Representation of Motor Skills Representation of Motor Skills Extension of Policy Learning by Weighting Exploration with Dynamic Movement Primitives the Returns (PoWER) to Optimize and Transfer Motor Skills Reward Function for RL 14
Clustering Reaction F/T Signals in Fitting Task Triangle Rectangle Hexagon With only Reaction F/T signals x: Initial Robot End-Effector Position
Motion Complexity in Fitting Tasks Reaction Force/Torque Clustering triangle rectangle irregular concave hexagon Calculating temporal and spatial entropies in every cluster Motion Complexity 1.4 1.177 1.2 Calculating motion 1 0.877 complexity in every 0.8 cluster 0.631 0.6 0.4 Calculating motion 0.2 complexity of a task 0 by summing all Triangle Rectangle Hexagon motion complexities Triangle Rectangle Hexagon 16 * Motion complexity calculated using reaction force/torque signals
Three Sequences of Task Transfer through RL (1/6) Known Unknown Unknown [Simple- to- Complex] triangle rectangle hexagon [Complex- to- Simple] triangle rectangle hexagon [Random] 17 hexagon rectangle triangle
Three Sequences of Task Transfer through RL (2/6) Known Unknown Unknown [Simple- to- Total Complex] 368 triangle rectangle hexagon # of iterations # of iterations 190 (A) 178 (B) (A) (B) 18
Three Sequences of Task Transfer through RL (3/6) Known Unknown Unknown [Complex- to- Total Simple] 237 triangle hexagon rectangle # of iterations # of iterations 136 (A) 101 (B) (A) (B) 19
Three Sequences of Task Transfer through RL (4/6) Known Unknown Unknown [Random] Total 539 hexagon rectangle triangle # of iterations # of iterations 431 (A) 108 (B) (A) (B) 20
Three Sequences of Task Transfer through RL (5/6) Known Unknown Unknown 190 178 [Simple- to- Total Complex] 368 triangle rectangle hexagon [Complex- 136 101 to- Total Simple] 237 triangle rectangle hexagon 431 108 [Random] Total 539 21 hexagon rectangle triangle
Three Sequences of Task Transfer through RL (6/6) # of iterations 600 500 400 300 200 100 0 Simple-to-Complex Complex-to-Simple Random • When human can provide demonstrations: Transfer task skills through the sequence of [Complex-to-Simple]. • When human cannot provide demonstrations: Transfer task skills through the sequence of [Simple-to-Complex]. 22
RL Considering Task Execution Time in Fitting Task Policy Learning by Weighting Exploration with the Returns Imitation Learning Clustering Modeling RL 23
1. What motion will be more complex? 2. What motion skill will be better learned first? 3. What and where to attend to learn from demonstrations? 24
Where to Attend? What to Attend? [00:00:45] This ape should be able to find and learn attentive and significant intentions(joint relations) in the human demonstration. How to find this? and By what measure? 25
Two Paradigms of Existing PbD Approaches Task-Sequence Motor Skill Learning Learning Deep Relational Visuomotor Learning Policies Task-sequence Task Dynamic Planning Parameterized Movement Models Primitives Concept PoWER Learning Symbolic Planning Motor Policy Primitives Guided Search • Trajectory Learning • Sequential behaviors • Motion Optimization/ • Serial order in behavior Generalization • High-level Learning • Law-level Learning … … 26
Task-sequence Learning : Learning Preconditions&Effects Subtask Time(t) Subtask Precondition Behavior/Action Post-condition Activation condition Motion Primitive Effect … … … Subtask Precondition Behavior/Action Post-condition Activation condition Motion Primitive Effect … … … Subtask Precondition Behavior/Action Post-condition Activation condition Motion Primitive Effect … … … • Precondition • Behavior/Action • Post-condition • Activation condition • Motion Primitive • Effect … … … Task-Sequence Learning/Planning 27
Recommend
More recommend