deep learning approach for pose estimation
play

Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter - PowerPoint PPT Presentation

Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter van Deurzen Introduction Delft Robotics Kanter van Deurzen Founded 2014 CTO Delft Robotics Staff of 11 Specialized in Vision guided Robotics through AI Winner Amazon


  1. Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter van Deurzen

  2. Introduction Delft Robotics Kanter van Deurzen Founded 2014 CTO Delft Robotics Staff of 11 Specialized in Vision guided Robotics through AI Winner Amazon Picking Challenge 2016

  3. Application Bin-picking • One type of object Multiple objects • • Restricted area Order-picking • Large number of different objects Multiple objects • • Restricted area Service robots • Large number of different objects No area restriction • Any orientation possible (6DOF)!!

  4. Typical pipeline Data acquisition Segmentation Rough pose Pose Grasp pose estimation optimization estimation • 2D • Deep learning • Ransac • ICP • 3D • 4PCS • Fast global registration

  5. CAD pipeline 1. Locate object in 2D/3D 2. Use CAD model to find global optimum of object pose 3. Use CAD model to refine pose locally 4. Determine grasp pose using estimated pose

  6. CAD pipeline • Advantages: – Complete 6DOF pose is known – Pose in gripper approximately known – Scales to other objects • Disadvantages: – Symmetry causes ambiguity, risk of local minima – Requires CAD model

  7. Non-rigid/CAD-less pipeline 1. Locate object in 2D/3D 2. Fit shape of gripper anywhere on object

  8. Non-rigid/CAD-less pipeline • Advantages: – No CAD model necessary – No knowledge of object necessary • Disadvantages: – Pose in gripper unknown – No knowledge of object used (fragility/weight/...)

  9. Why deep learning? • Rough pose estimation is slow • Rough pose estimation risks local minima

  10. Attempted DL Approaches • Classification • Regression • Key-points

  11. Classification • Concept – Classify pose as one of X classes Class 1 Class 2 Class 3 Class ...

  12. Classification results • Conditions: – Rendered images – 3 axes rotations – 10 classes per axis (36 degrees per class) – For only 1 axis, great (Z-axis, 99% accuracy) • Conclusions: – Scales badly to multiple axes (10 classes per axis, 10*10*10 = 1000 total classes).

  13. Regression • Treat as a regression problem • Regress 4 values (queternion) • Use RGB(D) as input Ground truth Predicted pose

  14. Regression results • Conclusions: – Works well on asymmetric objects – Symmetric objects cause ambiguity – Difficult to find correct loss function

  15. Key-points • Concept: – Recognize 3 or more keypoints – Determine pose based on these keypoints – Refine using (local) heuristics

  16. Key-points results • Conclusions: – Easily integrated end-to-end in network – Works only for well-defined keypoints, which is non-trivial – Results in good initial guesses

  17. Future research • Investigate other DL approaches ( • Process point clouds directly with DL – challenges: data, data, data, lack of research • Use other forms of DL (reinforcement learning? GANs?)

  18. Thank you for your attention Kantervan Deurzen k.vandeurzen@delftrobotics.com 0651753705

Recommend


More recommend