Deep Learning Approach for Pose Estimation Talk #23444 MSc Kanter van Deurzen
Introduction Delft Robotics Kanter van Deurzen Founded 2014 CTO Delft Robotics Staff of 11 Specialized in Vision guided Robotics through AI Winner Amazon Picking Challenge 2016
Application Bin-picking • One type of object Multiple objects • • Restricted area Order-picking • Large number of different objects Multiple objects • • Restricted area Service robots • Large number of different objects No area restriction • Any orientation possible (6DOF)!!
Typical pipeline Data acquisition Segmentation Rough pose Pose Grasp pose estimation optimization estimation • 2D • Deep learning • Ransac • ICP • 3D • 4PCS • Fast global registration
CAD pipeline 1. Locate object in 2D/3D 2. Use CAD model to find global optimum of object pose 3. Use CAD model to refine pose locally 4. Determine grasp pose using estimated pose
CAD pipeline • Advantages: – Complete 6DOF pose is known – Pose in gripper approximately known – Scales to other objects • Disadvantages: – Symmetry causes ambiguity, risk of local minima – Requires CAD model
Non-rigid/CAD-less pipeline 1. Locate object in 2D/3D 2. Fit shape of gripper anywhere on object
Non-rigid/CAD-less pipeline • Advantages: – No CAD model necessary – No knowledge of object necessary • Disadvantages: – Pose in gripper unknown – No knowledge of object used (fragility/weight/...)
Why deep learning? • Rough pose estimation is slow • Rough pose estimation risks local minima
Attempted DL Approaches • Classification • Regression • Key-points
Classification • Concept – Classify pose as one of X classes Class 1 Class 2 Class 3 Class ...
Classification results • Conditions: – Rendered images – 3 axes rotations – 10 classes per axis (36 degrees per class) – For only 1 axis, great (Z-axis, 99% accuracy) • Conclusions: – Scales badly to multiple axes (10 classes per axis, 10*10*10 = 1000 total classes).
Regression • Treat as a regression problem • Regress 4 values (queternion) • Use RGB(D) as input Ground truth Predicted pose
Regression results • Conclusions: – Works well on asymmetric objects – Symmetric objects cause ambiguity – Difficult to find correct loss function
Key-points • Concept: – Recognize 3 or more keypoints – Determine pose based on these keypoints – Refine using (local) heuristics
Key-points results • Conclusions: – Easily integrated end-to-end in network – Works only for well-defined keypoints, which is non-trivial – Results in good initial guesses
Future research • Investigate other DL approaches ( • Process point clouds directly with DL – challenges: data, data, data, lack of research • Use other forms of DL (reinforcement learning? GANs?)
Thank you for your attention Kantervan Deurzen k.vandeurzen@delftrobotics.com 0651753705
Recommend
More recommend