Robust Pose Optimization Made Differentiable Eric Brachmann 5th International Workshop on Recovering 6D Object Pose @ICCV19
Background 2012-2017 Dr. PhD at Eric Brachmann @eric_brachmann since 2018 Post-Doc at since 2019 Guest at Prof. Carsten Rother 2
Main Research Interests • Machine learning and projective geometry • Robust fitting with (differentiable) RANSAC • Object poses • Camera poses • Lines • Epipolar Geometry Object Coordinates – ECCV‘14 DSAC++ – CVPR‘18 DSAC – CVPR‘17 NG-RANSAC – ICCV‘19 3
Goal 6D Poses መ RGB(-D) Image 𝐽 𝐢 𝑝 Pose Estimation Pipeline Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring “Learning 6D object pose estimation using 3D object coordinates”, Brachmann et al., ECCV’14 “ iPose: instance-aware 6D pose estimation of partly occluded objects”, Jafari et al., ACCV’18 “ Segmentation-driven 6D Object Pose Estimation ”, Hu et al., CVPR’19 “Pix2Pose : Pixel-Wise Coordinate Regression of Objects for 6D Pose Estimation”, Park et al., ICCV’19 “DPOD: 6D Pose Object Detector and Refiner”, Zakharov et al., ICCV’19 … 4
Why End-to-End? 6D Camera Pose መ RGB(-D) Image 𝐽 𝐢 Pose Estimation Pipeline Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring 5
Why End-to-End? Indoor Outdoor [ESAC] [NGRANSAC] 88.1% 100% 35 86.5% 31 90% 30 80% Median Tranlation Error (cm) 61.7% 25 70% Re-Localization Rate 50.9% 60% 19 20 50% 15 40% 30% 10 20% 5 Comparing reprojection error before and after end-to-end training: 10% -10px Improvement ±0px Degradation +10px 0% 0 5cm, 5° 2cm, 2° Initialization End-to-End [ESAC] „ Expert Sample Consensus Applied to Camera Re- Localization”, Brachmann and Rother, ICCV’19 [NGRANSAC] “Neural - Guided RANSAC: Learning Where to Sample Model Hypotheses”, Brachmann and Rother, ICCV19 6
Roadmap Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring 7
Pose Loss (RGB-D) Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring Input: RGB-D ℓ 𝐮, 𝐮 ∗ + 𝛽ℓ 𝑆, 𝑆 ∗ with 𝐢 = 𝐮, 𝑆 log(𝑆 ∗ 𝑆 T ) with log 𝑆 : ℝ 3×3 → ℝ 3 𝐮 − 𝐮 ∗ 𝜄 in OpenCV: cv2.Rodrigues() incl. gradients 𝑆 ∗ 𝑆 8
Pose Loss (RGB) Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring Input: RGB Z-Err: 5cm 10cm 20cm ℓ 𝜌 𝐢, 𝐢 ∗ = 1 |𝒲| σ 𝐰∈𝒲 𝐷𝐢 ∗ 𝐰 − 𝐷𝐢𝐰 [Bra16] 𝒲 ... Model vertices 𝐷 ... Camera calibration matrix [Bra16] Brachmann et al., “ Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image ”, CVPR 2016 9
Ƹ Pose Solver (RGB-D) Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring Input: RGB-D 𝑆, Ƹ 𝟑 𝐮 = argmin 𝐲 𝒋 − 𝑆𝐳 𝑗 − 𝒖 𝑆,𝐮|𝑆𝑆 𝑈 =1 𝐲 𝑗 𝐳 𝑗 𝑗 𝐳) 𝑈 cov 𝐲 i , 𝐳 i = (𝐲 𝒋 −ത 𝐲)(𝐳 𝑗 − ത Kabsch Algorithm: C++ code with PyTorch 𝑗 integration coming soon. cov 𝐲 i , 𝐳 i = 𝑉𝛵𝑊 𝑈 1 0 0 𝑉 𝑈 0 1 0 𝑆 = 𝑊 det(𝑊𝑉 𝑈 ) 0 0 𝐮 = 𝑆ത 𝐳 - ത 𝐲 [Kab76] Kabsch , “A solution for the best rotation to relate two sets of vectors”, Acta Crystallographica, 1976 10
Pose Solver (RGB) Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring Input: RGB 2 𝑆, Ƹ 𝐮 = argmin 𝐪 𝑗 − 𝐷 𝑆𝐳 𝑗 − 𝐮 𝑺,𝐮 𝑗 𝒒 𝑗 𝐳 𝑗 Solving Perspective-n-Point: Initialization Gauss-Newton [Lep09] Lepetit et al., “ EPnP: An Accurate O(n) Solution to the PnP Problem”, IJCV’09 [Gao03] Gao et al ., “Complete Solution Classification for the Perspective -Three-Point Problem”, TPAMI’03 11
Pose Solver (RGB) Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring 𝐢 0 Initialization Gauss-Newton 2 Residual vector: 𝐬 𝐢 𝑗 = 𝐪 𝑗 − 𝐷𝐢𝐳 𝑗 Update Rule: 𝐢 𝑢+1 = 𝐢 𝑢 − 𝐾 𝐬 𝑈 𝐾 𝐬 −1 𝐾 𝐬 𝑈 𝐬(𝐢 𝑢 ) 𝜖 𝐬 𝐢 𝑢 𝑗 Jacobean: [𝐾 𝐬 ] 𝑗𝑘 = 𝜖 𝐢 𝑢 𝑘 𝐢 1 12
Pose Solver (RGB) Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring 𝐢 0 Initialization Gauss-Newton 2 Residual vector: 𝐬 𝐢 𝑗 = 𝐪 𝑗 − 𝐷𝐢𝐳 𝑗 Update Rule: 𝐢 𝑢+1 = 𝐢 𝑢 − 𝐾 𝐬 𝑈 𝐾 𝐬 −1 𝐾 𝐬 𝑈 𝐬(𝐢 𝑢 ) 𝜖 𝐬 𝐢 𝑢 𝑗 Jacobean: [𝐾 𝐬 ] 𝑗𝑘 = 𝜖 𝐢 𝑢 𝑘 𝐢 = 𝐢 ∞ − 𝐾 𝐬 Last update: መ 𝑈 𝐾 𝐬 −1 𝐾 𝐬 𝑈 𝐬(𝐢 ∞ ) 𝜖 𝑈 𝜖 መ 𝑈 𝐾 𝐬 −1 𝐾 𝐬 𝜖𝐳 𝑗 𝐬(𝐢 ∞ ) Gradients: 𝐢 ≈ − 𝐾 𝐬 𝜖𝐳 𝑗 𝐢 1 13
Pose Solver (RGB) Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring 𝐢 0 Initialization Gauss-Newton 2 Residual vector: 𝐬 𝐢 𝑗 = 𝐪 𝑗 − 𝐷𝐢𝐳 𝑗 Update Rule: 𝐢 𝑢+1 = 𝐢 𝑢 − 𝐾 𝐬 𝑈 𝐾 𝐬 −1 𝐾 𝐬 𝑈 𝐬(𝐢 𝑢 ) C++ code of [Bra18] online. Version with PyTorch 𝜖 𝐬 𝐢 𝑢 integration coming soon. 𝑗 Jacobean: [𝐾 𝐬 ] 𝑗𝑘 = 𝜖 𝐢 𝑢 𝑘 𝐢 = 𝐢 ∞ − 𝐾 𝐬 Last update: መ 𝑈 𝐾 𝐬 −1 𝐾 𝐬 𝑈 𝐬(𝐢 ∞ ) 𝜖 𝑈 𝜖 መ 𝑈 𝐾 𝐬 −1 𝐾 𝐬 𝜖𝐳 𝑗 𝐬(𝐢 ∞ ) Gradients: 𝐢 ≈ − 𝐾 𝐬 𝜖𝐳 𝑗 𝐢 1 [För16] Förstner and Wrobel , “Photogrammetric Computer Vision – Statistics, Geometry, Orientation and Reconstruction”, Springer’16 [Bra18] Brachmann and Rother , “Learning less is more - 6D camera localization via 3D surface regression”, CVPR’18 14
RANSAC Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring 𝐢 1 𝑡(𝐢 1 , 𝐳) 𝐢 2 𝑡(𝐢 2 , 𝐳) መ 𝑡(𝐢 3 , 𝐳) 𝐢 3 𝐢 Reprojection 𝑡(𝐢 4 , 𝐳) Errors of 𝐢 2 𝐢 4 Image Correspondence Prediction Hypothesis Sampling Scoring Hypothesis Selection Result argmax Selection Probabilistic Selection [Bra17] Soft Inlier Counting [Bra18]: exp(𝑡(𝐢 𝑘 𝐳)) መ መ 𝐢 = argmax 𝑡(𝐢 𝑘 , 𝐳) 𝐢 = 𝐢 𝑘 , where 𝑘~ 𝑡 𝐢, 𝐳 = sig(𝜐 − 𝛾 𝐪 𝑗 − 𝐷𝐢𝐳 𝑗 ) σ 𝑙 exp(𝑡(𝐢 𝑙 𝐳)) 𝐢 𝑘 𝑗 hard decision hard decision non-differentiable differentiable [Bra17] Brachmann et al ., “DSAC - Differentiable RANSAC for camera localization”, CVPR’17 [Bra18] Brachmann and Rother , “Learning less is more - 6D camera localization via 3D surface regression”, CVPR’18 15
Differentiable RANSAC (DSAC) Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring C++ code for camera re- exp(𝑡(𝐢 𝑘 , 𝐳)) መ Hypothesis selection: 𝐢 = 𝐢 𝑘 , where 𝑘~ σ 𝑙 exp(𝑡(𝐢 𝑙 , 𝐳)) = 𝑄 𝑘; 𝐳 localization online. PyTorch code for DSAC line fitting also online. ℒ 𝐳 = 𝔽 𝑘~𝑄 𝑘;𝐳 ℓ(𝐢 𝑘 , 𝐢 ∗ ) Learning objective: 𝜖 𝜖𝐳 log 𝑄 𝑘; 𝐳 + 𝜖 𝜖 Gradients: ℓ 𝐢 𝑘 , 𝐢 ∗ 𝜖𝐳 ℓ 𝐢 𝑘 , 𝐢 ∗ 𝜖𝐳 ℒ 𝐳 = 𝔽 𝑘~𝑄 𝑘;𝐳 derivative of selection probability derivative of task loss [Bra17] Brachmann et al ., “DSAC - Differentiable RANSAC for camera localization”, CVPR’17 16
Differentiable RANSAC (DSAC) PoseNet 149cm, 3.4° Active Search 19cm, 0.5° DSAC++ 13cm, 0.4° [Posenet] “Geometric Loss Functions for Camera Pose Regression with Deep Learning” Kendall and Cipolla, CVPR ’17 [Active Search ] “Efficient & effective prioritized matching for large -scale image-based localization”, Sattler et al., TPAMI’17 [DSAC] “DSAC - Differentiable RANSAC for Camera Localization”, Brachmann et al., CVPR’17 [DSAC++] “Learning Less is More – 6D Camera Localization via 3D Surface Regression”, Brachmann and Rother, CVPR’18 17
Correspondence Prediction Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring Dense Correspondences RANSAC / DSAC Input Image 𝐱 18
Neural Guided RANSAC (NG-RANSAC) Object Object Correspondence Pose Pose Solver RANSAC Detection Classification Prediction Loss Pose Scoring Dense Correspondences RANSAC / DSAC Input Image 𝐱 Selecting a scene coordinate: 𝑞 𝐳 = (𝐽; 𝐱) 4 Selecting a hypothesis: 𝑞 𝐢 = ς 𝑗=0 𝑞 𝐳 𝑗 Selecting a hypotheses pool: 𝑞 ℋ = ς 𝑘 𝑞 𝐢 𝑘 Learning objective: 𝔽 ℋ~𝑞 ℋ ℒ 𝐱 0 Sampling Weight 1 = 𝔽 ℋ~𝑞 ℋ 𝔽 𝑘~𝑄 𝑘|ℋ;𝐱 ℓ(𝐢 𝑘 , 𝐢 ∗ ) Neural Guidance DSAC 19
Recommend
More recommend