Geometric and semantic SLAM using high level features Shichao Yang Michael Kaess Sebastian Scherer
Autonomous Robots q Widely used in searching, monitoring, mapping etc. Bridge inspection Riverine mapping q Focus on the monocular camera. 2
SLAM q Simultaneous Localization and Mapping ORB SLAM, 2015 LSD, DSO, 2016 R Mur-Artal et al Jakob et al. 3
SLAM not enough? q Traditional SLAM might fail in challenging low-texture cases. DSO ORB SLAM 4
SLAM not enough? q Need objects and planes, in addition to points. Virtual Reality Autonomous Driving Robotic Manipulation Placing furniture Mousavian, 2016 Seung-Joon Yi, 2015 q Reason position of 3D objects, layouts. Scene understanding Hedau, Hoiem, 2010 5
Methods Jointly solve SLAM and scene understanding, demonstrating that they can benefit each other Scene SLAM understanding high level features: Lines, planes, objects… 6
Outline q Line VO/SLAM q Plane SLAM q Object SLAM 7
Outline q Line VO/SLAM q Plane SLAM q Object SLAM 8
Line VO/SLAM q Line is an important feature § Exist in low-texture environments § Provide long range constraints Feature points Lines q Challenges: § Line parameterization § Sensitive to occlusion, less reliable than points Shichao Yang,, Sebastian Scherer. " Direct Monocular Odometry Using Points and Lines." ICRA, 2017 9
Related Work: Points + Line q Only geometric error Juan, et al. ICCV, 2015 Albert, et al, ICRA, 2017 d Point-point geometric error Manohar, et al. ICRA, 2016 d Line-line geometric error Point-line geometric error 10
Proposed Line VO q Point + lines. Two error types. ! % ! " − d Direct method Feature-based method Point-point photometric error Point-line geometric error q Contributions: § Combine points and lines with two types of error, especially suitable for low-texture environments § Provide an uncertainty analysis and probabilistic fusion in tracking and mapping § Real time VO, outperforming or comparable to existing VO 11
Proposed Line VO q Pipeline, as an extension of point based SDVO [1] Images Point extraction High gradient pixels Line detection Line pixels Tracking Mapping Estimate camera pose by Estimate keyframe’s depth minimizing two kinds of through line regularization errors. [1] Semi-dense Visual Odometry for a Monocular Camera, Jakob Engel, et al. ICCV. 2013 12
Experiments - Line VO Relative Position Error (cm/s) on TUM Dataset 13
Experiments - Line VO q Datasets with various textures https://youtu.be/wu4jL2jQEac 14
Outline q Line VO/SLAM q Plane SLAM q Object SLAM 15
Introduction - Plane SLAM q Manhattan corridors Similar layout structures § Low-texture: few visual features § Difficult for traditional v-SLAM Texture-less but structured corridor. Sparse and inaccurate map of SLAM with Layout Planes ORB-SLAM IROS, 2016 16
Related Work – Layout Understanding Hoiem, 2007 Decision tree segmentation + pop up Hedau, 2009 Cuboidal room using vanishing points Usually works for Manhattan box § environments or fixed corridor configurations, view points Not real time. § Lee, 2009 Fixed corridor models. 17
Related Work - SLAM + Layout q Sequential approach, solving problems separately. Scene Scene SLAM SLAM understanding understanding 3D Layout Post dense mapping Point Cloud Detect plane and object Concha, Alejo, 2015 Sid Yingze Bao. 2014 Limitations One module fails, the other also § fail. 18
Proposed methods Scene SLAM understanding q Contributions Jointly optimize scene layouts with camera poses in SLAM framework and § large environments for the first time. Real time system applicable for robot navigation. § 19
Plane model from Single Image q Layout plane extraction from single image Ground segmentation Pop up 3D model Boundary line fitting Shichao Yang, Daniel Maturana, Sebastian Scherer. " Real-time 3D scene layout from a single image using convolutional neural networks." ICRA, 2016 20
Plane model from Single Image q Generalize to various environment structure § Wall � Ground Input Our Hoiem, 2007 Hedau, 2009 Lee, 2009 21
Plane model from Single Image Fast, in real time 60Hz § More accurate and robust to various environments. § https://youtu.be/2CvFHy5jk1c 4x speed 22
Plane model from Single Image q Previous method Ground segmentation Line fitting Pop-up q Improvement Detect edge Select edge Pop-up Plane matches true layout, invariant across frames, suitable for SLAM. § More accurate 3D model § 23
Plane model from Single Image q Optimal edge set selection Ground All edges segmentation max 0⊆; < & , >?: & ∈ ! Selected ground edges Submodular problem, greedy solution: § & ← & ∪ argmax 6 7|& Select edge one by one. .∉0:0∪{.}∈5 ∆ is the marginal cost gain of adding edge 7 24
Pop-up Plane SLAM q Factor Graph q Edges Plane measurement: @ A from single image pop-up process § Re-pop to update measurement after camera poses changes § q Nodes Plane: B A = {D ∈ R F , | D | = 1} . 3 Dof quaternion as minimal representation § for manifold optimization [1] . Camera Pose: H A 6 Dof SE3 § Shichao Yang, Yu Song, Michael Kaess, Sebastian Scherer. " Pop-up SLAM: a 25 Semantic Monocular Plane SLAM for Low-texture Environments." IROS, 2016
Pop-up Plane SLAM q Data association Geometry, not visual features due to low-texture ● Plane normal I " , I % angle difference. ● Overlapping ratio by projection B % onto B " ● q Loop closing Bag of words place recognition § Planes have different appearance and size across frames. § Landmarks merged after being created for some frames. Need to shift factors. Shift factors 26
Point-plane Fusion q Only Plane SLAM sometimes not enough q Point SLAM is not accurate in forward corridor motion with low parallax Pop-up depth map RGB Much better than stereo image triangulation 27
Point-plane Fusion q Depth fusion: Integrate LSD depth J K and pop-up depth J L in a filtering approach: § % J L + M L % J K % M L % N M K , M K % + M L % + M L % % M K M K % and M L % are covariance of depth measurements. M K % computed through error propagation rule. Pop-up covariance M L 28
Experiments of Plane SLAM q On public and collected dataset. https://youtu.be/TOSOWdxmtkw 29
Experiments of Plane SLAM q Compare with LSD and ORB SLAM On TUM dataset. § Existing point SLAM fails Plane Normal error Depth error Depth error<0.1m Value 2.83 � 6.2cm 86.8% 30
Experiments of Plane SLAM q On our data I Input Image LSD SLAM ORB SLAM Depth Enhanced LSD SLAM LSD Pop-up SLAM Our algorithms 31
Experiments of Plane SLAM q On our data II LSD SLAM Input Image Loop error 0.67%. ORB SLAM Our algorithms 32
Outline q Line VO/SLAM q Plane SLAM q Object SLAM 33
Introduction – Object SLAM q SLAM with objects and planes. Plane SLAM Plane and Object SLAM Completed work Proposed work 34
Related Work – 3D Object Understanding Prior CAD model Schwing, 2013 Object aligned with room Choi, 2013 Limitations Need prior object CAD model or § shape priors. § Keypoint model Murthy1, 2017 35
Related Work – Object SLAM (Only two image) SLAM++ (RGBD) Bao, Sid Yingze, et al. 2012 Salas-Moreno, et al. 2013 Limitations Work for small workspace § Require known object model § Dorian Gálvez-López, et al. 2016 36
Single image 3D object detection q Without 3D CAD or keypoint model. 37
Object SLAM q On TUM sequence (preliminary result) 3D Object detection in single image, without prior object model Existing point SLAM all fail Each object has 6 DoF pose, and Length, width, height Multi-view object SLAM 38
Conclusion q SLAM with high level features, from scene understanding. Line Plane Object Object Points Plane q Improve both state estimation and mapping. q Without prior CAD model or room model. 39 Image Modified from Salas-Moreno, 2014
Future work q More complicated environment? Support relations? Segmentation Intersection Occlusion q Jointly points, plane, objects? 40
Scene High level SLAM understanding features 41
Recommend
More recommend