Neural Topological SLAM for Visual Navigation CVPR-2020 Webpage: https://devendrachaplot.github.io/projects/Neural-Topological-SLAM Abhinav Ruslan Devendra Singh Saurabh Gupta Salakhutdinov Gupta Chaplot
Semantic Priors and Common-Sense • Humans use semantic priors and common-sense to explore and navigate everyday • Most navigation algorithms struggle to do so 2
Semantic Priors and Common-Sense • Humans use semantic Target Image priors and common-sense to explore and navigate everyday • Most navigation algorithms 3 2 struggle to do so 1 2
Image Goal Task ( I S ) ( I G ) Source Image Goal Image 3
Image Goal Task ( I S ) ( I G ) Source Image Goal Image • Agent observations are panoramic images 3
Image Goal Task ( I S ) ( I G ) Source Image Goal Image navigate • Agent observations are panoramic images • Take actions to navigate to the goal location 3
Image Goal Task ( I S ) ( I G ) Source Image Goal Image navigate • Agent observations are panoramic images • Take actions to navigate to the goal location • Take the `stop’ action at the goal location 3
Image Goal Task ( I S ) ( I G ) Source Image Goal Image navigate • Agent observations are panoramic images • Take actions to navigate to the goal location • Take the `stop’ action at the goal location • Sequential goals 3
Prior work 4
Prior work End-to-end Reinforcement or Imitation Learning End-to-end Learning Reward + • High sample complexity • Ine ff ective in large environments Reward − Neural Network Observations Actions 4
Prior work End-to-end Reinforcement or Imitation Learning End-to-end Learning Reward + • High sample complexity • Ine ff ective in large environments Reward − Neural Network Observations Actions Modular Metric Maps Modular Metric Maps • Can not learn semantic priors • Pose error accumulation 4
Topological Maps 5
Topological Maps Dining Room Kitchen Living Room Stairway Hallway Master Bedroom Office Children’s Room Entrance 5
Topological Graph Representation 6
Topological Graph Representation • Nodes : areas • Regular nodes : Explored areas • Ghost nodes : Unexplored areas Agent’s Current Node Regular Nodes Ghost Nodes 6
Topological Graph Representation • Nodes : areas • Regular nodes : Explored areas • Ghost nodes : Unexplored areas Agent’s Current Node Regular Nodes Ghost Nodes Selected Ghost Node 6
Topological Graph Representation • Nodes : areas • Regular nodes : Explored areas • Ghost nodes : Unexplored areas Agent’s Current Node Regular Nodes Ghost Nodes 7
Topological Graph Representation • Nodes : areas • Regular nodes : Explored areas • Ghost nodes : Unexplored areas Agent’s Current Node Regular Nodes Ghost Nodes 7
Topological Graph Representation • Nodes : areas • Regular nodes : Explored areas • Ghost nodes : Unexplored areas Agent’s Current Node Regular Nodes Ghost Nodes Selected Ghost Node 7
Topological Graph Representation • Nodes : areas • Regular nodes : Explored areas • Ghost nodes : Unexplored areas • Edges : Spatial relationship Relative Position Agent’s Current Node between nodes Regular Nodes Ghost Nodes Selected Ghost Node 7
Four learnable functions 8
Four learnable functions = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) 8
Geometric Prediction 9
Geometric Prediction = Geometric Prediction: Free directions ℱ G ( I 1 ) 9
Semantic Prediction 10
Semantic Prediction = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) 10
Localization 11
Localization = Localization ℱ L ( I 1 , I 2 ) Localization ( ℱ L ) 1 Localization ( ℱ L ) 0 11
Relative Pose Prediction 12 12
Relative Pose Prediction = Relative Pose ℱ R ( I 1 , I 2 ) Angle Direction label ℱ R Relative Pose 0 0 0 1 0 0 0 0 0 0 0 0 Prediction ( ℱ R ) 0.0 0.0 0.0 0.87 0.0 0.0 0.0 0.0 0.0 0.0 0 0 Score predictions ℱ R Distance 12 12
Neural Topological SLAM 13
Neural Topological SLAM = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) 13
Neural Topological SLAM = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) 14
Neural Topological SLAM = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) ℱ G ( I 1 ) ℱ S ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) 15
Neural Topological SLAM = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) ℱ G ( I 1 ) ℱ S ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) 16
Neural Topological SLAM = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) ℱ G ( I 1 ) ℱ S ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) Δ p Δ p 17
Neural Topological SLAM = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) ℱ G ( I 1 ) ℱ S ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) 18
Neural Topological SLAM = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) ℱ G ( I 1 ) ℱ S ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) 19
Neural Topological SLAM = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) ℱ G ( I 1 ) ℱ S ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) 20
Neural Topological SLAM = Geometric Prediction: Free directions ℱ G ( I 1 ) = Semantic Prediction: Closeness to target ℱ S ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) ℱ L ( I 1 , I 2 ) = Localization ℱ L ( I 1 , I 2 ) ℱ G ( I 1 ) ℱ S ( I 1 , I 2 ) = Relative Pose Prediction ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) ℱ R ( I 1 , I 2 ) 21
Single supervised learning model 22
Single supervised learning model • No reinforcement learning, no interaction needed • Can be trained completely with static data 22
Demo video Goal Location Ghost nodes Node Locations Selected Ghost node 23
Demo video Goal Location Ghost nodes Node Locations Selected Ghost node 0.09 0.08 0.73 0.23 0.07 0.17 23
Demo video Goal Location Ghost nodes Node Locations Selected Ghost node 24
Demo video Goal Location Ghost nodes Node Locations Selected Ghost node 24
Learning Semantic Priors Goal Location Ghost nodes Node Locations Selected Ghost node 0.09 0.08 0.73 0.23 0.07 0.17 25
Learning Semantic Priors Goal Location Ghost nodes Node Locations Selected Ghost node 0.20 0.56 0.17 0.13 0.76 0.27 26
Learning Semantic Priors Goal Location Ghost nodes Node Locations Selected Ghost node 27
Learning Semantic Priors Goal Location Ghost nodes Node Locations Selected Ghost node 27
Results RGBD RGBD RGB RGBD (No Noise) (No Stop) End-to-end LSTM + Imitation 0.10 0.14 0.15 0.18 Learning LSTM + RL 0.10 0.13 0.14 0.17 Occupancy Maps + Modular N/A 0.26 0.31 0.24 Metric Maps FBE + RL Active Neural SLAM 0.23 0.29 0.35 0.39 Neural Topological Topological 0.38 0.43 0.45 0.60 SLAM Maps
Results Robustness to Pose Noise RGBD RGBD RGB RGBD (No Noise) (No Stop) End-to-end LSTM + Imitation 0.10 0.14 0.15 0.18 Learning LSTM + RL 0.10 0.13 0.14 0.17 Occupancy Maps + Modular N/A 0.26 0.31 0.24 Metric Maps FBE + RL NTS is better than occupancy map models, Active Neural SLAM 0.23 0.29 0.35 0.39 captures and uses semantic Neural Topological Topological 0.38 0.43 0.45 0.60 priors. SLAM Maps
Recommend
More recommend