place recognition with instance search
play

Place recognition with instance search from hand-crafted to - PowerPoint PPT Presentation

Place recognition with instance search from hand-crafted to learning-based methods Giorgos Tolias Tutorial on Large -Scale Visual Place Recognition and Image- Based Localization Tolias Sattler Brachmann ICCV 2019, Seoul Outline


  1. Average precision loss The larger the batch the better  no need to sample [Revaud et al., ICCV’19]

  2. Average Precision Loss [Revaud et al., ICCV’19]

  3. Training data

  4. Training data from GPS: negatives candidate negatives anchor

  5. Training data from GPS: negatives candidate positives anchor camera orientation (unkown)

  6. Training data from GPS: negatives Descriptor distance to resolve:  pick the closest [Arandjelovic et al., CVPR’16] candidate positives anchor camera orientation (unkown)

  7. Training data from SfM 7.4M images  713 training 3D models [Schonberger et al. CVPR’15] [Radenovic et al. CVPR’16]

  8. Training data from SfM camera orientation known number of inliers known 7.4M images  713 training 3D models [Schonberger et al. CVPR’15] [Radenovic et al. CVPR’16]

  9. Training data from SfM: hard negatives Negative examples : images from different 3D models than the query Hard negatives : closest negative examples to the query anchor [Radenovic et al. PAMI’19]

  10. Training data from SfM: hard negatives Negative examples : images from different 3D models than the query Hard negatives : closest negative examples to the query the most similar anchor CNN descriptor [Radenovic et al. PAMI’19]

  11. Training data from SfM: hard negatives Negative examples : images from different 3D models than the query Hard negatives : closest negative examples to the query increasing CNN descriptor distance to the query the most similar naive hard negatives anchor CNN descriptor top k by CNN [Radenovic et al. PAMI’19]

  12. Training data from SfM: hard negatives Negative examples : images from different 3D models than the query Hard negatives : closest negative examples to the query increasing CNN descriptor distance to the query the most similar naive hard negatives anchor CNN descriptor top k by CNN [Radenovic et al. PAMI’19]

  13. Training data from SfM: hard negatives Negative examples : images from different 3D models than the query Hard negatives : closest negative examples to the query increasing CNN descriptor distance to the query the most similar naive hard negatives diverse hard negatives anchor CNN descriptor top k by CNN top k: one per 3D model [Radenovic et al. PAMI’19]

  14. Training data from SfM: hard positives Positive examples: images that share 3D points with the query Hard positives: positive examples not close enough to the query anchor [Radenovic et al. PAMI’19]

  15. Training data from SfM: hard positives Positive examples: images that share 3D points with the query Hard positives: positive examples not close enough to the query anchor top 1 by CNN [Radenovic et al. PAMI’19]

  16. Training data from SfM: hard positives Positive examples: images that share 3D points with the query Hard positives: positive examples not close enough to the query anchor top 1 by CNN [Radenovic et al. PAMI’19]

  17. Training data from SfM: hard positives Positive examples: images that share 3D points with the query Hard positives: positive examples not close enough to the query anchor top 1 by CNN top 1 by inliers harder positives [Radenovic et al. PAMI’19]

  18. Training data from SfM: hard positives Positive examples: images that share 3D points with the query Hard positives: positive examples not close enough to the query random from anchor top k by inliers top 1 by CNN top 1 by inliers harder positives [Radenovic et al. PAMI’19]

  19. Positive and negative training images [Radenovic et al. PAMI’19]

  20. Positive and negative training images 51.6 44.2 Off-the-shelf Oxford 5k Paris 6k [Radenovic et al. PAMI’19]

  21. Positive and negative training images 63.1 56.2 51.6 top 1 CNN + top k CNN 44.2 Off-the-shelf Oxford 5k Paris 6k [Radenovic et al. PAMI’19]

  22. Positive and negative training images 63.1 56.2 51.6 top 1 CNN + top k CNN 44.2 Off-the-shelf Oxford 5k Paris 6k [Radenovic et al. PAMI’19]

  23. Positive and negative training images 63.9 63.1 56.7 top 1 CNN + top 1 / model CNN 56.2 51.6 top 1 CNN + top k CNN 44.2 Off-the-shelf Oxford 5k Paris 6k [Radenovic et al. PAMI’19]

  24. Positive and negative training images 67.1 63.9 63.1 top 1 inliers + top 1 / model CNN 59.7 56.7 top 1 CNN + top 1 / model CNN 56.2 51.6 top 1 CNN + top k CNN 44.2 Off-the-shelf Oxford 5k Paris 6k [Radenovic et al. PAMI’19]

  25. Positive and negative training images 67.5 67.1 63.9 random(top k inliers) + top 1 / model CNN 63.1 60.2 top 1 inliers + top 1 / model CNN 59.7 56.7 top 1 CNN + top 1 / model CNN 56.2 51.6 top 1 CNN + top k CNN 44.2 Off-the-shelf Oxford 5k Paris 6k [Radenovic et al. PAMI’19]

  26. Class labels + cleaning Use classical computer vision to collect training data:  Bag-of-Words and spatial verification [Gordo et al. IJCV’18]

  27. Class labels + cleaning [Gordo et al. IJCV’18]

  28. Class labels + cleaning classification loss vs ranking loss [Gordo et al. IJCV’18]

  29. PlaNet N-way classification training adaptive partitioning into k=26,263 cells Very compact model (377 MB)! But is it better than instance search? [Weyand et al., ICCV’17]

  30. Revisiting IM2GPS A. Classification with globe partitioning Evaluation at different scales • best at coarse level, bad at fine level IM2GPS dataset • very compact model Fine street (1km) city (25km) Coarse scale region (250km) country (750km) continent (7500km) [Vo et al., CVPR’17]

  31. Revisiting IM2GPS A. Classification with globe partitioning Evaluation at different scales • best at coarse level, bad at fine level IM2GPS dataset • very compact model Fine B. Descriptors from A used for instance search street (1km) • improves for fine level city (25km) • all descriptors in memory Coarse scale region (250km) country (750km) continent (7500km) [Vo et al., CVPR’17]

  32. Revisiting IM2GPS A. Classification with globe partitioning Evaluation at different scales • best at coarse level, bad at fine level IM2GPS dataset • very compact model Fine B. Descriptors from A used for instance search street (1km) • improves for fine level city (25km) • all descriptors in memory Coarse scale C. Fine-tuning A with ranking loss, use for instance search region (250km) • no improvements country (750km) • high intra class variability / not challenging pairs continent (7500km) [Vo et al., CVPR’17]

  33. Revisiting IM2GPS A. Classification with globe partitioning Evaluation at different scales • best at coarse level, bad at fine level IM2GPS dataset • very compact model Fine B. Descriptors from A used for instance search street (1km) • improves for fine level city (25km) • all descriptors in memory Coarse scale C. Fine-tuning A with ranking loss, use for instance search region (250km) • no improvements country (750km) • high intra class variability / not challenging pairs continent (7500km) D. Global descriptor (MAC) trained with SfM data [Radenovic et al.] • the best for fine level • all descriptors in memory [Vo et al., CVPR’17]

  34. Google landmark recognition challenge Combining global GeM with local DELF-ASMK

  35. GeM-based recognition

Recommend


More recommend