geometric vlad for large scale image search
play

Geometric VLAD for Large Scale Image Search Zixuan Wang 1 , Wei Di 2 - PowerPoint PPT Presentation

Geometric VLAD for Large Scale Image Search Zixuan Wang 1 , Wei Di 2 , Anurag Bhardwaj 2 , Vignesh Jagadesh 2 , Robinson Piramuthu 2 1 2 Our Goal 1) Robust to various imaging conditions 2) Small memory footprint 3) Speed (<1s per query) 2


  1. Geometric VLAD for Large Scale Image Search Zixuan Wang 1 , Wei Di 2 , Anurag Bhardwaj 2 , Vignesh Jagadesh 2 , Robinson Piramuthu 2 1 2

  2. Our Goal 1) Robust to various imaging conditions 2) Small memory footprint 3) Speed (<1s per query) 2 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  3. Issues with matching images (1/2) Photometric Invariance • Brightness • Exposure 3 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  4. Issues with matching images (2/2) Geometric Invariance • Rotation • Translation • Scale 4 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  5. State-of-the-art: Bag-of-Words (BoW) BoW Computation Codebook Descriptor Image Inventory Keypoint Detection Bag-of-Words Construction Computation BoW Encoding Slide evolved from Fei-Fei Li’s Image Encoding Image Inventory Bag-of-Words Inverted Indices (size = 200k) 5 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  6. Issues with BOW Matching v Weak Matching Schema o for a “small” visual dictionary: too many false matches o for a “large” visual dictionary: many true matches are missed v Hard to find vocabulary size trade-offs v Large inverted index size 6 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  7. Recent approaches for very large scale indexing BoW Computation Codebook Descriptor Image Inventory Keypoint Detection Bag-of-Words Construction Computation Vector Encoding Vector Compression Image Inventory Bag-of-Words Nearest Neighbor Search (size = 128) 7 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  8. VLAD: Vector of Locally Aggregated Descriptors For a given image l x assign each descriptor to closest center c i ► accumulate (sum) descriptors per cell ► v i := v i + ( x - c i ) c i Residual ( x - c i ) adds useful information l VLAD (dimension D = k x d ) – typical k = 64 l 128 Dimension VLAD has better performance than 65k BoW! 8 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  9. Issue with VLAD x x r r c i c i VLAD : v i := v i + (x - c i ) VLAD : r + r + r + r = 4*r VLAD : r + r + r + r = 4*r VLAD fails to capture geometry information 9 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  10. gVLAD: Incorporating Geometry in VLAD x x r r Bin 1 c i c i Bin 2 gVLAD : Take 2 angle bins - [-30,120), [120-330) § v i := v i + (x - c i ) per angle bin § gVLAD: (4*r, 0) gVLAD: (2*r, 2*r) Bin 1: r + r + r + r Bin 1: r + r Bin 2: 0 Bin 2: r + r Angle binning captures different geometry of configurations! 10 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  11. Power of Keypoint Angle Features mAP Angle Bin (8) 0.15 Angle Bin (18) 0.24 Angle Bin (36) 0.26 Angle Bin (72) 0.27 GIST (544) 0.35 BoW (20,000) 0.45 Retrieval performance using only angle histogram Only 72-D Angle Bin performs well! 11 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  12. Datasets & Vocabularies Paris 6K Holidays Oxford 5K 1491 / 500 queries 6412 / 60 queries 5062 / 55 queries q Large Scale Distractors Flickr 100K, Flickr 1M q Vocabulary k-means clustering on SURF descriptors Rotated Holidays with k = 256 on Paris dataset 12 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  13. Dataset: Holiday & Oxford Holiday example queries Oxford example queries 13 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  14. Example Distractors – Flickr 14 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  15. gVLAD: Keypoint Detection & Descriptor Extraction – feature descriptor – angle 15

  16. gVLAD: Learning Angle Membership Rotate Holiday, 4 Bins Oxford, 4 Bins 0.855 0.63 0.85 0.62 0.845 0.61 0.84 0.6 mAP mAP 0.835 0.59 0.83 0.58 0.825 0.57 0.82 0.56 0.815 0.81 0.55 0 20 40 60 80 100 0 20 40 60 80 100 Offset Offset 16

  17. gVLAD: Learning Angle Membership Von-Mises Distribution Holiday SURF (8,233,763 key points) 17

  18. gVLAD: Vocabulary Adaptation • Adapt existing codebooks with incremental dataset • Alleviate the need of frequent large-scale codebook training 𝑉 ¡ - Initial data set 𝑣 initial codebook initial codebook k 1 𝐸 ¡ – New data set ​𝑣 adapted codebook k 2 k 3 18 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  19. gVLAD: Compute Descriptors Inter-norm ~ 17.7 % Rotated Holiday Dataset 19 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  20. gVLAD: PCA whitening Rotated Holiday Dataset Whitened gVLAD: 16.6 % Lower Dim 20 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  21. gVLAD: PCA whitening Dimension reduction on the original gVLAD using PCA From 65,536 à 128 dimensions, the mAP decreases only about 1%. 21 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  22. Experiment: Full size gVLAD on Holiday & Oxford mAP performance 2003 2010 2013 16.6% 7.1% - Full size gVLAD descriptors - Compared with state-of-the-art results - SURF detector & SURF descriptor are used - Best performances are in bold 22 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  23. Experiment: Low-dim gVLAD on Holiday & Oxford mAP performance 2003 2010 2012 2013 15.4% 15.2% - Low dimensional descriptors -­‑ ​𝑙↓𝑥 =128 K=128 - Comparison with state-of-the-art - Best performances are in bold. 23 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  24. Experiment: on Large Scale Dataset 2013 2008 Avg ~ 12.5% Avg ~ 16.3% - Large Scale Data with 100k/1M distractors - Comparison with state-of-the-art - Best performances are in bold. 24 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  25. Take Home Message gVLAD 0.8 0.7 0.6 0.5 0.4 1 2 3 4 5 6 7 8 9 10 11 VLAD VLAD+SSR Ours - gVLAD Improved Fisher Bow 2003 2010 2012 2013 MutltiVoc+VLAD 25 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  26. Thank You 26 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  27. BACKUP

  28. Speed and Memory Speed ~750ms per query Memory ¨ 0.5KB per image for 128-D features. ¨ 0.5GB for 1M images. ¨ 500GB for 1B images. 28 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

  29. Comparison with CNN-Based Approaches Neural Codes MOP-CNN “Neural Codes for Image Retrieval”, A Babenko1, A. Slesarev, A. Chigorin, and V. Lempitsky, arXiv, April 2014. “Multi-scale Orderless Pooling of Deep Convolutional Activation Features”, Yunchao Gong1, Liwei Wang2, Ruiqi gVLAD Guo2, and Svetlana Lazebnik, arXiv, March 2014. 29 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Recommend


More recommend