Geometric VLAD for Large Scale Image Search Zixuan Wang 1 , Wei Di 2 - PowerPoint PPT Presentation

Geometric VLAD for Large Scale Image Search Zixuan Wang 1 , Wei Di 2 , Anurag Bhardwaj 2 , Vignesh Jagadesh 2 , Robinson Piramuthu 2 1 2

Our Goal 1) Robust to various imaging conditions 2) Small memory footprint 3) Speed (<1s per query) 2 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Issues with matching images (1/2) Photometric Invariance • Brightness • Exposure 3 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Issues with matching images (2/2) Geometric Invariance • Rotation • Translation • Scale 4 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

State-of-the-art: Bag-of-Words (BoW) BoW Computation Codebook Descriptor Image Inventory Keypoint Detection Bag-of-Words Construction Computation BoW Encoding Slide evolved from Fei-Fei Li’s Image Encoding Image Inventory Bag-of-Words Inverted Indices (size = 200k) 5 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Issues with BOW Matching v Weak Matching Schema o for a “small” visual dictionary: too many false matches o for a “large” visual dictionary: many true matches are missed v Hard to find vocabulary size trade-offs v Large inverted index size 6 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Recent approaches for very large scale indexing BoW Computation Codebook Descriptor Image Inventory Keypoint Detection Bag-of-Words Construction Computation Vector Encoding Vector Compression Image Inventory Bag-of-Words Nearest Neighbor Search (size = 128) 7 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

VLAD: Vector of Locally Aggregated Descriptors For a given image l x assign each descriptor to closest center c i ► accumulate (sum) descriptors per cell ► v i := v i + ( x - c i ) c i Residual ( x - c i ) adds useful information l VLAD (dimension D = k x d ) – typical k = 64 l 128 Dimension VLAD has better performance than 65k BoW! 8 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Issue with VLAD x x r r c i c i VLAD : v i := v i + (x - c i ) VLAD : r + r + r + r = 4*r VLAD : r + r + r + r = 4*r VLAD fails to capture geometry information 9 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

gVLAD: Incorporating Geometry in VLAD x x r r Bin 1 c i c i Bin 2 gVLAD : Take 2 angle bins - [-30,120), [120-330) § v i := v i + (x - c i ) per angle bin § gVLAD: (4*r, 0) gVLAD: (2*r, 2*r) Bin 1: r + r + r + r Bin 1: r + r Bin 2: 0 Bin 2: r + r Angle binning captures different geometry of configurations! 10 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Power of Keypoint Angle Features mAP Angle Bin (8) 0.15 Angle Bin (18) 0.24 Angle Bin (36) 0.26 Angle Bin (72) 0.27 GIST (544) 0.35 BoW (20,000) 0.45 Retrieval performance using only angle histogram Only 72-D Angle Bin performs well! 11 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Datasets & Vocabularies Paris 6K Holidays Oxford 5K 1491 / 500 queries 6412 / 60 queries 5062 / 55 queries q Large Scale Distractors Flickr 100K, Flickr 1M q Vocabulary k-means clustering on SURF descriptors Rotated Holidays with k = 256 on Paris dataset 12 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Dataset: Holiday & Oxford Holiday example queries Oxford example queries 13 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Example Distractors – Flickr 14 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

gVLAD: Keypoint Detection & Descriptor Extraction – feature descriptor – angle 15

gVLAD: Learning Angle Membership Rotate Holiday, 4 Bins Oxford, 4 Bins 0.855 0.63 0.85 0.62 0.845 0.61 0.84 0.6 mAP mAP 0.835 0.59 0.83 0.58 0.825 0.57 0.82 0.56 0.815 0.81 0.55 0 20 40 60 80 100 0 20 40 60 80 100 Offset Offset 16

gVLAD: Learning Angle Membership Von-Mises Distribution Holiday SURF (8,233,763 key points) 17

gVLAD: Vocabulary Adaptation • Adapt existing codebooks with incremental dataset • Alleviate the need of frequent large-scale codebook training 𝑉 ¡ - Initial data set 𝑣 initial codebook initial codebook k 1 𝐸 ¡ – New data set 𝑣 adapted codebook k 2 k 3 18 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

gVLAD: Compute Descriptors Inter-norm ~ 17.7 % Rotated Holiday Dataset 19 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

gVLAD: PCA whitening Rotated Holiday Dataset Whitened gVLAD: 16.6 % Lower Dim 20 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

gVLAD: PCA whitening Dimension reduction on the original gVLAD using PCA From 65,536 à 128 dimensions, the mAP decreases only about 1%. 21 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Experiment: Full size gVLAD on Holiday & Oxford mAP performance 2003 2010 2013 16.6% 7.1% - Full size gVLAD descriptors - Compared with state-of-the-art results - SURF detector & SURF descriptor are used - Best performances are in bold 22 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Experiment: Low-dim gVLAD on Holiday & Oxford mAP performance 2003 2010 2012 2013 15.4% 15.2% - Low dimensional descriptors -‑ 𝑙↓𝑥 =128 K=128 - Comparison with state-of-the-art - Best performances are in bold. 23 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Experiment: on Large Scale Dataset 2013 2008 Avg ~ 12.5% Avg ~ 16.3% - Large Scale Data with 100k/1M distractors - Comparison with state-of-the-art - Best performances are in bold. 24 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Take Home Message gVLAD 0.8 0.7 0.6 0.5 0.4 1 2 3 4 5 6 7 8 9 10 11 VLAD VLAD+SSR Ours - gVLAD Improved Fisher Bow 2003 2010 2012 2013 MutltiVoc+VLAD 25 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Thank You 26 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

BACKUP

Speed and Memory Speed ~750ms per query Memory ¨ 0.5KB per image for 128-D features. ¨ 0.5GB for 1M images. ¨ 500GB for 1B images. 28 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Comparison with CNN-Based Approaches Neural Codes MOP-CNN “Neural Codes for Image Retrieval”, A Babenko1, A. Slesarev, A. Chigorin, and V. Lempitsky, arXiv, April 2014. “Multi-scale Orderless Pooling of Deep Convolutional Activation Features”, Yunchao Gong1, Liwei Wang2, Ruiqi gVLAD Guo2, and Svetlana Lazebnik, arXiv, March 2014. 29 ICML 2014 workshop on New Learning Frameworks and Models for Big Data

Geometric VLAD for Large Scale Image Search Zixuan Wang 1 , Wei Di 2 - PowerPoint PPT Presentation

Geometric VLAD for Large Scale Image Search Zixuan Wang 1 , Wei Di 2 , Anurag Bhardwaj 2 , Vignesh Jagadesh 2 , Robinson Piramuthu 2 1 2 Our Goal 1) Robust to various imaging conditions 2) Small memory footprint 3) Speed (<1s per query) 2

eZeeKonfigurator eZeeKonfigurator Vlad Grigorescu Vlad Grigorescu vlad@es.net Zeek Week 2019

Inmarsat BGAN Vlad Galu <vlad.galu@inmarsat.com> Oct 9 th 2012 What is BGAN? Worldwide

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

EE 6882 Visual Search Engine Lec. 1: Introduction tinyeye, photo copy search Web image search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Geometric Optimization Piotr Indyk April 26, 2005 Lecture 19: Geometric Optimization Geometric

Geometric Algebra A powerful tool for solving geometric problems in visual computing Leandro A.

Truncated Taylor approximation of Loewner dynamics Supervised by Prof. Dmitry Belyaev and Prof.

Class discrimination for microarray studies Vlad Popovici Swiss Institute of Bioinformatics

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

Image formation How are objects in the world captured in Image formation an image? Matlab

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Instance level recognition III: Correspondence and efficient visual search Josef Sivic

Joint Inference in Image Databases via Dense Correspondence Michael Rubinstein MIT CSAIL (while

Neural Codes for Image Retrieval David Stutz July 22, 2015 David Stutz | July 22, 2015 David

Image search through browsing using NN k networks Daniel Heesch, Marcus Pickering, Stefan Rger,

Learning for Image Search Wengang Zhou ( ) EEIS Department, University of Science &

Learning Transferable Architectures for Scalable Image Recognition Barret Zoph, Vijay Vasudevan,

Chenxi Liu , Liang-Chieh Chen, Florian Schrofg, Haruwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei

Image Search with Deep Learning Sung-Eui Yoon ( ) KAIST http://sgvr.kaist.ac.kr Class

Geometric VLAD for Large Scale Image Search Zixuan Wang 1 , Wei Di 2 - PowerPoint PPT Presentation

Geometric VLAD for Large Scale Image Search Zixuan Wang 1 , Wei Di 2 , Anurag Bhardwaj 2 , Vignesh Jagadesh 2 , Robinson Piramuthu 2 1 2 Our Goal 1) Robust to various imaging conditions 2) Small memory footprint 3) Speed (<1s per query) 2

eZeeKonfigurator eZeeKonfigurator Vlad Grigorescu Vlad Grigorescu vlad@es.net Zeek Week 2019

Inmarsat BGAN Vlad Galu &lt;vlad.galu@inmarsat.com&gt; Oct 9 th 2012 What is BGAN? Worldwide

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

EE 6882 Visual Search Engine Lec. 1: Introduction tinyeye, photo copy search Web image search

Search Engines Issues Avi Rappoport Search Tools Consulting Search Issues Enterprise Search

Geometric Optimization Piotr Indyk April 26, 2005 Lecture 19: Geometric Optimization Geometric

Geometric Algebra A powerful tool for solving geometric problems in visual computing Leandro A.

Truncated Taylor approximation of Loewner dynamics Supervised by Prof. Dmitry Belyaev and Prof.

Class discrimination for microarray studies Vlad Popovici Swiss Institute of Bioinformatics

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

Image formation How are objects in the world captured in Image formation an image? Matlab

Tabu Search Search Tabu Page 1 Part I Part I Tabu Search Principles Search Principles Tabu

Uninformed Search 2 Informed Search Rest of blind search An informed search strategyone

Informed search algorithms Outline Best-first search Greedy best-first search A *

Instance level recognition III: Correspondence and efficient visual search Josef Sivic

Joint Inference in Image Databases via Dense Correspondence Michael Rubinstein MIT CSAIL (while

Neural Codes for Image Retrieval David Stutz July 22, 2015 David Stutz | July 22, 2015 David

Image search through browsing using NN k networks Daniel Heesch, Marcus Pickering, Stefan Rger,

Learning for Image Search Wengang Zhou ( ) EEIS Department, University of Science &amp;

Learning Transferable Architectures for Scalable Image Recognition Barret Zoph, Vijay Vasudevan,

Chenxi Liu , Liang-Chieh Chen, Florian Schrofg, Haruwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei

Image Search with Deep Learning Sung-Eui Yoon ( ) KAIST http://sgvr.kaist.ac.kr Class

Inmarsat BGAN Vlad Galu <vlad.galu@inmarsat.com> Oct 9 th 2012 What is BGAN? Worldwide

Learning for Image Search Wengang Zhou ( ) EEIS Department, University of Science &