Particular Object Retrieval with Integral Max-pooling of CNN - PowerPoint PPT Presentation
Particular Object Retrieval with Integral Max-pooling of CNN Activations Tolias et al. ICLR 2016 Presented by Jaehyeong Cho Contents Introduction Related works Main approaches Results Conclusion Introduction How to find
Particular Object Retrieval with Integral Max-pooling of CNN Activations Tolias et al. ICLR 2016 Presented by Jaehyeong Cho
Contents • Introduction • Related works • Main approaches • Results • Conclusion
Introduction • How to find similar images? • Convert an image into a single feature (e.g. BoW, VLAD, CNN) • Measure the similarity between features => Quality of features highly affects the retrieval results • Are all parts of an image equally representative? • No, it is better to focus on important regions only • Main contribution • Encodes several image regions into single compact feature • Localizes matching objects
Related works • Retrieval methods considering spatial information • Babenko and Lempitsky, Aggregating Deep Convolutional Features for Image Retrieval, ICCV 2015 • Aggregates multiple convolutional features from various position in an image • Gives higher weights for the features near the center
Related works • Retrieval methods considering spatial information • Kalantidis et al. , Cross-dimensional Weighting for Aggregated Deep Convolutional Features, ECCV workshop 2016 • Gives different weights according to the channel and location
Related works • Retrieval methods considering spatial information • Xie et al. , Image Classification and Retrieval are ONE, ICMR 2015 • Extract CNN features from object regions • Represent an image with multiple features
Main approaches • Maximum activations of convolutions (MAC) • Proposed by Azizpour et al. , 2014 • CNN activations for an image I • W × H × K • Utilizes only maximum activations from each channel • Enables to capture representative regions • But lacks location information
Main approaches • Regional maximum activations of convolutions (R-MAC) • Extract MAC from multiple regions => Encodes the location information • Makes a single feature by summation
Main approaches • Object localization • q : MAC feature from the query object (blue) • Find the region that maximize the similarity T is required • Fast computation of f R
Main approaches • Object localization • Approximation of T f R • Localization result helps re-ranking
Results • Comparison of retrieval accuracy • without post-processing
Results • Comparison of retrieval accuracy • with post-processing
Results • Re-ranking with object localization
Conclusion • Generated improved feature vector by encoding location information into the feature • Approximated max-pooling process for fast computation • Localized the target object and effectively used it for re-ranking
References • Babenko, Artem, and Victor Lempitsky. "Aggregating local deep features for image retrieval." Proceedings of the IEEE International Conference on Computer Vision. 2015. • Kalantidis, Yannis, Clayton Mellina, and Simon Osindero. "Cross-dimensional weighting for aggregated deep convolutional features." arXiv preprint arXiv:1512.04065 (2015). • Xie, Lingxi, et al. "Image classification and retrieval are one." Proceedings of the 5th ACM on International Conference on Multimedia Retrieval. ACM, 2015. • Azizpour, Hossein, et al. "From generic to specific deep representations for visual recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 2015.
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.