Learning Kernel-matrix-based Representation for Fine-grained Image Recognition Lei Wang VILA group School of Computing and Information Technology University of Wollongong, Australia 11-Dec-2019
Introduction • Fine-grained image recognition Image courtesy of “Wei et. al., Deep learning for fine - grained image analysis: A survey”
Introduction • Fine-grained image recognition Image courtesy of “Wei et. al., Deep learning for fine - grained image analysis: A survey”
Introduction • Fine-grained image recognition Image courtesy of “Wei et. al., Deep learning for fine - grained image analysis: A survey”
Introduction • Feature : how to represent an image? – Scale, rotation, illumination, occlusion, deformation, … – Differences with respect to other classes – Ultimate goal: “ Invariant and discriminative ” Cat:
1. Before year 2000 • Hand-crafted , global features – Color, texture, shape, structure, etc. – An image becomes a feature vector • Shallow classifiers – K- nearest neighbor, SVMs, Boosting, …
2. Days of Bag of Features model (2003-12) Local Invariant Features • Invariant to view angle, rotation, scale, illumination, clutter, ... Interest point detection or Dense sampling An image becomes “ A set of feature vectors ”
3. Era of Deep Learning (since 2012) Deep Local Descriptors Depth Again, an image becomes “ A set of feature vectors ” Height Width “Cat”
Image(s): a set of points/vectors Object recognition Image set classification vs. Action recognition Neuroimaging analysis 9 How to pool a set of points/vectors to obtain a global visual representation ?
Pooling operation • Max pooling, average (sum) pooling, etc. x 1 x 2 How to pool? A set of . local . descriptors . x n • Covariance pooling ( second-order pooling )
Outline • Introduction on Covariance representation • Our research work – Moving to Kernel-matrix -based Representation (KSPD) – End-to-end Learning KSPD in deep neural networks • Conclusion 11
Introduction on Covariance representation Covariance Matrix vs.
Introduction on Covariance representation Use a Covariance matrix as a feature representation 13 Image is from http://www.statsref.com/HTML/index.html?multivariate_distributions.html
Introduction on Covariance representation belongs to Symmetric Positive Definite (SPD) matrix resides on a manifold instead of the whole space 14
Introduction on Covariance representation How to measure the similarity of two SPD matrices? ? 15
Introduction on SPD matrix Similarity measures for SPD matrices Geodesic distance SPD Euclidean matrices mapping Kernel method 16
Introduction on SPD matrix Fletcher P T, Principal geodesic analysis on symmetric spaces: Statistics of Pennec X, Fillard P, Ayache N. A diffusion tensors. Computer Vision and Riemannian framework for tensor Mathematical Methods in Medical and computing. IJCV, 2006 Biomedical Image Analysis., 2004 2004 2006 Geodesic 2003 distance Lenglet C, Statistics on the manifold of Förstner W, Moonen B. A metric for multivariate normal distributions: Theory and covariance matrices, Geodesy-The application to diffusion tensor MRI processing. Challenge of the 3rd Millennium, 2003 Journal of Mathematical Imaging and Vision, 2006 17
Introduction on SPD matrix Arsigny V, Log‐Euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic resonance in medicine, 2006, 2005 2008 Euclidean 2006 mapping Tuzel O, Pedestrian detection via classification on Veeraraghavan A, Matching shape sequences in riemannian manifolds. PAMI, IEEE Transactions on, 2008 video with applications in human movement analysis. PAMI, IEEE Transactions on, 2005 18
Introduction on SPD matrix Sra S. Positive definite matrices and the S- Wang R., et. al., Covariance discriminative Vemulapalli R, Pillai J K, Chellappa R. divergence. arXiv preprint arXiv:1110.1773, learning: A natural and efficient approach Kernel learning for extrinsic classification 2011. to image set classification, CVPR, 2012 of manifold features, CVPR, 2013 2011 Kernel methods 2012 2013 2014 Quang, Minh Ha, et. Al., Log-Hilbert- Schmidt metric between positive definite Harandi M et al. Sparse coding and dictionary S. Jayasumana, et. al., Kernel methods on the Riemannian operators on Hilbert spaces. NIPS. 2014. learning for SPD matrices: a kernel approach, manifold of symmetric positive definite matrices, CVPR 2013. ECCV, 2012 19
Introduction on SPD matrix Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017 Wang et al., G2DeNet: Global Li et al., Is Second-order Information Ionescu et al, , Matrix Gaussian Distribution Embedding Helpful for Large-scale Visual Recognition? Lin et al, Bilinear CNN Models Backpropagation for Deep Network and Its Application to ICCV2017 for Fine-grained Visual Networks with Structured Visual Recognition, CVPR2017 Recognition, ICCV2015 Layers, ICCV2015 Koniusz et al., A Deeper Look at Power Normalizations,, CVPR 2018 2015 Integration with 2016 2018 deep learning 2017 Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017 Huang et al., A Riemannian Cui et al., Kernel Pooling for Convolutional Gao et al. Compact Bilinear Pooling, Network for SPD Matrix Neural Networks, CVPR 2017 CVPR2016 20 Learning, AAAI2017
Introduction on SPD matrix Zheng et al., Learning Deep Bilinear Transformation for Fine-grained Image Representation , NeurIPS2019 Yu and Salzmann, Statistically- Li et al, Towards Faster Fang et al., Bilinear Attention Networks for motivated Second-order Pooling, Training of Global Covariance Person Retrieval, ICCV2019 ECCV2018 Pooling Networks by Iterative Matrix Square Root Wei et al., Kernelized Subspace Gao et al., Global Second-order Pooling Normalization, CVPR2018 Pooling for Deep Local Convolutional Networks, CVPR2019 Zheng et al., Learning Deep Bilinear Descriptors, CVPR2018 Transformation for Fine-grained Image Representation , NeurIPS2019 2018 Integration with deep learning 2019 Brooks et al., Riemannian batch normalization for SPD neural networks, Yu et al., Hierarchical Bilinear NeurIPS2019 Wang et al., Deep Global Lin et al., Second-order Democratic Pooling for Fine-Grained Generalized Gaussian Networks, Aggregation, ECCV2018 21 Visual Recognition, ECCV2018 CVPR2019
Outline • Introduction on Covariance representation • Our research work – Moving to Kernel-matrix -based Representation (KSPD) – End-to-end Learning KSPD in deep neural networks • Conclusion 22
Motivation Covariance Matrix Covariance matrix needs to be estimated from data
Motivation • Covariance estimate becomes singular – High-dimensional ( d ) features – Small sample ( n ) • Covariance matrix only characterises linear correlation between feature components 24
Introduction Applications with high dimensions but small sample issue Small sample 10 ~ 300 High dimensions 50 ~ 400 25
Introduction Look into Covariance representation i -th feature j -th feature … Just a linear kernel function! 26
Proposed kernel-matrix representation (ICCV15) Let’s use a kernel function instead Covariance Kernel Matrix! Advantages : • Model nonlinear relationship between features; • For many kernels, M is guaranteed to be nonsingular , no matter what the feature dimensions ( d ) and sample size ( n ) are. • Maintain the size of covariance representation to be d x d and therefore the computational load. 27
Application to skeletal action recognition * Cov_JH_SVM uses a kernel function to map each of the n samples into an infinite-dimensional space and implicitly computes a covariance matrix there. 28
Application to skeletal action recognition * Cov_JH_SVM uses a kernel function to map each of the n samples into an infinite-dimensional space and implicitly computes a covariance matrix there. 29
Application to object recognition (handcrafted features) 30
Application to scene recognition (extracted deep features) Comparison on MIT Indoor Scenes Data Set (Classification accuracy in percentage) 80 78 76 74 * Fisher Vector 72 (CVPR15) 70 68 Cov-RP 66 64 Ker-RP (RBF) 62 60 58 Alex Net (F7) VGG-19 Net (Conv5) * Cimpoi et. al., Deep filter banks for texture recognition and segmentation, CVPR2015 31
Outline • Introduction on Covariance representation • Our research work – Moving to Kernel-matrix -based Representation (KSPD) – End-to-end Learning KSPD in deep neural networks • Conclusion 32
Covariance representation Integration with Deep Learning Bilinear CNN Models for Fine-grained Visual Recognition, Lin et al, ICCV2015
Covariance representation Integration with Deep Learning Matrix Backpropagation for Deep Networks with Structured Layers, Ionescu et al, ICCV2015
Covariance representation Integration with Deep Learning G^2DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition, Wang et al, CVPR2017
Covariance representation Integration with Deep Learning Improved Bilinear Pooling with CNN, Lin and Maji, BMVC2017
Covariance representation Integration with Deep Learning Is Second-order Information Helpful for Large-scale Visual Recognition?, Li et al., ICCV2017
Recommend
More recommend