Embedding Network and Its Application to Visual Recognition Qilong - PowerPoint PPT Presentation

G 2 DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition Qilong Wang 1 Peihua Li 1 Lei Zhang 2 1 Dalian University of Technology, 2 Hong Kong Polytechnic University

Tendency of CNN architectures LeNet-5 VGG-VD-19 /GoogLeNet-22 AlexNet-8 ResNet-152 /Inception-V4 CNN architectures tend to be Deeper & Wider More accurate ! Only Convolution, Non-linear (ReLU), Pooling

Trainable structural layers O 2 P layer (LogCOV ） [DeepO 2 P, ICCV’15] … Bilinear pooling (COV) … [B- CNN, ICCV’15] …… Mean Map Embedding [DMMs, arXiv’15] Images Conv. layers Loss VLAD Coding [NetVLAD , CVPR’16] Modeling outputs of the last convolutional layer as trainable structural layers .

Trainable structural layers Fine-grained Visual Classification B-CNN [D,D] (84.1, 84.1, 91.3) ~ 8% VGG-VD16 (76.4, 74.1, 79.8) T.-Y. Lin, A. RoyChowdhury, and S. Maji. Bilinear CNN models for fine-grained visual recognition. In ICCV, 2015.

Trainable structural layers Place Recognition (Pitts30k) ~ 15% NetVLAD (85.6) VS. AlexNet(69.8) (+AlexNet) R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic. NetVLAD: CNN architecture for weakly supervised place recognition. In CVPR, 2016.

Trainable structural layers DMMs + GoogLeNet (49.00) VS. GoogLeNet(47.5) Scene Categorization (Place205) J. B. Oliva, D. J. Sutherland, B. P ´ oczos, and J. G. Schneider. Deep mean maps. arXiv, abs/1511.04150, 2015.

Trainable structural layers DMMs + GoogLeNet (49.00) VS. GoogLeNet(47.5) Integration of trainable structural layers into deep Scene Categorization (Place205) CNNs achieves significant improvements in many challenging vision tasks. J. B. Oliva, D. J. Sutherland, B. P ´ oczos, and J. G. Schneider. Deep mean maps. arXiv, abs/1511.04150, 2015.

Parametric probability distribution modeling ① Modelling abundant statistics of features. ② Producing fixed size representations regardless of varying feature sizes. Promising modeling performance ( > coding methods )  Nakayama et al. CVPR’10 Gaussian  Serra et al. CVIU’15 Parametric probability distribution modeling Distribution  Wang et al. CVPR’16 Gaussian Mixture Model High computational efficiency  Closed-form solution of parameters estimation …… Gaussian- Laplacian Model

Embedding of global Gaussian in CNN … … …… Images Conv. layers Loss Global Gaussian

Global Gaussian distribution embedding network （ G 2 DeNet ） 1    Global Σ μμ μ T 2   μ Σ   , Gaussian: μ T   1 … f Z ( ) Matrix Partition Sub- … X Y Square-rooted SPD Z layer Matrix Sub-layer 1   T T f ( ) X AX XA …… 1 MPL N  ( ) Y Y 2 f   2 ESRL  T T AX 1b B N sym ( ) Z  f ( ) Z  f  Y  X Images Conv. Layers Global Gaussian Embedding Layer Loss  A trainable global Gaussian embedding layer for modeling convolutional features.  The first attempt to plug a parametric probability distribution into deep CNNs .

Challenges Riemannian Geometry Structure Forward Q: How to construct our trainable Propagation Algebraic global Gaussian embedding layer? Structure A: The key is to give the explicit Backward forms of Gaussian distributions. Differentiable Propagation

Gaussian embedding The space of Gaussians is a Riemannian manifold having special geometric structure. [TPAMI’ 17] shows space of Gaussians is equipped with a Lie group structure. Cholesky decomp. left polar decomp.     L L T 1  A PO 1  T   , L    T L   T   2 ,     A    P  T T 0 1 T 1 , L         Gaussian Positive upper triangular matrix SPD matrix [TPAMI’ 17] Peihua Li, Qilong Wang et al. Local Log-Euclidean Multivariate Gaussian Descriptor and Its Application to Image Classification. TPAMI, 2017.

Global Gaussian embedding layer 1     T 2        , P Gaussian Embedding : T 1     2. Square-rooted SPD Matrix Sub-layer: 1. Matrix Partition Sub-layer : 1       T    Z f Y Y 2    Y f X ESRL T MPL 1     1   T U U 2   1 2    T T T T AX XA AX 1b B N N sym Y is a function of convolutional features X. Computing square-root of Y via SVD.

BP for global Gaussian embedding layer 1   Global  Σ μμ μ T 2   μ Σ   , Gaussian: μ T   1 … X Y f Z ( ) Z … Matrix Partition Sub- Square-rooted SPD layer Matrix Sub-layer 1   T T f ( ) X AX XA …… MPL 1 N    f ( ) Y Y 2 2  T T AX 1b B ESRL N sym ( ) Z  ( ) f Z  f  Y  X Images Conv. Layers Global Gaussian Embedding Layer Loss       f Z f Z The first step is to compute The goal is to compute   X Y

BP for square-rooted SPD matrix sub-layer    f Z   T Y U U Compute  Y    f f f    Y U : d : d : d [DeepO 2 P, ICCV’15 ]    Y U       T T T d U 2 U K U d YU , sym     T d U d YU . diag            f f f 1       T T T U 2 K U U , K              ij  Y U 2 2         diag i j sym [DeepO 2 P, ICCV’15]: Catalin Ionescu et al. Matrix Backpropagation for Deep Networks with Structured Layers. ICCV, 2015.

BP for square-rooted SPD matrix sub-layer   f f Compute and   U    1 1 f f f      T    Z Y Y U U f : d Z : d U : d 2 2 ESRL    Z U   1 1 1       T T d Z  d U U  U d U 2 2 2   2   sym     1  1  f f f 1 f      T 2 U U , U U . 2 2       U Z Z 2   sym

BP for global Gaussian embedding layer The goal is to compute  f given  f  X  Y     1 2     T T T T Y f X AX XA AX 1b B MPL N N sym       f 2 f   T T N XA 1b A     X Y     f f sym  X Y : d : d   X Y BP for global Gaussian embedding layer

Global Gaussian distribution embedding network （ G 2 DeNet ） 1    Global Σ μμ μ T 2   μ Σ   , Gaussian: μ T   1 … f Z ( ) Matrix Partition Sub- … X Y Square-rooted SPD Z layer Matrix Sub-layer 1   T T f ( ) X AX XA …… 1 MPL N  ( ) Y Y 2 f   2 ESRL  T T AX 1b B N sym ( ) Z  f ( ) Z  f  Y  X Images Conv. Layers Global Gaussian Embedding Layer Loss  Gaussian Embedding.   f f  Structural Backpropagation and .   X Y

Experiments on MS-COCO 890k segmented instances from MS-COCO dataset. 80 classes, ~600k training instances, ~290k validation ones. [DeepO 2 P, ICCV’ 15] DeepO 2 P DeepO 2 P-FC DeepO 2 P-FC [ICCV 15] (S) [ICCV 15] [ICCV 15] Err. 28.6 28.9 25.2 G 2 DeNet G 2 DeNet-FC (S) G 2 DeNet-FC (Ours) (Ours) (Ours) Err. 24.4 22.6 21.5 Convergence curve of our G 2 DeNet- Comparison of classification errors on MS-COCO. FC with AlexNet on MS-COCO.

Experiments on MS-COCO 890k segmented instances from MS-COCO dataset. 80 classes, ~600k training instances, ~290k validation ones. [DeepO 2 P, ICCV’ 15] AlexNet DeepO 2 P DeepO 2 P-FC DeepO 2 P-FC (baseline) [ICCV 15] (S) [ICCV 15] [ICCV 15] Err. 25.3 28.6 28.9 25.2 G 2 DeNet G 2 DeNet-FC (S) G 2 DeNet-FC DMMs-FC [arXiv‘15] (Ours) (Ours) (Ours) Err. 24.6 24.4 22.6 21.5 Convergence curve of our G 2 DeNet- Comparison of classification errors on MS-COCO. FC with AlexNet on MS-COCO.

Experiments on FGVR - Benchmarks Birds CUB-200-2011 FGVC-Aircraft FGVC-Car 200 classes 100 classes 196 classes 5,994 training/5,794 test 6,667 training/3,333 test 8,144 training/8,041 test

Experiments on FGVR - Results Methods Birds CUB-200-2011 FGVC-Aircraft FGVC-Cars FC-CNN 76.4 74.1 79.8 FV-CNN 77.5 77.6 85.7 VLAD-CNN 79.0 80.6 85.6 NetFV [TPAMI’17] 79.9 79.0 86.2 NetVLAD [CVPR’16] 81.9 81.8 88.6 B- CNN [ICCV’15] 84.1 84.1 91.3 G 2 DeNet (Ours) 87.1 89.0 92.5 Comparison of different counterparts by using VGG-VD16 without Bounding Box & Part sharing the same settings with B-CNN. NetFV [TPAMI’17]: Lin et al. Bilinear CNNs for Fine-grained Visual Recognition. TPAMI, 2017.

Embedding Network and Its Application to Visual Recognition Qilong - PowerPoint PPT Presentation

G 2 DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition Qilong Wang 1 Peihua Li 1 Lei Zhang 2 1 Dalian University of Technology, 2 Hong Kong Polytechnic University Tendency of CNN architectures LeNet-5

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

Embedding 3-manifolds via surgery on surfaces Kyle Larson University of Texas at Austin

Large Margin Taxonomy Embedding with an Application to Document Categorization K. Weinberger and

Avoiding artifacts in spectral white matter fiber clustering and embedding Demian Wassermann

Coding Theorems for Reversible Embedding Frans Willems and Ton Kalker T.U. Eindhoven, Philips

Complex Projective Structures and the Bers Embedding August 3, 2003 David Dumas

Rby : An Embedding of Alloy in Ruby Aleksandar Milicevic , Ido Efrati, and Daniel Jackson

Outline Definition Upward Embedding Horizontal Torus ( T h ) Previous works

Large-Scale Clustering through Functional NCut Embedding Embedding Experiments Summary

Embedding ACL2 in HOL Mike Gordon, Warren A. Hunt, Jr., Matt Kaufmann, James Reynolds Gordon,

Lecture 2.2: PAT Embedding Andreas Hinzmann Annapaola de Cosa PAT Tutorial, June

Network Embedding Social and Technological Networks Rik Sarkar University of Edinburgh, 2019.

Network Embedding Introductory Talk by Akash Anil Research Scholar OSINT Lab Dept. of Computer

Scenegraphs and Engines Scenegraphs and Engines Scenegraphs Application Application

EASM 2014 considered more informative and are supposed to attract attention (Oliva, Torralba,

LANDMARKS PRESERVATION COMMISSION PUBLIC HEARING APPLICATION 875 Fifth Avenue, #19A WORK

Semantic Image Segmentation and Web-Supervised Visual Learning Florian Schroff Andrew Zisserman

Housing and Homelessness Downtown Eastside Community Fair June 2019 Todays Purpose

Bowe, & Alex Sox-Harris Center for Innovation to Implementation (Ci2i) VA Palo Alto Health

(Bhutan?!)' miles 600 km 1000 Context' GNH' WORLD ECONOMIC OUTLOOK: COPING WITH HIGH DEBT

FY 2015 P FY 2015 P ROPOSED FY 2015 P FY 2015 P ROPOSED ROPOSED ROPOSED A NNUAL NNUAL O PERATING

XXVII AiIG PhD Summer School Teaching in Management Engineering: scope, methods and practices

Embedding Network and Its Application to Visual Recognition Qilong - PowerPoint PPT Presentation

G 2 DeNet: Global Gaussian Distribution Embedding Network and Its Application to Visual Recognition Qilong Wang 1 Peihua Li 1 Lei Zhang 2 1 Dalian University of Technology, 2 Hong Kong Polytechnic University Tendency of CNN architectures LeNet-5

Greedy embedding of a graph Greedy embedding of a graph 99 Greedy embedding Greedy embedding

Graph Drawing Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 )

Planarity Embedding Embedding For a given graph G = ( V , E ) , an embedding (into R 2 ) assigns

Embedding 3-manifolds via surgery on surfaces Kyle Larson University of Texas at Austin

Large Margin Taxonomy Embedding with an Application to Document Categorization K. Weinberger and

Avoiding artifacts in spectral white matter fiber clustering and embedding Demian Wassermann

Coding Theorems for Reversible Embedding Frans Willems and Ton Kalker T.U. Eindhoven, Philips

Complex Projective Structures and the Bers Embedding August 3, 2003 David Dumas

Rby : An Embedding of Alloy in Ruby Aleksandar Milicevic , Ido Efrati, and Daniel Jackson

Outline Definition Upward Embedding Horizontal Torus ( T h ) Previous works

Large-Scale Clustering through Functional NCut Embedding Embedding Experiments Summary

Embedding ACL2 in HOL Mike Gordon, Warren A. Hunt, Jr., Matt Kaufmann, James Reynolds Gordon,

Lecture 2.2: PAT Embedding Andreas Hinzmann Annapaola de Cosa PAT Tutorial, June

Network Embedding Social and Technological Networks Rik Sarkar University of Edinburgh, 2019.

Network Embedding Introductory Talk by Akash Anil Research Scholar OSINT Lab Dept. of Computer

Scenegraphs and Engines Scenegraphs and Engines Scenegraphs Application Application

EASM 2014 considered more informative and are supposed to attract attention (Oliva, Torralba,

LANDMARKS PRESERVATION COMMISSION PUBLIC HEARING APPLICATION 875 Fifth Avenue, #19A WORK

Semantic Image Segmentation and Web-Supervised Visual Learning Florian Schroff Andrew Zisserman

Housing and Homelessness Downtown Eastside Community Fair June 2019 Todays Purpose

Bowe, &amp; Alex Sox-Harris Center for Innovation to Implementation (Ci2i) VA Palo Alto Health

(Bhutan?!)' miles 600 km 1000 Context' GNH' WORLD ECONOMIC OUTLOOK: COPING WITH HIGH DEBT

FY 2015 P FY 2015 P ROPOSED FY 2015 P FY 2015 P ROPOSED ROPOSED ROPOSED A NNUAL NNUAL O PERATING

XXVII AiIG PhD Summer School Teaching in Management Engineering: scope, methods and practices

Bowe, & Alex Sox-Harris Center for Innovation to Implementation (Ci2i) VA Palo Alto Health