ICML 2020 On Learning Sets of Symmetric Elements [1] [2] [1,3] [3] Haggai Maron, Or Litany, Gal Chechik, Ethan Fetaya [1] Nvidia Research [2] Stanford University [3] Bar-Ilan University
Motivation and Overview
Set Symmetry Previous work (DeepSets, PointNet) targeted training a deep network over sets { } x 1 x 1 x 1 x 2 x 2 x 2 Deep x m , ⋮ x m , x m , … ⋮ ⋮ Net Input set
Set+Elements symmetry Both the set and its elements have symmetries. { } Deep Net , , … Input set Main challenge: What architecture is optimal when elements of the set have their own symmetries?
Deep Symmetric sets { } Input image set Output
Set symmetry: Order invariance/equivariance { } =
Set symmetry: Order invariance/equivariance { } = { }
Set symmetry: Order invariance/equivariance { } = { }
Element symmetry: Translation invariance/equivariance { }
Element symmetry: Translation invariance/equivariance { } { }
Element symmetry: Translation invariance/equivariance { } { }
Applications Modalities 1D signals 2D images 3D pointclouds Graph
This paper A principled approach for learning sets of complex elements (graphs, point clouds, images) Characterize maximally expressive linear layers that respect the symmetries ( DSS layers ) Prove universality results Experimentally demonstrate that DSS networks outperform baselines
Previous work
Deep sets [Zaheer et al. 2017]
Deep sets [Zaheer et al. 2017] Siamese CNN CNN CNN
Deep sets [Zaheer et al. 2017] Siamese Features CNN CNN CNN
Deep sets [Zaheer et al.] Siamese Features Deeps sets block CNN Deep CNN Sets CNN
Previous work: information sharing Information sharing layer CNN Aittala and Durand, ECCV 2018 CNN Sridhar et al., NeuriPS 2019 CNN CNN Liu et al., ICCV 2019 CNN CNN
Our approach
Invariance Many Learning tasks are invariant to natural transformations (symmetries) More formally. Let be a subgroup: H ≤ S n τ f : ℝ n → ℝ is invariant if , for all f ( τ ⋅ x ) = f ( x ) τ ∈ H e.g. image classification f f “Cat”
Equivariance Let be a subgroup: H ≤ S n Equivariant if , f ( τ ⋅ x ) = τ ⋅ f ( x ) τ e.g. edge detection f f τ
Invariant neural networks • Invariant by construction ⋯ Equivariant Invariant FC
Deep Symmetric Sets G x 1 , …, x n ∈ ℝ d with symmetry group G ≤ S d Want to be invariant/equivariant to both and the ordering G Formally the symmetry group is H = S n × G ≤ S nd
Main challenges • What is the space of linear equivariant layers for specific ? H = S N × G
Main challenges • What is the space of linear equivariant layers for a given ? H = S N × G • Can we compute these operators e ffi ciently?
Main challenges • What is the space of linear equivariant layers for a given ? H = S N × G • Can we compute these operators e ffi ciently? -invariant networks H • Do we lose expressive power? -invariant continuous functions H Continuous functions
Main challenges • What is the space of linear equivariant layers for a given ? H = S N × G • Can we compute these operators e ffi ciently? -invariant networks H Gap? • Do we lose expressive power? -invariant continuous functions H Continuous functions
-equivariant layers H L : ℝ n × d → ℝ n × d Theorem : Any linear − equivariant layer is of the form S N × G 1 ( x i ) + ∑ L ( X ) i = L G L G 2 ( x j ) j ≠ i L G 1 , L G where are linear -equivariant functions G 2 We call these layers Deep Sets for Symmetric elements layers (DSS)
DSS for images Single DSS layer are images x 1 , …, x n is the group of circular translations G 2 D -equivariant layers are convolutions G
DSS for images Single DSS layer CONV are images x 1 , …, x n CONV is the group of circular translations G 2 D CONV -equivariant layers are convolutions G
DSS for images Single DSS layer CONV are images x 1 , …, x n CONV is the group of circular translations G 2 D CONV -equivariant layers are convolutions G +
DSS for images Single DSS layer CONV are images x 1 , …, x n CONV is the group of circular translations G 2 D CONV -equivariant layers are convolutions G + CONV
DSS for images Single DSS layer CONV CONV Siamese part Information sharing part CONV + + CONV
Expressive power Theorem If G-equivariant networks are universal appoximators for G-equivariant functions, then so are DSS networks for -equivariant functions. S N × G
Expressive power Theorem If G-equivariant networks are universal appoximators for G-equivariant functions, then so are DSS networks for -equivariant functions. S N × G • Main tool: • Noether’s Theorem (Invariant theory) • For any finite group ℝ [ x 1 , . . . , x n ] H , the ring of invariant polynomials is finitely generated. H • Generators can be used to create continuous unique encodings for elements in ℝ n × d / H
Results
Signal classification
Image selection
Shape selection
Conclusions A general framework for learning sets of complex elements Generalizes many previous works Expressivity results Works well in many tasks and data types
Recommend
More recommend