On the Limitations of Representing Functions on Sets Edward - PowerPoint PPT Presentation

On the Limitations of Representing Functions on Sets Edward Wagstaff*, Fabian Fuchs*, Martin Engelcke* Ingmar Posner, Michael Osborne M achine L earning R esearch G roup *Equal contribution

Examples for Permutation Invariant Problems:   Detecting Common Attributes Smiling Blond Hair CelebA Dataset, Liu et al.

The deep sets architecture Input

The deep sets architecture Input ϕ

The deep sets architecture Input Latent A ϕ

The deep sets architecture Input Latent A + ϕ

The deep sets architecture Input Latent A Latent B + ϕ

The deep sets architecture Input Latent A Latent B + ϕ ρ

The deep sets architecture Input Latent A Latent B Output + ϕ ρ

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1.

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Assume that neural networks Φ and ρ are universal function approximators

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Find a Φ such that Assume that neural mapping from input networks Φ and ρ are & set X to latent universal function representation Y is approximators injective

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Find a Φ such that Assume that neural mapping from input networks Φ and ρ are Everything can & set X to latent universal function be modelled representation Y is approximators injective

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Find a Φ such that Assume that neural mapping from input networks Φ and ρ are Everything can & set X to latent universal function be modelled representation Y is approximators injective define c ( x ) : ℚ → ℕ

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Find a Φ such that Assume that neural mapping from input networks Φ and ρ are Everything can & set X to latent universal function be modelled representation Y is approximators injective define c ( x ) : ℚ → ℕ ϕ ( x ) = 2 c ( x ) then define

Role of Continuity We need to take real numbers into account!

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M.

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M. Sketch of Proof for   Necessity

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M. Sketch of To prove necessity, we Proof for   only need one function Necessity which can’t be decomposed with N<M . We pick max(X) .

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M. Sketch of To prove necessity, we We show that, in order Proof for   only need one function to represent max(X), Necessity which can’t be Φ ( X ) = ∑ ϕ ( x ) decomposed with x N<M . We pick max(X) . needs to be injective

Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M. Sketch of To prove necessity, we We show that, in order Proof for   only need one function to represent max(X), This is not Necessity which can’t be Φ ( X ) = ∑ possible with ϕ ( x ) decomposed with N<M x N<M . We pick max(X) . needs to be injective

Illustrative Example: Regressing to the Median {0.1, 0.6, − 0.32, 1.61, 0.5, 0.67, 0.3}

Illustrative Example: Regressing to the Median 10 0 100 15 30 80 critical latent dim N c 60 100 60 RMSE 200 10 − 1 300 40 400 500 20 10 − 2 0 10 0 10 1 10 2 10 3 0 100 200 300 400 500 600 N (latent dim) input size M

Thank You

On the Limitations of Representing Functions on Sets Edward - PowerPoint PPT Presentation

On the Limitations of Representing Functions on Sets Edward Wagstaff, Fabian Fuchs, Martin Engelcke* Ingmar Posner, Michael Osborne M achine L earning R esearch G roup *Equal contribution Examples for Permutation Invariant Problems:

Sets Sets A Set is an abstract data type representing an unordered Sets are unordered and

MATH 105: Finite Mathematics 6-1: Sets Prof. Jonathan Duncan Walla Walla College Winter

Languages and Regular expressions Lecture 2 1 Strings, Sets of Strings, Sets of Sets of

Membership Functions Why Not Use . . . Representing a Number vs. A Natural Question This

More on Functions Thomas Schwarz, SJ Marquette University Functions of Functions Functions

Elementary Functions Part 1, Functions Lecture 1.4a, Symmetries of Functions: Even and Odd

Elementary Functions Part 1, Functions Lecture 1.1b, Functions defined by equations Dr. Ken W.

Orthonormal bases of functions April 24, 2018 Data - Vectors or Functions Vectors Functions

Functions Programmer-Defined Functions Local Variables in Functions Overloading

Functions Declarations vs Definitions Inline Functions Class Member functions

Periodic Functions and Orthogonal Systems Periodic Functions Even and Odd Functions

Hash Functions in Action Hash Functions in Action Lecture 12 Hash Functions Hash Functions

Hash Functions in Action Hash Functions in Action Lecture 11 Hash Functions Hash Functions

Chapter 2 With Question/Answer Animations Chapter Summary ! Sets ! The Language of Sets ! Set

CPSC 121: Models of Computation Unit 12 Sets and Functions Based on slides by Patrice Belleville

Feasible computation on general sets Arnold Beckmann (joint work with Sam Buss and Sy Friedman)

FIRST Sets Dr. Mattox Beckman University of Illinois at Urbana-Champaign Department of Computer

Streaming Algorithms for Set Cover Piotr Indyk With : Sepideh Mahabadi, Ali Vakilian Set Cover

Using Wildlife Acoustics SM4Bat Joe Chun-Chia Huang 1 5/4/2018 SM4 BAT Two models: SM4BAT FS

DLBricks: Composable Benchmark Generation to Reduce Deep Learning Benchmarking Effort on CPUs

CPSC 121: Models of Computation PART 1 REVIEW OF TEXT READING Unit 11: Sets These pages

Theory and Practice of Finding Eviction Sets Pepe Vila Boris Kpf Jos F. Morales IMDEA

A Selective Survey of Self-Similar Sets and Suchlike Structures Kenneth Falconer University of

Ramseys theorem for pairs and provable recursive functions Alexander Kreuzer (joint work with