on the limitations of representing functions on sets
play

On the Limitations of Representing Functions on Sets Edward - PowerPoint PPT Presentation

On the Limitations of Representing Functions on Sets Edward Wagstaff*, Fabian Fuchs*, Martin Engelcke* Ingmar Posner, Michael Osborne M achine L earning R esearch G roup *Equal contribution Examples for Permutation Invariant Problems:


  1. On the Limitations of Representing Functions on Sets Edward Wagstaff*, Fabian Fuchs*, Martin Engelcke* Ingmar Posner, Michael Osborne M achine L earning R esearch G roup *Equal contribution

  2. Examples for Permutation Invariant Problems: 
 Detecting Common Attributes Smiling Blond Hair CelebA Dataset, Liu et al.

  3. The deep sets architecture Input

  4. The deep sets architecture Input ϕ

  5. The deep sets architecture Input Latent A ϕ

  6. The deep sets architecture Input Latent A + ϕ

  7. The deep sets architecture Input Latent A Latent B + ϕ

  8. The deep sets architecture Input Latent A Latent B + ϕ ρ

  9. The deep sets architecture Input Latent A Latent B Output + ϕ ρ

  10. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ

  11. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1.

  12. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof

  13. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Assume that neural networks Φ and ρ are universal function approximators

  14. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Find a Φ such that Assume that neural mapping from input networks Φ and ρ are & set X to latent universal function representation Y is approximators injective

  15. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Find a Φ such that Assume that neural mapping from input networks Φ and ρ are Everything can & set X to latent universal function be modelled representation Y is approximators injective

  16. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Find a Φ such that Assume that neural mapping from input networks Φ and ρ are Everything can & set X to latent universal function be modelled representation Y is approximators injective define c ( x ) : ℚ → ℕ

  17. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 1 (Zaheer et al.): This architecture can successfully model any permutation invariant function, even for latent dimension N=1. Proof Find a Φ such that Assume that neural mapping from input networks Φ and ρ are Everything can & set X to latent universal function be modelled representation Y is approximators injective define c ( x ) : ℚ → ℕ ϕ ( x ) = 2 c ( x ) then define

  18. Role of Continuity We need to take real numbers into account!

  19. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ

  20. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M.

  21. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M. Sketch of Proof for 
 Necessity

  22. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M. Sketch of To prove necessity, we Proof for 
 only need one function Necessity which can’t be decomposed with N<M . We pick max(X) .

  23. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M. Sketch of To prove necessity, we We show that, in order Proof for 
 only need one function to represent max(X), Necessity which can’t be Φ ( X ) = ∑ ϕ ( x ) decomposed with x N<M . We pick max(X) . needs to be injective

  24. Input Output + ϕ f ( x 1 , …, x M ) ρ ϕ ( x 1 ) x 1 Y x M ϕ ( x M ) X ⊂ ℝ M ℝ NxM ℝ N ℝ Theorem 2 : If we want to model all permutation invariant functions, it is suf<icient and necessary that the latent dimension N is at least as large as the maximum input set size M. Sketch of To prove necessity, we We show that, in order Proof for 
 only need one function to represent max(X), This is not Necessity which can’t be Φ ( X ) = ∑ possible with ϕ ( x ) decomposed with N<M x N<M . We pick max(X) . needs to be injective

  25. Illustrative Example: Regressing to the Median {0.1, 0.6, − 0.32, 1.61, 0.5, 0.67, 0.3}

  26. Illustrative Example: Regressing to the Median {0.1, 0.6, − 0.32, 1.61, 0.5, 0.67, 0.3}

  27. Illustrative Example: Regressing to the Median 10 0 100 15 30 80 critical latent dim N c 60 100 60 RMSE 200 10 − 1 300 40 400 500 20 10 − 2 0 10 0 10 1 10 2 10 3 0 100 200 300 400 500 600 N (latent dim) input size M

  28. Thank You

Recommend


More recommend