a wrapped normal distribution on hyperbolic space for
play

A Wrapped Normal Distribution on Hyperbolic Space for Gradient Based - PowerPoint PPT Presentation

ICML19 , Jun 12 th , 2019 A Wrapped Normal Distribution on Hyperbolic Space for Gradient Based Learning Yoshihiro Nagano 1) , Shoichiro Yamaguchi 2) , Yasuhiro Fujita 2) , Masanori Koyama 2) 1) Department of Complexity Science, The University


  1. ICML’19 , Jun 12 th , 2019 A Wrapped Normal Distribution on Hyperbolic Space for Gradient Based Learning Yoshihiro Nagano 1) , Shoichiro Yamaguchi 2) , Yasuhiro Fujita 2) , Masanori Koyama 2) 1) Department of Complexity Science, The University of Tokyo, Japan 2) Preferred Networks, Inc., Japan Code: github.com/pfnet-research/hyperbolic_wrapped_distribution Poster: 6:30-9:00 PM @Pacific Ballroom #7

  2. Motivation � � P P Mammal � � � � Primate Rodent P P ) � � � � � � p � Human Monkey ��� P P � [Silver+2016]

  3. Motivation Hierarchical Datasets Hyperbolic Space � � P P � � � � P P ) � � � � � � p � Mammal P P � [Silver+2016] Primate Rodent Human Monkey ��� [Image: wikipedia.org]

  4. Motivation Hierarchical Datasets Hyperbolic Space � � P P � � � � P P ) � � � � � � p � Mammal P P � [Silver+2016] Primate Rodent y l a l i t n e n o p s x u e i d s a e r s a s e t i r c h n t i i w e Human Monkey m ��� u l o V

  5. Motivation Hierarchical Datasets Hyperbolic Space � � P P � � � � P P ) � � � � � � p � Mammal P P � [Silver+2016] Primate Rodent Human Monkey ��� [Nickel+2017]

  6. Motivation Hierarchical Datasets Hyperbolic Space � � How can we extend these works to P P probabilistic inference? � � � � P P ) � � � � � � p � Mammal P P � [Silver+2016] Primate Rodent Human Monkey ��� [Nickel+2017]

  7. Difficulty: Probabilistic Distribution on Curved Space VAEs w/ Riemannian distribution [Ovinnikov2019; Mathieu+2019] - Only limited to the Gaussian w/ scalar variance - Needs rejection sampling ⇒ Construct distribution by sampling for flexible density and sampling

  8. Construction of Hyperbolic Wrapped Distribution Lorentz model : ������������������ Defining probabilistic distribution on locally flat tangent space and projecting its random variable with the parallel transport and exponential map. We can analytically get the log-density by calculating volumetric change.

  9. Construction of Hyperbolic Wrapped Distribution Lorentz model : ��������������� Defining probabilistic distribution on locally flat tangent space and projecting its random variable with the parallel transport and exponential map. We can analytically get the log-density by calculating volumetric change.

  10. Construction of Hyperbolic Wrapped Distribution Lorentz model : Defining probabilistic distribution on locally flat tangent space and projecting its random variable with the parallel transport and exponential map. We can analytically get the log-density by calculating volumetric change.

  11. Properties of Hyperbolic Wrapped Distribution Density: Projection: �(�)���(����(���(���(�� ���������(���(���(�������� ������������ ��������/�����A������/� (���/���������� ./���/.������/� �������. ����/�������/ �������� )/�/�������������� ����/�������������������� ./�/������������������������ ����/������A ���������.������������ ��A�.�����������������/�����/��� ����/�� ≃ ℝ $ �������������������/

  12. Numerical Evaluations Variational Autoencoder (a) A tree representation of the (b) Normal VAE ( β = 1 . 0 ) (c) Hyperbolic VAE training dataset Hyperbolic VAE could learn ������� ������� not only the true hierarchical ������� ������� structure but also noisy unseen ������� ������� data without any explicit ������� ������� ������� ������� knowledge for tree. Word embedding Euclid Hyperbolic MAP Rank MAP Rank n Our model outperformed 5 0 . 296 ± . 006 25 . 09 ± . 80 0 . 506 ± . 017 20 . 55 ± 1 . 34 Euclidean counterpart for 10 0 . 778 ± . 007 4 . 70 ± . 05 0 . 795 ± . 007 5 . 07 ± . 12 20 0 . 894 ± . 002 2 . 23 ± . 03 0 . 897 ± . 005 2 . 54 ± . 20 WordNet nouns dataset. 50 0 . 942 ± . 003 1 . 51 ± . 04 0 . 975 ± . 001 1 . 19 ± . 01 100 0 . 953 ± . 002 1 . 34 ± . 02 0 . 978 ± . 002 1 . 15 ± . 01

  13. Conclusion Proposed a projection-based probabilistic distribution on hyperbolic space which is easy to use with gradient-based learning. Constructed the wrapped normal distribution on Lorentz model by projecting the random variable on locally flat tangent space. Numerically evaluated the performance of our model on various datasets including MNIST, Atari 2600 Breakout, and WordNet. Poster: 6:30-9:00 PM @Pacific Ballroom #7

Recommend


More recommend