from variational to deterministic autoencoders
play

From Variational to Deterministic Autoencoders or the joys of - PowerPoint PPT Presentation

From Variational to Deterministic Autoencoders or the joys of density estimation in latent spaces Antonio Vergari Joint work with: Partha Ghosh, Mehdi S.M. Sajjadi , Bernhard Schlkopf, Michael Black University of California, Los Angeles


  1. From Variational to Deterministic Autoencoders or the joys of density estimation in latent spaces Antonio Vergari Joint work with: Partha Ghosh, Mehdi S.M. Sajjadi , Bernhard Schölkopf, Michael Black University of California, Los Angeles @tetraduzione 26th August 2020 - UCL - AI Center Seminars

  2. Why?

  3. Why? learning time

  4. Why? learning time inference time the generative modeling paradigm

  5. Variational Autoencoders (VAEs) Generative modeling [Van Den Oord2017, Tolstikhin2019, Razavi2019,...] ⇒ ⇒ Density Estimation [Kingma2014,Rezende2014,Burda2015,...] Disentanglement [Higgings2016, ...] ⇒ Kingma, Diederik P., and Max Welling. " Auto-encoding variational bayes. " ICLR 2014

  6. Variational Autoencoders Regularized Autoencoders (VAEs) (RAEs) Generative modeling [Van Den Oord2017, Tolstikhin2019, Razavi2019,...] ⇒ ⇒ Density Estimation [Kingma2014,Rezende2014,Burda2015,...] Disentanglement [Higgings2016, ...] ⇒ a simpler alternative for generative modeling

  7. ! disclaimer !

  8. Variational Autoencoders (VAEs)

  9. Variational Autoencoders (VAEs)

  10. Variational Autoencoders (VAEs)

  11. How to train VAEs?

  12. How to train VAEs?

  13. How to train VAEs?

  14. How to train VAEs?

  15. Training VAEs: issues Balancing reconstruction quality and compression [Burda et al. 2015, Tolshkin et al. 2018, ...] Spurious global optima [Dai et al. 2019] Posterior collapse [van den Oord et al. 2018, ...] Prior/aggregate posterior mismatch [Tolshkin et al. 2018, Dai et al. 2019, ...]

  16. Issue #1: balancing training

  17. Issue #1: balancing training one sample approximation!

  18. Issue #1: balancing training Weighting the KL term!

  19. Sampling VAEs

  20. Sampling VAEs

  21. Sampling VAEs

  22. Sampling VAEs

  23. Sampling VAEs the aggregate posterior should ideally match the prior!

  24. Issue #2: sampling spurious codes the prior/aggregate posterior mismatch

  25. Issue #2: sampling spurious codes the decoder has a hard time “imagining”

  26. Can we do better?

  27. Simpler VAEs?

  28. Simpler VAEs?

  29. Simpler VAEs?

  30. Simpler VAEs?

  31. Simpler VAEs?

  32. Simpler VAEs?

  33. Simpler VAEs?

  34. Simpler VAEs?

  35. How to have a smooth latent space? ideally,

  36. Regularized Autoencoders (RAEs)!

  37. Which regularization for RAEs?

  38. Which regularization for RAEs? Gradient penalization [Gulrajani et al. 2017; Mescheder et al. 2018]

  39. Which regularization for RAEs? Gradient penalization [Gulrajani et al. 2017; Mescheder et al. 2018] Spectral normalization [Miyato et al. 2018]

  40. Which regularization for RAEs? Gradient penalization [Gulrajani et al. 2017; Mescheder et al. 2018] Spectral normalization [Miyato et al. 2018] Weight decay [Bishop et al. 1996]

  41. RAE for image generation VAE RAE +L2 RAEs generate equally good or better samples and interpolations

  42. RAE for image generation VAE RAE +L2 AE even when regularization is implicit!

  43. Common image benchmarks: MNIST

  44. Common image benchmarks: CIFAR10

  45. Common image benchmarks: CelebA

  46. How do we sample from RAEs…?

  47. Sampling RAEs…?

  48. Ex-Post Density Estimation (XPDE)

  49. Ex-Post Density Estimation (XPDE)

  50. Which density estimator for XPDE?

  51. Which density estimator for XPDE? a SOTA deep generative model e.g. autoregressive model or Flow [van den Oord et al. 2019, Razavi et al. 2020] ...or another VAE! VAE training and sampling issues ...are still there! ⇒

  52. Which density estimator for XPDE? striving for simplicity: just Gaussian Mixture Models

  53. Can’t we just do XPDE for VAEs?

  54. Can’t we just do XPDE for VAEs?

  55. Can’t we just do XPDE for VAEs?

  56. Ex-Post Density Estimation (XPDE) XPDE consistently improves sample quality for all VAE variants

  57. Why...does it work?

  58. Why...does it work? ConvNets are very, very, very smooth ! [LeCun et al. 1994]

  59. Why...does it work? ConvNets are very, very, very smooth ! [LeCun et al. 1994] ...and these datasets are full, full, full of regularities !

  60. What about more challenging data? E.g., generating structured objects like molecules

  61. VAEs for molecules? Molecule VAE [Bombardelli et al. 2017] ⇒ GrammarVAE (GVAE) [Kusner et al. 2019] ⇒ Constrained Graph VAE ( CGVAE) [Liu et al. 2018, ...] ⇒ ... ⇒

  62. GRAE: RAEifying the Grammar VAE More accurate generation than Kusner et al. 2017

  63. RAEify your VAEs! VAE RAE

  64. RAEify your VAEs! VAE RAE

  65. RAEify your VAEs! VAE RAE

  66. Is this really simple… and new?

  67. AEs for generative modeling MCMC schemes to sample from Contractive [Rifai et al. 2011] and Denoising Autoencoders [Bengio et al. 2009]

  68. Other flavours of XPDE Two-Stage VAEs [Dai et al. 2019] use another VAE for XPDE VAE training and sampling issues ⇒ ...are still there! VQ-VAEs [van den Oord et al. 2019, Razavi et al. 2020] use PixelCNN over discrete latents VQ-VAEs are RAEs not VAEs! ⇒

  69. What did we lose?

  70. What did we lose? Variational Autoencoders Regularized Autoencoders (VAEs) (RAEs) Generative modeling ✓ Generative modeling ✓ ⇒ ⇒ Density Estimation ✓ Density Estimation ? ⇒ ⇒ Disentanglement ✓ Disentanglement ? ⇒ ⇒

  71. RAEs for density estimation ? RAEs (and VQ-VAEs) are like GANs, they are implicit likelihood models!

  72. RAEs for density estimation (?) RAEs (and VQ-VAEs) are like GANs, they are implicit likelihood models! An approximate ELBO can be recovered under some geometric assumptions

  73. RAEs for disentanglement (?)

  74. Conclusions

  75. aiPhones Phone capabilities ⇒ aiCloud, aiWatch, aiTunes,... ⇒ 4k Video, ... ⇒

  76. aiPhones RegularPhone Phone capabilities ⇒ aiCloud, aiWatch, aiTunes,... ⇒ 4k Video, ... ⇒ what is the simplest model that gets you further?

  77. Takeaway #1: RAEify your VAEs! VAE RAE

  78. Takeaway #2: use XPDE! Boost your VAEs by training a density estimator on the latent codes!

  79. Paper https://openreview.net/forum?id=S1g7tpEYDS Code https://github.com/ParthaEth/Regularized_autoencoders-RAE-

Recommend


More recommend