adaptive density estimation for generative models
play

Adaptive Density Estimation for Generative Models Thomas - PowerPoint PPT Presentation

Adaptive Density Estimation for Generative Models Thomas Konstantin Karteek Cordelia Jakob Lucas Shmelkov Alahari Schmid Verbeek Now at Huawei Generative modelling Goal Given samples from target distribution p , train a model


  1. Adaptive Density Estimation for Generative Models Thomas Konstantin Karteek Cordelia Jakob Lucas Shmelkov ∗ Alahari Schmid Verbeek ∗ Now at Huawei

  2. Generative modelling Goal Given samples from target distribution p ∗ , train a model p θ to match p ∗ 1

  3. Generative modelling Goal Given samples from target distribution p ∗ , train a model p θ to match p ∗ • Maximum likelihood: Eval. training points under the model 1

  4. Generative modelling Goal Given samples from target distribution p ∗ , train a model p θ to match p ∗ • Maximum likelihood: Eval. training points under the model • Adversarial training 1 : Eval. samples under (approximation of) p ∗ 1Ian Goodfellow et al. (2014). “Generative adversarial nets”. In: NIPS . 1

  5. Schematic illustration Model Data 2

  6. Maximum likelihood Model Data 3

  7. Maximum likelihood Model Data Over-generalization 3

  8. Maximum likelihood Consequences Model • MLE covers full support of distribution Data Over-generalization • Produces unrealistic samples 3

  9. Adversarial training Mode-dropping 4

  10. Adversarial training Consequences • Production of high quality samples • Parts of the support are dropped Mode-dropping 4

  11. Hybrid training approach Goal • Explicitly optimize both dataset coverage and sample quality 5

  12. Hybrid training approach Goal • Explicitly optimize both dataset coverage and sample quality • Discriminator can be seen as a learnable inductive bias 5

  13. Hybrid training approach Goal • Explicitly optimize both dataset coverage and sample quality • Discriminator can be seen as a learnable inductive bias • Retain valid likelihood to evaluate support coverage 5

  14. Hybrid training approach Goal • Explicitly optimize both dataset coverage and sample quality • Discriminator can be seen as a learnable inductive bias • Retain valid likelihood to evaluate support coverage Challenges • Tradeoff between the two objectives: need more flexibility 5

  15. Hybrid training approach Goal • Explicitly optimize both dataset coverage and sample quality • Discriminator can be seen as a learnable inductive bias • Retain valid likelihood to evaluate support coverage Challenges • Tradeoff between the two objectives: need more flexibility • Limiting parametric assumptions required for tractable MLE, e.g. Gaussianity, conditional independence 5

  16. Hybrid training approach Goal • Explicitly optimize both dataset coverage and sample quality • Discriminator can be seen as a learnable inductive bias • Retain valid likelihood to evaluate support coverage Challenges • Tradeoff between the two objectives: need more flexibility • Limiting parametric assumptions required for tractable MLE, e.g. Gaussianity, conditional independence • Often no likelihood in pixel space 2 2A. Larsen et al. (2016). “Autoencoding beyond pixels using a learned similarity metric”. In: ICML . 5

  17. Conditional independence Data 6

  18. Conditional independence N � p ( x | z ) = N ( x i | µ θ ( z ) , σ I n ) i Data 6

  19. Conditional independence Strongly penalysed N by GAN � p ( x | z ) = N ( x i | µ θ ( z ) , σ I n ) Strongly i penalysed by MLE Data 6

  20. Going beyond conditional independence Avoiding strong parametric assumptions • Lift reconstruction losses into a feature space 7

  21. Going beyond conditional independence Avoiding strong parametric assumptions • Lift reconstruction losses into a feature space • Deep invertible models: valid density in image space 7

  22. Going beyond conditional independence Avoiding strong parametric assumptions • Lift reconstruction losses into a feature space • Deep invertible models: valid density in image space • Retain fast sampling for adversarial training 7

  23. Maximum likelihood estimation with feature targets 8

  24. Maximum likelihood estimation with feature targets Amortized Variational inference in feature space : � � � det ∂f ψ � � L θ,φ,ψ ( x ) = − E q φ ( z | x ) [ln( p θ ( f ψ ( x ) | z ))] + D KL ( q φ ( z | x ) || p θ ( z )) − ln � � ∂ x � � �� � Evidence lower bound in feature space 8

  25. Maximum likelihood estimation with feature targets Amortized Variational inference in feature space : � � � det ∂f ψ � � L θ,φ,ψ ( x ) = − E q φ ( z | x ) [ln( p θ ( f ψ ( x ) | z ))] + D KL ( q φ ( z | x ) || p θ ( z )) − ln � � ∂ x � � �� � Ch. of Var. 8

  26. Maximum likelihood estimation with feature targets Maximum Likelihood Amortized Variational inference in feature space : � � � det ∂f ψ � � L θ,φ,ψ ( x ) = − E q φ ( z | x ) [ln( p θ ( f ψ ( x ) | z ))] + D KL ( q φ ( z | x ) || p θ ( z )) − ln � � ∂ x � 8

  27. Maximum likelihood estimation with feature targets Maximum Likelihood Adv. training Amortized Variational inference in feature space : � � � det ∂f ψ � � L θ,φ,ψ ( x ) = − E q φ ( z | x ) [ln( p θ ( f ψ ( x ) | z ))] + D KL ( q φ ( z | x ) || p θ ( z )) − ln � � ∂ x � Adversarial training with Adaptive Density Estimation : � � D ( f − 1 ψ ( µ θ ( z ))) L adv ( p θ,ψ ) = − E p θ ( z ) ln 1 − D ( f − 1 ψ ( µ θ ( z ))) � �� � Adv. update using log ratio loss 8

  28. Experiments on CIFAR10 Samples Real images 9

  29. Experiments on CIFAR10 Model BPD ↓ IS ↑ FID ↓ GAN WGAN-GP 7 . 9 SNGAN 7 . 4 29 . 3 SNGAN ( R,H ) 8 . 2 21 . 7 MLE 3 . 8 † 73 . 5 † VAE-IAF 3 . 1 4 . 5 † 56 . 8 † NVP 3 . 5 Hybrid Ours (v1) 3 . 8 8 . 2 17 . 2 Ours (v2) 3 . 5 6 . 9 28 . 9 FlowGan 4 . 2 3 . 9 Samples Real images 9

  30. Samples and real images (LSUN churches, 64 × 64 ) Samples @ 4 . 3 BPD Real images Thank you for listening. Come see us at poster 71 :) 10

Recommend


More recommend