Generative Models I Ian Goodfellow, Sta ff Research Scientist, Google - PowerPoint PPT Presentation

Generative Models I Ian Goodfellow, Sta ff Research Scientist, Google Brain MILA Deep Learning Summer School Montréal, Québec 2017-06-27

Density Estimation (Goodfellow 2017)

Sample Generation Training examples Model samples (Goodfellow 2017)

Maximum Likelihood θ ∗ = arg max E x ∼ p data log p model ( x | θ ) θ (Goodfellow 2017)

Taxonomy of Generative Models … Direct Maximum Likelihood GAN Explicit density Implicit density Markov Chain Tractable density Approximate density GSN -Fully visible belief nets -Change of variables Variational Markov Chain models Variational autoencoder Boltzmann machine (Goodfellow 2017)

Fully Visible Belief Nets • Explicit formula based on chain (Frey et al, 1996) rule: n Y p model ( x ) = p model ( x 1 ) p model ( x i | x 1 , . . . , x i − 1 ) i =2 x 1 x 1 x 2 x 2 x 3 x 3 x 4 x 4 x n x n (Goodfellow 2017)

Fully Visible Belief Nets • Disadvantages: • O( n ) non-parallelizable sample generation runtime • Generation not controlled by a latent code (Goodfellow 2017)

Notable FVBNs PixelCNN NADE (van den Ord et al 2016) MADE (Larochelle et al 2011) (Germain et al 2016) “Autoregressive models” (Goodfellow 2017)

Change of Variables � ◆� ✓ ∂ g ( x ) � � y = g ( x ) ⇒ p x ( x ) = p y ( g ( x )) � det � � ∂ x � e.g. Nonlinear ICA (Hyvärinen 1999) Disadvantages: - Transformation must be invertible - Latent dimension must match visible dimension 64x64 ImageNet Samples Real NVP (Dinh et al 2016) (Goodfellow 2017)

Variational Learning Z z p model ( x ) = p model ( x , z ) d z Latent variable models often have intractable density x (Goodfellow 2017)

� Variational Bound log p ( x ) � log p ( x ) � D KL ( q ( z ) k p ( z | x )) = E z ∼ q log p ( x , z ) + H ( q ) Variational inference: maximize with respect to q Variational learning: maximize with respect to parameters of p (Goodfellow 2017)

Variational Autoencoder (Kingma and Welling 2013, Rezende et al 2014) Define a neural network that predicts optimal q Define p ( z | x ) via another neural network Whole model can be fit via maximization of a single objective function with gradient- based CIFAR-10 samples optimization (Kingma et al 2016) (Goodfellow 2017)

For more information… • Max Welling will teach a lesson on variational inference (Goodfellow 2017)

Deep Boltzmann Machines (Salakhutdinov and Hinton, 2009) (Goodfellow 2017)

Generative Stochastic Networks (Bengio et. al, 2013) (Goodfellow 2017)

Generative Adversarial Networks D tries to make D(G(z)) near 0, D (x) tries to be G tries to make near 1 D(G(z)) near 1 Di ff erentiable D function D x sampled from x sampled from data model Di ff erentiable function G Input noise z (Goodfellow et al., 2014) (Goodfellow 2017)

Combining VAEs and GANs: Adversarial Variational Bayes Related: -Adversarial autoencoders -Adversarially learned inference -BiGANs (Mescheder et al, 2017) (Goodfellow 2017)

What can you do with generative models? • Simulated environments and training data • Missing data • Semi-supervised learning • Multiple correct answers • Realistic generation tasks • Simulation by prediction • Learn useful embeddings (Goodfellow 2017)

(Goodfellow 2017)

Generative models for simulated training data (Shrivastava et al., 2016) (Goodfellow 2017)

What is in this image? (Yeh et al., 2016) (Goodfellow 2017)

Generative modeling reveals a face (Yeh et al., 2016) (Goodfellow 2017)

Supervised Discriminator Real cat Real dog Fake Real Fake Hidden Hidden units units Input Input (Odena 2016, Salimans et al 2016) (Goodfellow 2017)

Semi-Supervised Classification MNIST (Permutation Invariant) Model Number of incorrectly predicted test examples for a given number of labeled samples 20 50 100 200 333 ± 14 DGN [21] Virtual Adversarial [22] 212 191 ± 10 CatGAN [14] 132 ± 7 Skip Deep Generative Model [23] 106 ± 37 Ladder network [24] 96 ± 2 Auxiliary Deep Generative Model [23] 1677 ± 452 221 ± 136 93 ± 6 . 5 90 ± 4 . 2 Our model 1134 ± 445 142 ± 96 86 ± 5 . 6 81 ± 4 . 3 Ensemble of 10 of our models (Salimans et al 2016) (Goodfellow 2017)

Semi-Supervised Classification CIFAR-10 Model Test error rate for a given number of labeled samples 1000 2000 4000 8000 20 . 40 ± 0 . 47 Ladder network [24] 19 . 58 ± 0 . 46 CatGAN [14] 21 . 83 ± 2 . 01 19 . 61 ± 2 . 09 18 . 63 ± 2 . 32 17 . 72 ± 1 . 82 Our model SVHN 19 . 22 ± 0 . 54 17 . 25 ± 0 . 66 15 . 59 ± 0 . 47 14 . 87 ± 0 . 89 Ensemble of 10 of our models Model Percentage of incorrectly predicted test examples for a given number of labeled samples 500 1000 2000 36 . 02 ± 0 . 10 DGN [21] Virtual Adversarial [22] 24 . 63 Auxiliary Deep Generative Model [23] 22 . 86 16 . 61 ± 0 . 24 Skip Deep Generative Model [23] 18 . 44 ± 4 . 8 8 . 11 ± 1 . 3 6 . 16 ± 0 . 58 Our model 5 . 88 ± 1 . 0 Ensemble of 10 of our models (Salimans et al 2016) (Goodfellow 2017)

Next Video Frame Prediction Ground Truth MSE Adversarial What happens next? (Lotter et al 2016) (Goodfellow 2017)

Next Video Frame Prediction Ground Truth MSE Adversarial (Lotter et al 2016) (Goodfellow 2017)

iGAN youtube (Zhu et al., 2016) (Goodfellow 2017)

Introspective Adversarial Networks youtube (Brock et al., 2016) (Goodfellow 2017)

Image to Image Translation Input Ground truth Output Labels to Street Scene input output Aerial to Map input output (Isola et al., 2016) (Goodfellow 2017)

Unsupervised Image-to-Image Translation Day to night (Liu et al., 2017) (Goodfellow 2017)

CycleGAN (Zhu et al., 2017) (Goodfellow 2017)

Text-to-Image Synthesis This bird has a yellow belly and tarsus, grey back, wings, and brown throat, nape with a black face (Zhang et al., 2016) (Goodfellow 2017)

Generative Models I Ian Goodfellow, Sta ff Research Scientist, Google - PowerPoint PPT Presentation

Generative Models I Ian Goodfellow, Sta ff Research Scientist, Google Brain MILA Deep Learning Summer School Montral, Qubec 2017-06-27 Density Estimation (Goodfellow 2017) Sample Generation Training examples Model samples (Goodfellow

generative design systems Generative Brief Design Definitions Workshop Processes

Generative networks part 2: GANs 23 / 54 Recap on generative networks Generative networks provide

CSC421/2516 Lecture 18: Generative Adversarial Networks Roger Grosse and Jimmy Ba Roger Grosse

Learning Deep Generative Models Inference & Representation Lecture 12 Rahul G. Krishnan

Deep Generative models for Inverse Problems Alex Dimakis joint work with Ashish Bora, Dave Van

Invertible Generative Models for Inverse Problems Mitigating Representation Error and Dataset Bias

Generative Adversarial Nets(GANs) Troy Cary and Chenzhi Zhao A generative adversarial net is

Introduction to Generative Models (and GANs) Haoqiang Fan fhq@megvii.com Nov. 2017 Figures

Compressed Sensing and Generative Models Ashish Bora Ajil Jalal Eric Price Alex Dimakis UT

Augmented Statistical Models: Exploiting Generative Models in Discriminative Classifiers Martin

LEARNING GENERATIVE MODELS ACROSS INCOMPARABLE SPACES Cha harlot otte Bunne unne , David

Probabilistic Models of Cognition: Generative models Table of Contents Chapter

Conditional Generative Adversarial Networks (and a brief look at image-to-image translation)

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

CSC321 Lecture 19: Generative Adversarial Networks Roger Grosse Roger Grosse CSC321 Lecture 19:

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

Survivability Analysis of a Computer System under an Advanced Persistent Threat Attack guez

Stochastic processes arising from non commutative symmetries Final conference of MADACA Domaine

Invariant, super and quasi-martingale functions of a Markov process Lucian Beznea Simion Stoilow

On the Total Variation Distance of SMPs Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu

On the discretization of Feynman-Kac semi-groups. Application to rare events sampling and Di ff

Stochastic Perrons Method in Linear and Nonlinear Problems Mihai S rbu, The University of

Generative and discriminative classification techniques Machine Learning and Category

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model Gi-Soo Kim, Myunghee Cho

Generative Models I Ian Goodfellow, Sta ff Research Scientist, Google - PowerPoint PPT Presentation

Generative Models I Ian Goodfellow, Sta ff Research Scientist, Google Brain MILA Deep Learning Summer School Montral, Qubec 2017-06-27 Density Estimation (Goodfellow 2017) Sample Generation Training examples Model samples (Goodfellow

generative design systems Generative Brief Design Definitions Workshop Processes

Generative networks part 2: GANs 23 / 54 Recap on generative networks Generative networks provide

CSC421/2516 Lecture 18: Generative Adversarial Networks Roger Grosse and Jimmy Ba Roger Grosse

Learning Deep Generative Models Inference &amp; Representation Lecture 12 Rahul G. Krishnan

Deep Generative models for Inverse Problems Alex Dimakis joint work with Ashish Bora, Dave Van

Invertible Generative Models for Inverse Problems Mitigating Representation Error and Dataset Bias

Generative Adversarial Nets(GANs) Troy Cary and Chenzhi Zhao A generative adversarial net is

Introduction to Generative Models (and GANs) Haoqiang Fan fhq@megvii.com Nov. 2017 Figures

Compressed Sensing and Generative Models Ashish Bora Ajil Jalal Eric Price Alex Dimakis UT

Augmented Statistical Models: Exploiting Generative Models in Discriminative Classifiers Martin

LEARNING GENERATIVE MODELS ACROSS INCOMPARABLE SPACES Cha harlot otte Bunne unne , David

Probabilistic Models of Cognition: Generative models Table of Contents Chapter

Conditional Generative Adversarial Networks (and a brief look at image-to-image translation)

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

CSC321 Lecture 19: Generative Adversarial Networks Roger Grosse Roger Grosse CSC321 Lecture 19:

Applications of GANs Photo-Realistic Single Image Super-Resolution Using a Generative

Survivability Analysis of a Computer System under an Advanced Persistent Threat Attack guez

Stochastic processes arising from non commutative symmetries Final conference of MADACA Domaine

Invariant, super and quasi-martingale functions of a Markov process Lucian Beznea Simion Stoilow

On the Total Variation Distance of SMPs Giorgio Bacci, Giovanni Bacci, Kim G. Larsen, Radu

On the discretization of Feynman-Kac semi-groups. Application to rare events sampling and Di ff

Stochastic Perrons Method in Linear and Nonlinear Problems Mihai S rbu, The University of

Generative and discriminative classification techniques Machine Learning and Category

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model Gi-Soo Kim, Myunghee Cho

Learning Deep Generative Models Inference & Representation Lecture 12 Rahul G. Krishnan