Unsupervised Learning • There is no direct ground truth for the quantity of interest • Autoencoders • Variational Autoencoders (VAEs) • Generative Adversarial Networks (GANs) 1
Autoencoders Goal: Meaningful features that capture the main factors of variation in the dataset • These are good for classification, clustering, exploration, generation, … • We have no ground truth for them Features Encoder Input data 2 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Autoencoders Goal: Meaningful features that capture the main factors of variation Features that can be used to reconstruct the image Decoder L2 Loss function: Features (Latent variables) Encoder Input data 3 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Autoencoders Linear Transformation for Encoder and Decoder give result close to PCA Deeper networks give better reconstructions, since basis can be non-linear Original Autoencoder PCA 4 Image Credit: Reducing the Dimensionality of Data with Neural Networks, . Hinton and Salakhutdinov
Example: Document Word Prob. → 2D Code PCA-based Autoencoder 5 Image Credit: Reducing the Dimensionality of Data with Neural Networks, Hinton and Salakhutdinov
Example: Semi-Supervised Classification • Many images, but few ground truth labels supervised fine-tuning start unsupervised train classification network on labeled images train autoencoder on many images Loss function (Softmax, etc) Predicted Label GT Label Decoder Classifier L2 Loss function: Features Features (Latent Variables) Encoder Encoder Input data 6 Slide Credit: Fei-Fei Li, Justin Johnson, Serena Yeung, CS 231n
Autoencoder geometry.cs.ucl.ac.uk/creativeai 7
Generative Models • Assumption: the dataset are samples from an unknown distribution • Goal: create a new sample from that is not in the dataset … ? Dataset Generated Image credit: Progressive Growing of GANs for Improved Quality, Stability, and Variation, Karras et al. 8
Generative Models • Assumption: the dataset are samples from an unknown distribution • Goal: create a new sample from that is not in the dataset … Dataset Generated Image credit: Progressive Growing of GANs for Improved Quality, Stability, and Variation, Karras et al. 9
Generative Models Generator with parameters known and easy to sample from 10
Generative Models How to measure similarity of and ? 1) Likelihood of data in Generator with Variational Autoencoders (VAEs) parameters 2) Adversarial game: Discriminator distinguishes Generator makes it vs known and and hard to distinguish easy to sample from Generative Adversarial Networks (GANs) 11
Autoencoders as Generative Models? • A trained decoder transforms some features to approximate samples from • What happens if we pick a random ? Decoder = Generator? • We do not know the distribution of features that decode to likely samples random Feature space / latent space 12 Image Credit: Reducing the Dimensionality of Data with Neural Networks , Hinton and Salakhutdinov
Variational Autoencoders (VAEs) • Pick a parametric distribution for features • The generator maps to an image distribution (where are parameters) Generator with parameters • Train the generator to maximize the likelihood of the data sample in : 13
Outputting a Distribution Bernoulli distribution Normal distribution Generator with Generator with parameters parameters sample sample 14
Variational Autoencoders (VAEs) • Pick a parametric distribution for features • The generator maps to an image distribution (where are parameters) Generator with parameters • Train the generator to maximize the likelihood of the data sample in : 15
Variational Autoencoders (VAEs): Naïve Sampling (Monte-Carlo) • Approximate Integral with Monte-Carlo in each iteration • SGD approximates the sum over data Maximum likelihood of data in generated distribution: 16
Variational Autoencoders (VAEs): Naïve Sampling (Monte-Carlo) • Approximate Integral with Monte-Carlo in each iteration • SGD approximates the expectancy over data Loss function: Generator with Random from dataset parameters sample 17
Variational Autoencoders (VAEs): Naïve Sampling (Monte-Carlo) • Approximate Integral with Monte-Carlo in each iteration • SGD approximates the expectancy over data Loss function: • Only few map close to a given • Very expensive, or very inaccurate (depending on sample count) Generator with Random from dataset parameters sample with non-zero 18
Variational Autoencoders (VAEs): The Encoder • During training, another network can learn a distribution of good for a given Loss function: • should be much smaller than • A single sample is good enough Generator with parameters sample Encoder with parameters 19
Variational Autoencoders (VAEs): The Encoder • Can we still easily sample a new ? Loss function: • Need to make sure approximates • Regularize with KL-divergence • Negative loss can be shown to be a lower bound for the likelihood, Generator with parameters and equivalent if sample Encoder with parameters 20
Reparameterization Trick Example when : , where Generator with parameters sample Backprop Backprop? sample Encoder with Encoder with parameters parameters Does not depend on parameters 21
Feature Space of Autoencoders vs. VAEs Autoencoder VAE SIGGRAPH Asia Course CreativeAI: Deep Learning for Graphics 22
Generating Data MNIST Frey Faces sample Generator with parameters sample 23 Image Credit: Auto-Encoding Variational Bayes , Kingma and Welling
VAE on MNIST https://www.siarez.com/projects/variational-autoencoder 24
Variational Autoencoder geometry.cs.ucl.ac.uk/creativeai 25
Generative Adversarial Networks Player 1: generator Player 2: discriminator real/fake Scores if discriminator Scores if it can distinguish can’t distinguish output between real and fake from real image from dataset 26
Generative Models How to measure similarity of and ? 1) Likelihood of data in Generator with Variational Autoencoders (VAEs) parameters 2) Adversarial game: Discriminator distinguishes Generator makes it vs known and and hard to distinguish easy to sample from Generative Adversarial Networks (GANs) 27
Why Adversarial? • If discriminator approximates : • at maximum of has lowest loss • Optimal has single mode at , small variance : discriminator with parameters : generator with parameters sample 28 Image Credit: How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , Ferenc Huszár
Why Adversarial? • For GANs, the discriminator instead approximates: : discriminator depends on the generator with parameters : generator with parameters sample 29 Image Credit: How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , Ferenc Huszár
Why Adversarial? VAEs: Maximize likelihood of GANs: Maximize likelihood of generator samples in Adversarial game data samples in approximate 30 Image Credit: How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , Ferenc Huszár
Why Adversarial? VAEs: Maximize likelihood of GANs: Maximize likelihood of generator samples in Adversarial game data samples in approximate 31 Image Credit: How (not) to Train your Generative Model: Scheduled Sampling, Likelihood, Adversary? , Ferenc Huszár
GAN Objective probability that is not fake fake/real classification loss (BCE): :discriminator Discriminator objective: :generator Generator objective: sample 32
Non-saturating Heuristic Generator loss is negative binary cross-entropy: poor convergence Negative BCE 33 Image Credit: NIPS 2016 Tutorial: Generative Adversarial Networks, Ian Goodfellow
Non-saturating Heuristic Generator loss is negative binary cross-entropy: poor convergence Flip target class instead of flipping the sign for generator loss: good convergence – like BCE Negative BCE BCE with flipped target 34 Image Credit: NIPS 2016 Tutorial: Generative Adversarial Networks, Ian Goodfellow
GAN Training Loss: Loss: Discriminator training Generator training :discriminator :discriminator from dataset :generator Interleave in each training step sample 35
DCGAN • First paper to successfully use CNNs with GANs • Due to using novel components (at that time) like batch norm., ReLUs, etc. 36 Image Credit: Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , Radford et al.
Generative Adversarial Network geometry.cs.ucl.ac.uk/creativeai 37
Conditional GANs (CGANs) • ≈ learn a mapping between images from example pairs • Approximate sampling from a conditional distribution 38 Image Credit: Image-to-Image Translation with Conditional Adversarial Nets , Isola et al.
Recommend
More recommend