Conditional Generative Adversarial Networks (and a brief look at image-to-image translation) Final Presentation Peter Bromley
Generative Models What is generative modeling? Data: Samples from high dimensional probability distribution p real Model: Approximate p real with learned distribution p fake https://blog.openai.com/generative-models/
Generative Models - cont. Mapped through function w/ learned weights Compared via loss function p fake p real Why do it? Data augmentation Similar to human ability of imagining an image Can map from noise vector to high dimensional probability distribution https://www.statlect.com/probability-distributions/normal-distribution https://blog.openai.com/generative-models/
Generative Adversarial Networks Goodfellow et. al, https://arxiv.org/abs/1406.2661 Generator Sample from p fake z ~ 𝒪 (0, 1) (1) (0) Discriminator p real or p fake ? Nash Equilibrium : Discriminator guesses completely randomly (accuracy = 0.5) Sample from p real Gan loss:
DCGAN (Deep Convolutional GAN) - Model Fully Connected z ~ 𝒪 (0, 1) Sample from p fake Conv Conv p fake Conv Conv Fake or Real? p real https://www.safaribooksonline.com/library/view/deep-learning-with/9781787128422/abc1dd74-9e57-4f89-82a5-3014fc35b664.xhtml http://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html
Conditional GANs Guide the learning process by conditioning the network on some label y New loss function: * c = n + 2 (fake, class 2) c = 2 (real, class 2) c = 1 (real, class 1) 2N-GAN “n” is number of classes As of 5/10/18, this model has not been thoroughly explored in the literature * (for cGAN model specifically) https://arxiv.org/pdf/1606.03657.pdf https://arxiv.org/abs/1610.09585 https://arxiv.org/pdf/1411.1784.pdf
Project Goals Compare GAN models (specifically “Conditional” GANs) on toy datasets: Conceptually (Pros and Cons) Subjectively (My evaluation of images) Empirically (Qualitative metric) Briefly look into image-to-image translation: Apply to a novel image domain Experiment with “2N label” model
DCGAN (Not a conditional model) - Results Target Images ( p real ) Generated Images ( p fake ) DCGAN makes realistic looking images, but has no notion of class type
Conditional Models - MNIST / Fashion MNIST cDCGAN ACGAN 2NGAN
Conditional Models - MNIST / Fashion MNIST (Loss Plots) cDCGAN ACGAN 2NGAN MNIST: Fashion:
Conditional Models - MNIST (In-Class Variation) Real cDCGAN ACGAN 2NGAN ACGAN and 2NGAN seem to underfit the “ones” distribution, but not the “sevens” cDCGAN captures more variety, but produces more “wrong” numbers Note: “Ones” variant seems to be far less common than the “sevens” variant in the real data
Real CIFAR10 Data (for reference) Real plane car bird cat deer dog frog horse ship truck
Conditional Models - CIFAR10 (Results) cDCGAN ACGAN 2NGAN plane car bird cat deer dog frog horse ship truck
Conditional Models - CIFAR10 (Loss and Inception Score) cDCGAN ACGAN 2NGAN Inception Score : Use a pretrained Inception net to classify generated samples Low entropy for correct labels of samples (P(y|x)) High entropy for whole generated distribution Higher is better Inception Scores: Mean: 5.3405895 , Std: 0.12261291 Mean: 3.5518444 , Std: 0.068273105 Mean: 6.355787 , Std: 0.1747299
Conditional Models - InfoGAN InfoGAN - Information Maximizing GAN for Unsupervised Learning Input predictable latent code as well as noise Maximize mutual information btw code and output Input: z ~ 𝒪 (0, 1), c_cat ~ Categorical(0, 10), c 1 and c 2 ~ Unif(-1, 1) c 1 c 2 c 1 c 2 c_cat
Conditional Models - More InfoGAN c 2 c 1
Higher Resolution Dataset: Cat Faces DCGAN InfoGAN https://github.com/AlexiaJM/Deep-learning-with-cats
Linear Interpolation - Cats
Conditional Model Comparisons - Conclusions 2NGAN: Most stable training, highest Inception Score cDCGAN: Unstable training, but high in-class variability ACGAN: Stable training, but lowest Inception Score InfoGAN: Picks up interesting features in an unsupervised manner
A brief look into Image-to-Image Translation with CycleGAN Image-to-Image Translation: Translate image from domain X to domain Y Examples: Black-and-white photos to color Summer landscape images to winter Model works with unpaired training data CycleGAN: Cycle-Consistent Loss https://junyanz.github.io/CycleGAN/ https://hardikbansal.github.io/CycleGANBlog/
CycleGAN Results: Monet2Photo Real Photo (Real A ) Fake Monet (Fake B ) Real Monet (Real B ) Fake Photo (Fake A )
CycleGAN Results: BrownBear2Panda (My experiment)
Overall Conclusions, Open Problems, and Future Work GANs are very hard to empirically evaluate Difficult to pick up global coherence in data sets with lots of image variety Throwing labels at a model does not always make it better In the future: Further research into 2NGAN model Experiment with more variables in InfoGAN More in depth into image-to-image translation Text-to-Image Synthesis
Miscellaneous Failures Mode Collapse at 70th Epoch “Animal Faces” “Cartoon2Celebrity” Demonic dogs
Thanks for watching!
Recommend
More recommend