Wasserstein GAN Martin Arjovsky, Soumith Chintala, Lon Bottou, ICML - PowerPoint PPT Presentation

Wasserstein GAN Martin Arjovsky, Soumith Chintala, Léon Bottou, ICML 2017 Presented by Yaochen Xie 12-22–2017

Contents ❖ GAN and its applications [1] ❖ GAN vs. Variational Auto-Encoder [2] ❖ What’s wrong with GAN [3], [4] ❖ JS Divergence and KL Divergence [3], [4] ❖ Wasserstein Distance [4], [5] ❖ WGAN and its Implementation [4]

Take A Look Back at GAN D and G play the following two-player minimax game with the value function V(D, G):

Applications of GAN - Image Translation Conditional GAN Triangle GAN

Applications of GAN - Super-Resolution

Applications of GAN - Image Inpainting Real Input TV LR GAN

GAN vs. VAE - AutoEncoder

GAN vs. VAE - Variational AutoEncoder • add a constraint on the encoding network , that forces it to generate latent vectors that roughly follow a unit gaussian distribution • generative loss : mean squared error • latent loss : KL divergence

GAN vs. VAE • VAE - explicit, use MSE to judge generation quality • GAN - implicit, use discriminator to judge generation quality

Drawbacks of GAN Unstable, not Gradient Vanishing Mode Collapse converging

Kullback–Leibler divergence (Relative Entropy) A metric that measures the distance between two distributions Continuous Distributions: Discrete Distributions: (P||Q) (P||Q) is not equal to Notice that: Rigorously, KL divergence cannot be considered as a distance.

Jensen-Shannon divergence A symmetrized and smoothed version of the KL Divergence When two distribution are far from each other….

Where is the loss from? Cross Entropy: Loss based on Cross Entropy: What if p and q belongs to continuous distributions? Expectation

What’s going wrong? Now we fix G, and let D be optimum: => 2 times of Jensen-Shannon divergence Till now, to optimize the loss is equivalent to minimize the JS-divergence between Pr and Pg. Gradient Vanishing ！

What’s going wrong? When G is fixed and D is optimum: = Mode Collapse Unstable KL —> ∞ KL —> 0

What’s going wrong?

We need a weaker distance WHY? 1. We wish to be continuous The most fundamental di ff erence between such distances is their impact on the convergence of sequences of probability distributions. iif. —> 0 E A weaker distance means easier to converge.

We need a weaker distance WHY? 2. We wish Continuity means that when a sequence of parameters converges to , the distributions also converge to The weaker this distance, the easier it is to define a continuous mapping from θ -space to P θ -space, since it’s easier for the distributions to converge. Then

Wasserstein (Earth-Mover) Distance ！ “ If each distribution is viewed as a unit amount of "dirt" piled on , the metric is the minimum "cost" of turning one pile into the other, which is assumed to be the amount of dirt that needs to be moved times the distance it has to be moved ”

Wasserstein Distance ！ • KL-Divergence and JS-Divergence are too strong for the loss function 1 to be continuous. • Wasserstein distance is a weaker measurement of distance s.t.: 1. is continuous if is continuous. 2. is continuous and di ff erentiable almost everywhere if is locally Lipschitz with finite expectation of local Lipschitz constant.

Wasserstein Distance ！

Optimal Transportation View of GAN Brenier potential

Convex Geometry Minkowski theorem Alexandrov theorem Geometric Interpretation to Optimal Transport Map

Wasserstein distance in WGAN Kantorovich-Rubinstein Duality : https://vincentherrmann.github.io/blog/wasserstein/ when μ and ν have bounded support, where Lip( f ) denotes the minimal Lipschitz constant for f .

Implementation Compared with origin GAN, WGAN conducts four changes: - Discriminator (with sigmoid activation) —> Critic (without sigmoid) - - 1 L D = L G = - Truncation of parameters in Critic (Discriminator). - Do not use momentum when gradient descending.

Experiments

References [1] Ian J. Goodfellow. Generative Adversarial Nets. ., and Max Welling. Auto-encoding variational bayes. [2] Kingma, Diederik P [3] Martin Arjovsky and L´eon Bottou. Towards principled methods for training generative adversarial networks. [4] Martin Arjovsky, Soumith Chintala and Léon Bottou. Wasserstein GAN. [5] Na Lei, Kehua Su, Li Cui, Shing-Tung Yau, David Xianfeng Gu, A Geometric View of Optimal Transportation and Generative Model

Wasserstein GAN Martin Arjovsky, Soumith Chintala, Lon Bottou, ICML - PowerPoint PPT Presentation

Wasserstein GAN Martin Arjovsky, Soumith Chintala, Lon Bottou, ICML 2017 Presented by Yaochen Xie 12-222017 Contents GAN and its applications [1] GAN vs. Variational Auto-Encoder [2] Whats wrong with GAN [3], [4] JS

Bridging Theory and Practice of GANs MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google

GANs for Creativity and Design MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

GANs for Limited Labeled Data MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Generative Adversarial Networks MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Adversarial Machine Learning MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Adversarial Machine Learning MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Adversarial Machine Learning MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Generative Adversarial Networks MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Bregman and Wasserstein, with Applications to Generative Adversarial Networks (GANs) and beyond

Le GaN dans les systmes militaires GaN in military systems Francis Doukhan

Simulating GaN Based Devices Optical and Electrical GaN Device Simulations Contents

Introduction to GANs LSGAN SAGAN MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Introduction to GANs LSGAN SAGAN MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

The Wasserstein GAN Instructor: John Thickstun Discussion Board: Available on Ed Zoom Link:

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

A variational finite volume scheme for Wasserstein gradient flows es 1 , T. O. Gallou et 2 , G.

Jason Martin, EdD Associate Dean, Walker Library Middle Tennessee State University

NoSQL Source: Pramod J. Sadalage and Martin Fowler NoSQL Distilled: A Brief Guide to the

Device Assignment for VMs in Kubernetes Martin Polednik (@mpolednik) Software Engineer @ Red Hat

Sierpiski carpet as a Martin boundary Stefan Kohl 6th Cornell Conference on Analysis,

Welcome! Assistance Animals in Public Accommodations & Housing will begin at 2:00 p.m.

Follow The Money: Understanding Console Publishers Presented By Bill Swartz

Augmented Reality Training for UAS Aircrew History Repeats Itself 1903 1932 ~30 years From the

Bengeworth CE Academy Top Dog: A play-based intervention to narrow the gap Bengeworth CE

Wasserstein GAN Martin Arjovsky, Soumith Chintala, Lon Bottou, ICML - PowerPoint PPT Presentation

Wasserstein GAN Martin Arjovsky, Soumith Chintala, Lon Bottou, ICML 2017 Presented by Yaochen Xie 12-222017 Contents GAN and its applications [1] GAN vs. Variational Auto-Encoder [2] Whats wrong with GAN [3], [4] JS

Bridging Theory and Practice of GANs MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google

GANs for Creativity and Design MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

GANs for Limited Labeled Data MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Generative Adversarial Networks MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Adversarial Machine Learning MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Adversarial Machine Learning MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Adversarial Machine Learning MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Generative Adversarial Networks MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Bregman and Wasserstein, with Applications to Generative Adversarial Networks (GANs) and beyond

Le GaN dans les systmes militaires GaN in military systems Francis Doukhan

Simulating GaN Based Devices Optical and Electrical GaN Device Simulations Contents

Introduction to GANs LSGAN SAGAN MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

Introduction to GANs LSGAN SAGAN MIX+GAN Ian Goodfellow, Sta ff Research Scientist, Google Brain

The Wasserstein GAN Instructor: John Thickstun Discussion Board: Available on Ed Zoom Link:

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

A variational finite volume scheme for Wasserstein gradient flows es 1 , T. O. Gallou et 2 , G.

Jason Martin, EdD Associate Dean, Walker Library Middle Tennessee State University

NoSQL Source: Pramod J. Sadalage and Martin Fowler NoSQL Distilled: A Brief Guide to the

Device Assignment for VMs in Kubernetes Martin Polednik (@mpolednik) Software Engineer @ Red Hat

Sierpiski carpet as a Martin boundary Stefan Kohl 6th Cornell Conference on Analysis,

Welcome! Assistance Animals in Public Accommodations &amp; Housing will begin at 2:00 p.m.

Follow The Money: Understanding Console Publishers Presented By Bill Swartz

Augmented Reality Training for UAS Aircrew History Repeats Itself 1903 1932 ~30 years From the

Bengeworth CE Academy Top Dog: A play-based intervention to narrow the gap Bengeworth CE

Welcome! Assistance Animals in Public Accommodations & Housing will begin at 2:00 p.m.