Conditional Adversarial Networks (or mapping from A to B) CS448V - PowerPoint PPT Presentation

Conditional Adversarial Networks (or “mapping from A to B”) CS448V — Computational Video Manipulation May 22 th , 2019

Why? - Cool! Trendy! - Google Scholar Pix2Pix CycleGAN … Hundreds of applications and follow-up works …

Enhancing Transitions

Single-Photo Facial Animation

Text-based Editing

Few-Shot Reenactment

Digital Humans

Overview • Convolutional Neural Networks • Generative Modeling • Pix2Pix (“mapping from A to B”)

Convolutional Neural Network Components? • 2D Convolution Layers (Conv2D) • Subsampling Layers (MaxPool , …) • Non-linearity Layers (ReLU , …) • Normalization Layers (BatchNorm , …) • Upsampling Layers (TransposedConv , …) • …

Convolution 32x32x3 image height 32 32 width 3 depth

Convolution 32x32x3 image 5x5x3 filter height 32 5 Convolve the filter with the image, 3 5 i.e., “slide over the image spatially, computing dot products” 32 width 3 depth

Convolution 32x32x3 image 5x5x3 filter height 32 Result: 1 number, the result of taking the dot product between the filter and a small 5x5x3 chunk of the image, i.e., 5x5x3 = 70-dimensional dot product + bias width 32 w T x + b 3 depth

Convolution 32x32x3 image Activation map 5x5x3 filter height 28 32 Convolve (slide) over all spatial locations width 28 32 3 1 depth

Convolution 32x32x3 image Activation map ? 5x5x3 filter height 28 32 Convolve (slide) over all spatial locations width 28 32 3 1 depth

Convolution 32x32x3 image Activation map height 28 32 Convolve (slide) over all spatial locations width 28 32 Invariant to? 3 1 ? depth Rotation ? Translation ? Scaling

Convolution 32x32x3 image Activation map height 28 32 Convolve (slide) over all spatial locations width 28 32 Invariant to? 3 1 depth Rotation Translation Scaling

Convolution 32x32x3 image Activation map height 32 Convolve (slide) over all spatial locations width 32 3 depth

Convolution Layer 32x32x3 image Activation tensor height 32 Convolution Layer width 32 3 depth

Convolutional Neural Network 32x32x3 image 32 ? Convolution ReLU e.g. 6 5x5x3 32 filters 3

Convolutional Neural Network 32x32x3 image 28x28x6 tensor 32 28 Convolution ReLU e.g. 6 5x5x3 32 28 filters 3 6

Convolutional Neural Network 32x32x3 image 28x28x6 tensor 32 28 ? Convolution Convolution ReLU ReLU e.g. 6 e.g. 10 5x5x3 5x5x6 32 28 filters filters 3 6

Convolutional Neural Network 32x32x3 image 28x28x6 tensor 24x24x10 tensor 32 28 24 ... Convolution Convolution Convolution ReLU ReLU ReLU e.g. 6 e.g. 10 5x5x3 5x5x6 32 28 24 filters filters 3 6 10

Convolutional Neural Networks [LeNet-5, LeCun 1980]

Feature Hierarchy Learn the features from data instead of hand engineering them! (If enough data is available)

U-Net Skip connections “Propagate low -level features directly, helps with details”

Overview • Convolutional Neural Networks • Generative Modeling • Pix2Pix

Generative Modeling 𝑂 𝐲 𝑗 𝑗=1 𝑞(X) x ~ 𝑞 X Density Function “more of the same!” 𝑞 X Training Data New Samples We want to learn 𝑞 X from data, such that we can “sample from it”!

Generative 2D Face Modeling 𝑂 𝐲 𝑗 𝑗=1 x ~ 𝑞 X Training Data New Samples The world needs more celebrities … or not … ?

3.5 Years of Progress on Faces

https://thispersondoesnotexist.com 2018

StyleGAN - Interpolation

Overview • Convolutional Neural Networks • Generative Modeling • Pix2Pix (“mapping from A to B”)

Image-to-Image Translation

Image-to-Image Translation G argmin 𝔽 𝐲,𝐳 [L(G 𝐲 , 𝐳)] G Loss Neural Network [Zhang et al., ECCV 2016]

Image-to-Image Translation G Paired! argmin 𝔽 𝐲,𝐳 [L(G 𝐲 , 𝐳)] G Loss Neural Network [Zhang et al., ECCV 2016]

Image-to-Image Translation G argmin 𝔽 𝐲,𝐳 [L(G 𝐲 , 𝐳)] G Loss Neural Network [Zhang et al., ECCV 2016]

Image-to-Image Translation G argmin 𝔽 𝐲,𝐳 [L(G 𝐲 , 𝐳)] G “ What should I do?” Neural Network [Zhang et al., ECCV 2016]

Image-to-Image Translation G argmin 𝔽 𝐲,𝐳 [L(G 𝐲 , 𝐳)] G “ What should I do?” “ How should I do it?” [Zhang et al., ECCV 2016]

Be careful what you wish for! 2 𝑀 𝐳, 𝐳 = 𝐳 − 𝐳 2

Degradation to the mean! 2 𝑀 𝐳, 𝐳 = 𝐳 − 𝐳 2

Automate Design of the Loss?

Automate Design of the Loss? Deep learning got rid of handcrafted features. Can we also get rid of handcrafting the loss function?

Automate Design of the Loss? Deep learning got rid of handcrafted features. Can we also get rid of handcrafting the loss function? Universal loss function?

Discriminator as a Loss Function Discriminator Real or Fake? (Classifier)

Conditional GAN

Conditional GAN Input 𝐲 Output 𝐳 Generator (G)

Conditional GAN Input 𝐲 Output 𝐳 Discriminator Generator (G) Real or Fake? (D) G tries to synthesize fake images that fool D D tries to tell real from fake

Conditional GAN (Discriminator) Input 𝐲 Output 𝐳 “1” Discriminator Generator (G) Fake (0.9) (D) D tries to identify the fakes arg max D 𝔽 𝐲,𝐳 [ log D G 𝐲 + log 1 − D 𝐳 ]

Conditional GAN (Discriminator) Input 𝐲 Output 𝐳 “1” Discriminator Generator (G) Fake (0.9) (D) “0” Discriminator D tries to identify the fakes Real (0.1) (D) D tries to identify the real images GT 𝐳 arg max D 𝔽 𝐲,𝐳 [ log D G 𝐲 + log 1 − D 𝐳 ]

Conditional GAN (Generator) Input 𝐲 Output 𝐳 “0” Discriminator Generator (G) Real (0.1) (D) G tries to synthesize fake images that fool D. arg min G 𝔽 𝐲,𝐳 [ log D G 𝐲 + log 1 − D 𝐳 ]

Conditional GAN Input 𝐲 Output 𝐳 Discriminator Generator (G) Real or Fake? (D) G tries to synthesize fake images that fool the best D. arg min max 𝔽 𝐲,𝐳 [ log D G 𝐲 + log 1 − D 𝐳 ] G D

Conditional GAN Input 𝐲 Output 𝐳 Loss Function Generator (G) Real or Fake? (D) G’s perspective: D is a loss function Rather than being hand-designed, it is learned jointly !

Conditional Discriminator Input 𝐲 Output 𝐳 Generator (G) Discriminator (D) Input 𝐲 arg min max 𝔽 𝐲,𝐳 [ log D 𝐲, G 𝐲 + log 1 − D 𝐲, 𝐳 ] G D

Patch Discriminator “Rather than penalizing if the output image looks fake, penalize if each overlapping patch in the output looks fake”

1x1 Pixel Discriminator

Image Discriminator

70x70 Patch Discriminator

Conditional Discriminator Input 𝐲 Output 𝐳 Generator (G) Discriminator (D) 𝑀 𝑑𝐻𝐵𝑂 G, D = 𝔽 𝐲,𝐳 [ log D 𝐲, G 𝐲 + log 1 − D 𝐲, 𝐳 ]

Reconstruction Loss 𝑚 1 Generator (G) 𝑀 𝑚 1 G = 𝔽 𝐲,𝐳 G x − y 1 “Stable training + fast convergence” 𝐻 ∗ = arg min max 𝑀 𝑑𝐻𝐵𝑂 G, D + 𝜇 𝑀 𝑚 1 (G) G D 100

Ablation Study ?

Ablation Study

Results on the Test Split

Results for Hand Drawings ?

Demo: Pix2Pix

Limitations 1. Paired data is required 2. Temporally instable if applied per-frame to a video sequence 3. Does not generalize to 3D transformations

CycleGAN

Cycle Consistency

CycleGAN

Recycle-GAN

Limitations 1. Paired data is required 2. Temporally instable if applied per-frame to a video sequence 3. Does not generalize to 3D transformations

Vid2Vid

Conditional Adversarial Networks (or mapping from A to B) CS448V - PowerPoint PPT Presentation

Conditional Adversarial Networks (or mapping from A to B) CS448V Computational Video Manipulation May 22 th , 2019 Why? - Cool! Trendy! - Google Scholar Pix2Pix CycleGAN Hundreds of applications and follow-up works Why? -

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Adversarial Training Attacks on Deep Networks and Generative Adversarial Networks Erkut Erdem

Generative Adversarial Networks Benjamin Striner CMU 11-785 March 21, 2018 Benjamin Striner

Review: Conditional Probability Conditional Probability The conditional probability of event

11/15/16 Conditional distributions Let X and Y be discrete r.v.s. Conditional probability mass

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Generative Adversarial Networks presented by Ian Goodfellow presentation co-developed with Aaron

Generative Adversarial Networks, Wasserstein Distance, and Adversarial Loss Zhiyu Min Alibaba

Robust Estimation and Generative Adversarial Networks Weizhi ZHU Hong Kong University of Science

Introduction to Generative Adversarial Networks Ian Goodfellow, OpenAI Research Scientist NIPS

What makes a Virtual Human Alive ? 1. Avatar & Autonomous Virtual Humans 2. The complexity of

Face Identification by Image Comparison done by pixel analysis ? But which pixel to compare

CS-184: Computer Graphics Lecture #19: Motion Capture ! ! ! Prof. James OBrien ! University

CG in Movies Week 13, Wed Apr 6 http://www.ugrad.cs.ubc.ca/~cs314/Vjan2005 News Friday class

of Requirements Engineering processes A Systematic Literature Review - by - Spandan Chowdhury

The Future of Games 1 Introduction Lots of things are changing in the games industry

Administration James Fabian, Principal, Fieldman, Rolapp & Associates Susan Goodwin, Managing

SUST100: Principles of Sustainability Andrew Kerins, PhD Danville Area Community College (DACC)

Conditional Adversarial Networks (or mapping from A to B) CS448V - PowerPoint PPT Presentation

Conditional Adversarial Networks (or mapping from A to B) CS448V Computational Video Manipulation May 22 th , 2019 Why? - Cool! Trendy! - Google Scholar Pix2Pix CycleGAN Hundreds of applications and follow-up works Why? -

Neglected topics CS 446 Adversarial examples and deep networks 1 / 23 Adversarial

Adversarial Training Attacks on Deep Networks and Generative Adversarial Networks Erkut Erdem

Generative Adversarial Networks Benjamin Striner CMU 11-785 March 21, 2018 Benjamin Striner

Review: Conditional Probability Conditional Probability The conditional probability of event

11/15/16 Conditional distributions Let X and Y be discrete r.v.s. Conditional probability mass

Deep Adversarial Learning for NLP 9:00 10:30 Introduction and Adversarial Training, GANs

Stronger and Faster Wasserstein Adversarial Attacks Kaiwen Wu kaiwen.wu@uwaterloo.ca Joint work

Confidence-Calibrated Adversarial Training Generalizing to Unseen Attacks David Stutz, Matthias

Reinforcing Adversarial Robustness using Model Confidence Induced by Adversarial Training Xi Wu

Adversarial Examples and Adversarial Training Ian Goodfellow, Sta ff Research Scientist, Google

Synthesizing Robust Adversarial Examples Anish Athalye*, Logan Engstrom*, Andrew Ilyas*, Kevin

CSC321 Lecture 22: Adversarial Learning Roger Grosse Roger Grosse CSC321 Lecture 22: Adversarial

Generative Adversarial Networks presented by Ian Goodfellow presentation co-developed with Aaron

Generative Adversarial Networks, Wasserstein Distance, and Adversarial Loss Zhiyu Min Alibaba

Robust Estimation and Generative Adversarial Networks Weizhi ZHU Hong Kong University of Science

Introduction to Generative Adversarial Networks Ian Goodfellow, OpenAI Research Scientist NIPS

What makes a Virtual Human Alive ? 1. Avatar &amp; Autonomous Virtual Humans 2. The complexity of

Face Identification by Image Comparison done by pixel analysis ? But which pixel to compare

CS-184: Computer Graphics Lecture #19: Motion Capture ! ! ! Prof. James OBrien ! University

CG in Movies Week 13, Wed Apr 6 http://www.ugrad.cs.ubc.ca/~cs314/Vjan2005 News Friday class

of Requirements Engineering processes A Systematic Literature Review - by - Spandan Chowdhury

The Future of Games 1 Introduction Lots of things are changing in the games industry

Administration James Fabian, Principal, Fieldman, Rolapp &amp; Associates Susan Goodwin, Managing

SUST100: Principles of Sustainability Andrew Kerins, PhD Danville Area Community College (DACC)

Synthesizing Robust Adversarial Examples Anish Athalye, Logan Engstrom, Andrew Ilyas*, Kevin

What makes a Virtual Human Alive ? 1. Avatar & Autonomous Virtual Humans 2. The complexity of

Administration James Fabian, Principal, Fieldman, Rolapp & Associates Susan Goodwin, Managing