Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros - PowerPoint PPT Presentation

Contrastive Learning for Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros Richard Zhang Jun-Yan Zhu UC Berkeley Adobe Research ECCV 2020

What is Unpaired Image-to-Image Translation? ⋯ ⋯ Training Set Test-time behavior

cycle-consistency loss CycleGAN (�� ., ICCV�17) DiscoGAN (K�� ., ICML �17) DualGAN (�� ., ICCV�17) Also used in MUNIT (H�� ., ECCV�18) DRIT (L�� ., ECCV�18)

interchangeable differentiated

invariant sensitive

What makes for a good output? Input (horse) Output (zebra) ? 𝐻

Retaining input con ontent tent Input (horse) Output (zebra) 𝐻 Discriminator

Retaining input con ontent tent Input (horse) Output (zebra) Invariant 𝐻 Sensitive 𝑨 1− 𝑨 �− 𝑨 �− 𝑨 � 𝑨 Corresponding patches should have hi high s h sim imil ilar arit ity

Patch-based Contrastive Loss 𝑨 � 𝑨 � /𝜐 1 − 𝑨 � 𝑨 1 /𝜐 0 Input (horse) Output (zebra) − 𝑨 � 𝑨 � /𝜐 0 softmax 𝐻 − 𝑨 � 𝑨 � /𝜐 0 softmax ( /𝜐 ) cosine similarities 𝑨 1 − 𝑨 � − 𝑨 � − 𝑨 � 𝑨 𝜐 =0.07 InfoNCE loss (Gutmann et al., AISTATS18 , van den Oord et al., 2018) used in MoCo and SimCLR • To produce positive pairs: • Handcrafted data augmentation (MoCo, SimCLR, etc.) • Input and synthesized image (ours) • MoCo: He et al., CVPR20, SimCLR: Chen et al., ICML20

Patchwise contrastive loss 𝐻

Patchwise contrastive loss Multilayer, Patchwise Contrastive Loss 𝐻 �nc 𝐻 d�c 𝐻 �nc

Patchwise contrastive loss Multilayer, Patchwise Contrastive Loss 𝐻 �nc 𝐻 d�c 𝐻 �nc + No fixed similarity metric (e.g., L1 or perceptual loss) + One-sided (no inverse mapping needed)

Internal vs External Patches 𝐻 Internal Patches

Internal vs External Patches 𝐻 Internal Patches External Patches Mo MoCo: : He et et al. l., , CVPR2 R20; ; SimCL CLR: : Ch Chen et et al. l., , ICML ML20 use a large set of external images as negative samples External patches make things worse

Power of In Inter ternal nal patches Texture Synthesis by Non-parametric Sampling (Efros & L��, ICCV�99, Efros & F��, SIGGRAPH�01) �� - S�� S�� -resolution using Deep Internal Learning (Shocher , C�� & I�� CVPR�18)

Internal vs External Patches input in internal rnal patches hes external patches Mode Collapse!

Identity Loss Regularization X G(X) 𝐻 Normally, Contrastive Loss between X and G(X) DTN (Taigman �� ., ICLR�17), CycleGAN (�� ., ICCV�17)

Identity Loss Regularization X G(X) 𝐻 Normally, Contrastive Loss between X and G(X) Y G(Y) Identity loss regularization 𝐻 Contrastive Loss between Y and G(Y) DTN (Taigman �� ., ICLR�17), CycleGAN (�� ., ICCV�17)

CUT FastCUT Contrastive Unpaired Translation Contrastive Loss 𝜇 � 1 𝜇 � 10 Identity Loss Regularization Conservative, Flexible, Faster than CycleGAN Even Faster than CUT

Lighter Footprint Tr Training time (sec/it iter, low ower is better) r) 0.8 0.7 0.6 0.5 0.4 0.3 < 0.5x 0.2 0.1 0 CycleGAN CycleGAN CU CUT CU CUT Fast Fast astCU astCU CUT CUT

Lighter Footprint Tr Training time (sec/it iter, low ower is better) r) 0.8 0.7 0.6 0.5 0.4 0.3 < 0.5x 0.2 0.1 0 CycleGAN CU CUT Fast astCU CUT MUNIT DRIT

Lighter Footprint Tr Training time (sec/it iter, low ower is better) r) 0.8 0.7 0.6 0.5 0.4 0.3 < 0.5x 0.2 0.1 0 CycleGAN GcGAN CU CUT Fast astCU CUT MUNIT DRIT DistanceGAN Self-DistGAN

Input CU CUT CycleGAN Fast astCUT CUT MUNIT DRIT DistanceGAN GcGAN

Dealing with Dataset Bias Source training set Target training set horse 17.9% zebra 36.8%

Dealing with Dataset Bias Source training set Target training set Input CUT FastCUT CycleGAN horse 17.9% detected pixels: zebra 30.8% zebra 25.9% zebra 19.1% zebra 36.8%

Cat � Dog Yosemite Summer � Winter Apple � Orange Paris � Burano GTA � Cityscapes

FID evaluating the realism of output images (lower is better) 180 160 140 120 100 80 60 horse2zebra cityscapes 40 cat2dog 20 0 CycleGAN MUNIT DRIT DistanceGAN SelfDistanceGAN GCGAN CUT FastCUT

Segmentation Score evaluating correspondences mean Intersection-over-Union (%), higher is better 30 25 I2I Model 20 15 % match 10 Segmenter 5 0

Single Image Translation C�� M�� 𝐻 Internal contrastive loss is well-suited for single image translation. Also see InGAN (Shocher �� ., ICCV�19), SinGAN (Shaham �� ., ICCV�19)

Reference photo Single Image Translation C�� M�� 𝐻 Internal contrastive loss is well-suited for single image translation. Also see InGAN (Shocher �� ., ICCV�19), SinGAN (Shaham �� ., ICCV�19)

Reference photo Single Image Translation C�� M�� 𝐸 𝐻 Internal contrastive loss is well-suited for single image translation. Also see InGAN (Shocher �� ., ICCV�19), SinGAN (Shaham �� ., ICCV�19)

Painting

Reference Photo Reference Photo Painting

Reference Photo Reference Photo Painting Gatys �� . CVPR�16

Reference Photo Painting STROTSS (Kolkin �� ., CVPR�19)

Reference Photo WCT 2 (�� ., ICCV�19) Painting

Reference Photo Painting Our translation result

Reference Photo Painting CycleGAN

Painting

Reference Photo Gatys �� . CVPR�16 Painting

Reference Photo STROTSS (K�� ., CVPR�19) Painting

Reference Photo WCT 2 (�� ., ICCV�19) Painting

Reference Photo Ours Painting

Reference Photo CycleGAN Painting

Reference Photo Our translation result Painting

Questions or Comments?

inter er-im image image ge tra-ima Disentanglement? intra

style yle ent onten con MUNIT (Huang, Liu, Belongie , Ka��, ECCV�18)

Structure for each row

Style for each column dark brown, light brown, white white, black uniform spotted striped

Extracting style and structure from an image 𝐹 style code 𝐻 structure code

Extracting style and structure from an image Co-occurrence Patch-based Discriminator 𝐹 style code 𝐻 structure code

Reconstruction 𝐸 Auto- structure code 𝐹 𝐻 encode style code

Reconstruction 𝐸 Auto- 𝐹 𝐻 encode Swap 𝐹 𝐻 𝐸

Reconstruction 𝐸 Auto- 𝐹 𝐻 encode Swap 𝐹 𝐻 𝐸 Reference patches Real/fake? Patch co-occurrence discriminator 𝐸 �atc�

style structure

Patch Co-Occurrence Discriminator is a Texture Discriminator What is Texture? �A� i�age �ha� ca� be �e��e�e��ed b� fi�� a�d �ec��d - ��de� ��a�i��ic�� Conjecture by Bela Julesz, 1962 Two textures that differ by first-order statistics

Patch Co-Occurrence Discriminator is a Texture Discriminator What is Texture? �A� i�age �ha� ca� be �e��e�e��ed b� fi�� a�d �ec��d - ��de� ��a�i��ic�� left adjacent pixel left adjacent pixel dark bright dark bright bright dark bright dark right pixel right pixel Conjecture by Bela Julesz, 1962 Two textures that differ by second-order statistics

Patch Co-Occurrence Discriminator is a Texture Discriminator What is Texture? �A� i�age �ha� ca� be �e��e�e��ed b� fi�� a�d �ec��d - ��de� ��a�i��ic�� Modeling joint probability is (almost) enough to capture texture Conjecture by Bela Julesz, 1962 Two textures that differ by third-order statistics

Patch Co-Occurrence Discriminator is a Texture Discriminator D( , ) =

Patch Co-Occurrence Discriminator is a Texture Discriminator D( , ) = Different Image

Patch Co-Occurrence Discriminator is a Texture Discriminator D( , ) = Same Image

Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros - PowerPoint PPT Presentation

Contrastive Learning for Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros Richard Zhang Jun-Yan Zhu UC Berkeley Adobe Research ECCV 2020 What is Unpaired Image-to-Image Translation?

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks Jun-Yan Zhu, et,

Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach Jingjing

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Global Translation Services Website translation using post-edited machine translation and

4CSLL5 IBM Translation Models Martin Emms October 22, 2020 4CSLL5 IBM Translation Models IBM

4CSLL5 IBM Translation Models IBM models Probabilities and Translation Alignments Martin Emms

A Retinex based GAN Pipeline to Utilize Paired and Unpaired Datasets for Enhancing Low Light

Unpaired Kidney Exchange: Overcoming the double coincidence of wants without a medium of exchange

Magnetic Effects in Matter Consider the following, semi-classical model of an atom. An unpaired

Simple, Lexicalized Choice of Translation Timing for Simultaneous Speech Translation Tomoki

Translation Memory & Machine Translation Dj Vu combines both smartly! Content

Translation Services: Innovation in Translation Workflow, Tools and Translation Workflow, Tools

Boosting Constraint Acquisition via Generalization Queries

H517 Visualization Design, Analysis, & Evaluation Week 12: Visualization Tasks &

room Today at Tiny Tigers we had so much fun playing, learning and exploring together, the three

Slides Set 5: Probabilistic Networks Rina Dechter Darwiche chapter 3,4, Pearl: chapters 3

h [.,.] h [.,.] f [.,.] f [.,.] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

TOPIC 9ISN'T IN THIS Max Fowler (Computer Science)

Constraint Satisfaction for First-Order Logic William McCune Computer Science Department

The Princeton ZebraNet Project: Sensor Networks for Wildlife Tracking Margaret Martonosi VET

Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros - PowerPoint PPT Presentation

Contrastive Learning for Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros Richard Zhang Jun-Yan Zhu UC Berkeley Adobe Research ECCV 2020 What is Unpaired Image-to-Image Translation?

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks Jun-Yan Zhu, et,

Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach Jingjing

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Computer Aided Translation Philipp Koehn 30 April 2015 Philipp Koehn Machine Translation:

Computer Aided Translation Philipp Koehn 15 November 2018 Philipp Koehn Machine Translation:

Global Translation Services Website translation using post-edited machine translation and

4CSLL5 IBM Translation Models Martin Emms October 22, 2020 4CSLL5 IBM Translation Models IBM

4CSLL5 IBM Translation Models IBM models Probabilities and Translation Alignments Martin Emms

A Retinex based GAN Pipeline to Utilize Paired and Unpaired Datasets for Enhancing Low Light

Unpaired Kidney Exchange: Overcoming the double coincidence of wants without a medium of exchange

Magnetic Effects in Matter Consider the following, semi-classical model of an atom. An unpaired

Simple, Lexicalized Choice of Translation Timing for Simultaneous Speech Translation Tomoki

Translation Memory &amp; Machine Translation Dj Vu combines both smartly! Content

Translation Services: Innovation in Translation Workflow, Tools and Translation Workflow, Tools

Boosting Constraint Acquisition via Generalization Queries

H517 Visualization Design, Analysis, &amp; Evaluation Week 12: Visualization Tasks &amp;

room Today at Tiny Tigers we had so much fun playing, learning and exploring together, the three

Slides Set 5: Probabilistic Networks Rina Dechter Darwiche chapter 3,4, Pearl: chapters 3

h [.,.] h [.,.] f [.,.] f [.,.] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

TOPIC 9ISN'T IN THIS Max Fowler (Computer Science)

Constraint Satisfaction for First-Order Logic William McCune Computer Science Department

The Princeton ZebraNet Project: Sensor Networks for Wildlife Tracking Margaret Martonosi VET

Translation Memory & Machine Translation Dj Vu combines both smartly! Content

H517 Visualization Design, Analysis, & Evaluation Week 12: Visualization Tasks &