unpaired image to image translation
play

Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros - PowerPoint PPT Presentation

Contrastive Learning for Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros Richard Zhang Jun-Yan Zhu UC Berkeley Adobe Research ECCV 2020 What is Unpaired Image-to-Image Translation?


  1. Contrastive Learning for Unpaired Image-to-Image Translation Taesung Park Alexei A. Efros Richard Zhang Jun-Yan Zhu UC Berkeley Adobe Research ECCV 2020

  2. What is Unpaired Image-to-Image Translation? ⋯ ⋯ Training Set Test-time behavior

  3. cycle-consistency loss CycleGAN (��� �� ��., ICCV�17) DiscoGAN (K�� �� ��., ICML �17) DualGAN (�� �� ��., ICCV�17) Also used in MUNIT (H���� �� ��., ECCV�18) DRIT (L�� �� ��., ECCV�18)

  4. interchangeable differentiated

  5. invariant sensitive

  6. What makes for a good output? Input (horse) Output (zebra) ? 𝐻

  7. Retaining input con ontent tent Input (horse) Output (zebra) 𝐻 Discriminator

  8. Retaining input con ontent tent Input (horse) Output (zebra) Invariant 𝐻 Sensitive 𝑨 1− 𝑨 �− 𝑨 �− 𝑨 � 𝑨 Corresponding patches should have hi high s h sim imil ilar arit ity

  9. Patch-based Contrastive Loss 𝑨 � 𝑨 � /𝜐 1 − 𝑨 � 𝑨 1 /𝜐 0 Input (horse) Output (zebra) − 𝑨 � 𝑨 � /𝜐 0 softmax 𝐻 − 𝑨 � 𝑨 � /𝜐 0 softmax ( /𝜐 ) cosine similarities 𝑨 1 − 𝑨 � − 𝑨 � − 𝑨 � 𝑨 𝜐 =0.07 InfoNCE loss (Gutmann et al., AISTATS18 , van den Oord et al., 2018) used in MoCo and SimCLR • To produce positive pairs: • Handcrafted data augmentation (MoCo, SimCLR, etc.) • Input and synthesized image (ours) • MoCo: He et al., CVPR20, SimCLR: Chen et al., ICML20

  10. Patchwise contrastive loss 𝐻

  11. Patchwise contrastive loss Multilayer, Patchwise Contrastive Loss 𝐻 �nc 𝐻 d�c 𝐻 �nc

  12. Patchwise contrastive loss Multilayer, Patchwise Contrastive Loss 𝐻 �nc 𝐻 d�c 𝐻 �nc + No fixed similarity metric (e.g., L1 or perceptual loss) + One-sided (no inverse mapping needed)

  13. Internal vs External Patches 𝐻 Internal Patches

  14. Internal vs External Patches 𝐻 Internal Patches External Patches Mo MoCo: : He et et al. l., , CVPR2 R20; ; SimCL CLR: : Ch Chen et et al. l., , ICML ML20 use a large set of external images as negative samples External patches make things worse

  15. Power of In Inter ternal nal patches Texture Synthesis by Non-parametric Sampling (Efros & L����, ICCV�99, Efros & F������, SIGGRAPH�01) ����� - S���� S���� -resolution using Deep Internal Learning (Shocher , C���� & I���� CVPR�18)

  16. Internal vs External Patches input in internal rnal patches hes external patches Mode Collapse!

  17. Identity Loss Regularization X G(X) 𝐻 Normally, Contrastive Loss between X and G(X) DTN (Taigman �� ��., ICLR�17), CycleGAN (��� �� ��., ICCV�17)

  18. Identity Loss Regularization X G(X) 𝐻 Normally, Contrastive Loss between X and G(X) Y G(Y) Identity loss regularization 𝐻 Contrastive Loss between Y and G(Y) DTN (Taigman �� ��., ICLR�17), CycleGAN (��� �� ��., ICCV�17)

  19. CUT FastCUT Contrastive Unpaired Translation Contrastive Loss 𝜇 � 1 𝜇 � 10 Identity Loss Regularization Conservative, Flexible, Faster than CycleGAN Even Faster than CUT

  20. Lighter Footprint Tr Training time (sec/it iter, low ower is better) r) 0.8 0.7 0.6 0.5 0.4 0.3 < 0.5x 0.2 0.1 0 CycleGAN CycleGAN CU CUT CU CUT Fast Fast astCU astCU CUT CUT

  21. Lighter Footprint Tr Training time (sec/it iter, low ower is better) r) 0.8 0.7 0.6 0.5 0.4 0.3 < 0.5x 0.2 0.1 0 CycleGAN CU CUT Fast astCU CUT MUNIT DRIT

  22. Lighter Footprint Tr Training time (sec/it iter, low ower is better) r) 0.8 0.7 0.6 0.5 0.4 0.3 < 0.5x 0.2 0.1 0 CycleGAN GcGAN CU CUT Fast astCU CUT MUNIT DRIT DistanceGAN Self-DistGAN

  23. Input CU CUT CycleGAN Fast astCUT CUT MUNIT DRIT DistanceGAN GcGAN

  24. Dealing with Dataset Bias Source training set Target training set horse 17.9% zebra 36.8%

  25. Dealing with Dataset Bias Source training set Target training set Input CUT FastCUT CycleGAN horse 17.9% detected pixels: zebra 30.8% zebra 25.9% zebra 19.1% zebra 36.8%

  26. Cat � Dog Yosemite Summer � Winter Apple � Orange Paris � Burano GTA � Cityscapes

  27. FID evaluating the realism of output images (lower is better) 180 160 140 120 100 80 60 horse2zebra cityscapes 40 cat2dog 20 0 CycleGAN MUNIT DRIT DistanceGAN SelfDistanceGAN GCGAN CUT FastCUT

  28. Segmentation Score evaluating correspondences mean Intersection-over-Union (%), higher is better 30 25 I2I Model 20 15 % match 10 Segmenter 5 0

  29. Single Image Translation C����� M������ �������� 𝐻 Internal contrastive loss is well-suited for single image translation. Also see InGAN (Shocher �� ��., ICCV�19), SinGAN (Shaham �� ��., ICCV�19)

  30. Reference photo Single Image Translation C����� M������ �������� 𝐻 Internal contrastive loss is well-suited for single image translation. Also see InGAN (Shocher �� ��., ICCV�19), SinGAN (Shaham �� ��., ICCV�19)

  31. Reference photo Single Image Translation C����� M������ �������� 𝐸 𝐻 Internal contrastive loss is well-suited for single image translation. Also see InGAN (Shocher �� ��., ICCV�19), SinGAN (Shaham �� ��., ICCV�19)

  32. Painting

  33. Reference Photo Reference Photo Painting

  34. Reference Photo Reference Photo Painting Gatys �� ��. CVPR�16

  35. Reference Photo Painting STROTSS (Kolkin �� ��., CVPR�19)

  36. Reference Photo WCT 2 (��� �� ��., ICCV�19) Painting

  37. Reference Photo Painting Our translation result

  38. Reference Photo Painting CycleGAN

  39. Painting

  40. Reference Photo Gatys �� ��. CVPR�16 Painting

  41. Reference Photo STROTSS (K����� �� ��., CVPR�19) Painting

  42. Reference Photo WCT 2 (��� �� ��., ICCV�19) Painting

  43. Reference Photo Ours Painting

  44. Reference Photo CycleGAN Painting

  45. Reference Photo Our translation result Painting

  46. Reference Photo Our translation result Painting

  47. Reference Photo Our translation result Painting

  48. Reference Photo Our translation result Painting

  49. Questions or Comments?

  50. inter er-im image image ge tra-ima Disentanglement? intra

  51. style yle ent onten con MUNIT (Huang, Liu, Belongie , Ka���, ECCV�18)

  52. Structure for each row

  53. Style for each column dark brown, light brown, white white, black uniform spotted striped

  54. Extracting style and structure from an image 𝐹 style code 𝐻 structure code

  55. Extracting style and structure from an image 𝐹 style code 𝐻 structure code

  56. Extracting style and structure from an image 𝐹 style code 𝐻 structure code

  57. Extracting style and structure from an image Co-occurrence Patch-based Discriminator 𝐹 style code 𝐻 structure code

  58. Reconstruction 𝐸 Auto- structure code 𝐹 𝐻 encode style code

  59. Reconstruction 𝐸 Auto- 𝐹 𝐻 encode Swap 𝐹 𝐻 𝐸

  60. Reconstruction 𝐸 Auto- 𝐹 𝐻 encode Swap 𝐹 𝐻 𝐸 Reference patches Real/fake? Patch co-occurrence discriminator 𝐸 �atc�

  61. style structure

  62. style structure

  63. Patch Co-Occurrence Discriminator is a Texture Discriminator What is Texture? �A� i�age �ha� ca� be �e��e�e��ed b� fi��� a�d �ec��d - ��de� ��a�i��ic�� Conjecture by Bela Julesz, 1962 Two textures that differ by first-order statistics

  64. Patch Co-Occurrence Discriminator is a Texture Discriminator What is Texture? �A� i�age �ha� ca� be �e��e�e��ed b� fi��� a�d �ec��d - ��de� ��a�i��ic�� left adjacent pixel left adjacent pixel dark bright dark bright bright dark bright dark right pixel right pixel Conjecture by Bela Julesz, 1962 Two textures that differ by second-order statistics

  65. Patch Co-Occurrence Discriminator is a Texture Discriminator What is Texture? �A� i�age �ha� ca� be �e��e�e��ed b� fi��� a�d �ec��d - ��de� ��a�i��ic�� Modeling joint probability is (almost) enough to capture texture Conjecture by Bela Julesz, 1962 Two textures that differ by third-order statistics

  66. Patch Co-Occurrence Discriminator is a Texture Discriminator D( , ) =

  67. Patch Co-Occurrence Discriminator is a Texture Discriminator D( , ) = Different Image

  68. Patch Co-Occurrence Discriminator is a Texture Discriminator D( , ) =

  69. Patch Co-Occurrence Discriminator is a Texture Discriminator D( , ) = Same Image

  70. Patch Co-Occurrence Discriminator is a Texture Discriminator D( , ) =

Recommend


More recommend