Semantic Image Analogy with a Conditional Single-Image GAN Ji a cheng Li , Zhiwei Xiong, Dong Liu, Xuejin Chen, Zheng-Jun Zh a ACM MM 2020 P ⇒ P ′ analogous I ⇒ I ′
Image Analogy A : A ′ :: B : B ′ : :: : :: A A A ′ A ′ : : B B B ′ B ′ A. Hertzmann, et al . 2001. Image analogies. ACM Trans. Graph.
Image Analogy A : A ′ :: B : B ′ : :: A A ′ : B B ′
Semantic Image Analogy P ⇒ P ′ :: I ⇒ I ′ Segmentation ⇒ :: Domain P P ′ Image ⇒ Domain I I ′
Semantic Image Analogy P ⇒ P ′ :: I ⇒ I ′ P P ⇒ P ′ P analogous I ⇒ I ′ I
ADE20k Cityscapes Semantic Image Synthesis Conditional GANs COCO CelebA … Retargeting In-the-wild Single-Image GANs Images Super-Resolution Unconditional Sampling … T. Park, et al. 2019. Semantic Image Synthesis With Spatially-Adaptive Normalization. In CVPR. T. Shaham, et al . 2019. SinGAN: Learning a Generative Model From a Single Natural Image. In ICCV.
ADE20k Cityscapes Semantic Image Synthesis Conditional GANs COCO CelebA … Can we achieve the best from both worlds? Retargeting In-the-wild Single-Image GANs Images Super-Resolution Unconditional Sampling … T. Park, et al. 2019. Semantic Image Synthesis With Spatially-Adaptive Normalization. In CVPR. T. Shaham, et al . 2019. SinGAN: Learning a Generative Model From a Single Natural Image. In ICCV.
Can we achieve the best from both worlds? Self-Supervised Training Conditional Single-Image GAN Semantic Feature Translation (SFT) Loss Terms
Self-Supervised Learning: Alternating Optimization Sampling Mode Reconstruction Mode ⇒ ⇒ ⇒ ⇒ P source ⇒ P aug :: I source ⇒ I aug P source ⇒ P source :: I source ⇒ I source
Self-Supervised Learning: Reconstruction Mode F aug P source E seg ⇒ share weights F source P source ( γ seg , β seg ) ⇒ SFT ( γ img , β img ) P source ⇒ P source :: I source ⇒ I source I source I source G
Semantic Feature Translation (SFT) Module Image Features Segmentation Transformation Transformation Features Parameters Parameters F l shift = F l aug − F l source ⊕ F l aug β l seg ≈ F l β l SFT block shift img Linear Linear F l aug F l scale = F l source ⊙ γ l γ l seg ≈ F l SFT block img scale F l source F l img
Loss Terms F aug P aug E seg share weights F source P source ( γ seg , β seg ) SFT ( γ img , β img ) I source I target G homogeneous appearance
Loss Terms aligned semantic layout F aug P aug E seg share weights F source P source ( γ seg , β seg ) SFT ( γ img , β img ) I source I target G homogeneous appearance
Loss Terms 1 ∑ min d ( V , U ) N U ⊂ I source V ⊂ I target V F aug P aug U E seg share weights I source I target F source P source ( γ seg , β seg ) SFT ( γ img , β img ) I source I target G Patch Coherence Loss
Loss Terms Semantic Alignment Loss F aug P aug P predict Feature Matching GAN Loss Real / Fake Loss E seg share weights F source P source Segmentation ( γ seg , β seg ) Network SFT D S ( γ img , β img ) Fake Real I source I aug I target G Patch Coherence Loss
Loss Terms F aug P aug E seg Fixed-Point Loss share weights γ img → 1 β img → 0 F source P source ( γ seg , β seg ) SFT ( γ img , β img ) I source I target G Reconstruction Loss
Loss Terms F aug P aug GAN Loss Real / Fake E seg Fixed-Point Loss share weights γ img → 1 β img → 0 F source P source ( γ seg , β seg ) SFT D ( γ img , β img ) Fake Real I source I target I source G Reconstruction Loss
Evaluation
User Study Interface Pleas rank A, B and C by appearance similarity with the left side image. A. Hertzmann, et al . 2001. Image analogies. ACM Trans. Graph. J. Liao, et al . 2001. Visual attribute transfer through deep image analogy. ACM Trans. Graph.
Quantitative Comparisons Rank #1 Rank #2 Rank #3 IA DIA Ours 60 IA 45 DIA 30 Ours 15 0 Mean IOU Pixel-wise Accuracy 0% 25% 50% 75% 100% A. Hertzmann, et al . 2001. Image analogies. ACM Trans. Graph. J. Liao, et al . 2001. Visual attribute transfer through deep image analogy. ACM Trans. Graph.
Comparisons with Previous Image Analogies Source Target Target Layout IA DIA Ours A. Hertzmann, et al . 2001. Image analogies. ACM Trans. Graph. J. Liao, et al . 2001. Visual attribute transfer through deep image analogy. ACM Trans. Graph.
Comparisons with Single-Image GANs Source Edited Source Target Layout IA SinGAN Ours Ours A. Hertzmann, et al . 2001. Image analogies. ACM Trans. Graph. T. Shaham, et al . 2019. SinGAN: Learning a Generative Model From a Single Natural Image. In ICCV.
Comparisons with Conditional GANs Source Target Layout SPADE Ours IA A. Hertzmann, et al . 2001. Image analogies. ACM Trans. Graph. T. Park, et al. 2019. Semantic Image Synthesis With Spatially-Adaptive Normalization. In CVPR.
Target #3 Target #2 Target #1 Source Semantic Manipulation Results
Applications
Object Removal Results P target I target P source I source
Face Editing Results Source Target #1 Target #2 Target #3
Sketch-to-Image Synthesis Results P target I target P source I source
Failure Cases P target I target P source I source
Thank you! P ⇒ P ′ analogous I ⇒ I ′
Recommend
More recommend