CMP722 ADVANCED COMPUTER VISION Lecture #8 Image Synthesis - PowerPoint PPT Presentation

Illustration: StyleGAN trained on Portrait by Yuli-Ban CMP722 ADVANCED COMPUTER VISION Lecture #8 – Image Synthesis Aykut Erdem // Hacettepe University // Spring 2019

Image credit: Three Robots (Love, Death & Robots, 2019) Previously on CMP722 • imitation learning • reinforcement learning • why vision? • connecting language and vision to actions • case study: embodied QA

Lecture overview • image synthesis via generative models • conditional generative models • structured vs unstructured prediction • image-to-image translation • generative adversarial networks • cycle-consistent adversarial networks • Disclaimer: Much of the material and slides for this lecture were borrowed from — Bill Freeman, Antonio Torralba and Phillip Isola’s MIT 6.869 class 3

Image classification Classifier “Fish” … image X label Y

Image synthesis Generator “Fish” label Y image X

Image synthesis via generative modeling X is high-dimensional! Model of high-dimensional structured data In vision, this is usually what we are interested in!

Generative Model Gaussian noise Synthesized image

Conditional Generative Model “bird” Synthesized image

Conditional Generative Model “A yellow bird on a branch” Synthesized image

Conditional Generative Model Synthesized image

Data prediction problems (“structured prediction”) Semantic segmentation Edge detection [Xie et al. 2015, …] [Long et al. 2015, …] Future frame prediction Text-to-photo “ this small bird has a pink breast and crown…” [Reed et al. 2014, …] [Mathieu et al. 2016, …]

What’s the object class of the center pixel? “Bird” “Bird” “Sky” “Sky” Each prediction is done independently!

Independent prediction Find a configuration of Input per-pixel compatible labels [“Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials”, Krahenbuhl and Koltun, NIPS 2011]

Structured prediction Define an objective that penalizes bad structure! (e.g., a graphical model)

Unstructured prediction All learning objectives we have seen in this class so far had this form! Per-datapoint least-squares regression: Per-pixels softmax regression:

Structured prediction with a CRF

Structured prediction with a generative model Model joint configuration of all outputs y

Challenges in visual prediction 1. Output is a high-dimensional, structured object 2. Uncertainty in the mapping, many plausible outputs

Properties of generative models 1. Model high-dimensional, structured output 2. Model uncertainty; a whole distribution of possible outputs

Image-to-Image Translation Input Output Training data Objective function Neural Network … (loss)

Image-to-Image Translation Input Output “ Ho How should I do it?” “ Wh What at should I do”

<latexit sha1_base64="xJ29wZXxweyTFS1NO2hizipUnRQ=">ACdXicbVFNT9wEHVSaGFLy0IvlSokl6UIJNgmqFJ7qYTaSw89gMQC1WaJHO9kY2EnkT0pWrn5B/13PgbXHqts6SifIw0vOb9zjcVJKYTAIrjz/ydz802cLi53nSy9eLndXVo9NUWkOA17IQp8mzIAUOQxQoITUgNTiYST5PxrUz/5CdqIj/CaQkjxSa5SAVn6Ki4+ztSDOt7Pc6tnv1VnQhxpAxtDM+Se2Put6ht4ftz1GqGbdh7dQ0MpWKbZDL+pIQorRr1tly+8+duM/jxaTzJmazmcu424v6AezoA9B2IeaeMg7l5G4JXCnLkhkzDIMSR5ZpFxC3YkqAyXj52wCQwdzpsCM7GxrNX3nmDFNC+0yRzpj/3dYpoyZqsQpm8nN/VpDPlYbVph+GlmRlxVCzm8apZWkWNDmC+hYaOAopw4wroWblfKMua2i+6iOW0J4/8kPwfFePwz64eGH3v6Xdh0L5A1ZJ1skJB/JPvlGDsiAcHLtvfbeuveH3/N3/A3b6S+13pekTvhv/8LmvXBtw=</latexit> <latexit sha1_base64="xJ29wZXxweyTFS1NO2hizipUnRQ=">ACdXicbVFNT9wEHVSaGFLy0IvlSokl6UIJNgmqFJ7qYTaSw89gMQC1WaJHO9kY2EnkT0pWrn5B/13PgbXHqts6SifIw0vOb9zjcVJKYTAIrjz/ydz802cLi53nSy9eLndXVo9NUWkOA17IQp8mzIAUOQxQoITUgNTiYST5PxrUz/5CdqIj/CaQkjxSa5SAVn6Ki4+ztSDOt7Pc6tnv1VnQhxpAxtDM+Se2Put6ht4ftz1GqGbdh7dQ0MpWKbZDL+pIQorRr1tly+8+duM/jxaTzJmazmcu424v6AezoA9B2IeaeMg7l5G4JXCnLkhkzDIMSR5ZpFxC3YkqAyXj52wCQwdzpsCM7GxrNX3nmDFNC+0yRzpj/3dYpoyZqsQpm8nN/VpDPlYbVph+GlmRlxVCzm8apZWkWNDmC+hYaOAopw4wroWblfKMua2i+6iOW0J4/8kPwfFePwz64eGH3v6Xdh0L5A1ZJ1skJB/JPvlGDsiAcHLtvfbeuveH3/N3/A3b6S+13pekTvhv/8LmvXBtw=</latexit> <latexit sha1_base64="xJ29wZXxweyTFS1NO2hizipUnRQ=">ACdXicbVFNT9wEHVSaGFLy0IvlSokl6UIJNgmqFJ7qYTaSw89gMQC1WaJHO9kY2EnkT0pWrn5B/13PgbXHqts6SifIw0vOb9zjcVJKYTAIrjz/ydz802cLi53nSy9eLndXVo9NUWkOA17IQp8mzIAUOQxQoITUgNTiYST5PxrUz/5CdqIj/CaQkjxSa5SAVn6Ki4+ztSDOt7Pc6tnv1VnQhxpAxtDM+Se2Put6ht4ftz1GqGbdh7dQ0MpWKbZDL+pIQorRr1tly+8+duM/jxaTzJmazmcu424v6AezoA9B2IeaeMg7l5G4JXCnLkhkzDIMSR5ZpFxC3YkqAyXj52wCQwdzpsCM7GxrNX3nmDFNC+0yRzpj/3dYpoyZqsQpm8nN/VpDPlYbVph+GlmRlxVCzm8apZWkWNDmC+hYaOAopw4wroWblfKMua2i+6iOW0J4/8kPwfFePwz64eGH3v6Xdh0L5A1ZJ1skJB/JPvlGDsiAcHLtvfbeuveH3/N3/A3b6S+13pekTvhv/8LmvXBtw=</latexit> <latexit sha1_base64="xJ29wZXxweyTFS1NO2hizipUnRQ=">ACdXicbVFNT9wEHVSaGFLy0IvlSokl6UIJNgmqFJ7qYTaSw89gMQC1WaJHO9kY2EnkT0pWrn5B/13PgbXHqts6SifIw0vOb9zjcVJKYTAIrjz/ydz802cLi53nSy9eLndXVo9NUWkOA17IQp8mzIAUOQxQoITUgNTiYST5PxrUz/5CdqIj/CaQkjxSa5SAVn6Ki4+ztSDOt7Pc6tnv1VnQhxpAxtDM+Se2Put6ht4ftz1GqGbdh7dQ0MpWKbZDL+pIQorRr1tly+8+duM/jxaTzJmazmcu424v6AezoA9B2IeaeMg7l5G4JXCnLkhkzDIMSR5ZpFxC3YkqAyXj52wCQwdzpsCM7GxrNX3nmDFNC+0yRzpj/3dYpoyZqsQpm8nN/VpDPlYbVph+GlmRlxVCzm8apZWkWNDmC+hYaOAopw4wroWblfKMua2i+6iOW0J4/8kPwfFePwz64eGH3v6Xdh0L5A1ZJ1skJB/JPvlGDsiAcHLtvfbeuveH3/N3/A3b6S+13pekTvhv/8LmvXBtw=</latexit> Designing loss functions Input Output Ground truth � � X Y , Y ) = 1 2 � � L 2 ( b � Y h,w − b Y h,w � 2 2 h,w

<latexit sha1_base64="xJ29wZXxweyTFS1NO2hizipUnRQ=">ACdXicbVFNT9wEHVSaGFLy0IvlSokl6UIJNgmqFJ7qYTaSw89gMQC1WaJHO9kY2EnkT0pWrn5B/13PgbXHqts6SifIw0vOb9zjcVJKYTAIrjz/ydz802cLi53nSy9eLndXVo9NUWkOA17IQp8mzIAUOQxQoITUgNTiYST5PxrUz/5CdqIj/CaQkjxSa5SAVn6Ki4+ztSDOt7Pc6tnv1VnQhxpAxtDM+Se2Put6ht4ftz1GqGbdh7dQ0MpWKbZDL+pIQorRr1tly+8+duM/jxaTzJmazmcu424v6AezoA9B2IeaeMg7l5G4JXCnLkhkzDIMSR5ZpFxC3YkqAyXj52wCQwdzpsCM7GxrNX3nmDFNC+0yRzpj/3dYpoyZqsQpm8nN/VpDPlYbVph+GlmRlxVCzm8apZWkWNDmC+hYaOAopw4wroWblfKMua2i+6iOW0J4/8kPwfFePwz64eGH3v6Xdh0L5A1ZJ1skJB/JPvlGDsiAcHLtvfbeuveH3/N3/A3b6S+13pekTvhv/8LmvXBtw=</latexit> <latexit sha1_base64="xJ29wZXxweyTFS1NO2hizipUnRQ=">ACdXicbVFNT9wEHVSaGFLy0IvlSokl6UIJNgmqFJ7qYTaSw89gMQC1WaJHO9kY2EnkT0pWrn5B/13PgbXHqts6SifIw0vOb9zjcVJKYTAIrjz/ydz802cLi53nSy9eLndXVo9NUWkOA17IQp8mzIAUOQxQoITUgNTiYST5PxrUz/5CdqIj/CaQkjxSa5SAVn6Ki4+ztSDOt7Pc6tnv1VnQhxpAxtDM+Se2Put6ht4ftz1GqGbdh7dQ0MpWKbZDL+pIQorRr1tly+8+duM/jxaTzJmazmcu424v6AezoA9B2IeaeMg7l5G4JXCnLkhkzDIMSR5ZpFxC3YkqAyXj52wCQwdzpsCM7GxrNX3nmDFNC+0yRzpj/3dYpoyZqsQpm8nN/VpDPlYbVph+GlmRlxVCzm8apZWkWNDmC+hYaOAopw4wroWblfKMua2i+6iOW0J4/8kPwfFePwz64eGH3v6Xdh0L5A1ZJ1skJB/JPvlGDsiAcHLtvfbeuveH3/N3/A3b6S+13pekTvhv/8LmvXBtw=</latexit> <latexit sha1_base64="xJ29wZXxweyTFS1NO2hizipUnRQ=">ACdXicbVFNT9wEHVSaGFLy0IvlSokl6UIJNgmqFJ7qYTaSw89gMQC1WaJHO9kY2EnkT0pWrn5B/13PgbXHqts6SifIw0vOb9zjcVJKYTAIrjz/ydz802cLi53nSy9eLndXVo9NUWkOA17IQp8mzIAUOQxQoITUgNTiYST5PxrUz/5CdqIj/CaQkjxSa5SAVn6Ki4+ztSDOt7Pc6tnv1VnQhxpAxtDM+Se2Put6ht4ftz1GqGbdh7dQ0MpWKbZDL+pIQorRr1tly+8+duM/jxaTzJmazmcu424v6AezoA9B2IeaeMg7l5G4JXCnLkhkzDIMSR5ZpFxC3YkqAyXj52wCQwdzpsCM7GxrNX3nmDFNC+0yRzpj/3dYpoyZqsQpm8nN/VpDPlYbVph+GlmRlxVCzm8apZWkWNDmC+hYaOAopw4wroWblfKMua2i+6iOW0J4/8kPwfFePwz64eGH3v6Xdh0L5A1ZJ1skJB/JPvlGDsiAcHLtvfbeuveH3/N3/A3b6S+13pekTvhv/8LmvXBtw=</latexit> <latexit sha1_base64="xJ29wZXxweyTFS1NO2hizipUnRQ=">ACdXicbVFNT9wEHVSaGFLy0IvlSokl6UIJNgmqFJ7qYTaSw89gMQC1WaJHO9kY2EnkT0pWrn5B/13PgbXHqts6SifIw0vOb9zjcVJKYTAIrjz/ydz802cLi53nSy9eLndXVo9NUWkOA17IQp8mzIAUOQxQoITUgNTiYST5PxrUz/5CdqIj/CaQkjxSa5SAVn6Ki4+ztSDOt7Pc6tnv1VnQhxpAxtDM+Se2Put6ht4ftz1GqGbdh7dQ0MpWKbZDL+pIQorRr1tly+8+duM/jxaTzJmazmcu424v6AezoA9B2IeaeMg7l5G4JXCnLkhkzDIMSR5ZpFxC3YkqAyXj52wCQwdzpsCM7GxrNX3nmDFNC+0yRzpj/3dYpoyZqsQpm8nN/VpDPlYbVph+GlmRlxVCzm8apZWkWNDmC+hYaOAopw4wroWblfKMua2i+6iOW0J4/8kPwfFePwz64eGH3v6Xdh0L5A1ZJ1skJB/JPvlGDsiAcHLtvfbeuveH3/N3/A3b6S+13pekTvhv/8LmvXBtw=</latexit> � � X Y , Y ) = 1 2 � � L 2 ( b � Y h,w − b Y h,w � 2 2 h,w

Designing loss functions Input Zhang et al. 2016 Ground truth Color distribution cross-entropy loss with colorfulness enhancing term.

Designing loss functions Be careful what you wish for!

Designing loss functions Image colorization L2 regression [Zhang, Isola, Efros, ECCV 2016] Super-resolution L2 regression [Johnson, Alahi, Li, ECCV 2016]

Designing loss functions Image colorization Cross entropy objective, with colorfulness term [Zhang, Isola, Efros, ECCV 2016] Super-resolution Deep feature covariance matching objective [Johnson, Alahi, Li, ECCV 2016]

Universal loss? … …

Generated images “Generative Adversarial Network” (GANs) Generated … … vs Real (classifier) Real photos [Goodfellow, Pouget-Abadie, Mirza, Xu, … Warde-Farley, Ozair, Courville, Bengio 2014]

Generator

real or fake? Generator Discriminator G tries at fool D s to syn synthesi size fake ake imag ages s that D tries s to identify y the fake akes

fake ake (0 (0.9 .9) real al (0 (0.1 .1)

real or fake? G tries fool D : s to syn synthesi size fake ake imag ages s that at fo

real or fake? G tries st D : s to syn synthesi size fake ake imag ages s that at fo fool th the best

Loss Function G ’s perspective: D is a loss function. Rather than being hand-designed, it is learned .

real or fake?

real! (“Aquarius”)

CMP722 ADVANCED COMPUTER VISION Lecture #8 Image Synthesis - PowerPoint PPT Presentation

Illustration: StyleGAN trained on Portrait by Yuli-Ban CMP722 ADVANCED COMPUTER VISION Lecture #8 Image Synthesis Aykut Erdem // Hacettepe University // Spring 2019 Image credit: Three Robots (Love, Death & Robots, 2019) Previously

CMP722 ADVANCED COMPUTER VISION Lecture #4 Multimodality Aykut Erdem // Hacettepe

CMP722 ADVANCED COMPUTER VISION Lecture #5 Language and Vision Aykut Erdem // Hacettepe

CMP722 ADVANCED COMPUTER VISION Lecture #3 Sequential Processing with NNs and Attention

CMP722 ADVANCED COMPUTER VISION Lecture #10 Modeling the Physical World Aykut Erdem //

CMP722 ADVANCED COMPUTER VISION Lecture #6 Deep Reinforcement Learning Aykut Erdem //

CMP722 ADVANCED COMPUTER VISION Lecture #9 Graph Networks Aykut Erdem // Hacettepe

Advanced Machine Learning CS 7140 - Spring 2018 Lecture 20: Generative Adversarial Networks

Days 3&4: ELAN Our class Google Drive folder: Lesson 2 bit.ly/DigLangDocLSA2019 Andrea

Swiss Cadastre Preparing for Swiss Cadastre Preparing for E- -Government Government E Dr.

Computational High Dynamic Range Photography HDR Frank Dellaert School of Interactive

Deep learning 11.3. Conditional GAN and image translation Fran cois Fleuret

Generative Adversarial Networks Stefano Ermon, Aditya Grover Stanford University Lecture 10

CSC421/2516 Lecture 18: Generative Adversarial Networks Roger Grosse and Jimmy Ba Roger Grosse

Anyway S102 Functions # Select "name" and "value" columns from secure

Understanding Design Pattern Density with Aspects A Case Study in JHotDraw with AspectJ Simon

Unit 1: Data Fitting Motivation Data fitting: Construct a continuous function that represents

Management and Business Strategy 2015/2016 MANUEL DE NICOLA LEARNING OBJECTIVES LO 1 What we

From Designing a Digital Future Progress in algorithms beats Moores law Everyone knows

Lecture 2: Tiling matrix-matrix multiply, code tuning David Bindel 1 Feb 2010 Logistics

Mod 1 Unit 3 Lesson 1 Introduction to Triangle Proofs Lecture Slides.notebook December 10, 2014 1

Advanced Machine Learning CS 7140 - Spring 2018 Lecture 16: Project Discussion Jan-Willem van de

Joint European Standing Group Meeting 33 Place your chosen image here. The four corners must

Introduction to Deep Learning: Concepts and Terminologies CSE 5194.01 Autumn 20 Arpan Jain

<this slide is intentionally blank> idealistic realistic state of the art state of the

CMP722 ADVANCED COMPUTER VISION Lecture #8 Image Synthesis - PowerPoint PPT Presentation

Illustration: StyleGAN trained on Portrait by Yuli-Ban CMP722 ADVANCED COMPUTER VISION Lecture #8 Image Synthesis Aykut Erdem // Hacettepe University // Spring 2019 Image credit: Three Robots (Love, Death & Robots, 2019) Previously

CMP722 ADVANCED COMPUTER VISION Lecture #4 Multimodality Aykut Erdem // Hacettepe

CMP722 ADVANCED COMPUTER VISION Lecture #5 Language and Vision Aykut Erdem // Hacettepe

CMP722 ADVANCED COMPUTER VISION Lecture #3 Sequential Processing with NNs and Attention

CMP722 ADVANCED COMPUTER VISION Lecture #10 Modeling the Physical World Aykut Erdem //

CMP722 ADVANCED COMPUTER VISION Lecture #6 Deep Reinforcement Learning Aykut Erdem //

CMP722 ADVANCED COMPUTER VISION Lecture #9 Graph Networks Aykut Erdem // Hacettepe

Advanced Machine Learning CS 7140 - Spring 2018 Lecture 20: Generative Adversarial Networks

Days 3&amp;4: ELAN Our class Google Drive folder: Lesson 2 bit.ly/DigLangDocLSA2019 Andrea

Swiss Cadastre Preparing for Swiss Cadastre Preparing for E- -Government Government E Dr.

Computational High Dynamic Range Photography HDR Frank Dellaert School of Interactive

Deep learning 11.3. Conditional GAN and image translation Fran cois Fleuret

Generative Adversarial Networks Stefano Ermon, Aditya Grover Stanford University Lecture 10

CSC421/2516 Lecture 18: Generative Adversarial Networks Roger Grosse and Jimmy Ba Roger Grosse

Anyway S102 Functions # Select &quot;name&quot; and &quot;value&quot; columns from secure

Understanding Design Pattern Density with Aspects A Case Study in JHotDraw with AspectJ Simon

Unit 1: Data Fitting Motivation Data fitting: Construct a continuous function that represents

Management and Business Strategy 2015/2016 MANUEL DE NICOLA LEARNING OBJECTIVES LO 1 What we

From Designing a Digital Future Progress in algorithms beats Moores law Everyone knows

Lecture 2: Tiling matrix-matrix multiply, code tuning David Bindel 1 Feb 2010 Logistics

Mod 1 Unit 3 Lesson 1 Introduction to Triangle Proofs Lecture Slides.notebook December 10, 2014 1

Advanced Machine Learning CS 7140 - Spring 2018 Lecture 16: Project Discussion Jan-Willem van de

Joint European Standing Group Meeting 33 Place your chosen image here. The four corners must

Introduction to Deep Learning: Concepts and Terminologies CSE 5194.01 Autumn 20 Arpan Jain

&lt;this slide is intentionally blank&gt; idealistic realistic state of the art state of the

Days 3&4: ELAN Our class Google Drive folder: Lesson 2 bit.ly/DigLangDocLSA2019 Andrea

Anyway S102 Functions # Select "name" and "value" columns from secure

<this slide is intentionally blank> idealistic realistic state of the art state of the