Representing and Explaining Novel Concepts with Minimal Supervision - PowerPoint PPT Presentation

Representing and Explaining Novel Concepts with Minimal Supervision Dr. Zeynep Akata 2 April 2019 1

Outline Motivating the Importance of Side Information (Generalized) Zero-Shot Learning with Side Information Deeply Explainable Artificial Intelligence Summary and Future Work 2

Data Distribution in Large-Scale Datasets Akata et.al. TPAMI’14 number of images number of classes 4

Attributes as Side Information Lampert et al. CVPR’09 images attributes class black-white has tail zebra lives on land [1 0 1 1 0 1] small gray [0 1 1 0 1 0] has tail whale lives in water big 5

Zero-Shot Learning images attributes black-white has tail ... lives on land small black-white no tail ... lives on land medium gray has tail ... lives in water big white has tail lives on land tiny 6

Muldimodal Embeddings Akata et al. CVPR’13 & TPAMI’16 IMAGE CLASS CLASS IMAGES FEATURES ATTRIBUTES LABELS zebra black whale white 7

Zero-Shot Learning Datasets 8

Zero-Shot vs Generalized Zero-Shot Learning Xian et al. CVPR 2017 Zero-Shot Learning Generalized Zero-Shot Learning CUB AWA CUB AWA Method u u u s H u s H Supervised Learning – – – – – – 82 . 1 96 . 2

Zero-Shot vs Generalized Zero-Shot Learning Xian et al. CVPR 2017 Zero-Shot Learning Generalized Zero-Shot Learning CUB AWA CUB AWA Method u u u s H u s H Supervised Learning – – – – – – 82 . 1 96 . 2 Multimodal Embeddings 54 . 9 59 . 9 23 . 7 62 . 8 34 . 4 16 . 8 76 . 1 27 . 5 9

Conclusions Standard image classification models fail with the lack of labels 1. Zero-Shot Learning is a challenging task that deserves attention 2. Side information, e.g. attributes, is required to tackle zero-shot learning 3. Several sources of side information exist: moving towards free-form text Akata et.al. IEEE CVPR 2013, 2015, 2016 & IEEE TPAMI 2014, 2016 10

How to Tackle the Missing Data Problem? Labels are difficult to obtain, attributes require expert knowledge 12

How to Tackle the Missing Data Problem? Labels are difficult to obtain, attributes require expert knowledge Proposed solution: Free text to image synthesis! 12

Detailed Visual Descriptions Reed et al. CVPR’16 The bird has a white This bird has This swimming bird underbelly, black distinctive-looking has a black crown feathers in the wings, brown and white with a large white a large wingspan, and stripes all over its strip on its head, a white beak. body, and its brown and yellow eyes. tail sticks up. This flower has a Light purple petals This flower is yellow central white blossom with orange and and orange in color, surrounded by large black middle green with petals that are pointed red petals leaves ruffled along the which are veined and edges. leaflike. 13

Deep Representations of Text Reed et al. CVPR’16 Sequential encoding Convolutional encoding The beak is yellow and pointed and the wings are blue. 14

Text to Image Synthesis This large bird has black feet and ?? → dark-brown feathers . 15

GAN 1 Conditioned on Text Reed et al. ICML’16 & NIPS’16 φ φ x := G (z, φ (t)) D (x’, φ (t)) φ(t) z ~ N(0,1) This flower has small, round violet This flower has small, round violet petals with a dark purple center petals with a dark purple center Generator Network Discriminator Network 1 Generative Adversarial Networks [Goodfellow et al. NIPS’14] 16

Text to Image Synthesis Results a small sized bird that has tones of brown and this is a large black bird with a pointy black beak Query Query dark red with a short stout bill Retrieval Generated Image this is a bird with a yellow belly, black head and the bird has a yellow bill, pink webbed feet, a breast and a black wing Query Query white body with gray wings and gray tail feathers Retrieval Generated Image a vibrant colored bird of copper color, orange this bird is all blue, the top part of the bill is and blue with a very large orange bill Query Query blue, but the bottom half is white. Retrieval Generated Image 17

Interpolatoing Between Sentences ‘Blue bird with black beak’ → ‘Red bird with black beak’ ‘This bird is completely red with black wings’ ‘Small blue bird with black wings’ → ‘A small sized bird that has a cream belly and ‘Small yellow bird with black wings’ a short pointed bill’ ‘This bird is bright.’ → ‘This bird is dark.’ ‘This is a yellow bird. The wings are bright blue’ 18

Generalized Zero-Shot Learning with Synthesized Images CUB Data H u s Only real data 23 . 7 62 . 8 34 . 4 19

Generalized Zero-Shot Learning with Synthesized Images CUB Data H u s Only real data 23 . 7 62 . 8 34 . 4 With generated images 23 . 8 48 . 5 31 . 9 This is not better than having no images! 19

Head color: red x seen Back color: black Crown color: red x g Wing shape: short G ( z , a ) unseen z ~ N ( 0 , 1 ) ? f-CLSWGAN for Text to Image Feature Synthesis Xian et al. CVPR’18 ResNet space synthetic CNN feature image space CNN This is a small bird f-CLSWGAN with a brown head and a yellow belly. real CNN image 20

Generalized Zero-Shot Learning with Synthesized Image Features CUB Data u s H Only real data 23 . 7 62 . 8 34 . 4 With generated images 23 . 8 48 . 5 31 . 9 21

Generalized Zero-Shot Learning with Synthesized Image Features CUB Data u s H Only real data 23 . 7 62 . 8 34 . 4 With generated images 23 . 8 48 . 5 31 . 9 With generated features ( f-CLSWGAN ) 43 . 7 57 . 7 49 . 7 21

E1 D1 red head pink belly E2 brown wings D2 gray beak CADA-VAE for Text to Image Feature Synthesis Sch¨ onfeld et al. CVPR’19 E1 D1 red head pink belly E2 brown wings D2 gray beak 22

DETAILED FIGURE COMPACT FIGURES SLIGHTLY MORE DETAILED FIGURES (THE EQUATIONS ON THE RIGHT ARE (SMALL ENOUGH TO PUT 3 IN A ROW) (PROBABLY TOO BIG TO PUT 3 IN A THE CROSS-RECONSTRUCTION LOSS. ROW) THE BASIC VAE LOSS IS NOT SHOWN) CADA-VAE: E1 D1 D1 E1 D2 E2 E2 D2 Current choice: CADA-VAE for Text to Image Feature Synthesis Sch¨ onfeld et al. CVPR’19 E1 D1 E1 D1 D1 E1 DA-VAE: red head pink belly E2 brown wings D2 D2 E2 gray beak E2 D2 22 D1 E1 E1 D1 CA-VAE: D2 E2 E2 D2

Generalized Zero-Shot Learning with Synthesized Image Features CUB Data u s H Only real data 23 . 7 62 . 8 34 . 4 With generated images 23 . 8 48 . 5 31 . 9 With generated features ( f-CLSWGAN ) 43 . 7 57 . 7 49 . 7 With generated features ( CADA-VAE ) 63 . 6 51 . 6 52 . 4 23

Seen Feature Reconstruction ( f-VAE ) Decoder/Generator(G) Encoder (E) Discriminator1 (D 1 ) Cape May Novel Feature Warbler Generation ( f-WGAN ) Discriminator2 VAE (D 2 ) Transductive GAN Learning D2 ( D2 ) f-VAEGAN-D2 for Text to Image Feature Synthesis Xian et al. CVPR’19 Seen Feature Reconstruction ( f-VAE ) Decoder/Generator(G) Encoder (E) Cape May Warbler 24

Seen Feature Reconstruction ( f-VAE ) Decoder/Generator(G) Encoder (E) Discriminator1 (D 1 ) Cape May Novel Feature Warbler Generation ( f-WGAN ) Discriminator2 VAE (D 2 ) Transductive GAN Learning D2 ( D2 ) f-VAEGAN-D2 for Text to Image Feature Synthesis Xian et al. CVPR’19 Seen Feature Reconstruction ( f-VAE ) Decoder/Generator(G) Encoder (E) Discriminator1 (D 1 ) Cape May Novel Feature Warbler Generation ( f-WGAN ) 24

Generalized Zero-Shot Learning with Synthesized Image Features CUB Data u s H Only real data 23 . 7 62 . 8 34 . 4 With generated images 23 . 8 48 . 5 31 . 9 With generated features ( f-CLSWGAN ) 43 . 7 57 . 7 49 . 7 With generated features ( CADA-VAE ) 63 . 6 51 . 6 52 . 4 With generated features ( f-VAEGAN-D2 ) 63 . 2 75 . 6 68 . 9 25

Seen Feature Reconstruction ( f-VAE ) Decoder/Generator(G) Encoder (E) Discriminator1 (D 1 ) Cape May Novel Feature Warbler Generation ( f-WGAN ) Discriminator2 VAE (D 2 ) Transductive GAN Learning D2 ( D2 ) f-VAEGAN-D2 for Text to Image Feature Synthesis Xian et al. CVPR’19 Seen Feature Reconstruction ( f-VAE ) Decoder/Generator(G) Encoder (E) Discriminator1 (D 1 ) Cape May Novel Feature Warbler Generation ( f-WGAN ) 26

Representing and Explaining Novel Concepts with Minimal Supervision - PowerPoint PPT Presentation

Representing and Explaining Novel Concepts with Minimal Supervision Dr. Zeynep Akata 2 April 2019 1 Outline Motivating the Importance of Side Information (Generalized) Zero-Shot Learning with Side Information Deeply Explainable Artificial

This presentation is a recap of one we gave to DOE- OHEP in March. Our most recent

Representing PL manifolds by edge-colored graphs: basic concepts and recent results Paola

A toy example in Minimal Model Program In minimal model program for 3-folds, Mori connected

A successful self explaining roads project i N in New Zealand; but Z l d b t what is next?

www.tablegroup.com/hub Smart Healthy Strategy Minimal Politics Marketing Minimal

On the minimal coloring number of the minimal diagram of torus links Eri Matsudo Nihon

Understanding Minimal Risk Richard T. Campbell University of Illinois at Chicago Why is Minimal

Strongly minimal groups in o-minimal structures ( with P . Eleftheriou and A. Hasson ) Kobi

Flicker: Flicker: Minimal TCB Code Execution Minimal TCB Code Execution Jonathan M. McCune

HTML Introduction (02c) Our Plan G HTML background G Unix and other issues G Minimal HTML

Minimal ESP draft-mglt-lwig-minimal-esp-05 Migault, Guggemos -- IETF99 Motivations Securing m2m

Membership Functions Why Not Use . . . Representing a Number vs. A Natural Question This

Normal and minimal surfaces The correspondence between normal surfaces and minimal surfaces has

Phonology 9/10/2010 Key Words / Concepts Phonology vs. phonetics Phoneme vs. allophone

Explaining Inconsistent Code Muhammad Numair Mansur Introduction 50% of the time in

Representing Clients with Diminished Representing Clients with Diminished Capacity in Civil Matters

Why Did You Say That? Explaining and Diversifying Captioning Models Kate Saenko VQA Workshop,

Minimal Spanning Tree JohnsonBaughs Algorithms , Section 7.3 (page 284) find Minimal

Minimal theory of massive gravity Minimal theory of massive gravity Antonio De Felice Yukawa

Minimal coloring numbers on minimal diagrams of torus links Eri Matsudo Nihon University

Minimal and normal surfaces There is a correspondence between the theory of minimal surfaces in

A minimal representation of the orthosymplectic Lie supergroup Sigiswald Barbier Joint work

Representing Images and Sounds Class 4. 3 Sep 2009 Instructor: Bhiksha Raj Representing an

Table of Contents I Representing Defaults A General Strategy for Representing Defaults Knowledge