Semantic Image Segmentation and Web-Supervised Visual Learning - PowerPoint PPT Presentation

Semantic Image Segmentation and Web-Supervised Visual Learning Florian Schroff Andrew Zisserman University of Oxford, UK Antonio Criminisi Microsoft Research Ltd, Cambridge, UK

Outline  Part I: Semantic Image Segmentation  Goal: automatic segmentation into object regions  Texton-based Random Forest classifier  Part II: Web-Supervised Visual Learning  Goal: harvest class specific images automatically • Use text & metadata from web-pages • Learn visual model  Part III: Learn segmentation model from harvested images

Goal: Classification & Segmentation water cow grass cow grass sheep grass Image Classification/Segmentation

Goal: Harvest images automatically  Learn visual models w/o user interaction  Specify object-class: e.g. penguin download web-pages and visual model Internet images for penguin related to penguin images

Challenges in Object Recognition  Intra-class variations: appearance differences/similarities among objects of the same class  Inter-class variations: appearance differences/similarities between objects of different classes  Lighting and viewpoint

Importance of Context  Context often delivers important cues  Human recognition heavily relies on context  In ambiguous cases context is crucial for recognition Oliva and Torralba (2007)

training System Overview images  Treat object recognition as supervised classification problem: feature extraction  Train classifier on labeled training data  Apply to new unseen test images  Feature extraction/description classifier  Crucial to have a discriminative (SVM, NN, feature representation Random Forest) unseen image description feature test for extraction images test images

Part I: Image Segmentation  Supervised classification problem:  Classify each pixel in the image … … … … classifier (SVM, NN, represents Random Forest) 1 pixel

Image Segmentation  Introduction to textons and single-class histogram models (SCHM)  Comparison of nearest neighbour (NN) and Random Forest  Show strength of Random Forests to combine multiple features

Background: Feature Extraction Lab repr. L colour- 1 pixel space 5x5 pixels a neighbourhood b repr. Lab 1 pixel colour- space L 3x5x5=75 dim. feature vectors a per pixel b

Background: Texton Vocabulary K-Means 75 dim. feature extraction … feature extraction Training Images Feature vectors Texton vocabulary 75 dim. V textons (#cluster centres) V = K in K-means

Map Features to Textons … … … … Feature Training Images Map to textons Resulting texton-maps Vectors (pre-clustered) per pixel

Texton-Based Class Models  Learn texton histograms given class regions  Represent each class as a set of texton histograms  Commonly used for texture classification (region  whole image) (Leung&Malik ICCV99, Varma&Zisserman CVPR03, Cula&Dana SPIE01, Winn et al. ICCV05) tree tree cow cow grass grass Exemplar based class models (Nearest Neighbour or SVM classifier)

Single Histogram Class Model Histograms (SHCM) … … Combined cow model Training Images Cow models Model each class by a single model! (Schroff et al. ICVGIP 06) (rediscovered by Boiman, Shechtman, Irani CVPR 08) (SHCM improve generalization and speed)

Pixelwise Classification (NN) … … fixed size sliding window … … Cow model h h = assign textons Kullback-Leibler Divergence h KL is better suited than Sheep model

Kullback-Leibler Divergence: Testing • KL does not penalize zero bins in the test histogram which are non-zero in the model histogram • Thus, KL is better suited for single- histogram class models, which have many non-zero bins due to different class appearances • This better suitability was shown by h our experiments query histogram h h

Random Forest: Intro Combine Single Histogram Class Model and Random Forest

Random Forest (Training)  During training each node “selects” the feature from a precompiled feature pool that optimizes the information gain

Random Forests (Testing) Textons t p < λ ? Classify … pixel … Tree 1 Tree n Averaged Class posteriors Class posteriors stored in leaf-nodes Class posteriors Class posteriors  Combination of independent decision trees  Emperical class posteriors in leaf nodes are averaged Kleinberg, Stochastic Discrimination 90  Amit & Geman, Neural Computation 97; Breiman 01  Lepetit & Fua, PAMI06; Winn et al, CVPR06; Moosman et al., NIPS06 

Single Histogram Class Model: Nearest Neighbour vs. node-tests h test histogram i q class model histogram Combine to node-test counts textons Histogram: Sheep model Nearest Neighbour … p t p < 0? counts textons Histogram: Cow model

Flexible, learnt rectangles offset  Learning of offset and rectangle shapes/sizes, as well as the channels improves performance

More Feature Types HOG RGB Textons … … … Weighted sum Pixel to be classified Difference of HOG responses of textons  Compute differences over various responses (RGB, textons, HOG)  Use difference of rectangle responses together with a threshold as node-test t p < λ ?

Feature Response: Example  Example of centered rectangle response:  Red-channel  Green-channel  Blue-channel  Example of rectangle difference (red- and green-channel)

Features: HOG Detailed  Each pixel is discribed Blocksize/ by a “stacked” hog Gradient bins normalization descriptor with different parameters  Difference computed over responses of one gradient bin with respect to a certain normalization and cellsize c=cellsize

Importance of different feature types RGB HOG HOG HOG & & RGB RGB

Importance of different feature types RGB HOG HOG & RGB RGB

Importance of different feature types RGB HOG bicycle building tree HOG HOG & & RGB RGB

Conditional Random Field for Cleaner Object Boundaries  Use global energy minimization instead of maximum a posteriori (MAP) estimate

Image Segmentation using Energy Minimization Conditional Random Field (CRF) • energy minimization using, e.g. Graph-Cut or TRW-S Colour difference Unary Contrast dependent vector likelihood Smoothness prior c i = binary variable representing label (‘ fg ’ or ‘ bg ’) of pixel i s cut t Labelling problem Graph Cut

CRF and Colour-Model Test image specific colour-model Only for Class posteriors 2 nd iteration Contrast dependent from Random Forest smoothness prior  CRF as commonly used (e.g. Shotton et al. ECCV06: TextonBoost)  TRW-S is used to maximize this CRF  Perform two iterations: one with one w/o colour model

MSRC-Databases tree 9-classes : building, tree grass, tree, cow, sky, airplane bike airplane, face, car, grass bicycle sheep car 120 training- 120 test- building images cow` face Similar: 21-classes Images Groundtruth Images Groundtruth

Segmentation Results (MSRC-DB) with Colour-Model Image Groundtruth Classification Classification Quality w/o CRF Class posteriors only

Segmentation Results (MSRC-DB) with Colour-Model Classification Image Classification Quality

Segmentation Results (MSRC-DB 21 classes) Classification Image overlay Classification Quality MAP w/o CRF CRF

21-class MSCR dataset

VOC2007-Database 20 classes : Aeroplane Bicycle Bird Boat Bottle Bus Car Cat Chair Cow Diningtable Dog Horse Motorbike Person Pottedplant Sheep Sofa Train Tvmonitor Images Groundtruth Images Groundtruth

VOC 2007

Results [1] Verbeek et al. NIPS2008; [2] Shotton et al. ECCV2006; [3] Shotton et al. CVPR 2008 (raw results w/o image level prior)  Combination of features improves performance  CRF improves performance and most importantly visual quality

Summary  Discriminative learning of rectangle shapes and offsets improves performance  Different feature types can easily be combined in the random forest framework  Combining different feature types improves performance

Part II: Web-Supervised Visual Learning  Goal: retrieve class specific images from the web  No user interaction (fully automatic)  Images are ranked using a multi-modal approach:  Text & metadata from the web-pages  Visual features  Previous work on learning relationships between words and images:  Barnard et al. JMLR 03 (Matching Words and Pictures)  Berg et al. CVPR 04, CVPR 06

Overview: Harvesting Algorithm Manually labeled images & metadata for some object classes learn text ranker once download text web-pages ranker images and Internet & images metadata

Overview: Harvesting Algorithm User specifies: penguin download text ranked web-pages ranker images images and Internet & images metadata related to penguin visual model for penguin

Semantic Image Segmentation and Web-Supervised Visual Learning - PowerPoint PPT Presentation

Semantic Image Segmentation and Web-Supervised Visual Learning Florian Schroff Andrew Zisserman University of Oxford, UK Antonio Criminisi Microsoft Research Ltd, Cambridge, UK Outline Part I: Semantic Image Segmentation Goal:

Segmentation Bottom-up Segmentation Semantic / instance segmentation Many Slides from L.

Semantic Segmentation / Instance Segmentation Based on Deep learning Yiding Liu 2018.12.08

Pixel-Level Im Image Understanding wit ith Semantic Segmentation and Panoptic Segmentation

Semantic segmentation Image classification Object detection Semantic segmentation Evolution

VIDEO SIGNALS Segmentation WHAT IS SEGMENTATION WHAT IS SEGMENTATION Segmentation is a

An Overview of Semantic Image Segmentation with Deep Learning Simone Bonechi Outline

Lecture 8: Image Segmentation Peng Chao Face++ Researcher pengchao@megvii.com Nov. 2017

Image Segmentation Image Segmentation: Definitions How do we know which groups of pixels in a

Segmentation Segmentation Segmentation Define the accurate boundaries of all objects in an image

Image Segmentation Machine Learning Study Group Presented by Yaochen Xie Jan 25, 2018 Outline

Segmentation 2014-11-14 Robin Strand Centre for Image Analysis Dept. of IT Uppsala University

Learning Deep Structured Models for Semantic Segmentation Guosheng Lin Semantic Segmentation

Segmentation using Segmentation using Bayesian Decision Theory Bayesian Decision Theory

SEMANTIC IMAGE SEGMENTATION WITH DEEP CONVOLUTIONAL NETS AND FULLY CONNECTED CRFS Paper by Chen,

Segmentation H. Papasaika, E. Baltsavias Image Segmentation Partitioning of an image into a set

Part 1 : Image Segmentation Anne Vialard LaBRI, Universit de Bordeaux Contents Introduction

Housing and Homelessness Downtown Eastside Community Fair June 2019 Todays Purpose

Welcome to Wdenswil Announcement of Winners International Olive Oil Award Zurich 2014 1

Mon Valley Commercial Real Estate Opportunities Agenda Introductions Waterfront Tax

6th Grade Promotion Ceremony Strings Ms.Hegre Amanda Mezquita Laura Jaime Dayana Rodriguez

LANDMARKS PRESERVATION COMMISSION PUBLIC HEARING APPLICATION 875 Fifth Avenue, #19A WORK

EASM 2014 considered more informative and are supposed to attract attention (Oliva, Torralba,

Embedding Network and Its Application to Visual Recognition Qilong Wang 1 Peihua Li 1 Lei Zhang 2

Bowe, & Alex Sox-Harris Center for Innovation to Implementation (Ci2i) VA Palo Alto Health

Semantic Image Segmentation and Web-Supervised Visual Learning - PowerPoint PPT Presentation

Semantic Image Segmentation and Web-Supervised Visual Learning Florian Schroff Andrew Zisserman University of Oxford, UK Antonio Criminisi Microsoft Research Ltd, Cambridge, UK Outline Part I: Semantic Image Segmentation Goal:

Segmentation Bottom-up Segmentation Semantic / instance segmentation Many Slides from L.

Semantic Segmentation / Instance Segmentation Based on Deep learning Yiding Liu 2018.12.08

Pixel-Level Im Image Understanding wit ith Semantic Segmentation and Panoptic Segmentation

Semantic segmentation Image classification Object detection Semantic segmentation Evolution

VIDEO SIGNALS Segmentation WHAT IS SEGMENTATION WHAT IS SEGMENTATION Segmentation is a

An Overview of Semantic Image Segmentation with Deep Learning Simone Bonechi Outline

Lecture 8: Image Segmentation Peng Chao Face++ Researcher pengchao@megvii.com Nov. 2017

Image Segmentation Image Segmentation: Definitions How do we know which groups of pixels in a

Segmentation Segmentation Segmentation Define the accurate boundaries of all objects in an image

Image Segmentation Machine Learning Study Group Presented by Yaochen Xie Jan 25, 2018 Outline

Segmentation 2014-11-14 Robin Strand Centre for Image Analysis Dept. of IT Uppsala University

Learning Deep Structured Models for Semantic Segmentation Guosheng Lin Semantic Segmentation

Segmentation using Segmentation using Bayesian Decision Theory Bayesian Decision Theory

SEMANTIC IMAGE SEGMENTATION WITH DEEP CONVOLUTIONAL NETS AND FULLY CONNECTED CRFS Paper by Chen,

Segmentation H. Papasaika, E. Baltsavias Image Segmentation Partitioning of an image into a set

Part 1 : Image Segmentation Anne Vialard LaBRI, Universit de Bordeaux Contents Introduction

Housing and Homelessness Downtown Eastside Community Fair June 2019 Todays Purpose

Welcome to Wdenswil Announcement of Winners International Olive Oil Award Zurich 2014 1

Mon Valley Commercial Real Estate Opportunities Agenda Introductions Waterfront Tax

6th Grade Promotion Ceremony Strings Ms.Hegre Amanda Mezquita Laura Jaime Dayana Rodriguez

LANDMARKS PRESERVATION COMMISSION PUBLIC HEARING APPLICATION 875 Fifth Avenue, #19A WORK

EASM 2014 considered more informative and are supposed to attract attention (Oliva, Torralba,

Embedding Network and Its Application to Visual Recognition Qilong Wang 1 Peihua Li 1 Lei Zhang 2

Bowe, &amp; Alex Sox-Harris Center for Innovation to Implementation (Ci2i) VA Palo Alto Health

Bowe, & Alex Sox-Harris Center for Innovation to Implementation (Ci2i) VA Palo Alto Health