Datasets for object recognition and scene understanding Slides - PowerPoint PPT Presentation

Datasets for object recognition and scene understanding Slides adapted with gratitude from http://www.cs.washington.edu/ education/courses/cse590v/11au/ (Neeraj Kumar and Brian Russell)

1972 Slide credit: A. Torralba

Slide credit: A. Torralba Marr, 1976

Caltech 101 and 256 101 object classes 256 object classes Griffin, Holub, Perona, 2007 Fei-Fei, Fergus, Perona, 2004 9,146 images 30,607 images Slide credit: A. Torralba

MSRC 591 images, 23 object classes Pixel-wise segmentation J. Winn, A. Criminisi, and T. Minka, 2005

LabelMe Tool went online July 1st, 2005 825,597 object annotations collected 199,250 images available for labeling labelme.csail.mit.edu B.C. Russell, A. Torralba, K.P. Murphy, W.T. Freeman, IJCV 2008

Quality of the labeling 12 22 36 8 15 22 Motorbike Car 6 9 14 7 12 21 Boat Person 16 28 52 11 20 36 Tree Dog 13 37 168 6 8 11 Mug Bird 7 10 15 7 8 11 Chair Bottle 5 9 15 5 7 12 Street House lamp 25% 50% 75% 25% 50% 75% Average labeling quality

Extreme labeling

The other extreme of extreme labeling … things do not always look good…

Testing Most common labels: test adksdsa woiieiie …

Sophisticated testing Most common labels: Star Square Nothing …

2011 version - 20 object classes: Person: person Animal: bird, cat, cow, dog, horse, sheep Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor The train/val data has 11,530 images containing 27,450 ROI annotated objects and 5,034 segmentations • Three main competitions: classification, detection, and segmentation • Three "taster" competitions: person layout, action classification, and ImageNet large scale recognition M. Everingham, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisserman

Slide credit: A. Torralba 80.000.000 tiny images 7 Online image search engines 75.000 non-abstract nouns from WordNet And after 1 year downloading images Google: 80 million images A. Torralba, R. Fergus, W.T . Freeman. PAMI 2008

Slide credit: A. Torralba • An ontology of images based on WordNet – 22,000+ categories of visual concepts – 15 million human-cleaned images – www.image-net.org shepherd dog, sheep dog animal collie German shepherd ~10 5 + nodes ~10 8 + images Deng, Dong, Socher, Li & Fei-Fei, CVPR 2009

• Collected all the terms from WordNet that described scenes, places, and environments • Any concrete noun which could reasonably complete the phrase “I am in a place”, or “let’s go to the place” • 899 scene categories • 130,519 images • 397 scene categories with at least 100 images • 63,726 labeled objects J. Xiao, J. Hays, K. Ehinger, A. Oliva, and A. Torralba, CVPR

All the following slides are from A. Torralba and A. Efros Unbiased Look at Dataset Bias Alyosha Efros (CMU) Antonio Torralba (MIT)

Are datasets measuring the right thing? • In Machine Learning: Dataset is The World • In Recognition Dataset is a representation of The World • Do datasets provide a good representation?

Visual Data is Inherently Biased • Internet is a tremendous repository of visual data (Flickr, YouTube, Picassa, etc) • But it’s not random samples of visual world

Flickr Paris

Google   StreetView Paris Knopp, Sivic, Pajdla, ECCV 2010

Sampled Alyosha Efros’s Paris

Sampling Bias • People like to take pictures on vacation

Photographer Bias • People want their pictures to be recognizable and/or interesting vs.

Social Bias “100 Special Moments” by Jason Salavon

Our Question • How much does this bias affect standard datasets used for object recognition?

“ Name That Dataset! ” game __ Caltech 101 __ Caltech 256 __ MSRC __ UIUC cars __ Tiny Images __ Corel __ PASCAL 2007 __ LabelMe __ COIL-100 __ ImageNet __ 15 Scenes __ SUN’09

SVM plays “Name that dataset!”

SVM plays “Name that dataset!” • 12 1-vs-all classifiers • Standard full-image features • 39% performance (chance is 8%)

SVM plays “Name that dataset!”

Datasets have different goals… • Some are object-centric (e.g. Caltech, ImageNet) • Otherwise are scene-centric (e.g. LabelMe, SUN’09) • What about playing “name that dataset” on bounding boxes?

Similar results Performance: 61% (chance: 20%)

Where does this bias comes from?

Some bias is in the world

Some bias comes from the way the data is collected

Google mugs Mugs from LabelMe

Measuring Dataset Bias

Cross-Dataset Generalization SUN LabelMe PASCAL ImageNet Caltech101 MSRC Classifier trained on MSRC cars

Cross-dataset Performance

Dataset Value

Mixing datasets Test on Caltech 101 Task: car detection   Features: HOG Adding additional   data from PASCAL Training on   AP Caltech 101 Number training examples

Mixing datasets Test on PASCAL Adding more   Adding more   PASCAL from LabelMe Adding more   from Caltech 101 AP Training on   PASCAL Number training examples

Negative Set Bias Not all the bias comes from the appearance of the objects we care about

Summary (from 2011) • Our best-performing techniques just don’t work in the real world – e.g., try a person detector on Hollywood film – but new datasets (PASCAL, ImageNet) are better than older ones (MSRC, Caltech) • The classifiers are inherently designed to overfit to type of data it’s trained on. – but larger datasets are getting better

Four Stages of Dataset Grief RECOGNITION IS WHAT BIAS? I HOPELESS., IT WILL AM SURE THAT NEVER WORK. WE MY MSRC WILL JUST KEEP CLASSIFIER OVERFITTING TO WILL WORK ON THE NEXT DATASET… ANY DATA! 3. Despair 1. Denial BIAS IS HERE TO STAY, SO WE MUST OF COURSE THERE BE VIGILANT THAT IS BIAS! THAT’’S OUR ALGORITHMS WHY YOU MUST DON’T GET ALWAYS TRAIN DISTRACTED BY IT. AND TEST ON THE SAME DATASET. 4. Acceptance 2. Machine Learning

Lessons that still apply in 2018 • Datasets are bigger but still very biased • Specific insights about particular datasets less relevant, but overall message still critical • Also, exemplary analysis paper! • Some work since then • Undoing the damage of dataset bias (Khosla et al. https:// people.csail.mit.edu/khosla/papers/eccv2012_khosla.pdf) • A deeper look at dataset bias (Tommasi et al. https://arxiv.org/pdf/ 1505.01257.pdf) • What makes ImageNet good for transfer learning (Huh et al. https:// arxiv.org/pdf/1608.08614.pdf) • Work on domain adaptation/transfer learning • Work on fairness in machine learning

Datasets for object recognition and scene understanding Slides - PowerPoint PPT Presentation

Datasets for object recognition and scene understanding Slides adapted with gratitude from http://www.cs.washington.edu/ education/courses/cse590v/11au/ (Neeraj Kumar and Brian Russell) 1972 Slide credit: A. Torralba Slide credit: A. Torralba

Object Recognition and Scene Understanding MIT student presentation 6.870 6.870 Template

Tool went online July 1st, 2005 Til Feb 11 th , 2009 Visitors: 64771 Available Images:

CS 395T: Visual Recognition Exploiting Context for Object Detection 5 th October 2012 Aashish

Holistic Scene Understanding for 3D Object Detection with RGB-D cameras Dahua Lin, Sanja Fidler,

Convolutional Neural Networks (Application in Object and Scene

Beyond Object Recognition in 2D Georgia Gkioxari Object Recognition in 2D The World is 3D

Object recognition and hierarchical computation Challenges in object recognition.

1 Examples The ETH-80 Dataset (Bastian Leibe and Bernt Schiele) The Caltech 101 average image

Lecture 7: Scene Text Detection and Recognition Dr. Cong Yao Megvii (Face++) Researcher

Supervised object recognition, unsupervised object recognition then Perceptual organization Bill

Overview Object Recognition Neurobiology of Vision Computational Object Recognition: Whats

Deep Incremental Scene Understanding Federico Tombari & Christian Rupprecht Technical

What is a Chair? The object The texture The object The texture The scene The object

Chapter 9 Object recognition Random Forests 9.9 Random forests 2 9.9 Random forests

Introduction to Artificial Intelligence Object Recognition Classifiers Cascade and HOG/SVM

ECG782: Multidimensional Digital Signal Processing Object Recognition

Event Recognition by Learning Amir Habibian Qualcomm Research, Amsterdam 27 Feb 2017 1 What is

Scene Understanding Introduction & Overview Outline Motivation The problems Scene

Selective Search for Object Recognition Uijlings et al. (IJCV 2013) Some figures are from

Dynamic Graph Message Passing Networks Li Zhang , Dan Xu, Anurag Arnab, Philip H.S Torr

Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview

a better and faster way Shu Kong CS, ICS, UCI Image Understanding --> Scene Parsing Scene

View Planning for Object Recognition Gabriel Oliveira and Volkan Isler RSN Lab Motivation 2/30

Object Recognition 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What do we

Datasets for object recognition and scene understanding Slides - PowerPoint PPT Presentation

Datasets for object recognition and scene understanding Slides adapted with gratitude from http://www.cs.washington.edu/ education/courses/cse590v/11au/ (Neeraj Kumar and Brian Russell) 1972 Slide credit: A. Torralba Slide credit: A. Torralba

Object Recognition and Scene Understanding MIT student presentation 6.870 6.870 Template

Tool went online July 1st, 2005 Til Feb 11 th , 2009 Visitors: 64771 Available Images:

CS 395T: Visual Recognition Exploiting Context for Object Detection 5 th October 2012 Aashish

Holistic Scene Understanding for 3D Object Detection with RGB-D cameras Dahua Lin, Sanja Fidler,

Convolutional Neural Networks (Application in Object and Scene

Beyond Object Recognition in 2D Georgia Gkioxari Object Recognition in 2D The World is 3D

Object recognition and hierarchical computation Challenges in object recognition.

1 Examples The ETH-80 Dataset (Bastian Leibe and Bernt Schiele) The Caltech 101 average image

Lecture 7: Scene Text Detection and Recognition Dr. Cong Yao Megvii (Face++) Researcher

Supervised object recognition, unsupervised object recognition then Perceptual organization Bill

Overview Object Recognition Neurobiology of Vision Computational Object Recognition: Whats

Deep Incremental Scene Understanding Federico Tombari &amp; Christian Rupprecht Technical

What is a Chair? The object The texture The object The texture The scene The object

Chapter 9 Object recognition Random Forests 9.9 Random forests 2 9.9 Random forests

Introduction to Artificial Intelligence Object Recognition Classifiers Cascade and HOG/SVM

ECG782: Multidimensional Digital Signal Processing Object Recognition

Event Recognition by Learning Amir Habibian Qualcomm Research, Amsterdam 27 Feb 2017 1 What is

Scene Understanding Introduction &amp; Overview Outline Motivation The problems Scene

Selective Search for Object Recognition Uijlings et al. (IJCV 2013) Some figures are from

Dynamic Graph Message Passing Networks Li Zhang , Dan Xu, Anurag Arnab, Philip H.S Torr

Selective Search for Object Recognition Uijlings et al. Schuyler Smith Overview

a better and faster way Shu Kong CS, ICS, UCI Image Understanding --&gt; Scene Parsing Scene

View Planning for Object Recognition Gabriel Oliveira and Volkan Isler RSN Lab Motivation 2/30

Object Recognition 16-385 Computer Vision (Kris Kitani) Carnegie Mellon University What do we

Deep Incremental Scene Understanding Federico Tombari & Christian Rupprecht Technical

Scene Understanding Introduction & Overview Outline Motivation The problems Scene

a better and faster way Shu Kong CS, ICS, UCI Image Understanding --> Scene Parsing Scene