Best of both worlds: Human-machine collaboration for object - PowerPoint PPT Presentation

Best of both worlds: Human-machine collaboration for object annotation Fei-Fei Li Olga Russakovsky Li-Jia Li (Stanford U.) (Stanford U.) (Snapchat) CVPR 2015

Backpack

Strawberry Flute Traffic light Backpack Matchstick Bathing cap Sea lion Racket

Large-scale recognition

Large-scale recognition Need benchmark datasets

PASCAL VOC 2005-2012 20 object classes 22,591 images Classification: person, motorcycle Detection Segmentation Person Motorcycle Action: riding bicycle Everingham, Van Gool, Williams, Winn and Zisserman. The PASCAL Visual Object Classes (VOC) Challenge. IJCV 2010.

Large Scale Visual Recognition Challenge (ILSVRC) 2010-2014 20 object classes 22,591 images 200 object classes 517,840 images DET 1000 object classes 1,431,167 images CLS-LOC Person Person Person Person Dog http://image-net.org/challenges/LSVRC/

ILSVRC types of image annotations Image classification • one object class per image • no bounding boxes Steel drum 1,000 object classes 1,331,167 images $

ILSVRC types of image annotations Image classification Single-object localization • • one object class per one object class per image • image bounding boxes around all • no bounding boxes instances of this class Steel drum Steel drum 1,000 object classes 1,000 object classes 573,966 images 1,331,167 images 657,231 bounding boxes $$ $

ILSVRC types of image annotations Image classification Single-object localization Object detection • • • one object class per one object class per image all target object classes • • image bounding boxes around all bounding boxes around • no bounding boxes instances of this class all instances Steel drum Steel drum Person Car Motorcycle Helmet 200 object classes 1,000 object classes 1,000 object classes 81,799 images 573,966 images 1,331,167 images 228,981 bounding boxes 657,231 bounding boxes $$$ $$ $

Q: How good is scene understanding with ILSVRC?

Q: How good is scene understanding with ILSVRC? An unknown image

Q: How good is scene understanding with ILSVRC? ILSVRC image classification: Table

Q: How good is scene understanding with ILSVRC? ILSVRC single-object localization: Table

Q: How good is scene understanding with ILSVRC? ILSVRC object detection: state-of-the-art output (removing wrong detections) Person Table Person Backpack Table TV

Q: How good is scene understanding with ILSVRC? ILSVRC object detection: all instances of the 200 target objects Cup Lamp Lamp Cup Potted Cup Plant Potted Plant Person Tapeplayer Table Person Potted Plant Couch Backpack Table TV Couch Table

One unsolved question:   What would it take to recognize all the objects here?

The accuracy/cost tradeoff Dense manual annotation High accuracy Huge cost Many objects Cost Label quantity and quality per image

The accuracy/cost tradeoff Dense manual annotation High accuracy Huge cost Many objects Cost Fully automatic object detection Low cost Low accuracy Few objects Label quantity and quality per image

The accuracy/cost tradeoff Dense manual annotation High accuracy Huge cost Many objects Cost Fully automatic object detection Low cost Low accuracy Few objects ☺ Label quantity and quality per image

The accuracy/cost tradeoff Dense manual annotation Crowd engineering High accuracy Huge cost is improving Many objects Cost Fully automatic object detection Low cost Low accuracy Few objects ☺ Label quantity and quality per image

The accuracy/cost tradeoff Dense manual annotation Crowd engineering High accuracy Huge cost is improving Many objects Humans need short, focused Data annotation tasks Cost Fully automatic object detection Low cost Low accuracy Few objects ☺ Label quantity and quality per image

The accuracy/cost tradeoff Dense manual annotation Crowd engineering High accuracy Huge cost is improving Many objects Object detectors Cost are improving Fully automatic object detection Low cost Low accuracy Few objects ☺ Label quantity and quality per image

The accuracy/cost tradeoff Dense manual annotation Crowd engineering High accuracy Huge cost is improving Many objects Object detectors Cost are improving Fully automatic object detection Object detectors are reasonably Low cost Low accuracy Algorithms Few objects accurate on some classes ☺ Label quantity and quality per image

The accuracy/cost tradeoff Dense manual annotation Crowd engineering High accuracy Huge cost is improving Many objects Object detectors Cost are improving Fully automatic object detection Low cost Low accuracy Few objects ☺ Label quantity and quality per image O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Human-machine collaboration   for object annotation O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Human-machine collaboration   for object annotation Input image and constraints O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Human-machine collaboration   for object annotation Input image and constraints Detections For every box B, class C: P(det(B,C) | Image) Pillow (0.8) Bed (0.5) O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Human-machine collaboration   for object annotation Multiple types of human input Input image Is this a bed? and constraints Are there more pillows? Detections Solicit feedback For every box B, class C: Is there a fan? P(det(B,C) | Image) Outline another bed, if any Name this object Pillow (0.8) Name another Bed (0.5) object: pillow, Is this an object? bed, what else? O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Human-machine collaboration   for object annotation Multiple types of human input Input image Is this a bed? and constraints Are there more pillows? Detections Solicit feedback For every box B, class C: Is there a fan? P(det(B,C) | Image, User input ) Outline another bed, if any Update state Name this object Pillow ( 0.9 ) Name another Bed ( 0.6 ) object: pillow, Is this an object? bed, what else? O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Human-machine collaboration   for object annotation Multiple types of human input Input image Is this a bed? and constraints Are there more pillows? Detections Solicit feedback For every box B, class C: Is there a fan? P(det(B,C) | Image, User input) Outline another bed, if any Update state Name this object Pillow (0.9) Name another Bed (0.6) object: pillow, Is this an object? bed, what else? Output detections O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Human-machine collaboration   for object annotation Multiple types of human input Input image Is this a bed? and constraints Are there more pillows? Detections Solicit feedback For every box B, class C: Is there a fan? P(det(B,C) | Image, User input) Outline another bed, if any Update state Name this object Pillow (0.9) Name another HCI in computer vision Bed (0.6) object: pillow, Branson ECCV2010 Jain ICCV2013 Is this an object? bed, what else? Kovashka ICCV2011 Vondrick IJCV 2013 Wah ICCV2011 Wah CVPR2014 Output detections Parkash ECCV2012 Vijayanarasimhan IJCV2014 Biswas CVPR2013 Branson CVPR2014 O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Some qualitative results Computer Object Detection ... O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Some qualitative results Computer Computer Human Object Detection Verify-box: Is the yellow box Answer: No tight around a car ... ... O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Some qualitative results Computer Computer Human Object Detection Verify-box: Is the yellow box Answer: No tight around a car ... ... … Computer Human Draw-box: Draw a box Answer: Yellow box below around a person ... ... O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Some qualitative results Computer Computer Human Object Detection Verify-box: Is the yellow box Answer: No tight around a car ... ... … Computer Computer Human Draw-box: Draw a box Final Labeling Answer: Yellow box below around a person Car Person ... ... ... O Russakovsky et al. Best of both worlds: human-machine collaboration for object annotation. CVPR 2015.

Best of both worlds: Human-machine collaboration for object - PowerPoint PPT Presentation

Best of both worlds: Human-machine collaboration for object annotation Fei-Fei Li Olga Russakovsky Li-Jia Li (Stanford U.) (Stanford U.) (Snapchat) CVPR 2015 Backpack Strawberry Flute Traffic light Backpack Matchstick Bathing cap

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Accountability 2.0 and the Worlds Best WorkforceWhat Does it Mean? Worlds Best Workforce

Shared Vision: Worlds Best Luxury Family Resort 1 Vision: Worlds Best Luxury Family Resort

2017-18 Worlds Best Workforce Results October 29, 2018 What is the Worlds Best Workforce

The Best of Both Worlds Combining Recent Advances in Neural Machine Translation Mia Xu Chen*

City of Piedmont Best Best & Krieger Company/BestBestKrieger @BBKlaw 2018 Best Best

+ GASPA Presentation - Human Resources: The Human Capital Strategic Partner Best People Best

Human-Machine Systems Engineering Introduction & Example A User-Centered Human-Machine

Machine to Machine Communications As a Service Machine-to-Machine (M2M) refers to technologies

THE WORLDS FIRST ALL-ELECTRONIC OPEN-ACCESS TOLL HIGHWAY October 26, 2012 WORLDS 1 ST

THE AWARD CATEGORIES Best House Best Apartment Best Alteration and Renovation

41 1 Sustainable Performance US Dollar Best Trade Best Customer The BIZZ Qatar Corporate

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Hyper-Connected Haptics technology for visually impaired in Human- Machine Collaboration Lead

Worlds Best Workforce Annual Report 2018-2019 Academic Year Annual Advisory Committee

Visipedia Tool Ecosystem for Dataset Curation and Annotation Serge Belongie Outline

From Open Annotations to W3C Web Annotations (and the impact on IIIF Presentation API 3.0)

Data Model A Practical Overview for IIIF & Mirador Michael Appleby Yale Center for British

The Codex BUILDING A GRAPH OF HISTORY What is Codex? v Text-as-a-Graph with the aim to achieve

Annotation and down-stream analysis Martin Morgan 1 Fred Hutchinson Cancer Research Institute,

Writing Your First Kotlin Compiler Plugin Kevin Most A brief intro Are these basically

Inconsistency Detection in Semantic Annotation Nora Hollenstein Nathan

Follow the brief presentation instructions Sharing PowerPoint slides is an effective way to get

Sambuz

Useful Links

Newsletter

Mail Us

Best of both worlds: Human-machine collaboration for object - PowerPoint PPT Presentation

Best of both worlds: Human-machine collaboration for object annotation Fei-Fei Li Olga Russakovsky Li-Jia Li (Stanford U.) (Stanford U.) (Snapchat) CVPR 2015 Backpack Strawberry Flute Traffic light Backpack Matchstick Bathing cap

Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat Chicken Human 1 Human 2 Rat

Human and Machine Learning Tom Mitchell Machine Learning Department Carnegie Mellon University

Accountability 2.0 and the Worlds Best WorkforceWhat Does it Mean? Worlds Best Workforce

Shared Vision: Worlds Best Luxury Family Resort 1 Vision: Worlds Best Luxury Family Resort

2017-18 Worlds Best Workforce Results October 29, 2018 What is the Worlds Best Workforce

The Best of Both Worlds Combining Recent Advances in Neural Machine Translation Mia Xu Chen*

City of Piedmont Best Best &amp; Krieger Company/BestBestKrieger @BBKlaw 2018 Best Best

+ GASPA Presentation - Human Resources: The Human Capital Strategic Partner Best People Best

Human-Machine Systems Engineering Introduction &amp; Example A User-Centered Human-Machine

Machine to Machine Communications As a Service Machine-to-Machine (M2M) refers to technologies

THE WORLDS FIRST ALL-ELECTRONIC OPEN-ACCESS TOLL HIGHWAY October 26, 2012 WORLDS 1 ST

THE AWARD CATEGORIES Best House Best Apartment Best Alteration and Renovation

41 1 Sustainable Performance US Dollar Best Trade Best Customer The BIZZ Qatar Corporate

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Hyper-Connected Haptics technology for visually impaired in Human- Machine Collaboration Lead

Worlds Best Workforce Annual Report 2018-2019 Academic Year Annual Advisory Committee

Visipedia Tool Ecosystem for Dataset Curation and Annotation Serge Belongie Outline

From Open Annotations to W3C Web Annotations (and the impact on IIIF Presentation API 3.0)

Data Model A Practical Overview for IIIF &amp; Mirador Michael Appleby Yale Center for British

The Codex BUILDING A GRAPH OF HISTORY What is Codex? v Text-as-a-Graph with the aim to achieve

Annotation and down-stream analysis Martin Morgan 1 Fred Hutchinson Cancer Research Institute,

Writing Your First Kotlin Compiler Plugin Kevin Most A brief intro Are these basically

Inconsistency Detection in Semantic Annotation Nora Hollenstein Nathan

Follow the brief presentation instructions Sharing PowerPoint slides is an effective way to get

Sambuz

Useful Links

Newsletter

Mail Us

City of Piedmont Best Best & Krieger Company/BestBestKrieger @BBKlaw 2018 Best Best

Human-Machine Systems Engineering Introduction & Example A User-Centered Human-Machine

Data Model A Practical Overview for IIIF & Mirador Michael Appleby Yale Center for British