You Only Look Once: Unified, Real-Time Object Detection Redmon et - PowerPoint PPT Presentation

You Only Look Once: Unified, Real-Time Object Detection Redmon et al., CVPR 2016 Mincheul Kang 1

Image Retrieval using Scene Graphs • Develop novel framework for semantic image retrieval based on the notion of a scene graph • Use scene graphs as query • Introduce a novel dataset of 5K human-generated scene graphs grounded to images Measure Score Query Output Object & Attribute Relationship 2

Contents 1. Background 2. Related work 3. Overview 4. Approach 5. Results 6. Conclusion 7. Q&A 3

Background • Object detection Localization Where? Recognition What? 4 Fast R-CNN slides : Ross Girshick

Background • Object detection in application • Image retrieval • Robotics • Self-driving car Need a fast and accurate algorithms 5 http://www.nvidia.com/object/drive-px.html http://kitschthingoftheday.blogspot.com/2011/06/breakfast-making-robots-at-tum.html

Background • Progress of object detection After CNN PASCAL VOC 80% mean Average Precision (mAP) Faster R-CNN 70% Fast R-CNN 60% R-CNN 50% 40% DPM 30% 20% 10% 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Machine learning + Computer vision 6

Related work • R-CNN (Region proposals + CNN) • Selective search • CNN that extracts a fixed-length feature vector from each region • Binary linear SVMs Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation, 7 Ross Girshick et al., CVPR 2014

Related work • Problem in R-CNN • Progress in several stages • Training and detection time is slow • Need a high capacity storage space 8

Related work • Fast R-CNN • Training is single-stage, using a multi-task loss • Training can update all network layers • No disk storage is required for feature caching Fast R-CNN, 9 Ross Girshick et al., ICCV 2015

Related work • Faster R-CNN • “selective search” => Computing time is long • Region Proposal Network Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, 10 Shaoqing Ren et al., NIPS 2015 and Slides

Related work • Summary • Improve the speed and mAP after CNN • But, It is not enough to operate real-time yet • YOLO • Enable real-time speeds while maintain high average precision 11

Overview • YOLO detection system 1) Resizes the input images to 448 X 448 2) Runs a single convolutional networks on the image 3) Thresholds the resulting detections by the model’s confidence You only look once: Unified, real-time object detection, 12 J Redmon et al., CVPR 2016

Approach • Divide the input image into an S X S grid Input image You only look once: Unified, real-time object detection, 13 J Redmon et al., CVPR 2016

Approach • Each grid cell predicts bounding boxes and confidence scores for those boxes. • IOU (intersection over union) • Confidence : You only look once: Unified, real-time object detection, 14 J Redmon et al., CVPR 2016

Approach • Each grid cell also predicts conditional class probabilities • Class probability : You only look once: Unified, real-time object detection, 15 J Redmon et al., CVPR 2016

Approach • Thresholds the resulting detections by the model’s confidence You only look once: Unified, real-time object detection, 16 J Redmon et al., CVPR 2016

Approach • YOLO • Enables end-to-end training and real-time speeds • Predict all bounding boxes across all classes for an image simultaneously You only look once: Unified, real-time object detection, 17 J Redmon et al., CVPR 2016

Approach • Training • Cost function : You only look once: Unified, real-time object detection, 18 J Redmon et al., CVPR 2016

Result • Result in sample artwork and natural images from internet You only look once: Unified, real-time object detection, 19 J Redmon et al., CVPR 2016

Result • Real-time speeds while maintaining high average precision 69.0 You only look once: Unified, real-time object detection, 20 J Redmon et al., CVPR 2016

Conclusion • Using a single network, it can be optimized end-to-end directly on detection • Predict all bounding boxes across all classes for an image simultaneously • Real-time speeds while maintaining high average precision • Limitations • Struggle with small objects that appear in groups, such as flocks of birds • Incorrect localizations 21

Q & A 22

You Only Look Once: Unified, Real-Time Object Detection Redmon et - PowerPoint PPT Presentation

You Only Look Once: Unified, Real-Time Object Detection Redmon et al., CVPR 2016 Mincheul Kang 1 Image Retrieval using Scene Graphs Develop novel framework for semantic image retrieval based on the notion of a scene graph Use scene

Collection #1 LOOk 1/8 LOOk 2/8 LOOk 3/8 LOOk 4/8 LOOk 5/8 LOOk 6/8

For personal use only For personal use only For personal use only For personal use only For

You Only Die Once Making Your Legacy Count Thank you for joining! Our webinar will begin

Sea Otters to Oregon Robert Bailey, Elakha Alliance Once, They W Once, The y Were ere Here.

OpenCms in the Telco industry - A Tale from Down Under OpenCMSdays 2009 Thomas Kutschi Once

From 2 days to 2 seconds - the birth of DevOps Dan North @tastapod Once upon a time

Local Complete Count Committees Census 2020 Count Everyone Once, Only Once, and in the Right

YOLO: You Only Look Once Unified Real-Time Object Detection Joseph Redmon, Santosh Divvala, Ross

Questions? Questions? Questions? Questions? Questions? Questions? Questions? Questions?

WELCOME TO CALVERT MIDDLE SCHOOL WE LOOK FORWARD TO WALKING THESE HALLS WITH YOU WE LOOK

From n-gate.com: Some academics arrive to tell us that (once again) they have Fixed The

Querying and Mining Data Streams: Querying and Mining Data Streams: You Only Get One Look You

For personal use only nextdc.com 1 For personal use only nextdc.com 2 For personal use only

Family Planning Only Programs Current Family Planning Only Programs Family Planning Only

A JEALOUS A JEALOUS CRYPTANALYST CRYPTANALYST In search of a short vector A story by Leo

The Story of a Tired Old Bathroom Once Once up upon on a time, time, th there ere was sad

Beta Presentation IMAGINE: IMAGe INtake Experience The Capstone Experience Team Auto-Owners

Annual General Meeting 18 November 2014 1851 1914 2014 Samuel Smith opens his first S. Smith

CONTACT CLEANING FOR BARRIER COATINGS Sheila Hamilton Teknek Ltd 1 OUTLINE Current market

Grid Code Review Panel Update on European Network Code Implementation 20 th November 2012

Deploying Machine Learning Models on The Edge Deploying Machine Learning Models on The Edge Yan

A P A P A Perspective o A Perspective o ti ti on the Current on the Current th C th C t

Best Practice Life Expectancy: An Extreme Value Approach Anthony Medford amedford@health.sdu.dk

Hands-On UNIX I ... not so gentle introduction Conventions # type this as root on the command