Object Detection JunYoung Gwak 1 Motivation Image classification - PowerPoint PPT Presentation

Object Detection JunYoung Gwak 1

Motivation Image classification ● Input: Image ● Output: object class 2

Motivation Limitation of classification ● Multiple classes ● Location i.e. Object classification assumes ● Single class of object ● Occupies majority of the input image 3

Motivation We need high-level understanding of the complex world 4

Problem Definition Object Detection ● Input: Image ● Output: multiple instances of ○ object location (bounding box) ○ object class 5

Problem Definition Object Detection ● Input: Image ● Output: multiple instances of ○ object location (bounding box) ○ object class Instance : ● Distinguishes individual objects, in contrast to considering them as a same single semantic class 6

Problem Definition Object Detection ● Input: Image ● Output: multiple instances of ○ object location (bounding box) ○ object class Bounding box : ● Rigid box that confines the instance ● Multiple possible parameterizations ○ (width, height, center x, center y) ○ (x1, y1, x2, y2) ○ (x1, y1, x2, y2, rotation) 7

Problem Definition Object Detection ● Input: Image ● Output: multiple instances of ○ object location (bounding box) ○ object class Object class : ● Semantic class of the instance ○ Similar to object classification task, by predicting a vector of scores 8

Modern Object Detection Architecture (as of 2017) ● Multiple important works around 2014-2017 which built the basis of modern object detection architecture ○ R-CNN ○ Fast R-CNN ○ Faster R-CNN ○ SSD ○ YOLO (v2, v3) Let’s dissect the modern (2017) ○ FPN ○ Fully convolutional object detection architecture! ○ ... ⇒ Detectron 9

Modern Object Detection Architecture (as of 2017) Stage 1 ● For every output pixel (given by backbone networks) ○ For every anchor boxes ■ Predict bounding box offsets ■ Predict anchor confidence ● Suppress overlapping predictions using non-maximum suppression (Optional, if two-stage networks) Stage 2 ● For every region proposals ○ Predict bounding box offsets ○ Predict its semantic class 10

Modern Object Detection Architecture (as of 2017) Fully Convolutional Every pixel makes prediction! ● In contrast to previous works in image classification 12

Modern Object Detection Architecture (as of 2017) Fully Convolutional Every pixel makes prediction! Key notions ● Conv Transpose / unpooling operation: Recover the resolution of the input image 13

Modern Object Detection Architecture (as of 2017) Fully Convolutional Every pixel makes prediction! Key notions ● Conv Transpose / unpooling operation ● 1x1 convolution pixel-wise fully connected layers 14

Modern Object Detection Architecture (as of 2017) Fully Convolutional Every pixel makes prediction! ⇒ Every pixel predicts bounding boxes that are centered at its location 15

Modern Object Detection Architecture (as of 2017) Anchor boxes Neural network prefers discrete prediction over continuous regression! ⇒ Preselect templates of bounding boxes to alleviate regression problem ⇒ Let neural network classify the anchor box and small refinement of it 17

Modern Object Detection Architecture (as of 2017) Stage 1 ● For every output pixel ○ For every anchor boxes ■ Predict bounding box offsets ■ Predict anchor confidence ● Suppress overlapping predictions using non-maximum suppression (Optional, if two-stage networks) Stage 2 ● For every region proposals ○ Predict bounding box offsets ○ Predict its semantic class 18

Modern Object Detection Architecture (as of 2017) Bounding box refinement Given ● Anchor box size ● Output pixel center location Predict bounding box refinement toward ● Log-scaled scale relative ratio ● Relative center offset 19

Modern Object Detection Architecture (as of 2017) Bounding box classification For each predicted bounding box, ● Predict confidence of the box ex) binary cross-entropy loss ● (Optional, if 1-stage network) Predict semantic class of the instance ex) categorical cross-entropy loss 21

Modern Object Detection Architecture (as of 2017) Non-maximum suppression The resulting prediction contains multiple predictions of same instance. Heuristics to remove redundant detections ● For all predictions, in descending order of the prediction confidence ○ If the current prediction heavily overlaps with any of the final predictions: ■ Discard it ○ Else 23 ■ Add it to the final prediction

Modern Object Detection Architecture (as of 2017) Stage 1 ● For every output pixel ○ For every anchor boxes ■ Predict bounding box offsets ■ Predict anchor confidence ● Suppress overlapping predictions using non-maximum suppression (Optional, if two-stage networks) Stage 2 ● For every region proposals ○ Predict bounding box offsets ○ Predict its semantic class ● Suppress overlapping predictions using non-maximum suppression 24

Modern Object Detection Architecture (as of 2017) Two-stage networks Second network to refine the prediction by the first network Pro ● Better predictions ○ Better localization ○ Better precision Con ● Non-standard operation (not favorable for embedded system) ● Slower 25

Modern Object Detection Architecture (as of 2017) For every region proposal from the fist stage ● Extract fixed-size feature corresponding to the region proposal Using the extracted features, ○ Predict bounding box offsets ○ Predict its semantic class 27

Modern Object Detection Architecture (as of 2017) For every region proposal from the fist stage ● Extract fixed-size feature corresponding to the region proposal Using the extracted features , ○ Predict bounding box offsets ○ Predict its semantic class 28

Modern Object Detection Architecture (as of 2017) ROI Align : For every region proposal from the fist stage, extract fixed-size feature 29

Modern Object Detection Architecture (as of 2017) For every region proposal from the fist stage ● Extract fixed-size feature corresponding to the region proposal Using the extracted features, ○ Predict bounding box offsets ○ Predict its semantic class 30

Modern Object Detection Architecture (as of 2017) Bounding box refinement Given ● Region Proposal box size ● Output pixel center location Predict bounding box refinement toward ● Log-scaled scale relative ratio ● Relative center offset 31

Object Detection JunYoung Gwak 1 Motivation Image classification - PowerPoint PPT Presentation

Object Detection JunYoung Gwak 1 Motivation Image classification Input: Image Output: object class 2 Motivation Limitation of classification Multiple classes Location i.e. Object classification assumes Single

Object Oriented Object 3 Programming Object 1 Object 2 Object 4 For : COP 3330. Object

Detection, Segmentation Overview Object Detection deer cat Object Detection as Classification

Object Detection Sanja Fidler CSC420: Intro to Image Understanding 1 / 48 Object Detection The

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

From image classification to object detection Image classification Object detection Image source

AutoML for Object Detection Xiangyu Zhang MEGVII Research 1 AutoML for Advances in AutoML

Object-Oriented Databases Object Oriented Databases ODMG Standard Object Model, Object

Object oriented Object oriented Object oriented Object oriented approach and UML approach and

CS6501: Deep Learning for Visual Recognition Object Detection: RCNN, Fast-RCNN, Faster-RCNN

Lecture 11: Object detection Contains slides from S. Lazebnik, R. Girshick, B. Hariharan 1

Object Detection Ujjwal Post-Doc, STARS Team INRIA Sophia Antipolis Outline What is Object

A Review on Salient Object Detection Feng Lin Salient Object Detection Target Detect and

Object Space Volume Rendering Object Space Volume Rendering Ronald Peikert SciVis 2010 - Object

Holistic Scene Understanding for 3D Object Detection with RGB-D cameras Dahua Lin, Sanja Fidler,

Deep Neural Networks for Object Detection Paper by C. Szegedy, A. Toshev, D. Erhan [2013]

Fusing Generic Objectness and Visual Saliency for Salient Object Detection Yasin KAVAK

Atomic physics with twisted light Andrey Surzhykov Technische Universitt Braunschweig

1 Important Things to Know Archive Version (POP-UP)

Heather Zheng Department of Computer Science p p University of California, Santa Barbara CS201

The role of the CDS market in pricing Eurozone sovereign risk Richard Portes London Business

Inclusive, Local Hiring Building the Pipeline to a Healthy Community David Zuckerman Debbi

Clock Skew Scheduling A Fast and Effective Approach Ankur Sharma, David Chinnery Mentor, a

Vid Video o Hyp yperlin linkin king (LNK) K) TR TRECVi CVid 2017 2017 Maria Eskevich

Wormhole: A Fast Ordered Index for In-memory Data Management(II) Main Paper : Wormhole: A Fast