Recent Progress on CNNs for Object Detection & Image Compression - PowerPoint PPT Presentation

Recent Progress on CNNs for Object Detection & Image Compression Rahul Sukthankar Google Research Confidential + Proprietary

Credits: My Research Group at Google Lifelong Learning Object Detection ++ Learning from Video NN Compression Individual Explorers - Vitto Ferrari (TL) - Kevin Murphy (TL) - Susanna Ricco (TL) - George Toderici (TL) - Chunhui Gu - Danfeng Qin - Alireza Fathi - Alexey Vorobyov - Damien Vincent - Ian Fischer - Hassan Rom - Anoop Korattikara - Bryan Seybold - David Minnen - Mohamad Tarifi - Jasper Uijlings - Chen Sun - Dave Marwood - Joel Shor - Noah Snavely - Stefan Popov - George Papandreou - David Ross - Nick Johnston - Shumeet Baluja - Hyun Oh Song - Sudheendra - Michele Covell - Jonathan Huang Vijayanarasimhan - Saurabh Singh 3D People/VR/AR Part-Time Faculty - Nathan Silberman - Sung Jin Hwang - Chris Bregler (TL) - Abhinav Gupta Event Understanding - Sergio Guadarrama - Avneesh Sud - Irfan Essa - Caroline - Tyler Zhu - Christian Frueh - Jitendra Malik NN Theorem Proving Pantofaru (TL) - Vivek Rathod - Diego Ruspini - Kate Fragkiadaki - Christian Szegedy (TL) - Arthur Wait - Nick Dufour [+ Noah & Vitto] - Alex Alemi - Cheol Park - Nori Kanazawa - Niklas Een - Eric Nichols - Vivek Kwatra - Sarah Loos - Radhika Marvin - Shrenik Lad - Vinay Bettadapura Confidential + Proprietary

Credits: My Research Group at Google Lifelong Learning Object Detection ++ Learning from Video NN Compression Individual Explorers - Vitto Ferrari (TL) - Kevin Murphy (TL) - Susanna Ricco (TL) - George Toderici (TL) - Chunhui Gu - Danfeng Qin - Alireza Fathi - Alexey Vorobyov - Damien Vincent - Ian Fischer - Hassan Rom - Anoop Korattikara - Bryan Seybold - David Minnen - Mohamad Tarifi - Jasper Uijlings - Chen Sun - Dave Marwood - Joel Shor - Noah Snavely - Stefan Popov - George Papandreou - David Ross - Nick Johnston - Shumeet Baluja - Hyun Oh Song - Sudheendra - Michele Covell - Jonathan Huang Vijayanarasimhan - Saurabh Singh 3D People/VR/AR Part-Time Faculty - Nathan Silberman - Sung Jin Hwang - Chris Bregler (TL) - Abhinav Gupta Event Understanding - Sergio Guadarrama - Avneesh Sud - Irfan Essa - Caroline - Tyler Zhu - Christian Frueh - Jitendra Malik NN Theorem Proving Pantofaru (TL) - Vivek Rathod - Diego Ruspini - Kate Fragkiadaki - Christian Szegedy (TL) - Arthur Wait - Nick Dufour Part 1 [+ Noah & Vitto] - Alex Alemi - Cheol Park - Nori Kanazawa - Niklas Een - Eric Nichols - Vivek Kwatra - Sarah Loos - Radhika Marvin - Shrenik Lad - Vinay Bettadapura Confidential + Proprietary

Credits: My Research Group at Google Lifelong Learning Object Detection ++ Learning from Video NN Compression Individual Explorers - Vitto Ferrari (TL) - Kevin Murphy (TL) - Susanna Ricco (TL) - George Toderici (TL) - Chunhui Gu - Danfeng Qin - Alireza Fathi - Alexey Vorobyov - Damien Vincent - Ian Fischer - Hassan Rom - Anoop Korattikara - Bryan Seybold - David Minnen - Mohamad Tarifi - Jasper Uijlings - Chen Sun - Dave Marwood - Joel Shor - Noah Snavely - Stefan Popov - George Papandreou - David Ross - Nick Johnston - Shumeet Baluja - Hyun Oh Song - Sudheendra - Michele Covell - Jonathan Huang Vijayanarasimhan - Saurabh Singh 3D People/VR/AR Part-Time Faculty - Nathan Silberman - Sung Jin Hwang - Chris Bregler (TL) - Abhinav Gupta Event Understanding - Sergio Guadarrama - Avneesh Sud - Irfan Essa Part 2 - Caroline - Tyler Zhu - Christian Frueh - Jitendra Malik NN Theorem Proving Pantofaru (TL) - Vivek Rathod - Diego Ruspini - Kate Fragkiadaki - Christian Szegedy (TL) - Arthur Wait - Nick Dufour [+ Noah & Vitto] - Alex Alemi - Cheol Park - Nori Kanazawa - Niklas Een - Eric Nichols - Vivek Kwatra - Sarah Loos - Radhika Marvin - Shrenik Lad - Vinay Bettadapura Confidential + Proprietary

Part 1: Object Detection Huang, Rathod, Sun, Zhu, Korattikara, Fathi, Fischer, Wojna, Song, Guadarrama, and Murphy, “Speed/accuracy trade-offs for modern convolutional object detectors” https://arxiv.org/abs/1611.10012 Confidential + Proprietary

Object Detection Confidential + Proprietary

Object Detection For a given set of object categories, Battery mark each instance with a bounding box and a category label Confidential + Proprietary

Bullet Object Detection Bullet For a given set of object categories, Battery mark each instance with a bounding box and a category label Can add object categories Confidential + Proprietary

7.62x51mm NATO cartridge Object Detection 5.56x45mm NATO cartridge For a given set of object categories, AA Battery mark each instance with a bounding box and a category label Can add more object categories (fine grained recognition) Confidential + Proprietary

Object Detection For a given set of object categories, mark each instance with a bounding box and a category label Becomes very challenging in complex scenes due to object size, clutter and partial occlusion Confidential + Proprietary

Object Detection -- Sampling of Key Ideas - Dense sliding windows -- searching over x, y, scale - Neural net based face detection [Rowley et al., 1995] - Classifier cascade, efficient ``integral image’’ features [Viola & Jones, 2001] - HoG + SVM for pedestrian detection [Dalal & Triggs, 2005] - Deformable part models [Felzenszwalb et al., 2010] - Proposals (selective search) vs. sliding windows [e.g., van de Sande et al., 2011] {overcomes issue of densely sampling x, y, scale + aspect ratio} - Return of neural nets -- learned feature extractors [Krizhevsky et al., 2012] - Current generation of object detectors -- pioneered by Multibox and R-CNN. Confidential + Proprietary

Typical Modern Approach: Predict Region Offset & Classify Classify regions as foreground or Object background. Predict offset for positive patches. Classify foreground ● Predicting bounding box offset is a counterintuitive concept regions into 1 of C ● How to select the initial boxes (often called anchors )? classes. Lizard: 0.8 ○ External process (R-CNN) Frog: 0.1 ○ Clustering ground truth boxes (Multibox) Dog: 0.1 ○ Dense grid (now popular) ● Interesting connection to sliding windows and object proposals Confidential + Proprietary

Typical Modern Approach: Predict Region Offset & Classify Classify regions as foreground or Object background. Predict offset for positive patches. Classify foreground regions into 1 of C classes. Lizard: 0.8 Frog: 0.1 Dog: 0.1 Confidential + Proprietary

Aside: What is a Neural Network? Magic box Numbers you have Numbers you want Learns from lots of data using gradient and grad student descent Confidential + Proprietary

Aside: What is a Neural Network? Magic box [0.01,…,0.76,…, 0.14] bicycle building forest Numbers you have (e.g., RGB pixels) Trained on a large labeled dataset like ImageNet Confidential + Proprietary

Aside: What is a Convolutional Neural Network? CNN Cuboid of numbers Cuboid of numbers (X x Y x D) (X’ x Y’ x D’) ● Patch-to-patch mapping ● Shared weights (shift invariant) ● Retinal connectivity (local support) Confidential + Proprietary

Components of Modern Object Detection Systems 1. Feature Extractor Input: RGB pixels Output: a feature vector of numbers for each patch 2. Proposal Generator Input: feature vector Output: objectness classifier -- foreground or background? Output: bounding box regression -- where? 3. Box Classifier -- can be combined with (2) Input: features for cropped box Output: multi-way classifier -- what class is this object? Output: bounding box refinement -- how to adjust box to be on object Confidential + Proprietary

Object Detection Meta-Architecture Type 1: Single-Shot Detector (SSD) & variants [Liu et al., 2015] Confidential + Proprietary

Object Detection Meta-Architecture Type 2: Faster R-CNN & variants [Ren et al., 2015] Confidential + Proprietary

Object Detection Meta-Architecture Type 3: Region-Based Fully Convolutional (R-FCN) [Dai et al., 2015] Confidential + Proprietary

Wide Choice of Feature Extractors Accuracy on ImageNet vs. model size Confidential + Proprietary

Build Your Own Object Detector -- Lots of Combinations! Meta Architecture Feature Extractor Other Important Choices 1. SSD 1. Inception Resnet V2 ● Input: low-res, hi-res 2. Faster R-CNN 2. Inception V2 ● Match: argmax, bipartite,... 3. R-FCN 3. Inception V3 ● Location loss: smooth L1, 4. MobileNet Bounding box encoding ● 5. Resnet 101 ● Stride 6. VGG 16 ● # Proposals ● Other hyperparameters... [Huang et al.] evaluate ~150 combinations in the paper! Confidential + Proprietary

mAP vs. Computation Confidential + Proprietary

mAP vs. Computation Optimality “Frontier” Models below the curve are generally dominated, both in accuracy & speed Focus discussion on the ones close to the curve Confidential + Proprietary

mAP vs. Computation Meta architecture SSD models are fastest Faster R-CNN is slow but more accurate Dropping #proposals makes Faster R-CNN fast w/o much mAP drop R-FCN is close to that sweet spot Confidential + Proprietary

Recent Progress on CNNs for Object Detection & Image Compression - PowerPoint PPT Presentation

Recent Progress on CNNs for Object Detection & Image Compression Rahul Sukthankar Google Research Confidential + Proprietary Credits: My Research Group at Google Lifelong Learning Object Detection ++ Learning from Video NN Compression

Recent Progress in Object Detection Jiaqi Wang Multimedia Laboratory The Chinese University of

From image classification to object detection Image classification Object detection Image source

Object Detection Sanja Fidler CSC420: Intro to Image Understanding 1 / 48 Object Detection The

Advanced Section #3: CNNs and Object Detection AC 209B: Data Science Javier Zazo Pavlos

Object Detection JunYoung Gwak 1 Motivation Image classification Input: Image

Deep Learning in Image Processing Topics: Image Filtering 101 CNNs 101 Image

Introduction to Object Detection & Image Segmentation Abel Brown (abelb@nvidia.com) November

Pictorial structures Laurens van der Maaten Introduction Object detection aims to find a

Training R-CNNs of various velocities Slow, fast, and faster

Deep Neural Networks for Object Detection Paper by C. Szegedy, A. Toshev, D. Erhan [2013]

Object Detection in Recent 3 Years Beyond RetinaNet and Mask R-CNN Gang Yu

EAS Radio Detection with LOPES results and recent progress (1) The cooperation of

Geirhos et al. (2019) Introduction ImageNet classifjcation with CNNs Which image cues are

A Solution for Densely Annotated Large Scale Object Detection Task Yuan Gao, Hui Shen, Donghong

Early Detection of Alzheimers Disease: Some recent progress by Mike Perry Cryonics Mar. Apr.

Semantic segmentation Image classification Object detection Semantic segmentation Evolution

Killer Bunny Rabbit Detector An Application of Object Detection James Maher Image from:

3D Understanding Towards Object Manipulation Some Thoughts and Progress Hao Su 3D in CV/CG

Object Detection Deep ConvNets for Recognition for... Images (global) Objects (local) Video

A Review on Salient Object Detection Feng Lin Salient Object Detection Target Detect and

Detection, Segmentation Overview Object Detection deer cat Object Detection as Classification

bounding-box April 9, 2019 1 Boxes in Object Detection In [1]: % matplotlib inline import d2l

Branch&Rank: Efficient, Non-Linear Object Detection Alain D. Lehmann, Peter V. Gehler, and

Lecture 11: Object detection Contains slides from S. Lazebnik, R. Girshick, B. Hariharan 1

Recent Progress on CNNs for Object Detection & Image Compression - PowerPoint PPT Presentation

Recent Progress on CNNs for Object Detection & Image Compression Rahul Sukthankar Google Research Confidential + Proprietary Credits: My Research Group at Google Lifelong Learning Object Detection ++ Learning from Video NN Compression

Recent Progress in Object Detection Jiaqi Wang Multimedia Laboratory The Chinese University of

From image classification to object detection Image classification Object detection Image source

Object Detection Sanja Fidler CSC420: Intro to Image Understanding 1 / 48 Object Detection The

Advanced Section #3: CNNs and Object Detection AC 209B: Data Science Javier Zazo Pavlos

Object Detection JunYoung Gwak 1 Motivation Image classification Input: Image

Deep Learning in Image Processing Topics: Image Filtering 101 CNNs 101 Image

Introduction to Object Detection &amp; Image Segmentation Abel Brown (abelb@nvidia.com) November

Pictorial structures Laurens van der Maaten Introduction Object detection aims to find a

Training R-CNNs of various velocities Slow, fast, and faster

Deep Neural Networks for Object Detection Paper by C. Szegedy, A. Toshev, D. Erhan [2013]

Object Detection in Recent 3 Years Beyond RetinaNet and Mask R-CNN Gang Yu

EAS Radio Detection with LOPES results and recent progress (1) The cooperation of

Geirhos et al. (2019) Introduction ImageNet classifjcation with CNNs Which image cues are

A Solution for Densely Annotated Large Scale Object Detection Task Yuan Gao, Hui Shen, Donghong

Early Detection of Alzheimers Disease: Some recent progress by Mike Perry Cryonics Mar. Apr.

Semantic segmentation Image classification Object detection Semantic segmentation Evolution

Killer Bunny Rabbit Detector An Application of Object Detection James Maher Image from:

3D Understanding Towards Object Manipulation Some Thoughts and Progress Hao Su 3D in CV/CG

Object Detection Deep ConvNets for Recognition for... Images (global) Objects (local) Video

A Review on Salient Object Detection Feng Lin Salient Object Detection Target Detect and

Detection, Segmentation Overview Object Detection deer cat Object Detection as Classification

bounding-box April 9, 2019 1 Boxes in Object Detection In [1]: % matplotlib inline import d2l

Branch&amp;Rank: Efficient, Non-Linear Object Detection Alain D. Lehmann, Peter V. Gehler, and

Lecture 11: Object detection Contains slides from S. Lazebnik, R. Girshick, B. Hariharan 1

Introduction to Object Detection & Image Segmentation Abel Brown (abelb@nvidia.com) November

Branch&Rank: Efficient, Non-Linear Object Detection Alain D. Lehmann, Peter V. Gehler, and