Mask R-CNN OBJECT INSTANCE SEGMENTATION AND HUMAN POSE ESTIMATION - PowerPoint PPT Presentation

Mask R-CNN OBJECT INSTANCE SEGMENTATION AND HUMAN POSE ESTIMATION Kaiming He Georgia Gkioxari Piotr Dollár Ross Girshick RESEARCH SCIENTIST POSTDOC RESEARCH SCIENTIST RESEARCH SCIENTIST FACEBOOK AI RESEARCH (FAIR)

Classic Computer Vision Problems Source: PASCAL Dataset Image classification ✓ boat ✓ person

Classic Computer Vision Problems Source: PASCAL Dataset Object detection Image classification ✓ boat ✓ person

Semantic Segmentation person Source: PASCAL Dataset Semantic segmentation (pixel-level classification)

The Instance Segmentation Task Our task Person 4 Person 5 Person 1 person Person 2 Person 3 Source: PASCAL Dataset Semantic segmentation Instance segmentation (pixel-level classification) (pixel-level detection)

Source: COCO Dataset

Source: DAVIS Dataset

Mask R-CNN TALK OUTLINE • Mask R-CNN Object instance segmentation • Human pose estimation • • Role of Caffe2 in our research • Conclusions

Object Detection: R-CNN REGION-BASED CONVOLUTION NEURAL NETWORK Per-region Image Region proposals classification by a CNN (External algorithm) SOURCE: GIRSHICK, DONAHUE, DARRELL, MALIK. RICH FEATURE HIERARCHIES FOR ACCURATE OBJECT DETECTION AND SEMANTIC SEGMENTATION. CVPR 2014

Object Detection: R-CNN REGION-BASED CONVOLUTION NEURAL NETWORK Class/box Class/box Class/box Class/box CNN CNN Per-region Image Region proposals classification by a CNN CNN CNN (External algorithm) SOURCE: GIRSHICK, DONAHUE, DARRELL, MALIK. RICH FEATURE HIERARCHIES FOR ACCURATE OBJECT DETECTION AND SEMANTIC SEGMENTATION. CVPR 2014

Fast R-CNN Class/box A SHARED CNN BODY Class/box Class/box Shared region-wise subnetwork RoIPool op CNN applied to External region entire image proposal algorithm (same as R-CNN) SOURCE: GIRSHICK. FAST R-CNN. ICCV 2015

Faster R-CNN Class/box REGION PROPOSAL NETWORK Class/box Class/box Shared region-wise subnetwork RoIPool op CNN applied to In-network region entire image proposals from RPN SOURCE: REN, HE, GIRSHICK,SUN. FASTER R-CNN: TOWARDS REAL-TIME OBJECT DETECTION WITH REGION PROPOSAL NETWORKS. NIPS 2015

Mask R-CNN for Instance Segmentation OVERVIEW • An extension of Faster R-CNN • Surprisingly simple • Fast: 200 ms / im • Accurate: state of the art on COCO

Mask R-CNN for Instance Segmentation Faster R-CNN Mask “head” RoIAlign CNN applied to entire image Region-wise segmentation subnetwork

Mask R-CNN results on COCO

Quantitative Results backbone mask AP 2015 COCO winner MNC ResNet-101-C4 24.6 FCIS w/ OHEM ResNet-101-C5-dilated 29.2 FCIS+++ w/ OHEM ResNet-101-C5-dilated 33.6 2016 COCO winner [seconds per image] Mask R-CNN ResNet-101-C4 33.1 Mask R-CNN ResNet-101-FPN 35.7 Our 200ms version Mask R-CNN ResNeXt-101-FPN 37.1

Mask R-CNN for Human Pose Estimation OVERVIEW • Keypoint = 1-hot mask • Human pose = 17 keypoints • Represent pose as 17 masks

Mask R-CNN results on COCO

Quantitative Results keypoint AP 2016 COCO winner CMU-Pose+++ 61.8 [seconds per image] G-RMI [w/ extra data] 62.4 Mask R-CNN [keypoint-only] 62.7 Mask R-CNN [keypoint & mask] 63.1 Our 200ms version

Caffe2 Accelerated Research

Caffe2 Object Detection Platform RAPID IDEA ITERATION IS A KEY ENABLING FACTOR IN RESEARCH • Early alpha users starting in May 2016 • Ported py-faster-rcnn from Caffe to Caffe2 • Key design choices • Flexible framework for implementing object detection models • Parallelize data loading with forward/backward computation

Caffe2 Object Detection Platform RAPID IDEA ITERATION IS A KEY ENABLING FACTOR IN RESEARCH • Sync SGD with 8 GPUs [Tesla M40] in a BigSur server • Rapid prototyping of Mask R-CNN models in 8-12 hours • SOTA Mask R-CNN models train in 44 hours • Previous systems: ~ 4 days training time [experience from MSRA]

From Research to Mobile with Caffe2

Conclusions • Simple and effective • Fast inference • Box, mask, and pose all-in-one network and method • Caffe2 enables extremely fast prototyping, critical to our success

Mask R-CNN OBJECT INSTANCE SEGMENTATION AND HUMAN POSE ESTIMATION - PowerPoint PPT Presentation

Mask R-CNN OBJECT INSTANCE SEGMENTATION AND HUMAN POSE ESTIMATION Kaiming He Georgia Gkioxari Piotr Dollr Ross Girshick RESEARCH SCIENTIST POSTDOC RESEARCH SCIENTIST RESEARCH SCIENTIST FACEBOOK AI RESEARCH (FAIR) Classic Computer Vision

1. procedure ONE TO ALL BC( d , my id , X ) 2. begin mask := 2 d 1; 3. /* Set all d bits of

Object Detection in Recent 3 Years Beyond RetinaNet and Mask R-CNN Gang Yu

CS7015 (Deep Learning) : Lecture 12 Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only

Object Detection using R-CNN Experiments CS381V: Visual Recognition, Spring 2016 William Xie

WHOLEHEARTED Digging Deeper to Broaden Our Reach WE WEAR THE MASK We Wear the Mask BY PAUL

Single mask technology implementation Piotr Bielwka 10 th RD51 Stony Brook Single mask

BLACK SOAP GHASSOUL MASK CLAY MASK White Clay Green Clay MASSAGE OIL ARGAN OIL ESSENCE WATER

Critical Contact NIV mask fitting workshop Therapeutic Care October 2018 Learning objectives

Development of a unique reusable safety respirator The Elipse Half-Face Mask represents a major

A C N A I B Enhance Skin complexion Enhance Skin complexion Bianca Facial Mask Enhanced

Classless Subnetting Explained When given an IP Address, Major Network Mask, and a Subnet Mask,

Beyond RetinaNet and Mask R-CNN Gang Yu yugang@megvii.com Outline Modern Object detectors

Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya

Decay vertex ID using CNN for p K+ Aaron Higuera University of Houston CNN Tools on

CNN Ba CNN Based ed Pi Pipeline peline for or Op Optical ical Fl Flow ow Tal Schuster,

CENG5030 Part 2-1: Introduction to Convolutional Nueral Network Bei Yu (Latest update: March 4,

Two term-layers: An Alternative topology for representing term relationships in the Bayesian

Partners and Key figures Call/Topic : LCE 72014: Distribution grid and retail market Duration:

Mesh Network Information ArrowSpan Wireless Mesh Network Solutions (China) Sept 2007 1

Basis konfiguration af ESXi via PowerCLI Mads Fog Albrechtslund vExpert 2014, PernixPro og

MAPS UMTS Emulator Gn & Gp Interfaces (UMTS Gn and Gp Interface Emulation) 818 West

Mapping & Measuring Community Health Networks Blueprint Annual Conference / April 9, 2014 The

On the Partial Observability of Michael D. Moffitt Temporal Uncertainty AAAI 2007 1 Outline

urban drainage systems The KREIS Project Jo rg Londong, Tobias Wtzel Bauhaus- Universit

Mask R-CNN OBJECT INSTANCE SEGMENTATION AND HUMAN POSE ESTIMATION - PowerPoint PPT Presentation

Mask R-CNN OBJECT INSTANCE SEGMENTATION AND HUMAN POSE ESTIMATION Kaiming He Georgia Gkioxari Piotr Dollr Ross Girshick RESEARCH SCIENTIST POSTDOC RESEARCH SCIENTIST RESEARCH SCIENTIST FACEBOOK AI RESEARCH (FAIR) Classic Computer Vision

1. procedure ONE TO ALL BC( d , my id , X ) 2. begin mask := 2 d 1; 3. /* Set all d bits of

Object Detection in Recent 3 Years Beyond RetinaNet and Mask R-CNN Gang Yu

CS7015 (Deep Learning) : Lecture 12 Object Detection: R-CNN, Fast R-CNN, Faster R-CNN, You Only

Object Detection using R-CNN Experiments CS381V: Visual Recognition, Spring 2016 William Xie

WHOLEHEARTED Digging Deeper to Broaden Our Reach WE WEAR THE MASK We Wear the Mask BY PAUL

Single mask technology implementation Piotr Bielwka 10 th RD51 Stony Brook Single mask

BLACK SOAP GHASSOUL MASK CLAY MASK White Clay Green Clay MASSAGE OIL ARGAN OIL ESSENCE WATER

Critical Contact NIV mask fitting workshop Therapeutic Care October 2018 Learning objectives

Development of a unique reusable safety respirator The Elipse Half-Face Mask represents a major

A C N A I B Enhance Skin complexion Enhance Skin complexion Bianca Facial Mask Enhanced

Classless Subnetting Explained When given an IP Address, Major Network Mask, and a Subnet Mask,

Beyond RetinaNet and Mask R-CNN Gang Yu yugang@megvii.com Outline Modern Object detectors

Mask R-CNN By Kaiming He, Georgia Gkioxari, Piotr Dollar and Ross Girshick Presented By Aditya

Decay vertex ID using CNN for p K+ Aaron Higuera University of Houston CNN Tools on

CNN Ba CNN Based ed Pi Pipeline peline for or Op Optical ical Fl Flow ow Tal Schuster,

CENG5030 Part 2-1: Introduction to Convolutional Nueral Network Bei Yu (Latest update: March 4,

Two term-layers: An Alternative topology for representing term relationships in the Bayesian

Partners and Key figures Call/Topic : LCE 72014: Distribution grid and retail market Duration:

Mesh Network Information ArrowSpan Wireless Mesh Network Solutions (China) Sept 2007 1

Basis konfiguration af ESXi via PowerCLI Mads Fog Albrechtslund vExpert 2014, PernixPro og

MAPS UMTS Emulator Gn &amp; Gp Interfaces (UMTS Gn and Gp Interface Emulation) 818 West

Mapping &amp; Measuring Community Health Networks Blueprint Annual Conference / April 9, 2014 The

On the Partial Observability of Michael D. Moffitt Temporal Uncertainty AAAI 2007 1 Outline

urban drainage systems The KREIS Project Jo rg Londong, Tobias Wtzel Bauhaus- Universit

MAPS UMTS Emulator Gn & Gp Interfaces (UMTS Gn and Gp Interface Emulation) 818 West

Mapping & Measuring Community Health Networks Blueprint Annual Conference / April 9, 2014 The