A Fuller Understanding of Fully Convolutional Networks Evan - PowerPoint PPT Presentation

A Fuller Understanding of Fully Convolutional Networks Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16 1

pixels in, pixels out colorization Zhang et al.2016 monocular depth + normals Eigen & Fergus 2015 semantic segmentation optical flow Fischer et al. 2015 boundary prediction Xie & Tu 2015 2

convnets perform classification < 1 millisecond “tabby cat” 1000-dim vector end-to-end learning 3

lots of pixels, little time? ~1/10 second ??? end-to-end learning 4

a classification network “tabby cat” 5

becoming fully convolutional 6

becoming fully convolutional 7

upsampling output 8

end-to-end, pixels-to-pixels network 9

end-to-end, pixels-to-pixels network upsampling conv, pool, pixelwise nonlinearity output + loss 10

spectrum of deep features combine where (local, shallow) with what (global, deep) fuse features into deep jet (cf. Hariharan et al. CVPR15 “hypercolumn”) 11

skip layers interp + sum skip to fuse layers! interp + sum end-to-end , joint learning of semantics and location dense output 12

skip layer refinement input image stride 32 stride 16 stride 8 ground truth no skips 1 skip 2 skips 13

skip FCN computation Stage 1 (60.0ms) Stage 2 (18.7ms) Stage 3 (23.0ms) A multi-stream network that fuses features/predictions across layers

Truth Input FCN SDS* Relative to prior state-of-the-art SDS: - 30% relative improvement for mean IoU - 286× faster *Simultaneous Detection and Segmentation 15 Hariharan et al. ECCV14

leaderboard FCN FCN FCN FCN FCN FCN FCN FCN FCN == segmentation with Caffe FCN FCN FCN FCN FCN FCN 16

care and feeding of fully convolutional networks 18

usage - train full image at a time without sampling - reshape network to take input of any size - forward time is ~100ms for 500 x 500 x 21 output (on M. Titan X) 19

image-to-image optimization 20

momentum and batch size 21

sampling images? no need! no improvement from sampling across images 22

sampling pixels? no need! no improvement from (partially) decorrelating pixels uniform poisson 23

context? - do FCNs incorporate contextual cues? - loses 3-4 % points when the background is masked - can learn from BG/shape alone if forced to! - Standard 85 IU - BG alone 38 IU - Shape 29 IU 24

past and future history of fully convolutional networks 25

history Shape Displacement Network Convolutional Locator Network Matan & LeCun 1992 Wolf & Platt 1994 26

pyramids The scale pyramid is a classic multi-resolution representation Fusing multi-resolution network layers is a learned, nonlinear 0 1 2 counterpart Scale Pyramid, Burt & Adelson ‘83 27

jets The local jet collects the partial derivatives at a point for a rich local description The deep jet collects layer compositions for a rich, learned description Jet, Koenderink & Van Doorn ‘87 28

extensions - detection + instances - structured output - weak supervision 29

detection: fully conv. proposals Fast R-CNN, Girshick ICCV'15 Faster R-CNN, Ren et al. NIPS'15 end-to-end detection by proposal FCN RoI classification 30

fully conv. nets + structured output Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. Chen* & Papandreou* et al. ICLR 2015. 31

fully conv. nets + structured output Conditional Random Fields as Recurrent Neural Networks. Zheng* & Jayasumana* et al. ICCV 2015. 32

dilation for structured output - enlarge effective receptive field for same no. params - raise resolution - convolutional context model: similar accuracy to CRF but non-probabilistic Multi-Scale Context Aggregation by Dilated Convolutions. Yu & Koltun. ICLR 2016 33

[ comparison credit: CRF as RNN, Zheng* & Jayasumana* et al. ICCV 2015 ] DeepLab : Chen* & Papandreou* et al. ICLR 2015. CRF-RNN : Zheng* & Jayasumana* et al. ICCV 2015 34

fully conv. nets + weak supervision FCNs expose a spatial loss map to guide learning: segment from tags by MIL or pixelwise constraints Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Pathak et al. arXiv 2015. 35

fully conv. nets + weak supervision FCNs expose a spatial loss map to guide learning: mine boxes + feedback to refine masks BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation. Dai et al. 2015. 36

fully conv. nets + weak supervision FCNs can learn from sparse annotations == sampling the loss What's the Point? Semantic Segmentation with Point Supervision. Bearman et al. ECCV 2016. 37

conclusion fully convolutional networks are fast, end-to-end models for pixelwise problems - code in Caffe caffe.berkeleyvision.org - models for PASCAL VOC, NYUDv2, SIFT Flow, PASCAL-Context fcn.berkeleyvision.org github.com/BVLC/caffe model example inference example solving example 38

A Fuller Understanding of Fully Convolutional Networks Evan - PowerPoint PPT Presentation

A Fuller Understanding of Fully Convolutional Networks Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16 1 pixels in, pixels out colorization Zhang et al.2016 monocular depth + normals Eigen & Fergus 2015

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Neural Network Basics Part II Content Image-to-image Why fully convolutional?

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Autoencoder (CAE) Prof. Seungchul Lee Industrial AI Lab. Convolutional Autoencoder

CONVOLUTIONAL AND RECURRENT NEURAL NETWORKS Neural networks Fully connected networks

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Understanding Convolutional Neural Networks David Stutz July 24th, 2014 David Stutz | July

FULLER FARLEY Pre-Feasibility Study Pursuant to the Fuller-Farley pre-feasibility study

XQUERY THE GETTING STUFF DONE LANGUAGE Jim Fuller, Principle Consultant MarkLogic XQuery

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Vehicle Velocity Prediction Using Artificial Neural Networks and Effect of Real-World Signals on

Vehicle Localization based on Lane Marking Detection Yuncong Chen UCSD HRI intern 2014

Monocular Depth Estimation Using Atrous Convolutions Group 5 - Faraz Saeedan Fabian Kessler,

VARIATION OF PUPILLARY DISTANCE: REPUBLIC OF MACEDONIA CASE STUDY Nikolina Saveska 1 , Svetlana

Deep Convolutional Poses for Human Interaction Recognition in Monocular Videos Marcel Sheeny de

IN INVEST VESTOR R PRESENT PRESENTATIO TION AUG UGUST 20 UST 2019 19 SAF SAFE E HAR

GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB Franziska Mueller, et, al. CVPR

T h e S t a t u s o f T e l e s c o p e A r r a y T h e S t a t u

A Fuller Understanding of Fully Convolutional Networks Evan - PowerPoint PPT Presentation

A Fuller Understanding of Fully Convolutional Networks Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16 1 pixels in, pixels out colorization Zhang et al.2016 monocular depth + normals Eigen & Fergus 2015

Convolutional Neural Networks Convolutional neural networks One of the major kinds of ANNs in use

Convolutional Neural Networks ---- Off the shelf top notch performances Convolutional Neural

Convolutional Kuan-Ting Lai 2020/3/31 Neural Network Convolutional Neural Networks (CNN)

Introduction CSCE 970 CSCE 970 Lecture 4: Lecture 4: Convolutional Convolutional Neural

Neural Network Basics Part II Content Image-to-image Why fully convolutional?

Convolutional Neural Networks in Speech Lecture 20 CS 753 Instructor: Preethi Jyothi

Convolutional Autoencoder (CAE) Prof. Seungchul Lee Industrial AI Lab. Convolutional Autoencoder

CONVOLUTIONAL AND RECURRENT NEURAL NETWORKS Neural networks Fully connected networks

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

Convolutional Neural Networks 08, 10 &amp; 17 Nov, 2016 J. Ezequiel Soto S. Image Processing

Understanding Convolutional Neural Networks David Stutz July 24th, 2014 David Stutz | July

FULLER FARLEY Pre-Feasibility Study Pursuant to the Fuller-Farley pre-feasibility study

XQUERY THE GETTING STUFF DONE LANGUAGE Jim Fuller, Principle Consultant MarkLogic XQuery

Semantic Segmentation of the sekleton in bone scintigraphy images with convolutional neural

15-780 Graduate Artificial Intelligence: Convolutional and recurrent networks J. Zico Kolter

and Inference for Convolutional Neural Networks 1 2 FFT IFFT 3 4 Mathieu et al.: Fast

Vehicle Velocity Prediction Using Artificial Neural Networks and Effect of Real-World Signals on

Vehicle Localization based on Lane Marking Detection Yuncong Chen UCSD HRI intern 2014

Monocular Depth Estimation Using Atrous Convolutions Group 5 - Faraz Saeedan Fabian Kessler,

VARIATION OF PUPILLARY DISTANCE: REPUBLIC OF MACEDONIA CASE STUDY Nikolina Saveska 1 , Svetlana

Deep Convolutional Poses for Human Interaction Recognition in Monocular Videos Marcel Sheeny de

IN INVEST VESTOR R PRESENT PRESENTATIO TION AUG UGUST 20 UST 2019 19 SAF SAFE E HAR

GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB Franziska Mueller, et, al. CVPR

T h e S t a t u s o f T e l e s c o p e A r r a y T h e S t a t u

Convolutional Neural Networks 08, 10 & 17 Nov, 2016 J. Ezequiel Soto S. Image Processing