segmentation segmentation
play

Segmentation Segmentation Segmentation Define the accurate - PowerPoint PPT Presentation

Day 4 Lecture 2 Segmentation Segmentation Segmentation Define the accurate boundaries of all objects in an image Segmentation: Datasets Pascal Visual Object Classes Microsoft COCO 20 Classes 80 Classes ~ 5.000 images ~ 300.000 images


  1. Day 4 Lecture 2 Segmentation

  2. Segmentation Segmentation Define the accurate boundaries of all objects in an image

  3. Segmentation: Datasets Pascal Visual Object Classes Microsoft COCO 20 Classes 80 Classes ~ 5.000 images ~ 300.000 images

  4. Semantic Segmentation Label every pixel! Don’t differentiate instances (cows) Classic computer vision problem Slide Credit: CS231n

  5. Instance Segmentation Detect instances, give category, label pixels “simultaneous detection and segmentation” (SDS) Slide Credit: CS231n

  6. Semantic Segmentation Extract Run through Classify patch a CNN center pixel COW CNN Repeat for every pixel Slide Credit: CS231n

  7. Semantic Segmentation Run “fully convolutional” network to get all pixels at once Smaller output CNN due to pooling Slide Credit: CS231n

  8. Semantic Segmentation Learnable upsampling! Long et al. Fully Convolutional Networks for Semantic Segmentation. CVPR 2015 Slide Credit: CS231n

  9. Convolutional Layer Typical 3 x 3 convolution, stride 1 pad 1 Input: 4 x 4 Output: 4 x 4 Slide Credit: CS231n

  10. Convolutional Layer Typical 3 x 3 convolution, stride 1 pad 1 Dot product between filter and input Input: 4 x 4 Output: 4 x 4 Slide Credit: CS231n

  11. Convolutional Layer Typical 3 x 3 convolution, stride 1 pad 1 Dot product between filter and input Input: 4 x 4 Output: 4 x 4 Slide Credit: CS231n

  12. Convolutional Layer Typical 3 x 3 convolution, stride 2 pad 1 Input: 4 x 4 Output: 2 x 2 Slide Credit: CS231n

  13. Convolutional Layer Typical 3 x 3 convolution, stride 2 pad 1 Dot product between filter and input Input: 4 x 4 Output: 2 x 2 Slide Credit: CS231n

  14. Convolutional Layer Typical 3 x 3 convolution, stride 2 pad 1 Dot product between filter and input Input: 4 x 4 Output: 2 x 2 Slide Credit: CS231n

  15. Deconvolutional Layer 3 x 3 “deconvolution”, stride 2 pad 1 Input: 2 x 2 Output: 4 x 4 Slide Credit: CS231n

  16. Deconvolutional Layer 3 x 3 “deconvolution”, stride 2 pad 1 Input gives weight for filter values Input: 2 x 2 Output: 4 x 4 Slide Credit: CS231n

  17. Deconvolutional Layer Sum where 3 x 3 “deconvolution”, stride 2 pad 1 output overlaps Same as backward pass for normal convolution! Input gives weight for filter Input: 2 x 2 Output: 4 x 4 Slide Credit: CS231n

  18. Deconvolutional Layer Im et al. Generating images with recurrent adversarial networks. arXiv 2016 Radford et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. ICLR 2016 “Deconvolution” is a bad name, already defined as “inverse of convolution” Better names: convolution transpose, backward strided convolution, 1/2 strided convolution, upconvolution Slide Credit: CS231n

  19. Skip Connections “skip connections” Skip connections = Better results Slide Credit: CS231n Long et al. Fully Convolutional Networks for Semantic Segmentation. CVPR 2015

  20. Semantic Segmentation Normal VGG “Upside down” VGG Noh et al. Learning Deconvolution Network for Semantic Segmentation. ICCV 2015 Slide Credit: CS231n

  21. Instance Segmentation Detect instances, give category, label pixels “simultaneous detection and segmentation” (SDS) Slide Credit: CS231n

  22. Similar to R-CNN, but with segments Instance Segmentation External Segment proposals Mask out background with mean image Slide Credit: CS231n Hariharan et al. Simultaneous Detection and Segmentation. ECCV 2014

  23. Instance Segmentation Hariharan et al. Hypercolumns for Object Segmentation and Fine-grained Localization. CVPR 2015 Slide Credit: CS231n

  24. Instance Segmentation Region proposal network (RPN) Similar to Faster R-CNN Reshape boxes to Learn entire model fixed size, end-to-end! figure / ground logistic regression Mask out background, predict object class Won COCO 2015 challenge (with ResNet) Dai et al. Instance-aware Semantic Segmentation via Multi-task Network Cascades. arXiv 2015 Slide Credit: CS231n

  25. Instance Segmentation Predictions Ground truth Slide Credit: CS231n Dai et al. Instance-aware Semantic Segmentation via Multi-task Network Cascades. arXiv 2015

  26. Resources ● CS231n Lecture @ Stanford [slides][video] ● Code for Semantic Segmentation ○ FCN (Caffe) ● Code for Instance Segmentation ○ SDS (Caffe) ○ SDS using Hypercolumns & sharing conv computations (Caffe) ○ Instance-aware Semantic Segmentation via Multi-task Network Cascades (Caffe)

Recommend


More recommend