an overview of semantic image segmentation with deep
play

An Overview of Semantic Image Segmentation with Deep Learning - PowerPoint PPT Presentation

An Overview of Semantic Image Segmentation with Deep Learning Simone Bonechi Outline Semantic Image Segmentation Deep Network for Semantic Segmentation FCN (Fully Convolutional Neural Network) DeconvNet PSPNet (Pyramid Scene


  1. An Overview of Semantic Image Segmentation with Deep Learning Simone Bonechi

  2. Outline Ø Semantic Image Segmentation Ø Deep Network for Semantic Segmentation FCN (Fully Convolutional Neural Network) • DeconvNet • PSPNet (Pyramid Scene Parsing Network) • Work in progress… Ø

  3. Semantic Image Segmentation

  4. Instance-Level Segmentation Ø Its main purpose is to identify objects of the same class and split them into different instances

  5. Results on PascalVoc 2012

  6. Fully Convolutional Neural Network (FCN) Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).

  7. FCN Overview Ø Tested with AlexNet, VGG and GoogLeNet Ø Reinterpret standard classification convnets as “Fully convolutional” networks (FCN) for semantic segmentation Ø Combine information from different layers for segmentation

  8. Replace FC with Convolutions A classification network Becoming fully convolutional

  9. Upsampling the output

  10. Convolution & Deconvolution Ø Deconvolution Ø Transposed convolution Ø Fractionally strided convolution Ø Backward strided convolution Ø Upconvolution Ø …..

  11. Upsampling the output

  12. FCN Limitations Ø Fixed-size receptive field FCN has fixed-size receptive field; objects substantially larger or • smaller than the receptive field may be fragmented or mislabeled Label map is so small, tend to forget detail structures of object •

  13. FCN skip architecture

  14. FCN Results Ø Results on PascalVOC 2012

  15. DeconvNet Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1520-1528).

  16. Pooling & Unpooling Ø Unpooling Retrieve structure of original activation map • Activation size is preserved, but still sparse •

  17. Convolution & Deconvolution Ø Deconvolution Densify sparse activation map •

  18. Visualization of activations

  19. Results - Comparisons

  20. PSP-net Zhao, Hengshuang, et al. "Pyramid scene parsing network." IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2017.

  21. Atrous Convolution Ø Upsample with atrous convolution to compute feature densely

  22. PSPNet Results

Recommend


More recommend