An Overview of Semantic Image Segmentation with Deep Learning - PowerPoint PPT Presentation
An Overview of Semantic Image Segmentation with Deep Learning Simone Bonechi Outline Semantic Image Segmentation Deep Network for Semantic Segmentation FCN (Fully Convolutional Neural Network) DeconvNet PSPNet (Pyramid Scene
An Overview of Semantic Image Segmentation with Deep Learning Simone Bonechi
Outline Ø Semantic Image Segmentation Ø Deep Network for Semantic Segmentation FCN (Fully Convolutional Neural Network) • DeconvNet • PSPNet (Pyramid Scene Parsing Network) • Work in progress… Ø
Semantic Image Segmentation
Instance-Level Segmentation Ø Its main purpose is to identify objects of the same class and split them into different instances
Results on PascalVoc 2012
Fully Convolutional Neural Network (FCN) Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).
FCN Overview Ø Tested with AlexNet, VGG and GoogLeNet Ø Reinterpret standard classification convnets as “Fully convolutional” networks (FCN) for semantic segmentation Ø Combine information from different layers for segmentation
Replace FC with Convolutions A classification network Becoming fully convolutional
Upsampling the output
Convolution & Deconvolution Ø Deconvolution Ø Transposed convolution Ø Fractionally strided convolution Ø Backward strided convolution Ø Upconvolution Ø …..
Upsampling the output
FCN Limitations Ø Fixed-size receptive field FCN has fixed-size receptive field; objects substantially larger or • smaller than the receptive field may be fragmented or mislabeled Label map is so small, tend to forget detail structures of object •
FCN skip architecture
FCN Results Ø Results on PascalVOC 2012
DeconvNet Noh, H., Hong, S., & Han, B. (2015). Learning deconvolution network for semantic segmentation. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1520-1528).
Pooling & Unpooling Ø Unpooling Retrieve structure of original activation map • Activation size is preserved, but still sparse •
Convolution & Deconvolution Ø Deconvolution Densify sparse activation map •
Visualization of activations
Results - Comparisons
PSP-net Zhao, Hengshuang, et al. "Pyramid scene parsing network." IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2017.
Atrous Convolution Ø Upsample with atrous convolution to compute feature densely
PSPNet Results
Recommend
More recommend
Explore More Topics
Stay informed with curated content and fresh updates.