psconv squeezing feature pyramid into one compact poly
play

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale - PowerPoint PPT Presentation

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer Anbang Yao Qifeng Chen Duo Li Highlights investigates multi-scale architecture through the lens of kernel engineering instead of network engineering


  1. PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer Anbang Yao Qifeng Chen Duo Li

  2. Highlights • investigates multi-scale architecture through the lens of kernel engineering instead of network engineering • extends the scope of conventional mono-scale convolution operation by developing our Poly-Scale Convolution • bring about performance improvement on classification, detection, segmentation tasks with NO computational overheads.

  3. Motivation: Multi-Scale Architecture Design • Single-Scale • AlexNet • VGGNet • …… • Multi-Scale • FCN -> skip connection • Inception -> parallel stream • …… Long et al ., Fully Convolutional Networks for Semantic Segmentation, CVPR 2015 Szegedy et al ., Going Deeper with Convolutions, CVPR 2015

  4. Previous Work: Layer-Level Skip Connection

  5. Previous Work: Filter-Level Parallel Stream  Kernel Size  Dilation Rate

  6. Previous Work: Filter-Level Feature Pyramid

  7. Motivation: Kernel-Level Feature Pyramid Input Feature Map Convolutional Filter Banks Different Colors->Different Dilation Rates

  8. Poly-Scale Convolution Method Standard Convolution Dilated Convolution

  9. Efficient Implementation Observation: Feature channel indices are interchangeable Implementation: Grouping kernels with the same dilation rate together and implement with group convolution

  10. Quantitative Results: ILSVRC 2012 Comparison to baseline models and SOTA multi-scale architectures on ImageNet

  11. Quantitative Results: MS COCO 2017 Comparison to baseline with basic/cascade detectors on COCO detection track

  12. Quantitative Results: MS COCO 2017 Comparison to baseline with basic/cascade detectors on COCO segmentation track

  13. Qualitative Results: Scale Allocation PS-ResNet-50 on ImageNet PS-ResNeXt-29 on CIFAR-100 ■ indicates starting residual block of one stage

  14. Conclusion • a plug-and-play convolution operation for any deep learning models • leads to consistent and considerable performance margins in a wide range of vision tasks, without bells and whistles • code available for reproducibility: https://github.com/d-li14/PSConv

  15. Thanks!

Recommend


More recommend