Chenxi Liu , Liang-Chieh Chen, Florian Schrofg, Haruwig Adam, Wei Hua, Alan Yuille, Li Fei-Fei 06/18/2019 @CVPR
Neural Architecture Search for Image Classifjcation Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." In CVPR. 2018. Liu, Chenxi, et al. "Progressive neural architecture search." In ECCV. 2018. Real, Esteban, et al. "Regularized evolution for image classifier architecture search." In AAAI. 2019. Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." In ICLR. 2019.
Neural Architecture Search for Dense Image Prediction ● Image classification is a good starting point for NAS, but should not be the end point. Our paper is one of the first efforts to extend NAS to dense image ● prediction (semantic segmentation to be exact).
Challenge 1: Network Level Search Space Inner Cell Level Outer Network Level
Challenge 1: Network Level Search Space Inner Cell Level Outer Network Level (automatically search) (hand design)
Challenge 2: Need for High Resolution & Effjcient NAS
Challenge 2: Need for High Resolution & Effjcient NAS airplane 32x32
Challenge 2: Need for High Resolution & Effjcient NAS airplane 32x32 > 321x321
Idea of Difgerentiable NAS …… Network\Layer 1 2 L -1 L #1 #2 #3 #4
Idea of Difgerentiable NAS …… Network\Layer 1 2 L -1 L #1 #2 …… #4 L
Idea of Difgerentiable NAS …… Network\Layer 1 2 L -1 L #1
Idea of Difgerentiable NAS …… Network\Layer 1 2 L -1 L ɑ 1 ɑ 2 #1 ɑ 3 ɑ 4 Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." In ICLR. 2019.
Idea of Difgerentiable NAS …… Network\Layer 1 2 L -1 L ɑ 1 ❌ ɑ 3 is the largest among the four ɑ 2 ❌ #1 ɑ 3 ɑ 4 ❌ Liu, Hanxiao, Karen Simonyan, and Yiming Yang. "Darts: Differentiable architecture search." In ICLR. 2019.
Idea of Difgerentiable NAS …… Network\Layer 1 2 L -1 L #1
Network Level Search Space …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 4 8 16 ……
Network Level Search Space …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 4 8 16 ……
Network Level Search Space …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 4 8 16 ……
Network Level Search Space …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 4 8 16 ……
Network Level Search Space …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 4 8 16 32
Network Level Search Space …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 PP AS 4 PP AS 8 PP AS 16 PP AS 32
DeepLabv3 …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 PP AS 4 PP AS 8 PP AS 16 PP AS 32 Chen, Liang-Chieh, George Papandreou, Florian Schroff, and Hartwig Adam. "Rethinking atrous convolution for semantic image segmentation." arXiv preprint arXiv:1706.05587 (2017).
Conv-Deconv …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 4 8 16 32 Noh, Hyeonwoo, Seunghoon Hong, and Bohyung Han. "Learning deconvolution network for semantic segmentation." In ICCV. 2015.
Stacked Hourglass …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 4 8 16 32 Newell, Alejandro, Kaiyu Yang, and Jia Deng. "Stacked hourglass networks for human pose estimation." In ECCV. 2016.
Network Level Search Space …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 PP AS 4 PP AS 8 PP AS 16 PP AS 32
Network Level Search Space …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 PP AS 4 PP AS 8 PP AS 16 PP AS 32
Network Level Search Space …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 PP AS 4 PP AS 8 PP AS 16 PP AS 32
Experiments ● 321x321 image crops from Cityscapes ● Number of layers L = 12 ● 40 epochs; less than 3 days on one P100 GPU
Auto-DeepLab Cell Architecture atr 5x5 + sep 3x3 sep 5x5 + sep sep 5x5 3x3 concat H l-2 ... H l-1 H l + sep atr 3x3 5x5 + sep atr 5x5 3x3 + sep 3x3
Auto-DeepLab Cell Architecture Atrous convolution is often used atr 5x5 + sep 3x3 sep 5x5 + sep sep 5x5 3x3 concat H l-2 ... H l-1 H l + sep atr 3x3 5x5 + sep atr 5x5 3x3 + sep 3x3
Auto-DeepLab Network Architecture …… Downsample\Layer 1 2 3 4 5 L-1 L 1 2 PP AS 4 PP AS 8 PP AS 16 PP AS 32
Auto-DeepLab Network Architecture …… Downsample\Layer 1 2 3 4 5 L-1 L 1 General tendency to downsample 2 PP AS 4 PP AS 8 PP AS 16 PP AS 32
Auto-DeepLab Network Architecture …… Downsample\Layer 1 2 3 4 5 L-1 L 1 General tendency to upsample 2 PP AS 4 PP AS 8 PP AS 16 PP AS 32
Pergormance on Cityscapes (Test Set) Method ImageNet? Coarse? mIOU (%) GridNet 69.5 FRRN-B 71.8 Auto-DeepLab-S 79.9 Auto-DeepLab-L 80.4 Auto-DeepLab-S Yes 80.9 Auto-DeepLab-L Yes 82.1 DeepLabv3+ Yes Yes 82.1 DPC Yes Yes 82.7 Fourure, Damien, et al. "Residual conv-deconv grid network for semantic segmentation." In BMVC. 2017. Pohlen, Tobias, et al. "Full-resolution residual networks for semantic segmentation in street scenes." In CVPR. 2017. Chen, Liang-Chieh, et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." In ECCV. 2018. Chen, Liang-Chieh, et al. "Searching for efficient multi-scale architectures for dense image prediction." In NeurIPS. 2018.
Pergormance on Cityscapes (Test Set) Method ImageNet? Coarse? mIOU (%) GridNet 69.5 FRRN-B 71.8 Auto-DeepLab-S 79.9 Auto-DeepLab-L 80.4 Auto-DeepLab-S Yes 80.9 Auto-DeepLab-L Yes 82.1 DeepLabv3+ Yes Yes 82.1 DPC Yes Yes 82.7 Fourure, Damien, et al. "Residual conv-deconv grid network for semantic segmentation." In BMVC. 2017. Pohlen, Tobias, et al. "Full-resolution residual networks for semantic segmentation in street scenes." In CVPR. 2017. Chen, Liang-Chieh, et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." In ECCV. 2018. Chen, Liang-Chieh, et al. "Searching for efficient multi-scale architectures for dense image prediction." In NeurIPS. 2018.
Thank You @chenxi116 htups://cs.jhu.edu/~cxliu/
Recommend
More recommend