an overview of deep residual learning
play

An Overview of Deep Residual Learning Semih Yagcioglu 01.03.2016 - PowerPoint PPT Presentation

An Overview of Deep Residual Learning Semih Yagcioglu 01.03.2016 Deep Residual Learning Microsoft Research Asia (MSRA) Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Deep Residual Learning for Image Recognition. arXiv


  1. An Overview of Deep Residual Learning Semih Yagcioglu 01.03.2016

  2. Deep Residual Learning •Microsoft Research Asia (MSRA) •Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. “Deep Residual Learning for Image Recognition”. arXiv 2015. •Shaoqing Ren, Kaiming He, Ross Girshick, & Jian Sun. “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”. NIPS 2015. •ILSVRC & COCO 2015 competitions

  3. MSRA @ ILSVRC & COCO 2015 Competitions • 1st places in all five main tracks • ImageNet Classification: “ Ultra-deep ” 152-layer nets • ImageNet Detection: 16% better than 2nd • ImageNet Localization: 27% better than 2nd • COCO Detection: 11% better than 2nd • COCO Segmentation: 12% better than 2nd *improvements are relative numbers Slide Credit: He et al. (MSRA)

  4. Revolution of Depth 28.2 25.8 152 layers 16.4 11.7 22 layers 19 layers 7.3 6.7 3.57 8 layers 8 layers shallow ILSVRC'14 ILSVRC'15 ILSVRC'14 ILSVRC'13 ILSVRC'12 ILSVRC'11 ILSVRC'10 ResNet GoogleNet VGG AlexNet ImageNet Classification top-5 error (%) Slide Credit: He et al. (MSRA)

  5. 101 layers Revolution of Depth 86 Engines of visual 66 recognition 58 34 16 layers 8 layers shallow AlexNet VGG ResNet HOG, DPM (RCNN) (RCNN) (Faster RCNN)* PASCAL VOC 2007 Object Detection mAP (%) *w/ other improvements & more data Slide Credit: He et al. (MSRA)

  6. Residual learning reformulates the learning procedure and redirects the information flow in deep neural networks.

  7. Revolution of Depth AlexNet, 8 11x11 conv, 96, /4, pool/2 layers (ILSVRC 2012) 5x5 conv, 256, pool/2 3x3 conv, 384 3x3 conv, 384 3x3 conv, 256, pool/2 fc, 4096 fc, 4096 fc, 1000 Slide Credit: He et al. (MSRA)

  8. Revolution of Depth s o f t m a x 2 S oft max A c t i v a t i o n FC Av e r a ge Po o l 7 x 7 + 1 (V) AlexNet, 8 11x11 conv, 96, /4, pool/2 VGG, 19 GoogleNet, 22 3x3 conv, 64 D e pt h Co n c a t C o n v C o n v C o n v C o n v 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) 5x5 conv, 256, pool/2 3x3 conv, 64, pool/2 layers layers layers 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) D e pt h Co n c a t 3x3 conv, 384 3x3 conv, 128 C o n v C o n v C o n v C o n v 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) (ILSVRC 2012) (ILSVRC (ILSVRC 2014) s o f t m a x 1 C o n v C o n v M a x P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) S oft max A c t i v a t i o n 3x3 conv, 384 3x3 conv, 128, pool/2 Max P ool 3 x 3 + 2 (S ) FC 2014) D e pt h Co n c a t FC 3x3 conv, 256, pool/2 3x3 conv, 256 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) C o n v C o n v M a x P o o l Av e r a g e P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 3 (V) fc, 4096 3x3 conv, 256 D e pt h Co n c a t C o n v C o n v C o n v C o n v 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) fc, 4096 3x3 conv, 256 C o n v C o n v M a x P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) D e pt h Co n c a t s o f t m a x 0 fc, 1000 3x3 conv, 256, pool/2 C o n v C o n v C o n v C o n v 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) S oft max A c t i v a t i o n C o n v C o n v M a x P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) FC 3x3 conv, 512 D e p t h C o n c a t F C C o n v C o n v C o n v C o n v C o n v 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3x3 conv, 512 C o n v C o n v M a x P o o l Av e r a g e P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 3 (V) D e pt h Co n c a t 3x3 conv, 512 C o n v C o n v C o n v C o n v 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) C o n v C o n v M a x P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 3x3 conv, 512, pool/2 Max P ool 3 x 3 + 2 (S ) D e pt h Co n c a t 3x3 conv, 512 C o n v C o n v C o n v C o n v 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) C o n v C o n v M a x P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 3x3 conv, 512 D e pt h Co n c a t C o n v C o n v C o n v C o n v 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) 3x3 conv, 512 C o n v C o n v M a x P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) Max P ool 3 x 3 + 2 (S ) 3x3 conv, 512, pool/2 L o c a l R e s p N o r m C o n v 3 x 3 + 1 (S ) fc, 4096 C o n v 1 x 1 + 1 (V) L o c a l R e s p N o r m fc, 4096 Max P ool 3 x 3 + 2 (S ) C o n v 7 x 7 + 2 (S ) fc, 1000 Slide Credit: He et al. (MSRA) i n pu t

  9. 7x7 conv, 64, /2, pool /2 1x1 co nv , 64 3x3 co nv , 64 1x1 co nv , 256 1x1 co nv , 64 3x3 co nv , 64 1x1 co nv , 256 1x1 co nv , 64 3x3 co nv , 64 1x1 co nv , 256 1x1 co nv , 128, /2 3x3 co nv , 128 Revolution of Depth 1x1 co nv , 512 1x1 co nv , 128 3x3 co nv , 128 1x1 co nv , 512 1x1 co nv , 128 3x3 co nv , 128 1x1 co nv , 512 1x1 co nv , 128 3x3 co nv , 128 1x1 co nv , 512 1x1 co nv , 128 3x3 co nv , 128 1x1 co nv , 512 1x1 co nv , 128 3x3 co nv , 128 1x1 co nv , 512 1x1 co nv , 128 3x3 co nv , 128 1x1 co nv , 512 1x1 co nv , 128 3x3 co nv , 128 1x1 co nv , 512 1x1 co nv , 256, /2 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 5x5 conv , 256, pool/2 3x3 conv , AlexNet, 8 VGG, 19 ResNet, 152 1x1 co nv , 1024 384 1x1 co nv , 256 3x3 conv , 64 3x3 co nv , 256 3x3 conv , 64, pool/2 3x3 conv , 3x3 conv , 384 3x3 conv , 256, 1x1 co nv , 1024 128 pool/2 1x1 co nv , 256 3x3 conv , 128, pool/2 fc, 4096 3x3 conv , 256 fc, 4096 3x3 conv , 256 fc, 1000 3x3 conv , 256 layers layers layers 3x3 co nv , 256 3x3 conv , 256, pool/2 3x3 conv , 1x1 co nv , 1024 512 1x1 co nv , 256 3x3 conv , 512 3x3 conv , 512 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 conv , 512, pool/2 3x3 conv , 512 3x3 conv , 512 3x3 co nv , 256 1x1 co nv , 1024 (ILSVRC 2012) (ILSVRC (ILSVRC 2015) 1x1 co nv , 256 3x3 conv , 512 3x3 conv , 512, pool/2 fc, 4096 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 fc, 4096 fc, 1000 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 2014) 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 256 3x3 co nv , 256 1x1 co nv , 1024 1x1 co nv , 512, /2 3x3 co nv , 512 1x1 co nv , 2048 1x1 co nv , 512 3x3 co nv , 512 1x1 co nv , 2048 1x1 co nv , 512 3x3 co nv , 512 1x1 co nv , 2048 ave pool, fc 1000 Slide Credit: He et al. (MSRA)

Recommend


More recommend