ceng5030 part 2 1 introduction to convolutional nueral
play

CENG5030 Part 2-1: Introduction to Convolutional Nueral Network Bei - PowerPoint PPT Presentation

CENG5030 Part 2-1: Introduction to Convolutional Nueral Network Bei Yu (Latest update: March 4, 2019) Spring 2019 1 / 22 Overview CNN Architecture Overview CNN Energy Efficiency CNN on Embedded Platform 2 / 22 Overview CNN Architecture


  1. CENG5030 Part 2-1: Introduction to Convolutional Nueral Network Bei Yu (Latest update: March 4, 2019) Spring 2019 1 / 22

  2. Overview CNN Architecture Overview CNN Energy Efficiency CNN on Embedded Platform 2 / 22

  3. Overview CNN Architecture Overview CNN Energy Efficiency CNN on Embedded Platform 3 / 22

  4. CNN Architecture Overview ◮ Convolution Layer ◮ Rectified Linear Unit (ReLU) ◮ Pooling Layer ◮ Fully Connected Layer … max(0,x) CONV ReLU POOL …… Hotspot max(0,x) Non-hotspot CONV ReLU POOL FC 3 / 22

  5. Convolution Layer Convolution Operation: c m m � � � I ⊗ K ( x , y ) = I ( i , x − j , y − k ) K ( j , k ) i = 1 j = 1 k = 1 … … max(0,x) max(0,x) CONV CONV ReLU ReLU POOL POOL …… …… Hotspot Hotspot max(0,x) max(0,x) Non-hotspot Non-hotspot CONV CONV ReLU ReLU POOL POOL FC FC 4 / 22

  6. Convolution Layer (cont.) Effect of different convolution kernel sizes: (a) 7 × 7 (b) 5 × 5 (c) 3 × 3 Kernel Size Padding Test Accuracy 7 × 7 3 87.50% 5 × 5 2 93.75% 3 × 3 1 96.25% 5 / 22

  7. Rectified Linear Unit … max(0,x) CONV ReLU POOL …… Hotspot max(0,x) Non-hotspot CONV ReLU POOL FC ◮ Alleviate overfitting with sparse feature map ◮ Avoid gradient vanishing problem Activation Function Expression Validation Loss ReLU max { x , 0 } 0.16 1 Sigmoid 87.0 1 +exp( − x ) exp( 2 x ) − 1 TanH 0.32 exp( 2 x )+ 1 BNLL log( 1 + exp( x )) 87.0 WOAF NULL 87.0 6 / 22

  8. Pooling Layer … max(0,x) CONV ReLU POOL …… Hotspot max(0,x) Non-hotspot CONV ReLU POOL FC ◮ Extracts the local region statistical attributes in the feature map 1 2 3 4 1 2 3 4 3.5 5.5 5 6 7 8 6 8 5 6 7 8 AVEPOOL MAXPOOL 11.5 13.5 9 10 11 12 14 16 9 10 11 12 13 14 15 16 13 14 15 16 (a) max pooling (b) avg pooling 7 / 22

  9. Pooling Layer (cont.) ◮ Translation invarient ◮ Dimension reduction Effect of pooling methods: Pooling Method Kernel Test Accuracy 2 × 2 Max 96.25% 2 × 2 Ave 96.25% 2 × 2 Stochastic 90.00% 8 / 22

  10. Fully Connected Layer ◮ Fully connected layer transforms high dimension feature maps into flattened vector. … max(0,x) CONV ReLU POOL …… Hotspot max(0,x) Non-hotspot CONV ReLU POOL FC 9 / 22

  11. Fully Connected Layer (cont.) ◮ A percentage of nodes are dropped out (i.e. set to zero) ◮ avoid overfitting Convolutional Hidden Layers C5-3 P5 Effect of dropout ratio: 100 . 00 …… … Accuracy (%) … 16x16x32 95 . 00 90 . 00 0 0 . 5 1 Dropout Ratio 512 2048 10 / 22

  12. Fully Connected Layer (cont.) ◮ A percentage of nodes are dropped out (i.e. set to zero) ◮ avoid overfitting Convolutional Hidden Layers C5-3 P5 Effect of dropout ratio: 100 . 00 Accuracy (%) …… … … 95 . 00 16x16x32 90 . 00 0 0 . 5 1 Dropout Ratio 512 2048 10 / 22

  13. Overview CNN Architecture Overview CNN Energy Efficiency CNN on Embedded Platform 11 / 22

  14. Computer Vision ◮ Humans use their eyes and their brains to visually sense the world. ◮ Computers user their cameras and computation to visually sense the world Jian Sun, “Introduction to Computer Vision and Deep Learning”. 11 / 22

  15. Few More Core Problems ���������� �������� Classification Sequence Detection Segmentation Pixel Image Region Video 12 / 22

  16. A Bit of History Jian Sun, “Introduction to Computer Vision and Deep Learning”. 13 / 22

  17. Winter of Neural Networks (mid 90’ – 2006) ◮ The rises of SVM, Random forest ◮ No theory to play ◮ Lack of training data ◮ Benchmark is insensitive ◮ Difficulties in optimization ◮ Hard to reproduce results Curse “Deep neural networks are no good and could never be trained.” 14 / 22

  18. Renaissance of Deep Learning (2006 – ) ◮ A fast learning algorithm for deep belief nets. [Hinton et.al 1996] ◮ Data + Computing + Industry Competition ◮ NVidia’s GPU, Google Brain (16,000 CPUs) ◮ Speech: Microsoft [2010], Google [2011], IBM ◮ Image: AlexNet, 8 layers [Krizhevsky et.al 2012] (26.2% -> 15.3%) 15 / 22

  19. Revolution of Depth AlexNet, 8 11x11 conv, 96, /4, pool/2 layers (ILSVRC 2012) 5x5 conv, 256, pool/2 3x3 conv, 384 3x3 conv, 384 3x3 conv, 256, pool/2 fc, 4096 fc, 4096 fc, 1000 Slide Credit: He et al. (MSRA) 16 / 22

  20. Revolution of Depth s o f t m a x 2 S oft max A c t i v a t i o n FC Av e r a ge Po o l 7 (V) 1 x 7 + AlexNet, 8 11x11 conv, 96, /4, pool/2 VGG, 19 3x3 conv, 64 GoogleNet, 22 D e pt h Co n ca t C o n v C o n v C o n v C o n v 1 + 1 x 1 (S ) 3 + 1 x 3 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) 5x5 conv, 256, pool/2 3x3 conv, 64, pool/2 layers layers layers 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) D e pt h Co n ca t 3x3 conv, 384 3x3 conv, 128 (ILSVRC 2012) (ILSVRC (ILSVRC 2014) 1 + 1 x 1 C o n v C o n v C o n v C o n v (S ) 3 + 1 x 3 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) so f t m a x 1 C o n v C o n v M a x P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) S oft max A c t i v a t i o n 3x3 conv, 384 3x3 conv, 128, pool/2 (S ) 2 Max P ool 3 + x 3 FC 2014) D e pt h Co n c a t FC 3x3 conv, 256, pool/2 3x3 conv, 256 1 x 1 (S ) 3 + 1 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 + 1 x 1 (S ) 1 x 1 (S ) + 1 x 5 + 3 C o n v C o n v M a x P o o l Av e r a g e P o o l 1 (V) x 1 + 1 (S ) 1 x 1 (S ) 3 + 1 x 3 + 1 (S ) 5 fc, 4096 3x3 conv, 256 D e pt h Co n c a t C o n v C o n v C o n v C o n v + 1 x 1 1 (S ) 3 + 1 x 3 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) fc, 4096 3x3 conv, 256 (S ) C o n v C o n v M a x P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 D e pt h Co n c a t so f t m a x 0 fc, 1000 3x3 conv, 256, pool/2 x 1 C o n v C o n v C o n v C o n v 1 + 1 (S ) 3 x 3 + 1 (S ) 5 + 1 x 5 (S ) 1 x 1 + 1 (S ) S oft max A c t i v a t i o n C o n v C o n v M a x P o o l 1 + 1 x 1 (S ) 1 + 1 x 1 (S ) 3 x 3 + 1 (S ) FC 3x3 conv, 512 D e p t h C o n c a t F C 1 x 1 C o n v C o n v C o n v C o n v + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) 1 x 1 (S ) + 1 C o n v 3x3 conv, 512 C o n v C o n v M a x P o o l Av e r a g e P o o l 1 x 1 + 1 (S ) 1 x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 3 (V) D e pt h Co n c a t 3x3 conv, 512 x 1 C o n v C o n v C o n v C o n v 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 + 1 x 1 (S ) C o n v C o n v M a x P o o l 1 x 1 (S ) 1 + 1 + 1 x 1 (S ) 3 x 3 + 1 (S ) 3x3 conv, 512, pool/2 Max P ool 3 x 3 + 2 (S ) D e pt h Co n c a t 3x3 conv, 512 x 1 1 C o n v C o n v C o n v C o n v + 1 (S ) 3 x 3 (S ) 5 + 1 x 5 (S ) 1 + 1 x 1 + 1 (S ) C o n v C o n v M a x P o o l 1 x 1 (S ) 1 + 1 + 1 x 1 (S ) 3 x 3 + 1 (S ) 3x3 conv, 512 D e pt h Co n c a t 1 C o n v C o n v C o n v C o n v x 1 + 1 (S ) 3 x 3 + 1 (S ) 5 x 5 + 1 (S ) 1 x 1 + 1 (S ) 3x3 conv, 512 C o n v C o n v M a x P o o l 1 x 1 (S ) 1 + 1 + 1 x 1 (S ) 3 x 3 + 1 (S ) 2 Max P ool 3 (S ) x 3 + 3x3 conv, 512, pool/2 L o c a l R e s p N o r m 3 x 3 + 1 C o n v (S ) fc, 4096 C o n v 1 + 1 x 1 (V) L o c a l R e s p N o r m fc, 4096 2 Max P ool 3 (S ) x 3 + C o n v fc, 1000 7 + 2 x 7 (S ) Slide Credit: He et al. (MSRA) in pu t 16 / 22

Recommend


More recommend