constructing fast network
play

Constructing Fast Network through Deconstruction of Convolution - PowerPoint PPT Presentation

Constructing Fast Network through Deconstruction of Convolution Yunho Jeon and Junmo Kim School of Electrical Engineering Korea Advanced Institute of Science and Technology NeurIPS 2018 Goal CNN has achieved outstanding accuracy with deeper


  1. Constructing Fast Network through Deconstruction of Convolution Yunho Jeon and Junmo Kim School of Electrical Engineering Korea Advanced Institute of Science and Technology NeurIPS 2018

  2. Goal CNN has achieved outstanding accuracy with deeper and wider networks Can we make fast CNN with smaller resources while retaining accuracy? 1

  3. How to make a fast network • Reduce FLOPs – Grouped or depthwise convolution – Network pruning 2

  4. How to make a fast network • Reduce FLOPs – Grouped or depthwise convolution – Network pruning  But, Lower FLOPs ≠ Faster speed due to memory access! 3

  5. How to make a fast network • Reduce FLOPs – Grouped or depthwise convolution – Network pruning  But, Lower FLOPs ≠ Faster speed due to memory access! • Reduce memory access – Reduce spatial convolutions • Maximize utilization of accessed memory – Use 1x1 convolutions 4

  6. How to make a fast network • Reduce FLOPs – Grouped or depthwise convolution – Network pruning Key Idea  But, Lower FLOPs ≠ Faster speed due to memory access! • Reduce memory access Deconstruct spatial convolution – Reduce spatial convolutions into atomic operations • Maximize utilization of accessed memory – Use 1x1 convolutions 5

  7. Deconstruction of convolution (1/3) Insight • Spatial convolution = Summation of 1x1 convolutions = 6

  8. Deconstruction of convolution (2/3) Shift Inputs instead of filters = 7

  9. Deconstruction of convolution (3/3) If we can share shifted inputs, Share shifted inputs 8

  10. Deconstruction of convolution (3/3) If we can share shifted inputs, – Reduce FLOPs & memory access  Share shifted inputs 9

  11. Deconstruction of convolution (3/3) If we can share shifted inputs, – Reduce FLOPs & memory access – But, expressive power is limited if shifting to one direction  Share shifted inputs 10

  12. Deconstruction of convolution (3/3) If we can share shifted inputs, – Reduce FLOPs & memory access – But, expressive power is limited if shifting to one direction Key Challenge How to shift inputs?  Share shifted inputs 11

  13. Our approach • Active Shift Layer (ASL) 1. Use depthwise shift 12

  14. Our approach • Active Shift Layer (ASL) 1. Use depthwise shift 2. Introduce new shift parameters for each channel 13

  15. Our approach • Active Shift Layer (ASL) 1. Use depthwise shift 2. Introduce new shift parameters for each channel 3. Expand to non-integer shift using interpolation 14

  16. Our approach • Active Shift Layer (ASL) 1. Use depthwise shift 2. Introduce new shift parameters for each channel 3. Expand to non-integer shift using interpolation • Shift values are differentiable!  Shift values are trained through network itself 15

  17. Example of Learned Shift Enlarge receptive fields by shifting inputs 16

  18. Experiment (ImageNet) • Better accuracy with the smaller number of parameters • Faster inference time with similar accuracy 17

  19. Thank you For more information, Please visit our poster #22

Recommend


More recommend