computatio ion reuse in in dnns by
play

Computatio ion Reuse in in DNNs by Exploiting Input Sim imilarity - PowerPoint PPT Presentation

Computatio ion Reuse in in DNNs by Exploiting Input Sim imilarity Marc Riera , Jose Maria Arnau, Antonio Gonzlez Sequence Processing Applications Speech Audio Signal 4/06/2018 ISCA 2018 2 Sequence Processing Applications 4/06/2018 ISCA


  1. Computatio ion Reuse in in DNNs by Exploiting Input Sim imilarity Marc Riera , Jose Maria Arnau, Antonio GonzΓ‘lez

  2. Sequence Processing Applications Speech Audio Signal 4/06/2018 ISCA 2018 2

  3. Sequence Processing Applications 4/06/2018 ISCA 2018 3

  4. Sequence Processing Applications 4/06/2018 ISCA 2018 4

  5. Sequence Processing Applications 4/06/2018 ISCA 2018 5

  6. Sequence Processing Applications Speech Recognition DNN executions to classify a sequence of audio frames in phonemes 4/06/2018 ISCA 2018 6

  7. Benchmarks DNN Name DNN Type DNN Application #Parameters Accuracy Kaldi MLP Acoustic Scoring 4,7M 89,04% EESEN RNN Speech Recognition 11M 68,85% C3D CNN Video Classification 78M 93,48% AutoPilot CNN Self-Driving Cars 1,6M 99,63% 4/06/2018 ISCA 2018 7

  8. In Input Sim imilarity 90% 77% 80% 69% 70% 61% 60% Input Similarity (%) 52% 50% 45% 40% 30% 20% 10% 0% Kaldi C3D Autopilot EESEN Average 4/06/2018 ISCA 2018 8

  9. Exploiting Temporal Sim imilarity Example Baseline π‘₯ 0 𝑗 𝐽 0 𝑃 𝑗 = 𝐽 0 𝑗 π‘₯ 0 + 𝐽 1 𝑗 π‘₯ 1 + 𝐽 2 𝑗 π‘₯ 2 + 𝑐 π‘₯ 1 𝑗 𝐽 1 Frame i N π‘₯ 2 𝑗 𝐽 2 𝑗+1 π‘₯ 0 𝐽 0 𝑃 𝑗+1 = 𝐽 0 𝑗+1 π‘₯ 0 + 𝐽 1 𝑗+1 π‘₯ 1 + 𝐽 2 𝑗+1 π‘₯ 2 + 𝑐 π‘₯ 1 𝑗+1 𝐽 1 Frame i+1 N π‘₯ 2 𝑗+1 𝐽 2 4/06/2018 ISCA 2018 9

  10. Exploiting Temporal Sim imilarity Example Proposal π‘₯ 0 𝑗 𝐽 0 𝑃 𝑗 = 𝐽 0 𝑗 π‘₯ 0 + 𝐽 1 𝑗 π‘₯ 1 + 𝐽 2 𝑗 π‘₯ 2 + 𝑐 π‘₯ 1 𝑗 𝐽 1 Frame i N π‘₯ 2 𝑗 𝐽 2 𝑗+1 π‘₯ 0 𝐽 0 𝒋 )𝒙 πŸ‘ 𝑷 𝒋+𝟐 = 𝑷 𝒋 + (𝑱 πŸ‘ 𝒋+𝟐 βˆ’π‘± πŸ‘ π‘₯ 1 𝑗+1 𝐽 1 Frame i+1 N Number of computations before = 6 π‘₯ 2 𝑗+1 Number of computations after = 2 𝐽 2 Note : Substraction of the inputs is almost negligible since its performed once per input 4/06/2018 ISCA 2018 10

  11. Computatio ion Reuse 90% 79% 80% 74% 70% 66% Computation Reuse (%) 60% 55% 53% 50% 40% 30% 20% 10% 0% Kaldi C3D Autopilot EESEN Average 4/06/2018 ISCA 2018 11

  12. DNN Processing Unit Tile 4/06/2018 ISCA 2018 12

  13. FC Execution in the Reuse Accelerator (1) 4/06/2018 ISCA 2018 13

  14. FC Execution in the Reuse Accelerator (2) 4/06/2018 ISCA 2018 14

  15. FC Execution in the Reuse Accelerator (3) 4/06/2018 ISCA 2018 15

  16. Other Supported Layers Convolutional Neural Network (CNN) Recurrent Neural Network (RNN) 4/06/2018 ISCA 2018 16

  17. Evalu luation Methodology β€’ Simulator to evaluate the performance and energy of the accelerator β€’ Design Compiler to obtain power and delay of logic modules β€’ 28/32nm library from Synopsys and the DesignWare logic modules β€’ CACTI used for SRAM and eDRAM memories β€’ MICRON LPDDR4 for main Memory β€’ Accelerator Configuration: 4/06/2018 ISCA 2018 17

  18. Memory ry Footprint Overheads 20 18 16 14 Memory Increase (%) 12 10 8 6 4 2 0 On-Chip IO Buffer Off-Chip Main Memory 4/06/2018 ISCA 2018 18

  19. Results: SpeedUp 4/06/2018 ISCA 2018 19

  20. Results: Energy Savin ings 4/06/2018 ISCA 2018 20

  21. Conclusions β€’ More than 60% of the inputs remain unmodified respect the previous execution β€’ Our proposed scheme checks which inputs have changed: β€’ Unmodified inputs are ignored, avoiding computations and memory accesses β€’ Modified inputs are used to correct the previous output of each neuron β€’ On average, 63% energy savings and 3.5x speedup β€’ Small area overhead of less than 1% mainly for additional storage 4/06/2018 ISCA 2018 21

  22. Computatio ion Reuse in in DNNs by Exploiting Input Sim imilarity Marc Riera , Jose Maria Arnau, Antonio GonzΓ‘lez

Recommend


More recommend