CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio , P. - PowerPoint PPT Presentation

CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio , P. Judd, T. Hetherington*, T. Aamodt*, N. E. Jerger, A. Moshovos * Please cite the original source.

CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio P. Judd T. Hetherington* T. Aamodt* N. Enright Jerger A. Moshovos *

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ DNNs = SIMD Heaven x + x 100’s — 1000's 3

⋯ ⋯ ⋯ ⋯ ⋯ DNNs = SIMD Heaven x + x 100’s — 1000's 7

CNVLUTIN: Smarter SIMD 52% Performance — 2x ED 2 P Out-of-the-box networks 8

Outline 1. What’s a CNN? 2. A wide SIMD design 3. CNVLUTIN: Skipping neurons in a wide SIMD design 4. Evaluation 5. Our approach 9

What’s a CNN? Korean … mask! 10’s of layers 10

What’s a CNN? … 11

What’s a CNN? Neurons (Input) … 11

What’s a CNN? Synapses Neurons (Filters) (Input) … … 11

What’s a CNN? … … 12

What’s a CNN? Neurons (Output) … … 12

What’s a CNN? Neurons (Output) … … … 12

What’s a CNN? Korean … mask! 10’s of layers 13

What’s a CNN? Convolution ReLU Pool Korean … mask! 10’s of layers 13

What’s a CNN? CNN typical layer Convolution ReLU Pool Data size Negatives to 0 Inner products 3 reduction 2 x 1 … + 0 x -1 -2 -3 -3 -2 -1 0 1 2 3 14

~90% Time spent in convolutions 15

Lots of Runtime Zeroes 0.6 0.5 0.4 0.3 0.2 0.1 0 Alexnet Google NiN VGG19 VGG_M VGG_S AVG Fraction of zero neurons in multiplications 16

Lots of Runtime Zeroes 0.6 0.5 0.4 0.3 Waste of time and energy!!! 0.2 0.1 0 Alexnet Google NiN VGG19 VGG_M VGG_S AVG Fraction of zero neurons in multiplications 16

Lots of Runtime Zeroes 0.6 0.5 0.4 0.3 Waste of time Dynamically and energy!!! generated 0.2 = 0.1 Not predictable 0 Alexnet Google NiN VGG19 VGG_M VGG_S AVG Fraction of zero neurons in multiplications 16

How to compute DNNs: DaDianNao* NBin Neuron 16 Lane 0 Neuron Lane 15 SB (eDRAM) IP0 x Neurons + f NBout x Filter 0 Filter 0 IP15 x + f Filter 15 x Filter 15 *Chen et al. MICRO 2014

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ Processing in DaDianNao 0 1 1 2 0 Neuron 1 2 1 0 3 Lanes 15 0 1 1 1 0 Synapse 1 Lanes Filter 0 15 0 Synapse 1 Lanes Filter 15 15 18

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ Processing in DaDianNao 0 1 1 2 0 Neuron 1 3 2 1 0 Lanes 1 15 0 1 1 0 Synapse 1 Lanes Filter 0 15 0 Synapse 1 Lanes Filter 15 15 18

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ Processing in DaDianNao 0 1 1 2 0 Neuron 1 3 2 1 0 Lanes 1 15 0 1 1 X 0 Synapse Multiplication of corresponding 1 Lanes neuron and synapse elements Filter 0 15 0 X Synapse 1 Lanes Filter 15 15 18

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ Zero-skipping in DaDianNao? 0 2 1 1 2 0 Neuron 3 1 2 1 0 3 Lanes 1 15 0 1 1 1 0 Synapse 1 Lanes Filter 0 15 0 Synapse 1 Lanes Filter 15 15 19

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ Zero-skipping in DaDianNao? Zero 0 1 1 2 1 1 2 0 removal Neuron 2 1 3 1 2 1 0 3 Lanes 1 1 1 15 0 1 1 1 0 Synapse 1 Lanes Filter 0 15 0 Synapse 1 Lanes Filter 15 15 19

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ Zero-skipping in DaDianNao? Zero 0 1 1 2 1 1 2 0 removal Neuron 2 1 3 1 2 1 0 3 Lanes 1 1 1 15 0 1 1 1 0 X Synapse 1 Lanes Filter 0 15 X 0 Synapse 1 Lanes Filter 15 15 19

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ Zero-skipping in DaDianNao? Zero 0 1 1 2 1 1 2 0 removal Neuron 2 1 3 1 2 1 0 3 Lanes 1 1 1 15 0 1 1 1 0 X Synapse Lanes can 1 Lanes not longer Filter 0 15 operate in lock-step! X 0 Synapse 1 Lanes Filter 15 15 19

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ CNVLUTIN: Decoupling Lanes Subunit 0 0 Neuron Lane 0 Neuron 1 Lanes Filter 0 Synapses Filter 1 15 Lane 0 Filter 15 0 Synapse 1 Lanes Filter 0 Subunit 15 15 Neuron Lane 15 Filter 0 0 Synapse Synapses Filter 1 1 Lanes Lane 15 Filter 15 Filter 15 15 CNVLUTIN DaDianNao 20

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ CNVLUTIN: Decoupling Lanes Subunit 0 Neuron Lane 0 1 1 2 0 Offsets 3 2 1 Filter 0 Synapses Lane 0 Filter 15 Subunit 15 Neuron Lane 15 0 1 1 1 Offsets 2 1 0 Filter 0 Synapses Lane 15 Filter 15 21

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ CNVLUTIN: Decoupling Lanes Subunit 0 Neuron Lane 0 1 1 2 Offsets 3 2 1 Filter 0 Synapses Lane 0 Filter 15 Subunit 15 Neuron Lane 15 1 1 1 Offsets 2 1 0 Filter 0 Synapses Lane 15 Filter 15 21

⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ CNVLUTIN: Decoupling Lanes Subunit 0 Neuron Lane 0 1 1 2 Offsets 3 2 1 Filter 0 X Synapses Lane 0 Filter 15 Subunit 15 Neuron Lane 15 1 1 1 Offsets 2 1 0 X Filter 0 Synapses Lane 15 Filter 15 21

CNVLUTIN: Ineffectual-neuron Filtering Layer i Layer i+1 22

CNVLUTIN: Ineffectual-neuron Filtering Layer i Layer i+1 Dispatcher Encoder eDRAM 23

CNVLUTIN: Ineffectual-neuron Filtering Layer i Layer i+1 Dispatcher Encoder eDRAM Brick 2 Brick 1 Brick 0 Neurons 7 6 5 0 0 0 0 0 0 2 1 0 Packed neurons 0 7 6 5 0 0 0 0 0 0 2 1 eDRAM O ff sets 0 3 2 1 0 0 0 0 0 0 2 1 ZF Neurons 7 6 5 0 2 1 Unit Bu ff ers O ff set 3 2 1 0 2 1 23

CNVLUTIN: Computation Slicing … Neuron Lane 15 Neuron Lane 1 Neuron Lane 0 24

Methodology • In-house timing simulator: baseline + CNVLUTIN • Logic + SRAM: Synthesis on 65nm TSMC • eDRAM model: Destiny • DNNs: Trained models from Caffe model zoo 25

Area Only +4.5% in area overhead 26

Speedup: ineffectual = 0 2 1.5 1 Better 0.5 0 Alexnet Google NiN VGG19 VGG_M VGG_S Geo 27

Speedup: ineffectual = 0 2 1.5 1 Better 0.5 0 Alexnet Google NiN VGG19 VGG_M VGG_S Geo 1.37x Performance on average 27

Loosening the Ineffectual Neuron Criterion CNVLUTIN zero “If all you have is a hammer, everything looks like a nail” (Maslow’s hammer) 37 0 13 10 15 1 123 0 0 7 1 3 0 1 20 0 18 31 0 33 28

Loosening the Ineffectual Neuron Criterion CNVLUTIN zero “If all you have is a hammer, everything looks like a nail” (Maslow’s hammer) 37 0 13 10 15 1 123 0 0 7 1 3 0 1 20 0 18 31 0 33 Example: consider ineffectual if value<2 29

Speedup: ineffectual >= 0 2 1.5 1 Better 0.5 0 Alexnet Google NiN VGG19 VGG_M VGG_S Geo only 0's 0's and more 1.52x Performance No accuracy lost 30

CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio , P. - PowerPoint PPT Presentation

CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio , P. Judd, T. Hetherington, T. Aamodt, N. E. Jerger, A. Moshovos * Please cite the original source. CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio P. Judd T.

Myra Dioquino Ne Neuron Ne Neuron ons ons ~30x out ~30x in 20 000 20,000x out o t ~30x

STEP 5. Slides of a neuron, nerve, and spinal cord #1 Neuron slide This step can wait until

Neuron-inspired maintenance-free, distributed sensing Challenges and algorithms Stephan Sigg

CS 472 - Perceptron 1 Basic Neuron CS 472 - Perceptron 2 Expanded Neuron CS 472 - Perceptron

Neuron Technologies Company Profile Dr. Ismayil Alakbarov Founder & CEO, Neuron Technologies

Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics Neuron: real and simulated

{ - + . ! wi xi 1 if > 0 - i =0 o = w n i =0 Output is a vector of

{ - + . wi xi 1 if > 0 - i =0 o = w n i =0 Output is a vector of

2/1/16 February 1, 20 16 Aspects of electrical activity in a neuron explained by

Index Convolutional Codes 1 Free distance 2 Cyclic structures and free distance 3 Computing

>>>CLICK HERE<<< Ieee Paper Presentation On Cloud Computing Free Download New

Dataflow Network Programming with Neuron Eric Griffis RacketCon 2018 St. Louis, MO 1 I. Meet

Neuroscience and Consciousness Chapter 2 Neurons Neuron cell communication is

Python + NEURON Interpreter HOC Section Neuron specific syntax Range Variable Mechanism

The cable equation A.K.A. the monodomain model Neurons Electric flow in neurons The neuron

CoreNeuron : Morphologically Detailed Neuron Simulations Building, Simulating and Optimizing Large

Outline Why model neural networks? Modeling Neural Networks A brief look at the neuron.

Theoretical neuroscience: From single neuron to network dynamics Nicolas Brunel Outline

CHAPTER I CHAPTER I From Biological From Biological to Artificial Neuron Model to Artificial

/ Saltlux, Inc. Act One Hyper-Connection Neuron

Model Structure Selection Tartu 2008 Neuron Takes number of inputs Processes them

Computing in carbon Basic elements of neuroelectronics -- membranes -- ion channels -- wiring

Computing in carbon Basic elements of neuroelectronics -- membranes -- ion channels -- wiring

Introduction to Neural Networks Jakob Verbeek 2017-2018 Biological motivation Neuron is basic

CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio , P. - PowerPoint PPT Presentation

CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio , P. Judd, T. Hetherington*, T. Aamodt*, N. E. Jerger, A. Moshovos * Please cite the original source. CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio P. Judd T.

Myra Dioquino Ne Neuron Ne Neuron ons ons ~30x out ~30x in 20 000 20,000x out o t ~30x

STEP 5. Slides of a neuron, nerve, and spinal cord #1 Neuron slide This step can wait until

Neuron-inspired maintenance-free, distributed sensing Challenges and algorithms Stephan Sigg

CS 472 - Perceptron 1 Basic Neuron CS 472 - Perceptron 2 Expanded Neuron CS 472 - Perceptron

Neuron Technologies Company Profile Dr. Ismayil Alakbarov Founder &amp; CEO, Neuron Technologies

Biomedicine Enrico Grisan enrico.grisan@dei.unipd.it Neuron basics Neuron: real and simulated

{ - + . ! wi xi 1 if &gt; 0 - i =0 o = w n i =0 Output is a vector of

{ - + . wi xi 1 if &gt; 0 - i =0 o = w n i =0 Output is a vector of

2/1/16 February 1, 20 16 Aspects of electrical activity in a neuron explained by

Index Convolutional Codes 1 Free distance 2 Cyclic structures and free distance 3 Computing

&gt;&gt;&gt;CLICK HERE&lt;&lt;&lt; Ieee Paper Presentation On Cloud Computing Free Download New

Dataflow Network Programming with Neuron Eric Griffis RacketCon 2018 St. Louis, MO 1 I. Meet

Neuroscience and Consciousness Chapter 2 Neurons Neuron cell communication is

Python + NEURON Interpreter HOC Section Neuron specific syntax Range Variable Mechanism

The cable equation A.K.A. the monodomain model Neurons Electric flow in neurons The neuron

CoreNeuron : Morphologically Detailed Neuron Simulations Building, Simulating and Optimizing Large

Outline Why model neural networks? Modeling Neural Networks A brief look at the neuron.

Theoretical neuroscience: From single neuron to network dynamics Nicolas Brunel Outline

CHAPTER I CHAPTER I From Biological From Biological to Artificial Neuron Model to Artificial

/ Saltlux, Inc. Act One Hyper-Connection Neuron

Model Structure Selection Tartu 2008 Neuron Takes number of inputs Processes them

Computing in carbon Basic elements of neuroelectronics -- membranes -- ion channels -- wiring

Computing in carbon Basic elements of neuroelectronics -- membranes -- ion channels -- wiring

Introduction to Neural Networks Jakob Verbeek 2017-2018 Biological motivation Neuron is basic

CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio , P. Judd, T. Hetherington, T. Aamodt, N. E. Jerger, A. Moshovos * Please cite the original source. CNVLUTIN: Ineffectual-neuron-free DNN computing J. Albericio P. Judd T.

Neuron Technologies Company Profile Dr. Ismayil Alakbarov Founder & CEO, Neuron Technologies

{ - + . ! wi xi 1 if > 0 - i =0 o = w n i =0 Output is a vector of

{ - + . wi xi 1 if > 0 - i =0 o = w n i =0 Output is a vector of

>>>CLICK HERE<<< Ieee Paper Presentation On Cloud Computing Free Download New