Neuro-Inspired Processor Design for On-Chip Learning and - PowerPoint PPT Presentation

Neuro-Inspired Processor Design for On-Chip Learning and Classification with CMOS and Resistive Synapses Jae-sun Seo School of ECEE, Arizona State University The 13 th Korea-U.S. Forum on Nanotechnology September 26, 2016 1

ML Literature (DNN) Neuromorphic (SNN) Courtesy: Nuance Song, PLoS Biol. 2005 ● Dense connectivity ● Sparse connectivity ● Learning done offline ● Online learning ● Back-propagation ● STDP, SRDP, Reward (requires labeled data) (biological evidence) ● MNIST 99.79%, ImageNet 95% ● MNIST 99.08%, ImageNet N/A ● What about unlabeled data ● Cont. learning & detection ● Adaptable for input change or customization? ● Full computation on each layer ● Sparse spiking, attention → high power → low power 2

Neuromorphic Core with On-Chip STDP fully functional 20X retention mode ● Under STDP learning, when neuron K spikes, all Slim neuron Base design variant 2.05mm synapses on row K and column K may update 64K 64K synapse synapse array array ● Transposable SRAM: single-cycle read & write in 2.05mm both row and col. directions 4-b synapse Low leakage ● Efficient pre- and post-synaptic update variant variant 256K 64K ● synapse synapse Near threshold operation array array ● Pattern recognition Seo, CICC, 2011 3

Versatile Learning in Neuromorphic Core Various STDP Learning Rules (Feldman, Neuron 2012) pre-synaptic post-synaptic neurons neurons cnt. N1_0 N3_0 cnt. LTP LTD wa0 wb0 cnt. N1_1 N3_1 cnt. wa1 wb1 N2 wa2 wb2 cnt. N1_2 N3_2 cnt. wb3 wa3 spike When N3 sp cnt. N1_3 N3_3 cnt. wb* synaps when Δ w Δ w subject to L N3 LTP: spikes w = w + [pre cnt.] + [post cnt.] Δ t Δ t LTD: w = w – [post cnt.] – [pre cnt.] Multi-factor Triplet-STDP ● A versatile neurosynaptic core to support various learning rules, large fan-in/-out, sparse connectivity ● Triplet STDP ( Pfister, J. of Neuroscience, 2006, Gjorgjieva, PNAS 2011 ) ● post-pre-post: post nrn. spike & pre nrn. timing & post nrn. timing ● pre-post-pre: pre nrn. spike & post nrn. timing & pre nrn. Timing 4

Feedforward Excitation & Inhibition Layer (i+1) neurons Axons w/ timing info. Synapses: TX => Inhib. Inhibition Synapse Array decoder Layer (i) neurons 1024x256 spike packet connection recurrent Synapses Inh. nrn Inh. => RX neuron 256 spike timing spike Neurons info. packet [1] Diehl, Front. of Neuroscience, 2015 ● Joint feed-forward excitation and inhibition ● For a small number of inhibitory neurons, add pre=>inh, inh=>post synapses ● Balance excitatory & inhibitory synaptic inputs 5 Vogels, Science, 2011

Neural Spike Sorting Processor (for deep brain sensing & stimulation) Detection & Sorting Input: Clustering Output Alignment Processor Raw Signal neuromorphic ● Signals from invasive electrodes: spikes from multiple neurons ● Online, unsupervised neuromorphic spike-sorting processor Collaboration with Columbia University (ISLPED 2015) I 1 H 1 I 2 ● Weight update through STDP I 3 H 2 Z 1 ● Start with K=2, automatically Encoder increases # of output neurons if ... H 3 I 8 bits ... the spike difference is large enough (self-organized map) Z K 32 samples I m H N 6

Exp. Results: Clustering Accuracy Receptive field of dataset that contains 4 clusters in 3000 spikes 100 Proposed. Avg acc.= 91% Osort based. Avg acc.=69% 100 26  W/ch Input Neurons Frequency(MHz) 10 Output 80 Neurons 70 spikes/s/neuron ([4]) Accuracy(%) 60 1 2.5 spikes/s/neuron 9.3  W/ch Others 40 Synapse (D2, D3, D4, D4*) 0.1 Array 20 Decoder Output 0.01 0 Neurons 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 D2 D3 D4 D4* W-D1 W-D2 VDD (V) Dataset • 65nm GP, high-Vth, 0.5x0.5mm 2 Spike sorting accuracy more • 9.3µW/ch at 0.3V reliable than other low-complexity algorithms such as O-sort • Layout of the design is dominated by memory Avg. accuracy: 91% vs. 69% elements, as well as power. 7

Neuromorphic Computing w/ NVMs ● Emerging NVMs (e.g. RRAM) could alleviate power/area bottleneck of conv. memories ● Read rows in parallel: weighted sum current ● Peripheral CMOS read: current-to-digital converter 0.53 130nm Voltage (V) V in V in 0.50 RRAM array + 1.5 RE CMOS read circuits RE (under testing) V spike V spike 0.0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Time (ns) Simulation results for 4ns read timing window 8

Summary ● Neuromorphic computing hardware ● 45nm testchip with on-chip STDP learning ● Versatile learning neuromorphic core & architecture ● 65nm spike clustering processor ● Emerging NVM arrays + peripheral read/write circuits ● Future research with circuit-device-architecture co- design and optimization 9

Collaborators ● ASU ● Faculty: Yu Cao, Shimeng Yu, Chaitali Chakrabarti, Sarma Vrudhula, Visar Berisha ● Students: Minkyu Kim, Deepak Kadetotad, Shihui Yin, Abinash Mohanty, Yufei Ma ● Intel: Gregory Chen, Ram Krishnamurthy ● Columbia University: Mingoo Seok, Qi Wang 10

Neuro-Inspired Processor Design for On-Chip Learning and - PowerPoint PPT Presentation

Neuro-Inspired Processor Design for On-Chip Learning and Classification with CMOS and Resistive Synapses Jae-sun Seo School of ECEE, Arizona State University The 13 th Korea-U.S. Forum on Nanotechnology September 26, 2016 1 ML Literature

OBJECTIVES: THE NEURO EXAM IN THE 1) REVIEW THE NEURO

FPGA co-processor Patrick Dunne for the co-processor group Introduction Co-processor will

THE NEURO EXAM IN THE ALTERED PATIENT Hugh H. West, M.D. Associate Professor UCSF Dept. of EM HREM

Processor Design Pipelined Processor Hung-Wei Tseng Drawbacks of a single-cycle processor

Systems Architecture The ARM Processor The ARM Processor p. 1/14 The ARM Processor ARM:

Processor Datapath Levels in Processor Design We can talk about design at a variety of levels

Processor Design Single Cycle Processor Hung-Wei Tseng Recap: the stored-program computer

Biologically I nspired Hardware System What is Bio-Inspired System? Why do we need

Associative Fine-Tuning of Biologically Inspired Active Neuro-Associative Knowledge Graphs Adrian

RRAM-BASED NEURO-INSPIRED COMPUTING FOR UNSUPERVISED TEMPORAL PREDICTIONS R. DEGRAEVE, D.

5 th Neuro-Inspired Computational Elements Workshop (NICE) An Overview of Related NSF

Neuro-Inspired Computational Elements 2017 March 6-8, 2017 IBM Research - Almaden Thanks to

Cortex-A15 Processor ARMs next generation mobile applications processor Travis Lanier Senior

Ch. 5: Processor + Memory December 12, 2008 Ch. 5: Processor + Memory Overview of Implementation

Outline Introduction to CMOS VLSI Design Partitioning Design MIPS Processor Example

Chapter 12 CPU Structure and Function Contents Processor organization Register

Efficient Neural Computing Enabled by Magneto-Metallic Neurons and Synapses K AUSHIK R OY A

POWER, PARALLEL AUTONOMY, AND PEOPLE Gill Pratt | CEO at Toyota Research Institute | GTC 2016

C Constructing i (and Deconstructing) (and Deconstructing) the Postmortem Interval the

Detailed Three- Dimensional Modeling of Cellular Signaling M. Wittmann, A. Eder, J.S. Wiegert,

Fast classification using sparsely active spiking networks Hesham Mostafa Institute of neural

- Evolution and Mirror Neurons. An Introduction to the nature of Self-Consciousness - (1/7)

1 Brainchip OCTOBER 2017 | Agenda Neuromorphic computing background Akida Neuromorphic

NETWORKS Pavlo Molchanov Stephen Tyree Tero Karras Timo Aila Jan Kautz 2017 WHY WE CAN PRUNE