a compiler for scalable placement and routing of brain
play

A Compiler for Scalable Placement and Routing of Brain-like - PowerPoint PPT Presentation

R A D I C A L R A D I C A L A Compiler for Scalable Placement and Routing of Brain-like Architectures Narayan Srinivasa Center for Neural and Emergent Systems HRL Laboratories LLC Malibu, CA


  1. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L A Compiler for Scalable Placement and Routing of Brain-like Architectures Narayan Srinivasa Center for Neural and Emergent Systems HRL Laboratories LLC Malibu, CA International Symposium on Physical Design 2013 March 26, 2013 Lake Tahoe, CA 1

  2. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Computers vs. Mammalian Brains Parallel distributed architecture Serial architecture Spontaneously active No activity unless instructed Composed of noisy components and Precision in components and operates at low speeds (< 10 Hz) operates at very high speeds (GHz) Low power (30W), High power (100MW), small footprint (1 liter) Large footprint (40M liters) Asynchronous (no global clock) Synchronous (global clock) Analog computing, Digital computing and Digital communication communication Integrated memory and Memory and Computation are Computation clearly separated Intelligence via Learning thru BBE Intelligence via programmed interactions algorithms/rules 2

  3. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Motivation and Objective Problem von Neumann Machines [ log ] • As compared to biological systems, A trade between universality and efficiency today’s intelligent Machine Neuromorphic machines are less Complexity Machines e.g. Gates; efficient by a factor of a Memory; million to a billion in Neurons; • Human level performance complex environments. Synapses • Dawn of a new age Power; Size Dawn of a new paradigm Program Objective • For intelligent machines to be useful, they must compete with biological systems. [ log ] “complex” “simple” Environmental Complexity e.g. Input Combinatorics Todd Hylton 2008 The SyNAPSE program seeks to break the programmable machine paradigm by developing neuromorphic machine technology that scales to biological levels 3

  4. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Program Structure Structure Period of Performance Baseline/Phase 0 October 7, 2008 - September 6, 2009 Option 1/Phase 1 September 7, 2009 - March 28, 2011 Option 2/Phase 2 March 29, 2011 - January 27, 2013 • Performers – HRL (prime) – Subcontractors • University of Michigan • Stanford University • Neurosciences Institute • Boston University • University of California, Irvine • George Mason University • Portland State University • SET Corporation HRL SyNAPSE Team – Sub 4

  5. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Overall Approach System (SyNAPSE) Top-down Modules (simulation) ( e.g . visual cortex) Networks ( e.g. cortical column) Model Biological Scale Machine Intelligence Circuits ( e.g . center-surround) Make Measure Components Bottom-up ( e.g. synapse / neuron) (devices) Todd Hylton 2008 Materials ( e.g. memristors) Attack the problem “bottom-up” and “top-down” and force disciplinary integration with a common set of objectives. 5

  6. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Brain Architecture Dense Network Synapses Neurons Brain is composed of 10 11 neural cells with 10 15 synapses: Very High Density (10 10 synapses/cm 2 ) and Connectivity (1:10 4 ) 6

  7. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Architecture Dynamics: Leaky Integrate and Fire Neuron Analog Spiking (Mixed Signal) Pre- neuron Signal A Signal B Analog Processing Block 1 wire used Post- per signal neuron V A t i , t i+1 are t asynchronous times t i t i+1 E spike (not quantized). They encode signal I spike T ISI = 1/f spike information • Single wire used to represent spike signals which encode analog information τ AMPA • Dissipate power only during spike events τ GABA • Spiking system less prone to noise and variations (only needs to maintain timing information) • Cascaded spiking analog processing blocks is less prone to noise accumulation due to spikes combined with learning and adaptation 7

  8. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Architecture Dynamics: Synaptic Plasticity Electrical è Chemical è Electrical Speed, Specificity, Timing Spike Timing Dependent Plasticity (STDP) (Markram et. al 1997; Bi and Poo, 1998) 8

  9. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Architecture Design: Small World Connectivity Strogatz 2000; Sporns, 2004) • Cortex (> 85% of the brain) is organized as a small world network of neurons • Dense local connections and sparse long range connections • The typical distance or synaptic path length L between two randomly chosen neurons grows as L α N where N is the number of neurons in network • Efficient communication despite network complexity – needed for survival 9

  10. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Large Scale System (Analog Core) # Neurons, # Neurons, # Synapses, # Synapses, Programmable Programmable Connectivity Connectivity Front-End Front-End Neuromorphic Neuromorphic (focus of this (focus of this Compiler Compiler paper) paper) Routing, Routing, Brain Architecture Brain Architecture Neuron Neuron Placement Placement Set Set Analog Core Analog Core switch states switch states Store Store Analog Analog with Cortical with Cortical Digital Digital Memory Memory Fabric Fabric Memory Memory (store synaptic (store synaptic (neurons, (neurons, conductances) conductances) Retrieve Retrieve Acquire Acquire synapses) synapses) Switch states Switch states Overall Design Goal: 10 6 neurons and 10 10 synapses in cm 2 consuming 1 W of power 10

  11. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Synaptic Time Multiplexing (STM) Time Direct wire connections between neurons is prohibitive (N MUX ) with required wiring density [3] t Δ MUX Bailey & Hammerstrom, 1988 APP Chip hip + … … … … synapses (10 4 per neuron) neurons N MUX t Δ = t MUX Δ (2) MUX APP Chip hip + APP Chip hip 1.0cm synapses Proposed Synaptic Time (4 per neuron) Multiplexing scheme overcomes t (1) Δ MUX wiring limitation by trading off circuit speed with wiring density APP Chip hip Scalable solution to enable CMOS based neuromorphic chip design 11

  12. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L Reconfigurable Fabric vs. Crossbar Reconfigurable Fabrics Fixed Fabrics Time multiplexed Fabric (HRL) Advantages Advantages Crossbar (SUNY) - No multiplexing - Flexible simplifies synapse topology design - High effective density (Wires Limitations reused for - Fixed topology Neurons different - Synapse axons) density limited by wiring (axons not multiplexed) Broadcasting (HRL) - Synapse in 2D array. Neurons in 1D arrays (HP, IBM) Advantages - Flexible Advantages topology - No multiplexing - High effective simplifies synapse density (Wires design Neurons reused for Limitations different - Fixed topology axons) - Number of neurons scale Limitations less than - High linearly with multiplexing chip area ratio needed - Synapse for large Neurons density limited networks 12 by wiring

  13. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L STM Fabric & Analog Core Chip Architecture K. Minkovich, N. Srinivasa, J. M. Cruz-Albrecht, Y. K. Cho and A. Nogin, "Programming Time- Multiplexed Reconfigurable Hardware Using a Scalable Neuromorphic Compiler," IEEE Trans. on Neural Networks and Learning Systems , vol. 23, no. 6, pp. 889-901, June 2012. Capacitor, Memristor, Design to minimize # of … Chip switches Node Switches Switches Data I/O and Bias Data I/O and Bias Data I/O and Bias Data I/O and Bias Array of Array of Digital Digital Nodes Nodes Memory Memory Data I/O and Bias Data I/O and Bias Data I/O and Bias Data I/O and Bias Data I/O and Bias Data I/O and Bias Data I/O and Bias Data I/O and Bias Axon Axon Synapse Synapse Neuron Neuron Routing Routing /STDP /STDP Channels Channels Analog Analog Memory Memory 1 node (1 neuron, 1 Data I/O and Bias Data I/O and Bias Data I/O and Bias Data I/O and Bias synapse M virtual synapses) 13 Time-multiplexing ensures scalability of hardware using conventional CMOS technology

  14. R ¡A ¡D ¡I ¡C ¡A ¡L R ¡A ¡D ¡I ¡C ¡A ¡L HRL SyNAPSE Fabricated Phase 0 Hardware Base Components Integrate & Fire Neuron Synapse with STDP 90nm CMOS 0.4pJ per spike < 10nW per neuron Jose Cruz-Albrecht, Michael Yung, Narayan Srinivasa, “Energy-Efficient, Neuron, Synapse and STDP Integrated 14 Circuits, “ IEEE Transactions on Biomedical Circuits and Systems , vol. 6. No. 3, pp. 246-256, June, 2012.

Recommend


More recommend