Embedded High Performance Computing (EHPC) and Neuromorphic Computing November 18, 2014 Mr. Mark Barnell Senior Computer Scientist Information Directorate Integrity Service Excellence Air Force Research Laboratory 1 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Outline • Challenges • Approaches – Scalable computing & applications – Exploit 3D integration – New devices and models of computation – Move “C4ISR to the edge” with EHPC 2 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Challenges in Big Data Analytics • Advantage in labor power could convert into global competitive advantages • Ability to “do more, do better” with more intelligent power, less labor power, would be a “force multiplier” • Big data limits human performance in analysis and decision making • High-performance information technologies for massive analytics enables the journey from big data to information, knowledge and wisdom. • R&D needed to achieve trusted autonomous systems that are capable of learning, reasoning, inferencing and interacting with human. • Computing hardware technologies reach physical limits in area, power and performance • Three-dimensional integrated circuits and systems with optimized performance under size, weight and power (SWaP) constraints. • RDT&E for nano/quantum/neuro device and system technologies. 3 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Multi-Tiered Approach to EHPC or HPEC Challenges T ECH D EVELOPMENT T RUSTED A RCHITECTURES F UNDAMENTAL S CIENCE 4 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
The Condor Cluster FY10 DHPI Key design considerations: Price/performance & Performance/Watt 1716 SONY Playstation3s • STI Cell Broadband Engine • PowerPC PPE • 6 SPEs • 256 MB RAM 84 head nodes • 6 gateway access points • 78 compute nodes • Intel Xeon X5650 dual-socket hexa- core • (2) NVIDIA Tesla GPGPUs • 39 nodes – (78) C2050 • 39 nodes – (78) C2070/5 • 48 GB RAM 5 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2011-3208, 06 Jun 2011)
Hybrid Neuromorphic Model T e x t i m a g e Input: character images Character level: auto- associative neural networks Word level: confabulation algorithms based on knowledge base of weighted links among letters Sentence level: confabulation algorithms based on knowledge base of weighted links among words and phrases Output : “Text image” DISTRIBUTION A. Approved for Public Release [88ABW-2010-6225] Distribution Unlimited 6
RADAR Data Processing for High Resolution Images 7 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2011-3208, 06 Jun 2011)
R. U. D. I. Cluster 176 Jetson Boards (60T/flops @ 2.1kW) 8 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
3D Integration 3D Integration Hybrid Memory Cube 3D NAND Flash Memory (Toshiba) (HMC Consortium) 9 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Bio-Inspired Computing Architecture Crossbar array of memristors Neuromorphic Computing Accelerator (NCA) 00101101 Std Patterns V in (k) , k=1...m Input Crossbar Crossbar ADC Vectors R/W Array Array Control V( t+1 ) M - M+ X Output V - ( t ) V+( t ) Buffer Summing Config. ST & Training Signal Error Amplifier V out Arbiter Generation Detection & Comparator Diff Training Complemted Crossbar-based Computation Core 00101101 Arbiter Arbiter Bridge General ADC I/O I/O Cfg Cfg Purpose Buffers Buffers NCA NCA Processor 00101101 SRAM Bridge Arbiter Arbiter ADC I/O I/O Cfg Cfg Buffers Buffers I/O NCA NCA Neuromorphic models and algorithms (ANN, Inference, etc.) Conventional Processing Neuromorphic Computing Accelerators 10 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Moving ISR to the Tactical Edge Versatile Intelligent Sensor 11 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Summary • Future embedded systems are challenged to continue delivering extreme performance in small space • But security places additional challenges for trust, agility and resilience • Powerful technology drivers are still at hand to meet the challenges – Computer architecture innovations – Nano and quantum advances – 3D stacking – Algorithm development and mapping to architectures 12 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Recommend
More recommend