scalable multi precision
play

Scalable Multi-Precision Simulation of Spiking Neural Networks on - PowerPoint PPT Presentation

Scalable Multi-Precision Simulation of Spiking Neural Networks on GPU with OpenCL Dmitri Yudanov (Advanced Micro Devices, USA) Leon Reznik (Rochester Institute of Technology, USA) WCCI 2012, IJCNN, June 12 Agenda Motivation OpenCL.


  1. Scalable Multi-Precision Simulation of Spiking Neural Networks on GPU with OpenCL Dmitri Yudanov (Advanced Micro Devices, USA) Leon Reznik (Rochester Institute of Technology, USA) WCCI 2012, IJCNN, June 12

  2. Agenda  Motivation  OpenCL. SNN Simulation Platform  GPU Device Architecture  SNN Simulation Architecture  Results: Verification and Performance  Next Simulator Architecture  Conclusion  Q&A

  3. Motivation  SNN simulation scalability domains: ◦ Network size ◦ Connection count ◦ SNN component models (neuron, synapse, gap junction etc) ◦ Simulation methods (event-driven, time-driven, numerical methods) ◦ Precision  Simulation flexibility and programmability for heterogeneous environment. OpenCL.  Configuration: ◦ GPU Radeon™ HD 7970 (code-named Tahiti ). OpenCL ◦ Izhikevich neuron model ◦ Parker-Sochacki simulation method

  4. OpenCL. Simulation Platform Open Computing Language. Open  standard maintained by Khronos Group Four models:  Platform model  Memory model  Programming model  Execution model  Based on B Gaster et al . Heterogeneous Computing with OpenCL .: Morgan Kaufmann Pub, 2011.

  5. Tahiti GPU Architecture: High Level View Based on AFDS11 presentation: M Houston and M Mantor. (2011, June) Fusion Developer Summit: AMD Graphics Core Next.

  6. Tahiti GPU Architecture: Compute Unit Based on AFDS11 presentation: M Houston and M Mantor. (2011, June) Fusion Developer Summit: AMD Graphics Core Next.

  7. Simulation: Computation Flow

  8. Simulation: Update PS solver is based on sequential implementation of R Stewart and W Bair, "Spiking neural network simulation: numerical integration with the Parker-Sochacki method," Journal of Computational Neuroscience, vol. 27, no. 1, pp. 115-133, August 2009.

  9. Simulation: Expand

  10. Simulation: Sort Radix sort example: 1 bit radix. LSD sort. Modified from T Harada and L Howes. (2011, Dec.) Heterogeneous Compute.[Online]. http://www.heterogeneouscompute.org/wordpress/wpcontent/uploads/2011/06/RadixSort.pdf

  11. Simulation: Address

  12. Results: Verification and Testbench A unit test for each kernel  A unified integration test  with complete host-device verification A variety of compilation  modes C++ preprocessor-driven  optimizations XML-driven search script  for the best performing variant. User Interface:  Perl script + XML  Microsoft VS 

  13. Results: Performance Size-connection scalability in multi-precision networks with per-WF precision  allocation. 1000 iterations, 250 us step  Randomly-connected SNN with only AMPA synapses.  GPU: Radeon™ HD 7970, CPU: AMD Phenom ™ II, 3.2 GHz (single thread)  Average T otal GPU CPU Network Average Average Synapses Synapse Time Time Time Size Events Spikes per Count per Step, per Step, Ratio (neurons) per Step per Step Neuron (millions) (ms) (ms) 2,100,000 90 230,000 2,522 190 13.5 659 48 131,000 1,458 370,000 257 191 5.7 279 48 16,000 11,677 300,000 25 191 3.2 283 88

  14. Simulator: Next Architecture Out-of-order flow  with event-based synchronization Target-oriented  synaptic matrix partitioning Mixed hybrid and  time-driven simulation flows Variety of neuron  models STDP  Just-in-time spike-to-  event expansion

  15. Conclusion  Multi-precision scalable (neurons, connections, precision) SNN parallel simulator.  OpenCL, Tahiti architecture.  Fully verified with CPU original implementation.  Up to 90x faster compared to a single thread on CPU. Future Work (in the order of importance) Object-oriented design  Out-of-order execution flows  STDP feature  Linux support  Application examples  User interface (possibly a library with extensions to PyNN)  APU support  Other: root-cause Newton-Raphson divergence, just-in-time spike-to-event expansion,  sort radix scalability.

  16. Q&A Selected Bibliography  R. Stewart and W. Bair, "Spiking neural network simulation: numerical integration with the Parker-Sochacki method," Journal of Computational Neuroscience , vol. 27, no. 1, pp. 115-33, Aug. 2009.  E. M. Izhikevich, "Simple model of spiking neurons," Neural Networks, IEEE Transactions on , vol. 14, pp. 1569--1572, 2003.  B Gaster, D R Kaeli, L Howes, and P Mistry, Heterogeneous Computing with OpenCL.: Morgan Kaufmann Pub, 2011.  T Harada and L Howes . (2011, Dec.) “Introduction to GPU Radix Sort.” Heterogeneous Compute . [Online]. http://www.heterogeneouscompute.org/wordpress/wpcontent/uploads/2011/06/RadixSort.pdf  M Houston and M Mantor. (2011, June) Fusion Developer Summit: AMD Graphics Core Next. [Online]. http://developer.amd.com/afds  D Yudanov, M Shaaban, R Melton, and L Reznik, "GPU-based simulation of spiking neural networks with real-time performance & high accuracy," in The 2010 International Joint Conference on Neural Networks (IJCNN) , 2010, pp. 1-8. Code: http://code.google.com/p/neurosim Thanks to  Lee Howes, Dr. Wu-Chun Feng

Recommend


More recommend