emerging non volatile memory resistive memory technologies
play

Emerging Non Volatile Memory Resistive Memory Technologies Key - PowerPoint PPT Presentation

Emerging Non Volatile Memory Resistive Memory Technologies Key concept: replace DRAM cell capacitor with a programmable resistor 1T-1C DRAM 1T-1R STT-MRAM, PCM, RRAM Charge based sensing Resistance based sensing


  1. Emerging Non Volatile Memory

  2. Resistive Memory Technologies ¨ Key concept: replace DRAM cell capacitor with a programmable resistor • 1T-1C DRAM • 1T-1R STT-MRAM, PCM, RRAM • Charge based sensing • Resistance based sensing • Volatile • Non-volatile

  3. Leading Contenders STT-MRAM PCM-RAM R-RAM [Halupka, et al. ISSCC’10] [Pronin. EETime’13] [Henderson. InfoTracks’11] - Limited to single-level + Multi-level cell capable + Multi-level cell capable cell + 4F 2 3D-stackable cell + 4F 2 3D-stackable cell - 3D un-stackable - Endurance: ~10 9 writes - Endurance: 10 6 ~10 12 + High endurance (~10 15 ) - ~100ns switching time writes + ~4ns switching time - ~300uW switching + ~5ns switching time + ~50uW switching power + ~50uW switching power power [ITRS’13]

  4. Positioning of Resistive Memories SRAM Higher Higher STT Endurance Speed DRAM PCM Lower FLASH RRAM Cost HDD Capacity

  5. In-Memory Processing

  6. Example Research Question ¨ Can we reduce the cost of data movement between memory and processor core? 1X 500X Processor Memory How to reduce data movement energy? data sources: Nvidia

  7. Example Research Question ¨ Can we reduce the cost of data movement between memory and processor core? 1X 500X Processor Memory How to reduce data movement energy? data sources: Nvidia

  8. Combinatorial Optimization ¨ Numerous critical problems in science and engineering can be cast within the combinatorial optimization framework. Approximate Heuristic Algorithms Pharmaceuticals Combinatorial Genetic Algorithms Communication Optimization Problems Networks Ant Colony Optimization Traveling Salesman Knapsack Semi-Definite Programming Bin Packing Artificial Tabu Search Scheduling Intelligence 10010 Machine Learning Simulated Annealing 01 1 Data 1001 DNA Mining Analysis

  9. Combinatorial Optimization ¨ Numerous critical problems in science and engineering can be cast within the combinatorial optimization framework. Approximate Heuristic Algorithms Pharmaceuticals Combinatorial Genetic Algorithms Communication Optimization Problems Networks Ant Colony Optimization Traveling Salesman Knapsack Semi-Definite Programming Bin Packing Artificial Tabu Search Scheduling Intelligence 10010 Machine Learning Simulated Annealing 01 1 Data 1001 DNA Mining Analysis Massively Parallel Boltzmann Machine

  10. The Boltzmann Machine ¨ Two-state units connected with real-valued edge weights form a stochastic neural network. ¨ Goal: iteratively update the state or weight variables to minimize the network energy (E). x 0 w 0,j x j Σ x 3 w 3,j The Boltzmann Machine

  11. The Boltzmann Machine ¨ Two-state units connected with real-valued edge weights form a stochastic neural network. ¨ Goal: iteratively update the state or weight variables to minimize the network energy (E). E = - ½ ΣΣ x i x j w i,j 1 x 0 w 0,j x j 1 + e δ / C Control Σ Parameter x 3 w 3,j The Boltzmann Machine δ = (2 x j -1) Σ x i w i,j

  12. Computational Model ¨ Network energy is minimized by adjusting either the edge weights or recomputing the states. ¨ Iterative matrix-vector multiplication between weights and states is critical to finding minimal network energy. Memory Arrays w 0,0 w 0,1 … x 0 w 1,0 x 1 … … Data Movement … … … 1 Σ , � , The Boltzmann Machine Functional Units 1 + e x

  13. Resistive Random Access Memory ¨ An RRAM cell comprises an access transistor and a resistive switching medium. V RRAM Arrays RRAM Cell Wordline Bitline … … RRAM: Resistive RAM … The Boltzmann Machine Functional Units (source: HP, 2009)

  14. Resistive Random Access Memory ¨ A read is performed by activating a wordline and measuring the bitline current (I). V I = V/R 1 RRAM Arrays ‘1’ R 1 … … … The Boltzmann Machine Functional Units

  15. Memristive Boltzmann Machine ¨ Key Idea: exploit current summation on the RRAM bitlines to compute dot product. V I = Σ V/R i RRAM Arrays ‘1’ ‘1’ ‘1’ … … … The Boltzmann Machine Functional Units ‘1’

  16. Memristive Boltzmann Machine ¨ Memory cells represent the weights and state variables are used to control the bitline and wordlines. V I = Σ V/R i RRAM Arrays w 01 X 1 X 0 w 04 w 02 w 03 X 2 X 4 X 3 … … … The Boltzmann Machine Functional Units

  17. Memristive Boltzmann Machine ¨ Memory cells represent the weights and state variables are used to control the bitline and wordlines. V I = Σ X 0 X i W 0i X 0 w 01 RRAM Arrays X 1 w 02 X 2 w 03 X 3 … … … w 04 The Boltzmann Machine Functional Units X 4

  18. System Integration DDR3 reads and writes are used for configuration Software configures the and data transfer. on-chip data layout and 1. Configure the DIMM initiates the optimization 2. Write weights and states by writing to a memory mapped control register. 3. Compute D R A M 4. Read the outcome To maintain ordering, accesses to the accelerator are made Controller uncacheable by the CPU processor. Accelerator DIMM

  19. Summary of Results 1 Execution Time Normalized to 34x the Single Threaded Kernel Multi-threaded Kernel PIM Accelerator 60x 0.1 9x Memristive Accelerator 0.01 6x 0.01 0.1 1 System Energy Normalized to the Single Threaded Baseline

Recommend


More recommend