REVISITING THE MEMORY HIERARCHY FOR TOMORROW COMPUTING SYSTEMS Leti Devices Workshop | Elisa Vianello | December 4, 2016
OUTLINE • Ever Increasing Need for More Memory • Rethinking the Memory Hierarchy with New NV Memory Technologies • Rethinking the System Architecture? | 2 Leti Devices Workshop | ElisaVianello | December 4, 2016
EVER INCREASING NEED FOR MORE MEMORY The amount of data being created is growing 40% a year into the next decade Mobile 0 Analog Server In billions of GB Digital M. Hilbert et al., Vol. 332, SCIENCE, 2011 IoT 2002 turn-point between analog vs digital format By 2020 the digital universe is expected to contain nearly as many digital bits as there are Stars in the Universe… | 3 Leti Devices Workshop | ElisaVianello | December 4, 2016
EVER INCREASING NEED FOR MORE MEMORY The growth in data produced is outpacing the improvements in the density and cost of storage technologies Core count doubling ~ every 2 years Compute costs dropping faster DRAM capacity doubling ~ every 3 years than memory costs logic memory Source: Liam et al ISCA 2009 Source: IBM Deep Computing New memory technologies and rethinking of system architecture focused on data storage and management are needed! | 4 Leti Devices Workshop | ElisaVianello | December 4, 2016
CONVENTIONAL TYPES OF MEMORIES AND TECHNOLOGIES Processor Data moves along all levels of storage Cost/bit & Speed hierarchy before and after being Registers CPU Register files processed at the processor Caches L1-L4 SRAM DRAM Main MIM CAP 1T 1R Memory Latency gap SDD NAND Flash Storage HDD MAGNETIC Memory volume Can new NVMs (RRAM, PCM, MRAM…) help to reduce the latency gap and limit data movements across the Memory Hierarchy? | 5 Leti Devices Workshop | ElisaVianello | December 4, 2016
RRAM MEMORIES: PROGRAMMING WINDOW VS. ENDURANCE 8 10 Programming Window R off /R on [#] Cu-GST/CuO/SiO 2 /BE CBRAM OxRAM Cu/Ta2O5/Pt 6 PCRAM 10 Ag/GeS 2 /HfO 2 Cu/PSE/Ru 4 Ti/TaSiO/Ru 10 GST-O Golden GST-N TaN/GeTe/Cu GST-Sb Cu-GST/CuO/SiO 2 /BE 2 CNT 10 TaON Ta2O5/TaOx Ta2O5/TaOx MRAM WOx GaSbGe 0 10 1 3 5 7 9 11 10 10 10 10 10 10 Cycle Number N c [#] Endurance N c [#] • CBRAM � largest R off /R on chalco-based and/or bilayers; best E. Vianello IEDM 2014 endurance oxide-based L. Perniola IMW 2016 • OxRAM � largest R off /R on non-polar; best endurance bipolar • PCRAM � best endurance GST-based • MRAM � outlier.. Universal Memory does Not Exist! 4.5 paper on Monday afternoon on RRAM Endurance, Retention and Window Margin Trade-off | 6 Leti Devices Workshop | ElisaVianello | December 4, 2016
STORAGE CLASS MEMORIES Requirements: Latency of access Fast access speed approaching DRAM Nonvolatile retain data at power off CPU <0.001µs registers Register files High endurance (program/erase cycles) <0.01µs cache L1-L4 SRAM Solid state (no moving parts) DRAM Low cost/bit (approaching HDD) MIM CAP <0.03µs 1T 1R compu processing @ SCM: from SCM ting 0.05µs-100µs? PCM/RRAM data compression to query processing SDD >100µs NAND Flash HDD MAGNETIC >10 3 µs | 7 Leti Devices Workshop | ElisaVianello | December 4, 2016
« TRANSFORMATIONAL » ABILITY OF PCM H. Y. Cheng et al., IEDM 2012 G. Navarro et al., IEDM 2013 H.Y. Cheng et al., JAP 2014 P. Zuliani et al., SSE 2015 V. Sousa et al., VLSI 2015 12 Mb 90 nm test chip RESET 30ns - SET 800ns -1 10 7 10 DEFECT DENSITY RESET RESET -2 10 Ω ] RESITANCE [ Ω Ω Ω SET 6 10 -3 10 2h bake 5 -4 10 10 at 230 ° C SET -5 10 4 10 0 4 8 12 16 20 1 3 5 7 9 10 10 10 10 10 CURRENT [ µ µ A] µ µ CYCLES | 8 Leti Devices Workshop | ElisaVianello | December 4, 2016
HYBRID MAIN MEMORY High capacity working memory Hybrid memory: DRAM as a cache to RRAM or PCM tech CPU Register files to achieve the best of multiple technologies L1-L4 SRAM DRAM Hybrid RRAM/PCM SCM PCM/RRAM SDD NAND Flash • fast • non-volatile HDD • durable • high-density MAGNETIC • low-cost | 9 Leti Devices Workshop | ElisaVianello | December 4, 2016
Ti/HfO 2 BASED-OxRAM <100ns programming time at 1V with up to 100M cycles 28nm CMOS E. Vianello IEDM 2014 A. Benoist IRPS 2014 ST/Leti LRS and HRS on 4kbits array MAD Leti OxRAM 4.7 paper on Monday afternoon on Filament-based RRAM | 10 Leti Devices Workshop | ElisaVianello | December 4, 2016
L2-L3 CACHE Replacing SRAM (L2-L3) best candidate is STT-MRAM CPU (fast and high endurance) Cache L2 L3 with STT-MRAM DRAM Hybrid PCM/MRAM SCM PCM/RRAM Reduction of stand-by power consumption SDD NAND Flash HDD 4Mbit STT MRAM MAGNETIC Read 3.3ns @ 1.25V 65nm CMOS [Noguchi, Toshiba, ISSCC 2016] | 11 Leti Devices Workshop | ElisaVianello | December 4, 2016
REPLACING SRAM WITH STT-MRAM STT-MRAM demonstrated fast writing speed and high endurance, however retention still to be fully characterized MAD Leti pSTT-MRAM 4kbit array different retention extraction methods are compared a trade off exists between thermal stability and programming speed 27.3 paper on Wednesday morning on Data Retention Extraction Methodology pSTT-MRAM | 12 Leti Devices Workshop | ElisaVianello | December 4, 2016
RETHINKING THE SYSTEM ARCHITECTURE? Neuromorphic systems • Detect and predict patterns in complex data • visual or auditory data analysis RRAM has been promoted by Leti (and many others!) to emulate synaptic plasticity: the ability of synapses to strengthen or weaken over time CBRAM PCM GeS 2 /Ag GST , GeTe, GST/HfO 2 OxRAM HfO 2 /Ti (planar + VRAM) | 13 Leti Devices Workshop | ElisaVianello | December 4, 2016
RRAM TO IMPLEMENT ON-LINE LEARNING In nature synapses have a volatile component, is it useful for the learning process? Highly noisy neurological data ANN with RRAM Conductance [S] RRAM data biological data based synapses decoding of neural signals Time [s] synaptic change has a volatile component The volatile component allows to improve detection in highly- noisy input data 16.6 paper on Tuesday morning on Short and Long-term Synaptic Plasticity Using OxRAM | 14 Leti Devices Workshop | ElisaVianello | December 4, 2016
CONCLUSION The data explosion is leading to a corresponding growth in data centric applications (capture, classify, archive…). The adoption of new NVMs enable a rethinking of system architecture-based on data storage and management. However universal memory does not exist, different NVM technologies have to be introduced in the storage hierarchy Challenges: � Design matched to the specific memory technology features � Non volatility does non come for free • static power vs. active power • memory window vs. endurance • memory window vs. data retention. | 15 Leti Devices Workshop | ElisaVianello | December 4, 2016
CONCLUSION Thanks to the new NVMs technologies (PCM, RRAM, MRAM) the persistent memory is getting closer to the compute center avoiding wasting energy in the movement | 16 Leti Devices Workshop | ElisaVianello | December 4, 2016
Leti, technology research institute Commissariat à l’énergie atomique et aux énergies alternatives Minatec Campus | 17 rue des Martyrs | 38054 Grenoble Cedex | France www.leti.fr
EMBEDDED SYSTEMS: TOWARD DISTRIBUTED MEMORY MCU (Stand-by) MCU power challenge: need for Ultra- Sensor Low Stand-by Power design solution Radio MCU (active) NV Flip-Flop Source: Renesas ASP-DAC 2014 [N. Jovanovi ć ISCAS 2016] NV logic eFlash (NVM) logic nvSRAM SRAM 28nm FDSOI + HfO 2 based OxRAM • thermal stability for smart card & NVM merges with SRAM and logic automotive shutdown SRAM & registers thanks to • easy integration with advance CMOS distributed NVM | 18 Leti Devices Workshop | ElisaVianello | December 4, 2016
A circuit-architecture co-optimization framework for evaluating emerging memory hierarchies ISPASS.2013 X. Dong et al. | 19
Recommend
More recommend