system on chip
play

System-on-Chip Seung Kang Qualcomm Technologies, Inc. IEEE - PowerPoint PPT Presentation

Emerging Memories and Pathfinding for the Era of sub-10nm System-on-Chip Seung Kang Qualcomm Technologies, Inc. IEEE Solid-State Circuits Society Seminar San Diego, CA August 8, 2019 1 Memory Is Big Business >> $100 Billions*


  1. Emerging Memories and Pathfinding for the Era of sub-10nm System-on-Chip Seung Kang Qualcomm Technologies, Inc. IEEE Solid-State Circuits Society Seminar San Diego, CA August 8, 2019 1

  2. Memory Is Big Business >> $100 Billions* https://www.dw.com/ http://www.icinsights.com/news/bulletins/Total-Memory-Market-Forecast-To-Increase-10-In-2017/ * Not including embedded memories for AP, SOC, and MCU 2

  3. Memory Subsystem Hierarchical memory layers Computing RF On-chip Cache Off-chip Cache Bit Cost Main Memory Local Storage Remote Storage Data 3

  4. Memory Subsystem There is no such thing like a universal memory RF Speed; Endurance SRAM “Embedded” DRAM DRAM Density; Flash (SSD), HDD Retention Flash (SSD), HDD, Tape 4

  5. Problem Statement 1 "Memory Wall" Overall system performance & power governed more by memory subsystem than by CPU subsystem MCU SOC, AP Computing- ROM ROM centric RF OTP/MTP CPU CPU SRAM External OTP/MTP L1 eFlash Flash Custom L2 SRAM GPU L3 Cache Memory Embedded Cost Standalone DRAM Flash Storage/SSD Data-centric HDD 5

  6. Problem Statement 2 Many-Core Processors Increasing SRAM area & leakage power overhead Shared 25 Mbytes of L3 cache L3 Cache Shared (60 Mbytes for 24 cores) L3 Cache Intel Broadwell-E (14nm node) • Datacenter applications projecting  120 Mbytes (960 Mb) L3 cache at 10nm and beyond. • More expensive at advanced nodes (6T-SRAM:  550 F 2 at 7 nm vs.  150 F 2 at 40 nm) • High standby/leakage power (worse at high T) 6

  7. Problem Statement 3 IOT & Embedded System Inherent drawbacks caused by memory limitations • Energy-hungry • Poor form factor • High cost • Security vulnerability “The IOT is an NVM problem.” Greg Yeric, ARM (2015 IEDM Plenary Talk) 7

  8. A New Perspective on Energy Efficiency New Demand and Criteria for Wearable and Bioelectronic Devices Critical Challenge: Battery Life (Energy Efficiency) 8

  9. A New Perspective on Security & Privacy Demand for secure memory and HW primitives (e.g. PUF) Endpoint Gateway Cloud ? ? ? 9

  10. Problems, new requirements, and opportunities demand advanced memories… 10

  11. Memory Classification Device Type Volatile Memory Nonvolatile Memory SRAM Charge Modulation Resistance Modulation DRAM Flash FRAM PCM MRAM RRAM 2D/3D STT- MRAM Ox-RAM NAND NOR SOT/SHE CB-RAM Field MRAM VMCO CNT Mature (mainstream or commoditized) Mott Emerging (currently in small markets) Transition 11

  12. Phase Change Memory PCM PC-RAM PRAM 12

  13. PCM: Early History Neale, Nelson, & Moore, Electronics , 1970 “Nonvolatile and reprogrammable, the read-mostly memory is here” • Density: 256 bits • Die Size: 122-by-131-mil (10.3 mm 2 ) • Read: 2.5 mA, < 5 V Set: 5 mA,  25 V, 10 ms • • Reset: < 200 mA, 25 V, 5 µs 13

  14. PCM: Basic Concept Phase-change Element Amorphous Crystalline High R Low R Source: Samsung (2006) • Chalcogenide alloy (e.g. Ge-Sb-Te/GST)) T > melting point • Programming: Joule heating followed by natural cooling T > crystallization T • Relatively simple physics! 14

  15. PCM: Cell and Array Architecture Cell = Access Device + Phase-change Element 1BJT-1R 1FET-1R 1Diode-1R The required characteristics of access FET, diode, or BJT are largely governed by the upper limit of the reset current (to drive localized melting) at a target cell size. Cross-bar Array 15

  16. PCM: Evolution of Cell Configuration Improve thermal isolation Source: H.-L. Lung (ITRS ERD, 2014) >90% of heat is wasted during reset Lower reset current/power Improved endurance & retention 16

  17. PCM: Reliability Cycling Endurance Chen et al. (Macronix-IBM, IMW, 2009) 0 cycles 10 cycles 10K cycles 1M cycles Updoped GST 0 cycles 1K cycles 100M cycles 1B cycles Doped GST 17

  18. PCM: Reliability Retention Shih et al. (Macronix-IBM, IEDM, 2008) 18

  19. PCM: Prototype Samsung 8Gb PCM (ISSCC, 2012) 4.2F 2 19

  20. PCM: Evolution to 3D • PCMS • Phase-change memory (PCM) coupled with a selector (OTS) • OTS: Ovonic Threshold Switch • 64 Mb • Endurance: 10 6 cycles Kau et al. (Intel & Numonyx, IEDM, 2009) Intel Optane Memory Series (2017) 3D XPoint (Intel & Micron, 2016) Chip Density 16 GB (128 Gb) 32 GB 7  s 9  s Read Latency • 20nm node 18  s 30  s Write Latency • 128 Gb Random Read 190K IOPS 240K IOPS • SLC Random Write 35K IOPS 65K IOPS Selector Sequential Read 900 MB/s 1350 MB/s Sequential Write 145 MB/s 290 MB/s Memory Power (Active/Idle) 3.5 W / 1 W Endurance 182.5 TB (Lifetime Writes) Source: Intel.com 20

  21. 3D XPoint as Storage Class Memory It does not replace DRAM, or NAND storage, but it adds a new layer to improve the subsystem Source: Intel-Micron, 2015 21

  22. Magnetoresistive RAM MRAM Spin-transfer-torque MRAM STT-MRAM ST-MRAM STT-RAM 22

  23. A Building Block: Magnetic Tunnel Junction Multiple flavors, but perpendicular MTJ Electrical resistance varied by Free Layer Tunnel Barrier relative electron spin alignment Pinned Layer : Magnetoresistance (MR) Parallel Antiparallel Low Resistance (R P ) High Resistance (R AP ) Relatively small read window Electrical switching, not magnetic switching 23

  24. MRAM Snapshot A new class of memory: Nonvolatile RAM • Fast NVM • High endurance •  3 additional masks over baseline logic • Low voltage (no charge pump) • Scalable Operation voltage on MTJ Read:  0.1 V Write: 0.3 − 0.5 V Lu et al. (Qualcomm & TDK) Park et al. (Qualcomm & Applied Mat.) IEDM, 2015 IEDM, 2015 24

  25. MRAM Array Architecture Write Driver Reference Generator Read SA MTJ Array MUX Rref Rref Ref Ref BL0 SL0 BL31 SL31 BL0 SL0 BL1 SL1 SLDP (local data path) MTJ MTJ MTJ MTJ wl<0> MTJ MTJ MTJ MTJ wl<1> MTJ Array MTJ MTJ MTJ MTJ wl<510> MTJ MTJ MTJ MTJ wl<511> Ref MTJ array Data MTJ array 2IOs+Ref Use the same bitcell for both data and reference array • Small read window → Design for robust read (sensing) is critical • Balancing switching asymmetry and source generation 25

  26. Challenges for MRAM Design and Reliability Narrow design window for deeply scaled nodes Prevent read error ▪ Low V Read (  0.1V) ▪ High TMR ▪ Fast fall off of RDR slope Prevent write error ▪ Low V Write ▪ Fast fall off of WER slope Improve barrier reliability ▪ High V BD ▪ Contain TDDB 26

  27. MRAM Device Scalability: I c Most important bitcell and design parameter Critical Switching Current ( µA ) MTJ Diameter (nm) Saida et al., VLSI Symp., 2016 Kang, VLSI Symp., 2014 At small dimensions, dynamic current consumption becoming comparable with that of SRAM cell current 27

  28. MRAM Device Scalability: Endurance Practically unlimited endurance for cache applications Kan et al., IEDM, 2016 10 13 1.E+13 5 × 10 14 30k 10 22 5.E+14 30000 30000 1.E+22 (-) AP-P (-) Polarity, 50 ns Pulse (+) P-AP 10 12 1.E+12 25000 5 × 10 10 Endurance Requirement Cycles Breakdown (cycles) 10 18 (-) 1 ppm 1.E+18 5.E+10 Time to Breakdown (sec) (+) 1 ppm Resistance (Ohms) 25 nm 20k 20000 20000 10 11 1.E+11 5 × 10 6 10 14 1.E+14 5.E+06 10 years, 50% Duty Cycle 15000 10 10 1.E+10 5 × 10 2 10 10 5.E+02 1.E+10 10k 10000 10000 L2 SRAM (256 KB) 45 nm L2 MRAM (1024 KB) 10 9 1.E+09 5 × 10 -2 10 6 5.E-02 1.E+06 5000 L3 SRAM (1.5 MB) L3 MRAM (6 MB) 10 8 5 × 10 -6 10 2 0 1.E+08 1.E+02 5.E-06 0 0 0 25 50 75 100 0.5 0.75 1 1.25 1.5 1 1 1.5 1.5 2 2 Millions of accesses per core per MTJ Voltage (V) MTJ Voltage (V) second Intrinsically solid Better with MTJ scaling In real life, subjected to design robustness & defect control 28

  29. MRAM: Prototypes Samsung (IEDM, 2016 / 7 th MRAM Global Innovation Forum) SK Hynix-Toshiba (IEDM, 2016 / ISSCC, 2017) 4Gb 9F 2 (30nm) 29

  30. MRAM: Qualcomm Demo System MRAM integrated along with PSRAM and NOR Flash for performance and power benchmarking Integrated into a demo tablet  350X faster than Flash  3X faster than PSRAM Kang, IMW, 2016 MRAM can unify PSRAM (volatile RAM) and NOR (nonvolatile storage) with PPAC advantages 30

  31. MRAM In Production 31

  32. MRAM for Processing-in-Memory CNN Accelerator A single-chip solution for Mobile and IOT applications From Gyrfalcon Technologies (2018) • 22nm eMRAM (40 MB) • 9.9 TOPS/W 32

  33. Resistive RAM RRAM ReRAM Conductive Bridge RAM CB-RAM 33

  34. RRAM: Materials Two-terminal resistive switching elements (excluding PCM and MRAM). Found in numerous combinations of materials. Source: P. Wong (Stanford, 2011) 34

  35. RRAM: Common Classification Different materials & switching characteristics Top Electrode Top Electrode Top Electrode Tunnel Barrier Metal Ion Metal Reservoir Conductive Solid Oxide Metal Oxide Electrolyte Bottom Electrode Bottom Electrode Bottom Electrode Conductive Metal Oxide RRAM Oxide RRAM (Ox-RAM) Conductive Bridge RRAM (CB-RAM) Vacancy Modulated Conductive Oxide Transition Metal Oxide RRAM Programmable Metallization Cell (PMC) RRAM (VMCO RRAM) Interfacial Switching (2D) Filamentary Switching (1D) Uniform Switching (No forming) 35

Recommend


More recommend