advanced memory systems
play

ADVANCED MEMORY SYSTEMS Mahdi Nazm Bojnordi Assistant Professor - PowerPoint PPT Presentation

ADVANCED MEMORY SYSTEMS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Programmable Controller Limitations to Existing Memory Controllers Modern memory controllers are


  1. ADVANCED MEMORY SYSTEMS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture

  2. Programmable Controller

  3. Limitations to Existing Memory Controllers ¨ Modern memory controllers are performance-critical and complex Core 1 Core 2 Core 3 Core 4 Multiple performance � objectives Address Mapping Shared Cache Power Management Application-specific Command Scheduling � optimizations Memory Controller QoS Maintenance On-chip Off-chip Refresh Management Patches and in-field � updates Bank 1 Bank 2 Bank 3 Bank 4

  4. Programmable Memory Controllers ¨ Programmability can make a memory controller higher-performance and more flexible Core 1 Core 2 Core 3 Core 4 Multiple performance � objectives Shared Cache Application-specific Programmable � optimizations Memory Framework Controller On-chip Off-chip Patches and in-field � updates Bank 1 Bank 2 Bank 3 Bank 4

  5. Design Overview ¨ Key idea: Judicious division of labor between specialized hardware and firmware ¤ Request and transaction processing in firmware ¤ Configurable timing validation in hardware Request Transaction Processor Processor Command Logic PARDIS

  6. Request Processing ¨ A RISC ISA for operating on memory requests Memory Request Metadata Address Processor ALU Memory Application Hints Control Flow Address Mapping

  7. Request Processing ¨ Queue management with instruction flags ¤ R flag enqueues a request ADD Firmware ¤ T flag dequeues a transaction T SUB R AND XOR ¨ An instruction can be Request annotated with both R and T Queue flags if needed Request Processor Transaction Queue

  8. Implementation ¨ Two five-stage pipelines and one configurable timing validation circuit

  9. Emerging Technologies

  10. DRAM Cell Structure ¨ One-transistor, one-capacitor ¤ Realizing the capacitor is challenging • 1T-1C DRAM • Charge based sensing • Volatile

  11. DRAM Cell Structure ¨ One-transistor, one-capacitor ¤ Realizing the capacitor is challenging • 1T-1C DRAM • Charge based sensing • Volatile

  12. Memory Scaling in Jeopardy Scaling of semiconductor memories greatly challenged beyond 20nm Example: DRAM

  13. Memory Scaling in Jeopardy Scaling of semiconductor memories greatly challenged beyond 20nm Example: DRAM A/R < 10

  14. Why DRAM Slow? ¨ Logic VLSI Process: optimized for better transistor performance ¨ DRAM VLSI Process: optimized for low cost and low leakage Logic DRAM PCB How to reduce distance?

  15. Processing-in-Memory ¨ Increasing bandwidth by placing processing units on same die with DRAM ¨ Not a new concept! ¤ Merged Logic and DRAM (MLD) n IBM, Mitsubishi, Samsung, Toshiba, etc. ¤ Other efforts n FlexRAM n IRAM n Active Pages n …

  16. Historical PIM Challenges ¨ Hard to program (no standard interface) ¨ Embedding logic on modified DRAM process ¤ Substantially larger transistors n Reduce memory capacity ¤ Slower logic and lower performance ¨ Embedding DRAM on modified logic process ¤ Leaky transistors, high refresh rates, increased cost/bit ¤ Increased manufacturing complexity

  17. 3D Die-Stacking ¨ Different devices are stacked on top of each other ¨ Layers are connected by through-silicon vias (TSVs) DRAM DRAM Logic DRAM PCB Logic ¨ Why? ¤ Communication between devices bottlenecked by limited I/O pins ¤ Integrating heterogeneous elements on a single wafer is expensive and suboptimal

  18. 3D Stacked Memory ¨ Hybrid Memory Cube (HMC) ¤ A logic layer at the bottom ¨ High Bandwidth Memory (HBM) ¤ Silicon interposer at the bottom Interface In-Package Bank Controller Cache Controller DRAM Dice { … Processor Die Silicon Interposer Package Substrate

Recommend


More recommend