working around the limits of working around the limits of
play

Working Around the Limits of Working Around the Limits of CMOS - PowerPoint PPT Presentation

Working Around the Limits of Working Around the Limits of CMOS CMOS CMOS CMOS Mary Jane Irwin, Penn State Mary Jane Irwin, Penn State NSF Workshop: Electronic Design Automation Past, Present, and Future Past, Present, and Future


  1. Working Around the Limits of Working Around the Limits of CMOS CMOS CMOS CMOS Mary Jane Irwin, Penn State Mary Jane Irwin, Penn State NSF Workshop: Electronic Design Automation – Past, Present, and Future Past, Present, and Future July 8 and 9, 2009

  2. Ab t Ab t Abstract Abstract t t  The design constraints of improved performance, better energy efficiency, increased reliability, and The design constraints of improved performance better energy efficiency increased reliability and   constrained design costs challenge EDA researchers as silicon technology continues to scale according to Moore’s Law. However, there are functions that our “standard” silicon technology – CMOS – just doesn’t do well. For functions such as global interconnects, on-chip non-volatile memory, and massive (high bandwidth) input/output, technologies other than CMOS combined with 3D integration holds great promise. For example, a network-on-chip in a second layer exploiting optical and/or RF technology can provide high performance energy efficient and reliable global interconnects SRAM/DRAM memory provide high performance, energy efficient, and reliable global interconnects. SRAM/DRAM memory stacking allows massively parallel memory access helping to mitigate the memory wall and dramatically reducing the large off-chip memory energy consumption. Additionally, stacking emerging non-volatile memory which is immune to radiation-induced soft errors can provide on-chip non-volatile storage while consuming zero standby power. Stacked layers of chemoresistive sensors, mass-sensitive nanoresonators, and biologically-selective FETs fabricated via a directed-assembly approach can provide radically new input/output mechanisms. input/output mechanisms.  But to achieve the promise of 3D integration as a way to sustain Moore’s law as well as to enable More-  than-Moore requires advances by the EDA community working with the design community, as well as interdisciplinary efforts with chemist, biologists, and material scientists. Fundamental research challenges for the designer include determining a functional partitioning that maximizes the benefits of vertical g g p g connections while achieving optimal performance and energy efficiency, designing the interface circuitry between the CMOS “brains” and the non-CMOS technologies, and ensuring temperature stability across and between layers. T o meet these challenges, design methodologies and design tools necessary to implement and simulate/validate 3D architectures which integrate these new technologies and must be developed. –July 22, 2008 –2

  3. EDA EDA Designer Researcher –July 22, 2008 –3

  4. Bookkeeping fabric Bookkeeping fabric Bookkeeping fabric Bookkeeping fabric  No need to be blazingly fast  “Old” CMOS suffices (>250nm) ◦ Better reliability ◦ Less leakage L l k ◦ No or little process variation variation ◦ Existing tools (mostly) work fine –July 22, 2008 –4

  5. Adding compute power fabric Adding compute power fabric Adding compute power fabric Adding compute power fabric  Take advantage of scaling for compute power (<90nm)  Now highly susceptible No hi hl s sce tible to faults, variation, leakage, etc. g ◦ Must be dynamically reconfigurable ◦ Must have to have a way ◦ Must have to have a way to monitor the “health” of the compute elements –July 22, 2008 –5

  6. variability reliability probabilistic behavior Picture of SET PADOX Structure : Fabrication method for IC-oriented Si single-electron transistors Ono, Y.; Takahashi, Y.; Yamazaki, K.; Nagase, M. Namatsu, H.; Kurihara, K.; Murase, K.

  7. Monitoring the compute fabric Monitoring the compute fabric Monitoring the compute fabric Monitoring the compute fabric  Performance/power/ p fault “sensors” ◦ Hardware counters ◦ Temperature sensors ◦ …  Control knobs C t l k b ◦ Turn off idle and faulty cores & links cores & links ◦ Apply DVFS ◦ … –July 22, 2008 –7

  8. NBTI and oxide wearout sensors NBTI and oxide wearout sensors NBTI and oxide wearout sensors NBTI and oxide wearout sensors L. Co LIFO + Control Bits DUT 14.4 40  m 20b Register nverters 20b Register 20b Register Oscillator 20b Counter 21.42  m 21.42  m DUT P1 Stressed Device St d D i P0 T emperature Calibration Device NAND Oscillator Level Converters

  9. NBTI sensors NBTI sensors NBTI sensors NBTI sensors 12  Other kind of 164  m 164  m 164  m 164  m 164  m 164  m Recovery Phase Stress Phase 10 10 sensors 8  Number and (mV) LIFO + Scan Control LIFO + Scan Control LIFO + Scan Control LIFO + Scan Control LIFO + Scan Control LIFO + Scan Control LIFO + Scan Control LIFO + Scan Control LIFO + Scan Control 6 6 location of 1 1 1 1 1 1  Vth ( 96  m 96  m 96  m 96  m 96  m 96  m sensors 4 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide 16 Oxide Vstress = 1.7V  Aggregation of A ti f 2 2 Temperature = 130C Temperature = 130C Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors Sensors DISCRETES DISCRETES DISCRETES DISCRETES Sampling Rate = 10s sensor 0 0 50 100 150 200 information information Time (10 3 s) 144 Oxide Sensors 144 Oxide Sensors 96 NBTI Sensors 96 NBTI Sensors 196  m 196  m 196  m 196  m 1500  m 1500  m m m m m 1 00 1 00  m  m  m  m – David Blaauw, UMich

  10. Adding a communication network Adding a communication network Adding a communication network Adding a communication network  By moving from 2D to 3D have many more close neighbors  Design space exploration  Design space exploration ◦ Many TSV’s → higher bandwidth → lower yield ◦ …  Other interconnect technologies like RF and g optical ◦ Interface circuitry issues ◦ Thermal issues ◦ Thermal issues –July 22, 2008 –10

  11. 3D challenges 3D challenges 3D challenges 3D challenges Memory-stacked CMP Architecture/ Design/ Design/ 3D applications 3D applications Applications 3D uProc System-level design analysis tools Thermal Cost-driven 3D Cost-driven 3D simulation i l ti EDA tools Design automation OpenAccess Physical 3D EDA tools design Clocktree Scan Scan synthesis design T est T esting and T est scheduling design-for- access testability testability Pre bond Pre-bond Fault wafer/die models test Process Research initiated, No solutions Nearly – IBM 11/2002 Some available, research Mature needed tools/techniques –11 available

  12. Adding (nonvolatile) memory Adding (nonvolatile) memory Adding (nonvolatile) memory Adding (nonvolatile) memory  Use 3D memory stacking to take advantage of the increased bandwidth and reduced latency  Will probably require a redesign of the memory organization/interface organization/interface  Stacking NVM (MRAM, PCRAM, …) ◦ Instant on/off, rad hardened ◦ Extremely low leakage ◦ Long write latencies and large ◦ Long write latencies and large write energy –July 22, 2008 –12

  13. MRAM (L2 cache) stacking MRAM (L2 cache) stacking MRAM (L2 cache) stacking MRAM (L2 cache) stacking 2M-SRAM-SNUCA 2M SRAM SNUCA 2M-SRAM-DNUCA 2M SRAM DNUCA 8M-MRAM-DNUCA 8M MRAM DNUCA 8M Hybrid DNUCA 8M Hybrid DNUCA 1 0.8 0 8 0.6 0.4 0.2 0 galgel apsi equake fma3d swim streamcluster T otal Power Comparison – Yuan Xie, PSU –13

  14. Adding I/O Adding I/O Adding I/O Adding I/O  “Traditional” I/O with optical device stacking – HP Labs ISCA 08 HP Labs, ISCA 08  Nontraditional I/O –July 22, 2008 –14

  15. Nontraditional I/O Nontraditional I/O Nontraditional I/O Nontraditional I/O  Chemical sensors Chemical sensors  Electrofluidic nanowire and – Tom Mallouk, Chemistry, PSU nanobead self assembly –Theresa Mayer, EE, PSU y –15

  16. A A nano A A nano nano nose application nano nose application nose application nose application  Input – digitized p g responses from (100x100) gas sensor array  Memory – threshold value storage l  Compute fabric – neighborhood neighborhood aggregation (e.g., systolic array) systolic array) –July 22, 2008 –16

  17. More nontraditional I/O More nontraditional I/O More nontraditional I/O More nontraditional I/O  Biomedical imaging g g ◦ Measure tiny magnetic fields  Use magnetoelectric g sensors (Magnetic field → Strain → Electric field) ◦ Magnetostriction (materials M i i ( i l that change their shape in response to a magnetic field) p g ) ◦ Piezoelectricity (generate electric potential in response to applied mechanical stress ) to applied mechanical stress ) –July 22, 2008 –17

Recommend


More recommend