iee5008 autumn 2012 memory systems 3d stacking sram
play

IEE5008 Autumn 2012 Memory Systems 3D Stacking SRAM - PowerPoint PPT Presentation

IEE5008 Autumn 2012 Memory Systems 3D Stacking SRAM Anwar,Hossameldin Department of Electronics Engineering National Chiao Tung University Eng_hossam123@yahoo.com Anwar,Hossameldin 2012 Outline Introduction 3D Technology Process


  1. IEE5008 –Autumn 2012 Memory Systems 3D Stacking SRAM Anwar,Hossameldin Department of Electronics Engineering National Chiao Tung University Eng_hossam123@yahoo.com Anwar,Hossameldin 2012

  2. Outline  Introduction  3D Technology Process  Physical Characteristics of d2d Vias  Planar SRAM Components  Planar SRAM design Techniques  3D implementations of Banked SRAM Arrays  3D implementations of Multiported SRAM arrays  Bank and Array-Stacked 3D SRAM Benefits  Multiported 3D SRAM Benefits  Conclusion  References Anwar,Hossameldin 2

  3. Introduction  The semiconductor industry faces number of challenges. 1 .Poor Scaling of RC delays. 2 .Power Consumption. 3 .Manufacturing challenges.  3D integration has the potential to address these challenges.  3D integration can reap the advances in traditional planar processes such as d ouble-gate transistors, T ri-gate transistors, f inFETs, s trained Silicon and m etal gates. Anwar,Hossameldin 3

  4.  3D fabrication involves stacking two or more die connected with density and low latency.  The increased density and ability to place and route in 3D provide new opportunities for microarchitecture design.  In 3D fabrication, the dense die-to-die enable 3d SRAM components are partitioned at the levels of individual wordlines or bitlines.  So, the benefits are: 1 .Reduction of wire length within SRAM arrays. Provides simultaneous latency. Provides energy reduction. 2 .Reduction of area footprint. Provides reduction of required wires for global routing. Anwar,Hossameldin 4

  5. 3D Technology Process  There are several proposed methods for 3D integration such as Multilayer buried structures(MLBS) Die bonding Anwar,Hossameldin 5

  6.  Multi layer buried structure (MLBS) Structure  Multiple device layers are sequentially fabricated in stacked fashion.  Layer-to-layer connections are made from interlayer vias or from direct source- drain/drain-source contacts.  It uses local polysilicon wires for connection. Advantage  vertical 3D vias can potentially scale down with feature size. Anwar,Hossameldin 6

  7.  Die bonding Structure  It uses conventional planar fabrication processes and metal vias to bond the planar die vertically.  Depositing vias on the top metal layers of each of the two die and/or etching vias through the backside of the die, aligning the two die and bonding them together. Anwar,Hossameldin 7

  8. There are many organizations for multiple die bonding:  Face-to-Face (F2F) bonding.  Face-to-Back (F2B) bonding.  Back-to-Back (B2B) bonding. Anwar,Hossameldin 8

  9. Physical Characteristics of d2d Vias  The thinning of the die, reduces the distance that d2d via must cross to connect the two die.  A d2d vias is much smaller than the planar interconnect.  It reduces both resistance and capacitance.  So, the signal propagation delay between the two die is reduced. Anwar,Hossameldin 9

  10. Planar SRAM Components Caches Basic design parameters  C ache size.  B lock size.  A ssociativity. Features  L arge capacity. 1 .caches are organized as banks to increase bandwidth and decrease power consumption. 2 .Caches are subbanked to save power by sharing sense amplifier circuitry among subbanks.  R equire both tag and data arrays. Anwar,Hossameldin 1 0

  11. Register files Features  L ower capacity  D o not have a tag array.  C onsist of regular array of 6T SRAM cells.  T ypically multiported with multiple read ports and multiple write ports to satisfy the required bandwidth for data processing. Anwar,Hossameldin 11

  12. Planar SRAM array –based components features  Consists of regular array memory cells.  Easy to partition across a multiple die.  SRAM array are viewed as set of wordlines(horizontally) and set of bitlines(vertically).  Row decoder drives the wordlines and control the access transistors of the data storage cells.  The bitlines are read by sense amplifier at the bottom of the array. Anwar,Hossameldin 12

  13. Planar SRAM design Techniques  It used to increase the performance and reduce the power consumption in SRAM arrays. Memory Banking Technique Memory Subbanking Technique Hierarchical Wordline Technique Anwar,Hossameldin 13

  14.  Memory Banking Technique Power Saving Divides the memory array into multiple modules(banks). • Accessing only the bank that contains the required data. • Bandwidth Enhancement If the requested data values located in different banks, • we can simultaneously obtain values out of multiple banks. • Thus, mimicking the effect of a multiported memory array. • But! If multiple addresses target the same bank, we have a bank conflicts. • So, we need a buffer mechanism that stores and reissues the requests, • So that, the target bank provides the requested data values in later clock cycles. • Anwar,Hossameldin 14

  15. Example  Higher order interleaving technique • Divides the memory array into banks based on the higher order address bits. • If the array contains 2^N locations, • One bank contains addresses from 0 to (2^(N-1))-1. • The other bank contains addresses from 2^(N-1) to (2^N)-1.  Lower order interleaving technique • Uses the lower order address bits to identify the banks(odd and even addresses). • If the requested data is located into only one bank, no need to access other banks. • So, it does not consume dynamic power. Anwar,Hossameldin 15

  16.  Memory Subbanking Technique Features A cache block is divided into a number of subbanks. • The required word is chosen using the offset bits in the address. • The subbank selector selects between the two subbanks and feeds the data from only • one subbank into the sense amplifier circuitry. So, a common set of sense amplifiers can be shared across the subbanks. • Data are read out from only one subbank at a time. • Cutting down on the cache power. • Bitline precharge power saving because only the selected subbank needs to be • precharged. Anwar,Hossameldin 16

  17.  Hierarchical Wordline Technique(HWL) Problems Wordlines are heavily loaded by the access transistors (two per SRAM cell) across the • whole row of SRAM cells. Wordlines contribute the overall delay of SRAM access. • HWL structure (Solution) Uses global wordlines(GWL) to drive multiple shorter subwordlines. • The decoder output is used as the global wordline. • So, the wordline loading and latency of driving wordlines are reduced. • Disadvantage Worsen the wire complexity of the wordlines,the wiring requirement of wordlines is • doubled!!. Anwar,Hossameldin 17

  18. 3D Implementations of Banked SRAM Arrays  One option for 3D-integrated SRAM array design is to stack banks on the top of each others.  Another option is to split the arrays in multiple layers.  Long metal wires are used to route global signals in banked SRAM arrays. 3D Bank Stacking 3D Array Splitting Anwar,Hossameldin 18

  19.  3D Bank Stacking There are two possible orientations for bank stacking: Left-to-Right Stacking • Top-to-Down Stacking • Notes X is the bank width, Y is the bank height. • Assuming that X=Y. • 67% reduction in horizontal component of wiring to and from the banks. • The vertical component of the bank wiring is unaffected. • So, the reduction in wire length translates into a reduction of power and delay. • Anwar,Hossameldin 19

  20.  3D Array Splitting Features Partitioning individual rows and columns of the SRAM arrays within a a bank and • stacking them upon themselves. Can reduce the length of either wordlines or bitlines depending on the orientation of the • split. Anwar,Hossameldin 20

  21. The First Array-split Configuration  Stacks columns on columns  Single long wordline has been replace by a pair of parallel wordlines.  The decoder must drive the wordlines on both of the die.  So, it requires one d2d via per wordline.  At the bottom of the array, the column select multiplexors have been split across the two die .  So, it requires additional d2d vias.  There are reduction in latency and power due to wordlines length reduction. Anwar,Hossameldin 21

  22. The Second Array-split Configuration  Stacks rows on rows.  The row decoder must be partitioned across the two die.  Decompose the 1-to-n decoder into 1-to-2 decoder and two 1-to-n/2 decoders.  The two 1-to-n/2 decoders are stacked on top of each other.  The 1-to-2 decoder will only active to avoid the stacking of thermally active components.  So, the length of the bitlines reduce to half.  There are latency and power reduction due to wire reduction at both the array and bank levels. Anwar,Hossameldin 22

  23. 3D Implementation of Multiported SRAM Arrays  There are many possible design for multiported SRAM array in 3D integration technology. Register Partitioning(RP) Bit Partitioning(BP) Port Splitting(PS) Anwar,Hossameldin 23

Recommend


More recommend