in monolithic 3d ics
play

in Monolithic 3D ICs Hao Zhuang, Jingwei Lu, Kambiz Samadi*, Yang - PowerPoint PPT Presentation

Performance-Driven Placement for Design of Rotation and Right Arithmetic Shifters in Monolithic 3D ICs Hao Zhuang, Jingwei Lu, Kambiz Samadi*, Yang Du*, and Chung-Kuan Cheng Dept. Computer Science & Engineering, University of California, San


  1. Performance-Driven Placement for Design of Rotation and Right Arithmetic Shifters in Monolithic 3D ICs Hao Zhuang, Jingwei Lu, Kambiz Samadi*, Yang Du*, and Chung-Kuan Cheng Dept. Computer Science & Engineering, University of California, San Diego, CA, 92093 USA *Qualcomm Research, San Diego, CA, 92121, USA

  2. Outline • Motivation – Monolithic 3D ICs (M3D) – Our target circuits: • Rotation Shifter • Arithmetic Shifter (right shift) • Optimization Approach of Shifter Designs – Permutation-based Optimization + M3D technology – Efficient Simulated Annealing Solver • Experiment • Conclusions

  3. Motivation: Resume Moore’s Law – 3D ICs • 3D-ICs is a promising solution for scaling of VLSI. • Standard Cell Height = 1.4um [1] • Through Silicon Vias (TSV)-based 3D ICs – Fabricate dies separately. – Wafer need to be thinned, aligned and then bonded. – TSV is large • TSV diameter = 6um [1] • Monolithic 3D Ics (M3D) – Fabricate tiers sequentially – Use monolithic inter-tier vias (MIVs) as vertical connections. The are of only metal-via sizes. • MIV diameter = 70nm [1] [1] Shreepad. Panth, et al. ASPDAC2012

  4. Motivation: Monolithic 3D ICs (M3D)/Monolithic Inter-Tier Vias (MIV) The advantages of M3D/MIV: • High-density integrations. Reduce the huge dimensions and area overhead of TSVs for 3D IC designs. • Cope with interconnect-limited 2D-ICs, where most of the problems are essentially caused by the high interconnect density at gate level . • Inserts vertical connections and shortens the distance between connected modules. • Reduce the total wire length and power , improves the routability and timing behavior.

  5. Motivation: Our target circuits • Shifter Circuits – An indispensable datapath components in the MPU and ASIC. – Has a broad spectrum of application and could impact the system performance in a larger scale. – The wiring inside each shifter module is quite dense . Improvement on timing and power behaviors of shifters becomes an important subject . • In this work, our specific targets of shifters are: – Rotation Shifter – Arithmetic Shifter (Right shift)

  6. Rotation Shifter (Rotator) This is a linear ordering design (LO) of Rotation shifter (rotator), also known as cyclic shifter. Rotation requires long wrap-around wires. D 7 D 6 D 5 D 4 D 3 D 2 D 0 D 1 x (7, 0) (6, 0) (5, 0) (4, 0) (3, 0) (2, 0) (0, 0) (1, 0) y 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 0 (7, 1) (6, 1) (5, 1) (4, 1) (3, 1) (2, 1) (1, 1) (0, 1) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 1 (7, 2) (6, 2) (5, 2) (4, 2) (3, 2) (2, 2) (1, 2) (0, 2) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 2 (7, 3) (6, 3) (5, 3) (4, 3) (3, 3) (2, 3) (1, 3) (0, 3) Z 1 Z 0 Z 7 Z 6 Z 5 Z 4 Z 3 Z 2

  7. Rotation Shifter (Rotator) This is a linear ordering design (LO) of Rotation shifter (rotator), also known as cyclic shifter. Rotation requires long wrap-around wires. D 7 D 6 D 5 D 4 D 3 D 2 D 0 D 1 x (7, 0) (6, 0) (5, 0) (4, 0) (3, 0) (2, 0) (0, 0) (1, 0) y 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 0 (7, 1) (6, 1) (5, 1) (4, 1) (3, 1) (2, 1) (1, 1) (0, 1) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 1 (7, 2) (6, 2) (5, 2) (4, 2) (3, 2) (2, 2) (1, 2) (0, 2) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 2 (7, 3) (6, 3) (5, 3) (4, 3) (3, 3) (2, 3) (1, 3) (0, 3) Z 1 Z 0 Z 7 Z 6 Z 5 Z 4 Z 3 Z 2

  8. Right Arithmetic Shifter This is a linear ordering design of right arithmetic shifter. Extend the original MSB (most significant bits). D 7 D 6 D 5 D 4 D 3 D 2 D 0 D 1 x (7, 0) (6, 0) (5, 0) (4, 0) (3, 0) (2, 0) (0, 0) (1, 0) y 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 0 (7, 1) (6, 1) (5, 1) (4, 1) (3, 1) (2, 1) (1, 1) (0, 1) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 1 (7, 2) (6, 2) (5, 2) (4, 2) (3, 2) (2, 2) (1, 2) (0, 2) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 2 (7, 3) (6, 3) (5, 3) (4, 3) (3, 3) (2, 3) (1, 3) (0, 3) Z 1 Z 0 Z 7 Z 6 Z 5 Z 4 Z 3 Z 2

  9. Objectives • Objectives: – Reduce such longest path to improve timing. – Reduce total wire length to improve power. • Heavy wire loads in the linear order design, caused by long wrap-around wires. D 7 D 6 D 5 D 4 D 3 D 2 D 0 D 1 x (7, 0) (6, 0) (5, 0) (4, 0) (3, 0) (2, 0) (0, 0) (1, 0) y 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 0 (7, 1) (6, 1) (5, 1) (4, 1) (3, 1) (2, 1) (1, 1) (0, 1) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 1 (7, 2) (6, 2) (5, 2) (4, 2) (3, 2) (2, 2) (1, 2) (0, 2) 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 S 2 (7, 3) (6, 3) (5, 3) (4, 3) (3, 3) (2, 3) (1, 3) (0, 3) Z 1 Z 0 Z 7 Z 6 Z 5 Z 4 Z 3 Z 2

  10. Approaches Our optimization approach combines two aspects as follows: • M3D/MIV – Inserts vertical connections, may shortens the distance between connected cells. – By introducing extra dimension here, it reduces the total wire length and dynamic power, improve the routability and timing behavior. • Cell Order Permutations (proposed in our ASPDAC 07 paper [2]) – Idea/Observations: By swapping the physical positions of cells in shifter, it reduces the longest path and total wire length . – Sometimes, it compensates the delay penalty by deviate routes of the design only by naïve folding 2D designs to 3D ICs (show in the experiment of right arithmetic shifter). The first work to optimize 3D shifter by cell order permutations. (Previous work tend to use simple folding 2D linear design into 3D space. Not efficient!) [2] Haikun Zhu, et al. ASPDAC2007.

  11. Optimization (Cell Order Permutation) Illustration case of permutation-based optimization (8-bit rotator) 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 >> 1-bit >> 1-bit 1 0 1 0 1 0 1 0 4 3 0 1 0 1 0 1 0 7 0 1 0 1 0 1 0 1 0 1 0 1 0 1 6 5 4 3 7 2 1 0 7 6 5 4 3 2 1 0 7 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 >> 2-bit >> 2-bit 7 7 1 7 1 9 1 3 1 3 1 3 1 3 1 3 4 2 2 2 4 1 4 5 4 3 5 5 1 4 1 3 7 6 5 4 3 2 1 0 3 4 2 6 7 5 1 0 7 7 9 3 3 3 3 3 2 4 4 5 4 5 4 3 >> 4-bit >> 4-bit 7 7 7 7 9 7 3 7 3 11 3 11 3 13 3 7 8 4 7 5 8 8 4 7 8 4 7 7 4 6 3 8 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 7 7 9 7 11 11 13 7 7 8 8 7 8 7 6 8 LO design A optimized solution The longest path spans (LPS) The longest path spans (LPS) is 13 MUX cells. only 8 MUX cells. 11

  12. Optimization (M3D) • Folding 2D LO into 3D Linear Order Design • Cut the long wrap-around wires, use short MIVs instead Cut the 2D design Folding to different layers (3D) 12

  13. Optimization (M3D) (1) Physical placement of cell and (2) Connect different layers with MIVs for vertical communications 13

  14. Cell Order Permutation in 3D space • Extend to 3D Space Permutations – wires are not shown along x-y direction, – the MIVs are treated as vertical interconnects. – Assume MIVs connects adjacent layers is 5% width of MUX cell – Swap these two highlighted cells, and etc. Input Output 14

  15. Solve the whole optimization via Simulated Annealing-Based Solver (SA) • Use slacks for timing – Take two fan-out nodes into considerations, not just reducing the longest path. The weight of net 𝑜 𝑗 , – 𝑡𝑚𝑏𝑑𝑙 𝑓 ) 𝜄 𝑥 𝑜 𝑗 = (1 − 𝑓∈𝑜 𝑗 𝐸 – Total slack of 𝑜𝑓𝑢𝑥𝑝𝑠𝑙 of shifter 𝑋 𝑡𝑚𝑏𝑑𝑙 = 𝑥 𝑜 𝑗 𝑜 𝑗 ∈𝑜𝑓𝑢𝑥𝑝𝑠𝑙 • Use total wire length as another cost function – Power is proportional to wire length 𝑋 𝑈𝑋𝑀 is the total wire length. – • Auto-Normalizing Cost Function ∆ 𝑑𝑝𝑡𝑢 = 𝛿 Δ𝑋 𝑡𝑚𝑏𝑑𝑙 + (1 − 𝛿) Δ𝑋 𝑈𝑋𝑀 𝑋 𝑡𝑚𝑏𝑑𝑙𝑞𝑠𝑓𝑤 𝑋 𝑈𝑋𝑀𝑞𝑠𝑓𝑤 𝛿 is a tuning parameter. [3][4] [3] A. Marquardt, et. al. FPGA 2000 [4] K. Eguro, et. al. DAC 2008

  16. Scalable SA optimization solver • Integer Linear Programming (ILP) is not scalable in our case, which was used in [2]. • SA is a scalable method to solve this optimization problem, and also archive almost same quality of LPS as ILP, (shown in Table II, 16 bits rotator cases). • “LPS”: The span of the longest path along x -/z- directions, measure in the number of MUX cell. wire span along y-direction contributes the same among shifters) • When optimizing a 32 bits rotator in 2 layer, ILP spends over days to obtain the solution, while SA only take minutes. [2] Haikun Zhu, et al. ASPDAC2007.

  17. Experiment of Shifter Design Optimization • The parameters for performance evaluations • Notations on following pages, – “SA”: permutation -based optimization by simulated annealing-based solver. – “LO”: Linear order design in 2D or folding linear order design in 3D. – “LPS”: The span of the longest path along x -/z- directions, measure in the number of MUX cell (wire span along y-direction contributes the same among shifters). – Delay, and Power are measured based on the following methods.

Recommend


More recommend