the quest for design closure the quest for design closure
play

The Quest for design closure The Quest for design closure Olivier - PowerPoint PPT Presentation

Part IV Part IV The Quest for design closure The Quest for design closure Olivier Coudert Monterey Design System ASP-DAC 2001, tutorial 3 ASP-DAC'01 Olivier Coudert IV-1 DSM Dilemma SOC DSM Time to market Higher resistance Abstraction


  1. Part IV Part IV The Quest for design closure The Quest for design closure Olivier Coudert Monterey Design System ASP-DAC 2001, tutorial 3 ASP-DAC'01 Olivier Coudert IV-1

  2. DSM Dilemma SOC DSM Time to market Higher resistance Abstraction Million gates Higher cross- High density, larger die coupling Higher clock speeds Non-linear timing Long wires Power Project management Electromigration Accuracy Re-use, IPs IR Drop Larger database Inductances Larger design space etc ... Require detailed Need abstraction analyses to levels to manage understand physical complexity interactions ASP-DAC'01 Olivier Coudert IV-2

  3. Logic Synthesis Flow while (x<a) do while (x<a) do Behavioral spec. x1:= x + dx x1:= x + dx; ; u1:= u - - (3*x*u* (3*x*u*dx dx) ) - - (3*y* (3*y*dx dx); ); u1:= u y1:= y + (u*dx dx); ); y1:= y + (u* RC: = ALU 1(RX, a, comp); RC: = ALU 1(RX, a, comp); Behavioral x:= x1; u:= u1; y:= y1; x:= x1; u:= u1; y:= y1; wait until clock AND RC; wait until clock AND RC; synthesis RX1 := ALU1 (RX, RDX, ADD); RX1 := ALU1 (RX, RDX, ADD); endwhile endwhile RT1 := MULT1(RU, RX); RT1 := MULT1(RU, RX); RT2 := MULT 2(3, RDX); RT2 := MULT 2(3, RDX); RTL wait until clock; wait until clock; RT3 := MULT1(RT1, RT2); RT3 := MULT1(RT1, RT2); Logic RT4:= MULT2(RT2, RY); RT4:= MULT2(RT2, RY); synthesis Gate level Synthesis done with netlist WLM and Elmore delay Layout ASP-DAC'01 Olivier Coudert IV-3

  4. Limits of Elmore Delay � As R increases, Elmore delay becomes inaccurate, and cannot be trusted for guiding optimizations ( ) T = + + + R 1 C 1 C 2 C 3 C 4 + R4 + + + R 2 ( C 2 C 3 C 4 ) R 4 C 4 R1 R2 R3 C4 C1 C2 C3 � Elmore delay to C4 is independent of R3! ASP-DAC'01 Olivier Coudert IV-4

  5. Elmore Errors 1200 delay comparisons from a 0.35 µ m CMOS mP 7 0 0 6 0 0 5 0 0 4 0 0 3 0 0 2 0 0 m ax 1 0 0 0 0 2 0 4 0 6 0 8 0 1 0 0 1 2 0 1 4 0 1 6 0 % error 1 0 0 ps rise tim es ASP-DAC'01 Olivier Coudert IV-5

  6. Wire Load Model � Based on number of pins of the nets � Good prediction of average/median capacitance… � …But very large variance # k-pin nets C ASP-DAC'01 Olivier Coudert IV-6

  7. Transistor vs. Wire Delays 1000 Delay Delay Transistor (psec psec) ) ( 100 Metal 2 (2mm) 10 1 1.5 1 0.8 0.6 0.35 0.25 0.18 0.1 Technology Generation (µ) Technology Generation (µ) ASP-DAC'01 Olivier Coudert IV-7

  8. Timing & Interconnect � Wireload models were ALWAYS inaccurate � Post-synthesis signoff was possible when interconnect contributed ~20% of the total capacitance � But now the interconnect capacitance is becoming dominant over the total capacitance with each new process generation ASP-DAC'01 Olivier Coudert IV-8

  9. Design Closure � Placement/synthesis/routing interaction � Congestion � Timing Optimization � Clock design � Power design � Signal integrity � Design signoff � Problem size & Computational resources ASP-DAC'01 Olivier Coudert IV-9

  10. Timing in the pre-DSM flows RTL to gate-level performance driven synthesis, then P&R netlist interconnect timing flexibility information RTL good high #fanout Instanciate architecture and estimation perform technology mapping Gate level netlist accurate Detailed placed netlist estimation low accurate Limited sizing detailed Placed & routed netlist topology ASP-DAC'01 Olivier Coudert IV-10

  11. Timing in a DSM flow Timing is known after placement: synthesis and P&R cannot be independent netlist interconnect timing flexibility information RTL large high #fanout Instanciate architecture and variance perform technology mapping Gate level netlist accurate Detailed placed netlist estimation low accurate Limited sizing detailed Placed & routed netlist topology ASP-DAC'01 Olivier Coudert IV-11

  12. Timing in a DSM flow Need enough P&R information, and enough netlist flexibility netlist interconnect timing flexibility information RTL large high #fanout Instanciate architecture and variance perform technology mapping Gate level netlist good estimation medium accurate Detailed placed netlist estimation low accurate Limited sizing detailed Placed & routed netlist topology ASP-DAC'01 Olivier Coudert IV-12

  13. Placement Quadratic placement � � fast � restricted cost function, e.g., timing driven placement with net weighting Simulated annealing � � open cost function � extremely slow Force directed � � semi-open cost function � slower than quadratic placement � tuning more difficult Bisection (mincut + partitioning) � � open cost function � slower than quadratic placement ASP-DAC'01 Olivier Coudert IV-13

  14. Placement/Synthesis/Routing placement congestion timing route synthesis area � Placement is needed to derive routing � Routing is needed to derive timing � Cell placement and net topology must be flexible to allow synthesis ASP-DAC'01 Olivier Coudert IV-14

  15. Netlist Clustering Start placement by building a hierarchical tree of cell-clusters � from the netlist (h-Metis DAC’97) A key to optimal placement is to optimize the size and � locations of these clusters Both functional hierarchy and netlist topology need to be � considered Netlist C B A D E F ASP-DAC'01 Olivier Coudert IV-15

  16. Placement The clusters are sized and placed within bins and among � megacells Minimize: � � wirelength � Intra-bin and inter-bins congestion ASP-DAC'01 Olivier Coudert IV-16

  17. Placement This process continues to smaller clusters and smaller bins � Long wires are probabilistically routed � ASP-DAC'01 Olivier Coudert IV-17

  18. Placement Eventually one reaches a cluster and bin size for which timing � and congestion are predictable (~ 1k to 10k gates per bin) physical prototype: � � Timing optimization can start at this level ONLY � Timing signoff can be done at this level ONLY ASP-DAC'01 Olivier Coudert IV-18

  19. Design Closure � Placement/synthesis/routing interaction � Congestion � Timing Optimization � Clock design � Power design � Signal integrity � Design signoff � Problem size & Computational resources ASP-DAC'01 Olivier Coudert IV-19

  20. Placement & Congestion Cells are nonuniformily distributed within bins � � Dynamic whitespace allocation addresses congestion at the global level Inter- and intra-bin congestion is predictable at the physical � prototype level ASP-DAC'01 Olivier Coudert IV-20

  21. Non-Uniform Whitespace Mgmt. Example of whitespace allocation after timing driven � placement and optimization White Space White Space White Space White Space added to relieve added to relieve added to relieve added to relieve congestion congestion congestion congestion White Space White Space removed to removed to help relieve help relieve congestion congestion in other areas in other areas Movement of cells Movement of cells for timing optimization for timing optimization ASP-DAC'01 Olivier Coudert IV-21

  22. Congestion Management DSM creates a � significant timing/congestion dependency Carefully manage � congestion so that there are no surprises at DR stage! Wiring models and � congestion estimates are strongly correlated from placement, through GR to DR ASP-DAC'01 Olivier Coudert IV-22

  23. Routing Correlation � Global routing can utilize the whitespace to avoid long-distance couplings for critical nets � Extra spacing, shielding, or space for rip-up and reroute � No surprises for the detailed router after GR � Advanced N-layer shape-based router � Supports gridless and gridded routing � Variable wire width for optimal delay constraints � Cross-talk avoidance, antenna effects � Clock tree sizing for tree balancing � Power routing sizing for voltage drop and electromigration ASP-DAC'01 Olivier Coudert IV-23

  24. Design Closure � Placement/synthesis/routing interaction � Congestion � Timing Optimization � Clock design � Power design � Signal integrity � Design signoff � Problem size & Computational resources ASP-DAC'01 Olivier Coudert IV-24

  25. Timing Prediction As the routing models become more precise, so do the timing � predictions for the long wires � The timing/delay models and analyses are only as precise as the physical information � Enforce correlation from front-end to back-end ASP-DAC'01 Olivier Coudert IV-25

  26. Timing Optimization The first tech mapping was an approximation, since the wiring � capacitances were not known With sufficient physical information at the placement level, we � begin timing optimization ASP-DAC'01 Olivier Coudert IV-26

  27. Placement/Synthesis/Routing Buffers are inserted for shielding, delay, and attenuation � Global routing is used to place the buffers and inverters � The design of long nets is “seeded” by buffers, driven by � accurate physical information ASP-DAC'01 Olivier Coudert IV-27

  28. Incremental Synthesis Requirements “partial” placement and routing � placement and routing must accommodate for the incremental � logic changes local transformation � accurate delay estimation � � input slope and output capacitance dependent � Interconnect delay � rising and falling signals � crosstalk aware efficient incremental timing analysis � ASP-DAC'01 Olivier Coudert IV-28

Recommend


More recommend