outline
play

Outline 2 Introduction Introduction Preliminaries Preliminaries - PowerPoint PPT Presentation

FF-Bond: Multi-bit Flip-flop Bonding at Placement C HANG -C HENG T SAI Y IYU S HI G UOJIE L UO I RIS H UI -R U J IANG IRIS Lab NCTU MST PKU ISPD-13 Outline 2 Introduction Introduction Preliminaries Preliminaries Problem formulation


  1. FF-Bond: Multi-bit Flip-flop Bonding at Placement C HANG -C HENG T SAI Y IYU S HI G UOJIE L UO I RIS H UI -R U J IANG IRIS Lab NCTU – MST – PKU ISPD-13

  2. Outline 2 Introduction Introduction Preliminaries Preliminaries Problem formulation Problem formulation Algorithm - FF-Bond Algorithm - FF-Bond Experimental results Experimental results Conclusion Conclusion

  3. Multi-Bit Flip-Flops (MBFFs) 3  Clock power is critical for modern IC designs  𝐸𝑧𝑜𝑏𝑛𝑗𝑑 𝑞𝑝𝑥𝑓𝑠 ∝ 𝐷𝑊 2 𝑔  MBFFs present a smaller load on the clock network  Replace FFs with MBFFs  Effectively reduces both clock network power and MBFF power  Prefer large FFs (high bit number), avoid orphans (single-bit FF)  Avoid impacting timing critical paths Master Master Slave Slave D 1 D 1 Q 1 Q 1 Bit Normalized Normalized latch latch latch latch number power per bit area per bit 1 1.00 1.00 clk clk 2 0.86 0.96 4 0.78 0.71 Master Slave D 2 Q 2 Power efficient latch latch

  4. Prior Work 4  Relocating flip-flops benefits clock network synthesis  [Cheon+, DAC05], [Papa+, Micro11], [Lee/Markov, TCAD12]  Replacing flip-flops with MBFFs saves clock power  [Yan/Chen, ICGCS10], [Chang+, ICCAD10], [Wang+, ISPD11] [Jiang+, ISPD11], [Liu+, DATE12]  Focus on post-placement MBFF clustering  Pre-placement  Lack physical information MBFF bonding at-placement  Post-placement  Cells are immovable  Limited clustering flexibility and quality

  5. One Possible Solution … 5  Directly integrate placement & post-placement MBFF clustering Netlist Timing-driven placement MBFF clustering End  The movement of flip-flops  Is constrained by the placement at the current iteration  May oscillate among iterations

  6. Ionic Bonding and Flip-flop Bonding 6  Goal: Guide flip-flops towards merging friendly locations e - + - Ionic Na F Na F bonding Na + F  NaF Flip-flop Flip-flop bonding Example: MBFF library: 1-bit, 2-bit, 4-bit Mergeable flip-flop sets http://en.wikipedia.org/wiki/Ionic_bond

  7. Post-Placement vs. At-Placement - s38417 7 Post-placement clustering At-placement bonding  # MBFFs 4-/2-/1-bit  # MBFFs 4-/2-/1-bit  35/252/237  159/105/35 MBFF SBFF

  8. Outline 8 Introduction Introduction Preliminaries Preliminaries Problem formulation Problem formulation Algorithm - FF-Bond Algorithm - FF-Bond Experimental results Experimental results Conclusion Conclusion

  9. Post-Placement MBFF Clustering 9  Given  A placed design  MBFF library  Timing slacks of flip-flops SBFF  Replace FFs with MBFFs MBFF  Minimize flip-flop power  Satisfy timing constraints MBFF Clustering

  10. Intersection Graph 10  Define the feasible region of a flip-flop according to its slack  Model the overlap of feasible regions by an intersection graph  A proper-sized clique corresponds to an MBFF Feasible region 1 6 f o ( i ) 3 5 i 8 4 f i ( i ) 2 7 y Fanout slack Intersection graph Fanin slack x

  11. INTEGRA (INTErval GRAph) 11  Perform coordinate transformation  Sort starting (s) and ending (e) points of projection in the x’ and y’ axes y x FF1 FF2 FF3 FF4 FF5 FF6 FF7 FF8 TYPE s s s s e e s s s e s e e e e e FF# 1 2 3 4 1 2 5 6 7 3 8 4 5 7 6 8 Jiang et al. “INTEGRA: Fast multibit flip-flop clustering for clock power saving ,” TCAD12, ISPD11.

  12. INTEGRA 12 e 1  Find decision points FF1 FF6 e 3  ‘se’ in x’ axis e 4 s 3 FF3  Retrieve maximal cliques at FF5 e 2 FF4 decision points s 4 y’ s 1 FF8  Check x’ and y’ axes s 2 T x’  {1, 2, 4} or {1, 3, 4} F Y F P  Form MBFFs of proper sizes # FF2 E FF7  e.g., {1, 2} FF1 FF2 FF3 FF4 FF5 FF6 FF7 FF8 TYPE s s s s e e s s s e s e e e e e FF# 1 2 3 4 1 2 5 6 7 3 8 4 5 7 6 8 Decision points

  13. Example: INTEGRA 13  Example: MBFF library: 1-bit, 2-bit, 4-bit FF1 FF6 1 1 1 1 6 6 6 6 FF3 3 3 3 3 FF5 FF4 FF8 5 5 5 5 8 8 8 8 4 4 4 4 FF2 y’ 2 2 2 2 FF7 7 7 7 7 x’ Guide FFs towards 2 dual-bit flip-flops 2 four-bit flip-flops merging friendly locations 1 four-bit flip-flop

  14. Outline 14 Introduction Introduction Preliminaries Preliminaries Problem formulation Problem formulation Algorithm - FF-Bond Algorithm - FF-Bond Experimental results Experimental results Conclusion Conclusion

  15. The MBFF Bonding at Placement Problem 15  Given  Gate-level netlist  MBFF library  Timing constraints  Find a placement and replace FFs with MBFFs  Minimize flip-flop power  Satisfy timing constraints

  16. Outline 16 Introduction Introduction Preliminaries Preliminaries Problem formulation Problem formulation Algorithm - FF-Bond Algorithm - FF-Bond Experimental results Experimental results Conclusion Conclusion

  17. The Overview of FF-Bond 17  Guide flip-flops towards merging friendly locations at the global placement stage without sacrificing timing Netlist FF-Bond Global placement Objective function construction Signoff timer FF-Bond with timing-driven net weighting Gradient-based optimization Legalization solver Detailed placement Evenly distributed? Sparse enough?  < d 1  < d 2 N N Y Y Clock tree synthesis Flip-flop bonding Routing MBFF clustering Pseudo-net generation End = 𝑝𝑤𝑓𝑠𝑚𝑏𝑞𝑞𝑓𝑒_𝑏𝑠𝑓𝑏  : overlap index 𝑢𝑝𝑢𝑏𝑚_𝑑𝑓𝑚𝑚_𝑏𝑠𝑓𝑏

  18. Example: s38584 18 Netlist FF-Bond Global placement Objective function construction Signoff timer with timing-driven net weighting Gradient-based optimization solver Evenly distributed? Sparse enough?  < d 1  < d 2 N N Y Y Flip-flop bonding  : overlap index MBFF clustering Pseudo-net generation = 𝑝𝑤𝑓𝑠𝑚𝑏𝑞𝑞𝑓𝑒_𝑏𝑠𝑓𝑏 𝑢𝑝𝑢𝑏𝑚_𝑑𝑓𝑚𝑚_𝑏𝑠𝑓𝑏

  19. Example: s38584 19  Spread cells until sparse enough Netlist FF-Bond Global placement Objective function construction Signoff timer with timing-driven net weighting Gradient-based optimization solver Evenly distributed? Sparse enough?  < d 1  < d 2 N N Y Y Flip-flop bonding  : overlap index MBFF clustering Pseudo-net generation = 𝑝𝑤𝑓𝑠𝑚𝑏𝑞𝑞𝑓𝑒_𝑏𝑠𝑓𝑏 𝑢𝑝𝑢𝑏𝑚_𝑑𝑓𝑚𝑚_𝑏𝑠𝑓𝑏

  20. Example: s38584 20  Apply flip-flop bonding Netlist FF-Bond Global placement Objective function construction Signoff timer with timing-driven net weighting Gradient-based optimization solver Evenly distributed? Sparse enough?  < d 1  < d 2 N N Y Y Flip-flop bonding  : overlap index MBFF clustering Pseudo-net generation = 𝑝𝑤𝑓𝑠𝑚𝑏𝑞𝑞𝑓𝑒_𝑏𝑠𝑓𝑏 𝑢𝑝𝑢𝑏𝑚_𝑑𝑓𝑚𝑚_𝑏𝑠𝑓𝑏

  21. Timing-Driven Placement 21  Pure wirelength-driven placement + slack-based net-weighting  Pure wirelength-driven analytical placement: mPL 𝐧𝐣𝐨 𝑋 𝑦, 𝑧 = 𝑓∈𝐹 𝑤 𝑗 ,𝑤 𝑘 ∈𝑓,𝑗<𝑘 𝑦 𝑗 − 𝑦 𝑘 + max 𝑤 𝑗 ,𝑤 𝑘 ∈𝑓,𝑗<𝑘 𝑧 𝑗 − 𝑧 𝑘 max  𝐭. 𝐮. 𝐸 𝑗𝑘 = 𝐿 , 1 ≤ 𝑗 ≤ 𝑛, 1 ≤ 𝑘 ≤ 𝑜  Smooth the objective function and the constraints  Log-sum-exp approximation exp 𝑦 𝑙 exp −𝑦 𝑙 exp 𝑧 𝑙 exp −𝑧 𝑙 𝑋 𝑦, 𝑧 = 𝜃 log + log + log + log 𝜃 𝜃 𝜃 𝜃 𝑓∈𝐹 𝑤 𝑙 ∈𝑓 𝑤 𝑙 ∈𝑓 𝑤 𝑙 ∈𝑓 𝑤 𝑙 ∈𝑓  Inverse Laplace transformation 𝐧𝐣𝐨 𝑋 𝑦, 𝑧 𝜔 𝑗𝑘 = 𝐭. 𝐮. 𝐿, 1 ≤ 𝑗 ≤ 𝑛, 1 ≤ 𝑘 ≤ 𝑜  Slack-based net weighting α 𝑡𝑚𝑏𝑑𝑙 𝑜𝑓𝑢 𝑥𝑓𝑗𝑕ℎ𝑢 = 1 − , α > 1  𝑈 𝑑𝑚𝑙  slack = 0 for the 1 st iteration (pure wirelength-driven placement) T. Chen et al . “Multilevel generalized force - directed method for circuit placement,” ISPD05.

  22. Flip-Flop Bonding 22  Bond flip-flops into perfect-sized cliques  Perfect: most power efficient Bit Normalized Normalized  Example: number power per bit area per bit 1 1.00 1.00 2 0.86 0.96 4 0.78 0.71 Power efficient Oversized Perfect Undersized

  23. Flip-Flop Bonding 23  Bond flip-flops into perfect-sized cliques  Priority of processing maximal cliques:  Perfect > undersize > oversize  Perfect size: preserved  Undersize/oversize: try to form a target-sized clique by selecting the nearest flip-flops in a specified search region  The target size: the flip-flop configuration that is larger than, nearest to, and more power efficient than the investigated clique size  Adjacency inside the search region 𝑦 c − 𝑦 𝑗 + 𝑧 𝑑 − 𝑧 𝑗 − 𝜁 𝑗 × 𝑡 𝑔𝑗 𝑗 + 𝑡 𝑔𝑝 𝑗  Physical & timing:

  24. Example: Flip-Flop Bonding (1/3) 24  Extract maximal cliques Flip-flop Bit Normalized Normalized number power per bit area per bit 1 1.00 1.00 2 0.86 0.96 4 0.78 0.71 Power efficient

  25. Example: Flip-Flop Bonding (2/3) 25  Bonding strategy  Choose an undersized clique with priority 3>2>1  Select nearest flip-flops to form target-sized cliques  3 → 4  2 → 4  1 → 2  Choose an oversized clique  Select nearest flip-flops to form target-sized cliques  Even → 4X  Odd → even

Recommend


More recommend