implementation selection selection
play

IMPLEMENTATION SELECTION SELECTION IMPLEMENTATION Yi-Le Huang, - PowerPoint PPT Presentation

LAGRANGIAN LAGRANGIAN RELAXATION FOR GATE RELAXATION FOR GATE IMPLEMENTATION SELECTION SELECTION IMPLEMENTATION Yi-Le Huang, Jiang Hu and Weiping Shi Yi Le Huang, Jiang Hu and Weiping Shi Department of Electrical and Computer Engineering


  1. LAGRANGIAN LAGRANGIAN RELAXATION FOR GATE RELAXATION FOR GATE IMPLEMENTATION SELECTION SELECTION IMPLEMENTATION Yi-Le Huang, Jiang Hu and Weiping Shi Yi Le Huang, Jiang Hu and Weiping Shi Department of Electrical and Computer Engineering Texas A&M University Texas A&M University

  2. OUTLINE OUTLINE • Introduction and motivation • Projection-based descent method for solving Projection based descent method for solving Lagrangian dual problem • Distribution of Lagrangian multipliers • Experimental results E i t l lt • Conclusion 2

  3. GATE IMPLEMENTATION SELECTION GATE IMPLEMENTATION SELECTION • Gate implementation options • size size • P/N ratio • threshold voltage (Vt) • … • Large problem size • commonly, hundreds of thousands of gates commonly, hundreds of thousands of gates • sometimes millions of gates • Essential for circuit power and performance 3

  4. PREVIOUS WORK (CONTINUOUS) PREVIOUS WORK (CONTINUOUS) • Solved by • Linear programming Linear programming • Convex programming • Network flow • Round fractional solutions to integers R d f i l l i i • Fast • Rounding errors Rounding errors • Restrictions on delay/power models • Difficult to handle P/N ratio unless transistor level 4

  5. PREVIOUS WORK (DISCRETE) PREVIOUS WORK (DISCRETE) • No rounding error • Compatible with different power/delay models Compatible with different power/delay models • Sensitivity based heuristics • Simple • Quick • Greedy • Dynamic programming-like search Dynamic programming like search • Relatively systematic solution search 5

  6. LAGRANGIAN LAGRANGIAN RELAXATION (LR) RELAXATION (LR) • Handle conflicting objectives or complex constraints • With continuous optimization • Faster convergence (Chen, Chu and Wong, TCAD 1999) • With dynamic programming-like search With d i p i lik s h • Alleviate the curse of dimensionality 6

  7. OVERVIEW OF LR OVERVIEW OF LR Original problem LR subproblem Lagrangian multiplier Lagrangian multiplier Minimize A(x) Minimize A(x) + λ •B(x) ( ) ( ) Subject to: B(x) ≤ 0 j ( ) C(x) ≤ 0 Subject to: C(x) ≤ 0 Lagrangian dual problem Find Find λ → max optimal → max optimal p solution of solution of subproblem subproblem 7

  8. LR FOR GATE IMPLEMENTATION SELECTION LR FOR GATE IMPLEMENTATION SELECTION Original problem LR subproblem P  Min Mi ( ( ) ) P x        s. t. PO Min ( ) P( x ) a A a A j j j j                in in ( ( ) ) ( ( ) ) a a D D a a i i j j a a D D a a i ij j ij i ij j   PI     D a i ( ) D a i i 0 i i i λ ij x: implementation decision j i P: power D: delay D: delay Subgradient == -slack b di l k a: arrival time 8

  9. LAGRANGIAN LAGRANGIAN DUAL PROBLEM DUAL PROBLEM • Ideally • a piece-wise convex function a piece wise convex function • solved by subgradient method λ • variant of steepest descent • In practice I ti • no guarantee for optimal subproblem solution • dual problem is no longer convex dual problem is no longer convex • How to solve non-convex dual problem? • not well studied 9

  10. KARUSH KARUSH- -KUHN KUHN- -TUCKER (KKT) CONDITIONS TUCKER (KKT) CONDITIONS 1 I 2 2 II II i          ... ... 1 1 2 2 i i i i iI iI iII iII Flow conservation (Chen, Chu and Wong TCAD99) 10

  11. PROBLEM OF PROBLEM OF SUBGRADIENT SUBGRADIENT METHOD METHOD Slack1: -5 Slack3: -5 Slack2: 20 Δλ Δλ 1 = 5 ρ , Δλ 2 = -20 ρ , Δλ 3 = 5 ρ 5 ρ Δλ 20 ρ Δλ 5 ρ ρ : step size in subgradient method Δλ 1 + Δλ 2 ≠ Δλ 3 11

  12. PROJECTION PROJECTION- -BASED DESCENT METHOD BASED DESCENT METHOD Subgradient Projected move direction: smoothed historical gradient λ 1 λ 2 12

  13. PROJECTION ESTIMATION PROJECTION ESTIMATION • Table of (a, λ ) in previous iterations • Gradient history: (a i -a i 1 )/( λ i - λ i 1 ) Gradient history: (a i a i-1 )/( λ i λ i-1 ) • Projection direction • Weighted average of historical gradients • More weight for recent gradients ‐ 2 ‐ 0.2 ‐ 0.6 13

  14. STEP SIZE STEP SIZE • Δλ = (q – a cur ) / η • q: required arrival time q: required arrival time • a cur : current arrival time • η : projected move direction 14

  15. MULTIPLIER UPDATE FLOW MULTIPLIER UPDATE FLOW • Multipliers at PO are updated by projection • They are distributed to entire circuit in reverse- They are distributed to entire circuit in reverse topological order • like network flow • Alternatively, from PI distribute in topological Alt ti l f PI dist ib t i t p l i l order 15

  16. MULTIPLIER DISTRIBUTION MULTIPLIER DISTRIBUTION • Ensure flow conservation • Try to equalize slack: different slacks imply room Try to equalize slack: different slacks imply room for power saving 1 1 • Given outgoing flow, find x Gi t i fl fi d I 2 II  j x a     i    jk   ij  ( ) ( ) i in j k out j x is the target arrival time • Δλ ij = (x-a i )/ η ij 16

  17. EXPERIMENT SETUP EXPERIMENT SETUP • ISCAS85 benchmark • Cell library based on 70nm technology Cell library based on 70nm technology • Synthesized by SIS • Placed by mPL • Elmore delay d • Analytical power model • LR subproblem is solved by greedy heuristic • LR subproblem is solved by greedy heuristic • Compare our approach (projection+greedy) with baseline (subgradient+greedy) 17

  18. RESULTS WITH TIGHT TIMING CONSTRAINTS RESULTS WITH TIGHT TIMING CONSTRAINTS Initial Sub ‐ gradient method Our method testcase # of gates g power p slack power p slack run time p power slack run time chain 11 9.3 ‐ 295.6 60.8 ‐ 13.9 0.0 104.4 0.1 0.4 c432 289 221.7 ‐ 10379.8 832.4 ‐ 33.1 0.8 803.7 0.6 1.2 c499 539 418.8 ‐ 5389.7 1545.4 ‐ 11.5 1.5 1522.8 1.7 2.2 c880 340 259.4 ‐ 4239.1 515.5 ‐ 31.7 0.9 549.4 15.6 1.7 c1355 579 426.6 ‐ 5353.7 1470.0 ‐ 5.3 1.7 1403.9 7.6 2.5 c1908 722 582.8 ‐ 7286.4 1452.7 ‐ 12.8 2.2 1402.7 5.9 3.2 c2670 1082 725.1 ‐ 16177.1 1465.9 ‐ 32.8 2.8 1312.6 9.1 4.1 c3540 1208 994.5 ‐ 7369.0 2650.5 ‐ 116.6 3.7 3016.6 20.0 5.5 c5315 2440 1941.7 ‐ 9956.3 3627.4 ‐ 199.0 7.4 4088.7 7.8 10.9 c6288 2342 1819.7 ‐ 10476.1 6305.5 ‐ 29.4 7.6 5382.4 3.4 11.4 c7552 c7552 3115 3115 2390.0 2390 0 ‐ 21197 9 21197.9 6875 7 6875.7 ‐ 97 2 97.2 9.7 9 7 5433 9 5433.9 20 6 20.6 14 8 14.8 Sum 9790 38.38 57.82 # of violation 11 11 0 18

  19. RESULTS WITH LOOSE TIMING CONSTRAINTS RESULTS WITH LOOSE TIMING CONSTRAINTS Initial Sub ‐ gradient method Our method testcase testcase # of gates # of gates power power slack slack power power slack slack run time run time power power slack slack run time run time chain 11 9.3 ‐ 215.5 27.2 5.0 0.0 27.2 5.0 0.1 c432 289 221.7 ‐ 8033.3 249.8 17.0 0.8 238.7 18.0 1.1 c499 c499 539 539 418.8 418.8 ‐ 4198.2 4198.2 874.5 874.5 614.0 614.0 1.5 1.5 498.8 498.8 7.0 7.0 2.2 2.2 c880 340 259.4 ‐ 3219.2 327.8 231.0 0.9 279.8 1.0 1.3 c1355 579 426.6 ‐ 4084.3 736.4 38.0 1.7 522.1 3.0 2.5 c1908 722 582.8 ‐ 5716.4 878.0 22.0 2.2 666.7 70.0 3.1 c2670 2670 1082 1082 725 1 725.1 ‐ 12969.7 12969 7 760.0 760 0 711 0 711.0 2 8 2.8 734 2 734.2 115 0 115.0 4 0 4.0 c3540 1208 994.5 ‐ 5873.4 2012.5 718.0 3.7 1147.6 7.0 5.5 c5315 2440 1941.7 ‐ 8156.4 3165.6 1033.0 7.4 2171.9 17.0 10.8 c6288 2342 1819.7 ‐ 7786.7 3951.3 310.0 7.5 2518.8 13.0 11.5 c7552 3115 2390.0 ‐ 16899.7 3897.8 1386.0 9.7 2445.5 107.0 14.3 Sum 9790 16881 38.25 11251 56.34 # of violation 11 0 0 19

  20. SLACK OVER ITERATIONS (C432) SLACK OVER ITERATIONS (C432) 2000 1500 1000 ck 500 500 Sla 0 1 1 3 3 5 5 7 7 9 9 11 11 13 13 15 15 17 17 19 19 21 21 23 23 25 25 27 27 29 29 ‐ 500 ‐ 1000 Iteration Iteration Our algorithm Sub ‐ gradient method 20

  21. POWER OVER ITERATIONS (C432) POWER OVER ITERATIONS (C432) 500 450 400 ower 350 P 300 250 200 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 I Iteration i Our algorithm Sub ‐ gradient method 21

  22. CONCLUSIONS AND FUTURE RESEARCH CONCLUSIONS AND FUTURE RESEARCH • Drawbacks of subgradient method are investigated • New techniques are proposed to solve Lagrangian dual problem for gate implementation selection • They lead to better solutions and faster convergence g • In future, we will integrate them with dynamic programming-like search programming-like search 22

Recommend


More recommend