outstanding unsolved problems demand new methods for
play

Outstanding unsolved problems demand new methods for their solution, - PowerPoint PPT Presentation

SoC Design Issues Wire Retiming for Global Interconnects Buffer Insertion for SoC Circuits Multicore Parallel CAD Outstanding unsolved problems demand new methods for their solution, while powerful new methods beget new problems to be solved.


  1. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Wire pipelining VLSI scaling trend Frequency: 2X/generation, Die size: 1.25X/generation Problem: global communication requires multiple clock periods Recent research Insert flip-flops (FFs) on wires based on physical needs (Intel, IBM, etc.) How to maintain logical (functional) correctness? FF insertion changes computation schedule Synchronization among different computation units may be destroyed 14 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  2. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Wire pipelining by retiming Retiming [Leiserson and Saxe ’83] relocates FFs w/o changing functionality Re-scheduling computation We extend it for pipelining long wires Re-scheduling both computation and communication FFs may be added at PI (or PO) and then retimed into the circuit 15 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  3. � SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD An SOC design example n a u 1 v x Block placement and global b 3 routing are given 2 Signal directions and register 1 w c locations, too y 16 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  4. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Timing model for a combinational block a a d 1 d 1 +d 2 x x b d 2 b d 3 +d 2 d 3 c y c d 4 y d 4 (a) (b) Timing arrows represent pin-to-pin path delays 17 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  5. ✁ � ✁ ✁ ✁ ✁ ✁ ✁ � SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Timing model for a sequential block a a x d 1 +d 2 x d 1 d 1 +d 2 d d 2 1 f 1 y d y b b 0 3 d f 2 3 (a) (b) Timing arrows for pin-to-pin combinational paths A virtual register introduced for other paths Paths starting or ending at registers 18 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  6. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Timing model for a net x y u v w (a) (b) Nodes for Steiner points Nodes for entrances and exits of buffer-forbidden areas 19 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  7. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Optimal wire retiming problem G = ( V , E ), E = E 1 ∪ E 2 , E 1 ∩ E 2 = ∅ delay: d ( e ), #FF: w ( e ), ∀ e ∈ E 20 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  8. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Optimal wire retiming problem G = ( V , E ), E = E 1 ∪ E 2 , E 1 ∩ E 2 = ∅ delay: d ( e ), #FF: w ( e ), ∀ e ∈ E ∀ e ∈ E 2 , d ( e ) is proportional to its length Since buffers are allowed on E 2 20 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  9. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Optimal wire retiming problem G = ( V , E ), E = E 1 ∪ E 2 , E 1 ∩ E 2 = ∅ delay: d ( e ), #FF: w ( e ), ∀ e ∈ E ∀ e ∈ E 2 , d ( e ) is proportional to its length Since buffers are allowed on E 2 Find relocation of FFs No FFs changed on any e ∈ E 1 Minimize clock period (= the maximum delay between any two consecutive FFs) 20 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  10. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Introducing decision variables r and t r(u)=r(u)+1 u u r(u)=r(u)-1 w(u,v) after w(u,v)+r(v)-r(u) ... ... u v u v retiming w r (u,v) v t(v)=max(...) 21 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  11. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Formal problem formulation Minimize T subject to: 22 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  12. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Formal problem formulation Minimize T subject to: Retiming validity r ( u ) = r ( v ) ∀ ( u , v ) ∈ E 1 (1) w r ( u , v ) = w ( u , v ) + r ( v ) − r ( u ) ≥ 0 ∀ ( u , v ) ∈ E 2 (2) 22 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  13. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Formal problem formulation Minimize T subject to: Retiming validity r ( u ) = r ( v ) ∀ ( u , v ) ∈ E 1 (1) w r ( u , v ) = w ( u , v ) + r ( v ) − r ( u ) ≥ 0 ∀ ( u , v ) ∈ E 2 (2) Timing validity t ( v ) ≥ t ( u ) + d ( u , v ) − w r ( u , v ) T ∀ ( u , v ) ∈ E (3) 0 ≤ t ( v ) ≤ T ∀ v ∈ V (4) 22 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  14. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Algorithmic view of the problem Traditional Retiming Wire Retiming d=0 d ≥0 d ≥0 Maximum Cycle Ratio (MCR) 23 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  15. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Traditional retiming problem r ( u ) = r ( v ) , ∀ ( u , v ) ∈ E 1 (1) w r ( u , v ) = w ( u , v ) + r ( v ) − r ( u ) ≥ 0 , ∀ ( u , v ) ∈ E 2 (2) t ( v ) ≥ t ( u ) + d ( u , v ) , ∀ ( u , v ) ∈ E : w r ( u , v ) = 0 0 ≤ t ( v ) ≤ T , ∀ v ∈ V (4) 24 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  16. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Zhou’s algorithm [ASP-DAC’05] Solve traditional retiming incrementally w/o binary search: Initialize T by r = 0 Iteratively increment r ( v ) for t ( v ) ≥ T Maintain m pointers for optimality checking 17 10 7 13 v 5 v 4 v 5 v 4 7 7 r=2 7 7 v 3 v 3 T=17 T=10 3 3 3 r=1 3 3 3 v 1 v 2 v 1 v 2 3 3 3 10 6 3 7 10 10 13 v 5 v 4 v 5 v 4 r=1 7 7 r=1 7 7 v 3 v 3 T=10 T=10 3 3 3 r=1 3 3 3 v 1 v 2 v 1 v 2 10 3 3 3 6 3 25 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  17. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Maximum cycle ratio problem Minimize T subject to: t ( v ) ≥ t ( u ) + d ( u , v ) − w r ( u , v ) T ∀ ( u , v ) ∈ E (3) 26 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  18. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Maximum cycle ratio problem Minimize T subject to: t ( v ) ≥ t ( u ) + d ( u , v ) − w r ( u , v ) T ∀ ( u , v ) ∈ E (3) Burns’s algorithm [CalTech PhD thesis ’91] Solve MCR problem by iteratively pushing down T 26 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  19. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Idea for solving wire retiming problem Initialize T with r = 0 27 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  20. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Idea for solving wire retiming problem Initialize T with r = 0 Iteratively reduce T while keeping (1)-(4) 27 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  21. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Idea for solving wire retiming problem Initialize T with r = 0 Iteratively reduce T while keeping (1)-(4) With r unchanged Extend Burns’s algorithm 27 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  22. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Idea for solving wire retiming problem Initialize T with r = 0 Iteratively reduce T while keeping (1)-(4) With r unchanged Extend Burns’s algorithm Change r (retiming) Extend Zhou’s algorithm 27 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  23. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Idea for solving wire retiming problem Initialize T with r = 0 Iteratively reduce T while keeping (1)-(4) With r unchanged Extend Burns’s algorithm Change r (retiming) Extend Zhou’s algorithm Certify optimality 27 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  24. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Push down T with r unchanged Retiming validity ((1) and (2)) is kept 28 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  25. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Push down T with r unchanged Retiming validity ((1) and (2)) is kept Minimize T under timing validity: t ( v ) ≥ t ( u ) + d ( u , v ) − w r ( u , v ) T , ∀ ( u , v ) ∈ E (3) 0 ≤ t ( v ) ≤ T , ∀ v ∈ V (4) 28 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  26. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Push down T with r unchanged Retiming validity ((1) and (2)) is kept Minimize T under timing validity: t ( v ) ≥ t ( u ) + d ( u , v ) − w r ( u , v ) T , ∀ ( u , v ) ∈ E (3) 28 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  27. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Push down T with r unchanged Retiming validity ((1) and (2)) is kept Minimize T under timing validity: t ( v ) ≥ t ( u ) + d ( u , v ) − w r ( u , v ) T , ∀ ( u , v ) ∈ E (3) Burns’s algorithm [CalTech PhD thesis ’91] Returns minimal T under (3) 28 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  28. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Push down T with r unchanged Retiming validity ((1) and (2)) is kept Minimize T under timing validity: t ( v ) ≥ t ( u ) + d ( u , v ) − w r ( u , v ) T , ∀ ( u , v ) ∈ E (3) 0 ≤ t ( v ) ≤ T , ∀ v ∈ V (4) Burns’s algorithm [CalTech PhD thesis ’91] Returns minimal T under (3) Extend Burns’s to incorporate (4) 28 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  29. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Burns’s Algorithm 1 While (true) E c ← { ( u , v ) ∈ E | t ( v ) = t ( u ) + d ( u , v ) − w r ( u , v ) T } ; 2 Return T and r , if E c contains a cycle; 3 29 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  30. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Burns’s Algorithm 1 While (true) E c ← { ( u , v ) ∈ E | t ( v ) = t ( u ) + d ( u , v ) − w r ( u , v ) T } ; 2 Return T and r , if E c contains a cycle; 3 For v ∈ V in topological sort order of G c = ( V , E c ) do 4 ∆( v ) ← 0, if v is a root in G c ; 5 ∆( v ) ← max ∀ ( u , v ) ∈ E c { ∆( v ) , ∆( u ) + w r ( u , v ) } ; 6 u ∆( v)=3 v 29 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  31. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Burns’s Algorithm 1 While (true) E c ← { ( u , v ) ∈ E | t ( v ) = t ( u ) + d ( u , v ) − w r ( u , v ) T } ; 2 Return T and r , if E c contains a cycle; 3 For v ∈ V in topological sort order of G c = ( V , E c ) do 4 ∆( v ) ← 0, if v is a root in G c ; 5 ∆( v ) ← max ∀ ( u , v ) ∈ E c { ∆( v ) , ∆( u ) + w r ( u , v ) } ; 6 θ ← ∞ ; 7 For each ( u , v ) ∈ E do 8 If (∆( u ) + w r ( u , v ) > ∆( v )) then 9 t ( v ) − t ( u ) − d ( u , v )+ w r ( u , v ) T θ ← min { θ, } ; 10 ∆( u )+ w r ( u , v ) − ∆( v ) 29 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  32. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Burns’s Algorithm 1 While (true) E c ← { ( u , v ) ∈ E | t ( v ) = t ( u ) + d ( u , v ) − w r ( u , v ) T } ; 2 Return T and r , if E c contains a cycle; 3 For v ∈ V in topological sort order of G c = ( V , E c ) do 4 ∆( v ) ← 0, if v is a root in G c ; 5 ∆( v ) ← max ∀ ( u , v ) ∈ E c { ∆( v ) , ∆( u ) + w r ( u , v ) } ; 6 θ ← ∞ ; 7 For each ( u , v ) ∈ E do 8 If (∆( u ) + w r ( u , v ) > ∆( v )) then 9 t ( v ) − t ( u ) − d ( u , v )+ w r ( u , v ) T θ ← min { θ, } ; 10 ∆( u )+ w r ( u , v ) − ∆( v ) T ← T − θ 11 For each v ∈ V do 12 t ( v ) ← t ( v ) + θ · ∆( v ); 13 29 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  33. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Modify Burns’s to satisfy (4) 1 While (true) E c ← { ( u , v ) ∈ E | t ( v ) = t ( u ) + d ( u , v ) − w r ( u , v ) T } ; 2 Return T and r , if E c contains a cycle; 3 For v ∈ V in topological sort order of G c = ( V , E c ) do 4 ∆( v ) ← 0, if v is a root in G c ; 5 ∆( v ) ← max ∀ ( u , v ) ∈ E c { ∆( v ) , ∆( u ) + w r ( u , v ) } ; 6 θ ← ∞ ; 7 For each ( u , v ) ∈ E do 8 If (∆( u ) + w r ( u , v ) > ∆( v )) then 9 t ( v ) − t ( u ) − d ( u , v )+ w r ( u , v ) T θ ← min { θ, } ; 10 ∆( u )+ w r ( u , v ) − ∆( v ) T ← T − θ 11 For each v ∈ V do 12 t ( v ) ← t ( v ) + θ · ∆( v ); 13 30 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  34. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Modify Burns’s to satisfy (4) 1 While (true) E c ← { ( u , v ) ∈ E | t ( v ) = t ( u ) + d ( u , v ) − w r ( u , v ) T } ; 2 Return T and r , if E c contains a cycle; 3 For v ∈ V in topological sort order of G c = ( V , E c ) do 4 ∆( v ) ← 0, if v is a root in G c ; 5 ∆( v ) ← max ∀ ( u , v ) ∈ E c { ∆( v ) , ∆( u ) + w r ( u , v ) } ; 6 θ ← ∞ ; 7 For each ( u , v ) ∈ E do 8 If (∆( u ) + w r ( u , v ) > ∆( v )) then 9 t ( v ) − t ( u ) − d ( u , v )+ w r ( u , v ) T θ ← min { θ, } ; 10 ∆( u )+ w r ( u , v ) − ∆( v ) For each v ∈ V do 11 θ ← min { θ, T − t ( v ) ∆( v )+1 } ; 12 T ← T − θ 13 For each v ∈ V do 14 t ( v ) ← t ( v ) + θ · ∆( v ); 15 30 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  35. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Modify Burns’s to satisfy (4) Theoretical importance Push T down to the minimum, with r unchanged 30 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  36. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Push down T by changing r Condition ∃ v , t ( v ) = T 31 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  37. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Push down T by changing r Condition ∃ v , t ( v ) = T Zhou’s algorithm r ( v ) ← r ( v ) + 1 Necessary to get a smaller T if it exists 31 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  38. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Push down T by changing r Condition ∃ v , t ( v ) = T Zhou’s algorithm r ( v ) ← r ( v ) + 1 Necessary to get a smaller T if it exists Regain retiming validity ((1) and (2)) Proper r adjustments 31 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  39. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Push down T by changing r Condition ∃ v , t ( v ) = T Zhou’s algorithm r ( v ) ← r ( v ) + 1 Necessary to get a smaller T if it exists Regain retiming validity ((1) and (2)) Proper r adjustments Run extended Burns’s under new r 31 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  40. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Criteria for certifying optimality Optimality has been reached if: A critical cycle in Burns’s An m cycle in Zhou’s ∃ v ∈ V , r ( v ) > N ff , the total # of FFs in any simple path 32 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  41. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Criteria for certifying optimality Optimality has been reached if: A critical cycle in Burns’s An m cycle in Zhou’s ∃ v ∈ V , r ( v ) > N ff , the total # of FFs in any simple path 32 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  42. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Criteria for certifying optimality Optimality has been reached if: A critical cycle in Burns’s An m cycle in Zhou’s ∃ v ∈ V , r ( v ) > N ff , the total # of FFs in any simple path 32 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  43. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Criteria for certifying optimality Optimality has been reached if: A critical cycle in Burns’s An m cycle in Zhou’s ∃ v ∈ V , r ( v ) > N ff , the total # of FFs in any simple path 32 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  44. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD An example t=2 t=0 a b b ∆=0 ∆=1 3 2 T=5 a c 2 d c 2 t=0 t=2 d ∆=0 ∆=1 33 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  45. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD An example t=2 t=0 t=1.5 a b t=2 a b b ∆=0 ∆=1 ∆=1 ∆=0 3 2 θ= 1.5 T=3.5 T=5 a c 2 d c d c 2 t=0 t=2 t=0 t=3.5 d ∆=0 ∆=1 ∆=0 ∆=1 33 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  46. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD An example t=2 t=0 t=1.5 a b t=2 a b b ∆=0 ∆=1 ∆=1 ∆=0 3 2 θ= 1.5 T=3.5 T=5 a c 2 d c d c 2 t=0 t=2 t=0 t=3.5 d ∆=0 ∆=1 ∆=0 ∆=1 r(c)++,r(b)++ t=0 t=2 ∆=0 ∆=0 a b T=5 d c t=0 t=2 ∆=0 ∆=0 33 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  47. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD An example t=2 t=0 t=1.5 a b t=2 a b b ∆=0 ∆=1 ∆=1 ∆=0 3 2 θ= 1.5 T=3.5 T=5 a c 2 d c d c 2 t=0 t=2 t=0 t=3.5 d ∆=0 ∆=1 ∆=0 ∆=1 r(c)++,r(b)++ t=0 t=0 t=2 t=2 ∆=0 ∆=0 ∆=1 ∆=0 a b a b θ=1 T=4 T=5 d c d c t=0 t=2 t=0 t=2 ∆=1 ∆=0 ∆=0 ∆=0 33 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  48. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD An example t=2 t=0 t=1.5 a b t=2 a b b ∆=0 ∆=1 ∆=1 ∆=0 3 2 θ= 1.5 T=3.5 T=5 a c 2 d c d c 2 t=0 t=2 t=0 t=3.5 d ∆=0 ∆=1 ∆=0 ∆=1 r(c)++,r(b)++ t=0 t=0 t=2 t=3 t=0 t=2 ∆=0 ∆=0 ∆=1 ∆=0 a b a b a b θ=1 θ=1 T=3 T=4 T=5 Critical cycle! d c d c d c t=0 t=2 t=0 t=2 t=1 t=2 ∆=1 ∆=0 ∆=0 ∆=0 33 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  49. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Computation complexity Complexity per iteration O ( | V | 2 | E | ) 34 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  50. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Computation complexity Complexity per iteration O ( | V | 2 | E | ) # of iterations O ( | V | · N ff ), where N ff is the total # of FFs in any simple path 34 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  51. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Computation complexity Complexity per iteration O ( | V | 2 | E | ) # of iterations O ( | V | · N ff ), where N ff is the total # of FFs in any simple path Entire algorithm O ( | V | 3 | E | · N ff ) in the worst case 34 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  52. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Computation complexity Complexity per iteration O ( | V | 2 | E | ) # of iterations O ( | V | · N ff ), where N ff is the total # of FFs in any simple path Entire algorithm O ( | V | 3 | E | · N ff ) in the worst case Remarkable efficiency in practice 34 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  53. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Benchmark ISCAS-89 1st test set: treat gates as blocks 2nd test set: circuits w/ non-complete bipartite blocks 35 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  54. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Benchmark ISCAS-89 1st test set: treat gates as blocks 2nd test set: circuits w/ non-complete bipartite blocks Use hMETIS to partition a circuit into groups Treat each group as a block 35 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  55. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Block models a d 1 +d 2 w d 1 +d 2 x d 1 Complete Bipartite b y d 3 +d 4 d 5 +d 4 c h z d 5 Non-Complete Bipartite 36 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  56. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Optimal clock period Circuit | V | | E | N ff w/o non-CB w/ non-CB T opt T opt #Step #Par #Step s386 519 700 6 13 51.1 50 1 55.0 s400 511 665 21 120 32.2 50 1 50.6 s444 557 725 21 289 35.2 40 1 63.2 s838 1299 1206 32 2 76.0 130 1 84.0 s953 1183 1515 29 31 60.6 110 2 69.5 s1488 2054 2780 6 11 70.6 200 1 73.3 s1494 2054 2792 6 63 76.9 160 1 79.9 s5378 7205 8603 179 26 111.2 500 1 115.3 s13207 19816 22999 669 129 239.5 1000 1 292.8 s35932 46097 58266 1728 68 148.3 2000 1 163.2 53473 66964 1452 126 2000 1 s38584 204.0 264.0 37 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  57. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Running time comparison (in seconds) t bs1 [ICCAD’03], t bs2 [DATE’04], precision=0.1 Circuit w/o non-CB blocks w/ non-CB blocks t bs1 t bs2 t new t bs1 t bs2 t new s386 1.97 0.01 0.00 3.67 0.01 0.00 s400 1.64 0.01 0.03 3.38 0.01 0.00 s444 2.23 0.03 0.09 4.31 0.01 0.00 s838 8.79 0.03 0.00 33.42 0.02 0.00 s953 9.76 0.04 17.56 0.07 0.02 0.00 s1488 35.17 0.08 0.08 98.88 0.05 0.00 s1494 34.13 0.08 62.86 0.09 0.06 0.00 s5378 684.6 0.24 0.31 1344.74 0.29 0.00 s13207 - 1.07 - 206.52 3.46 0.02 s35932 - 18.63 7.55 - 6.19 0.19 s38584 - 7.44 - 21992.67 30.17 0.19 38 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  58. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Summary Scaling trend introduces more multiple clock period interconnects Retiming is a critical technique for wire pipelining with correctness 39 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  59. SoC Design Issues Block models and problem formulation Wire Retiming for Global Interconnects Incremental retiming algorithms for wire pipelining Buffer Insertion for SoC Circuits Experimental results Multicore Parallel CAD Summary Scaling trend introduces more multiple clock period interconnects Retiming is a critical technique for wire pipelining with correctness An efficient new algorithm is proposed and tested Without binary search Exact optimality Polynomial time bounded Simple implementation Efficient in practice Incremental in nature 39 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  60. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Outline 1. SoC Design Issues 2. Wire Retiming for Global Interconnects Block models and problem formulation Incremental retiming algorithms for wire pipelining Experimental results 3. Buffer Insertion for SoC Circuits Motivation and Problem Formulation Efficient Algorithms Based on Network Flow Experimental Results 4. Multicore Parallel CAD Multicore Revolution and CAD Challenges Nondeterministic Transactional Algorithm Mapping Algorithm to Multicore Program Experimental Results 40 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  61. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Buffers Are Everywhere Saxena et al. TCAD04 Projected that as many as 70% cells could just be buffers. 100 90 80 70 %age of blocks as buffers 60 50 40 30 20 10 0 90nm 65nm 45nm 32nm 41 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  62. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Efficient and Effective Techniques Are Needed 42 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  63. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Efficient and Effective Techniques Are Needed Most prior researches focusing on buffering a single net: van Ginneken ISCAS90 Shi et al. DAC03 42 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  64. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Efficient and Effective Techniques Are Needed Most prior researches focusing on buffering a single net: van Ginneken ISCAS90 Shi et al. DAC03 However, how to buffer a whole circuit is the ultimate issue 42 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  65. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Efficient and Effective Techniques Are Needed Most prior researches focusing on buffering a single net: van Ginneken ISCAS90 Shi et al. DAC03 However, how to buffer a whole circuit is the ultimate issue “Budgeting + buffering each net” won’t work since we do not know budgeting cost a priori 42 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  66. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Simply Buffering Each Net Optimally is Overkill A O B T o Minimal delay? No. Budgeting? C 43 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  67. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Simply Buffering Each Net Optimally is Overkill A Budgeting? How? O B T o C 43 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  68. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Our Goal Problem Given a combinational circuit, insert minimal number of buffers such that the timing constraint is satisfied. 44 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  69. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Our Goal Problem Given a combinational circuit, insert minimal number of buffers such that the timing constraint is satisfied. It is NP-hard [Liu et al. ICCD99] 44 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  70. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Our Goal Problem Given a combinational circuit, insert minimal number of buffers such that the timing constraint is satisfied. It is NP-hard [Liu et al. ICCD99] So, we just want to solve effectively and efficiently but not optimally 44 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  71. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Limited Existing Approaches Lagrangian relaxation based [Liu et al. ICCD99, DATE00] Path-based [Sze et al. DAC05] 45 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  72. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Lagrangian relaxation based [Liu et al. ICCD99, DATE00] Objective function: α � e ∈ E K e . 46 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  73. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Lagrangian relaxation based [Liu et al. ICCD99, DATE00] Objective function: α � e ∈ E K e . Objective function after LR: � e ∈ E ( α K e + β e d e ). 46 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  74. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Lagrangian relaxation based [Liu et al. ICCD99, DATE00] Objective function: α � e ∈ E K e . Objective function after LR: � e ∈ E ( α K e + β e d e ). Sensitive to α 46 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  75. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Lagrangian relaxation based [Liu et al. ICCD99, DATE00] Objective function: α � e ∈ E K e . Objective function after LR: � e ∈ E ( α K e + β e d e ). Sensitive to α How to determine α ? 46 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  76. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Lagrangian relaxation based [Liu et al. ICCD99, DATE00] Objective function: α � e ∈ E K e . Objective function after LR: � e ∈ E ( α K e + β e d e ). Sensitive to α How to determine α ? Lagrangian relaxation based: expensive, over-buffering 46 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  77. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Path based [Sze et al. DAC05] Select a set of critical paths How to determine the number of critical paths? Performance compared with Lagrangian relaxation based? 47 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  78. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Problem formulation PI PO s t E = E P ∪ E F 48 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  79. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Problem formulation Minimize � ( i , j ) ∈ E P K ij (5) s.t. a i + d ij ≤ a j ∀ ( i , j ) ∈ E (6) a t − a s ≤ REQ (7) where K ij is the number of buffers on edge ( i , j ), a i is the arrival time at vertex i , d ij is the delay of edge ( i , j ), and REQ is the timing constraint. 49 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

  80. SoC Design Issues Motivation and Problem Formulation Wire Retiming for Global Interconnects Efficient Algorithms Based on Network Flow Buffer Insertion for SoC Circuits Experimental Results Multicore Parallel CAD Difficulty in buffering It’s a discrete optimization problem 50 Prof. Hai Zhou EECS Northwestern University Optimization Algorithms and Parallel Programming in Physical and

Recommend


More recommend