Clock Skew Scheduling A Fast and Effective Approach Ankur Sharma, - PowerPoint PPT Presentation

Lagrangian Relaxation Based Gate Sizing with Clock Skew Scheduling – A Fast and Effective Approach Ankur Sharma, David Chinnery Mentor, a Siemens Business Chris Chu Iowa State University, Computer Engineering

Outline ◼ Motivation – Previous work – Contribution ◼ Problem statement ◼ Previous approach ◼ Our proposed approach ◼ Experimental results ◼ Conclusion 2

Motivation ◼ Gate sizing is a key circuit optimization technique — Can trade off area, delay, and power — Delay-constrained leakage power minimization ◼ Skewing the clock arrival allows time borrowing between sequential stages. This is known as useful skew . ◼ Timing borrowing can be used for: — Increasing performance or satisfying delay constraints — Timing slack to reduce area or power 3

Simultaneous Gate Sizing with Skew Scheduling Clock Period, T = 20 Flip Flip Flip Delay = 16 Delay = 24 Q D flop flop D flop D Q Q A B C a clk,B = 4,24,44,… a clk,A = 0,20,40,… a clk,C = 0,20,40,… skew B = 4 skew A = 0 skew C = 0 ◼ Signal is required to travel within one clock cycle ◼ Clock skew alters the required and arrival times 4

Previous Work ◼ [Chuang’95] formulated the primal problem as a linear program. — Piece-wise linear approximation of convex delays ◼ [Roy’07] formulated a Lagrangian dual problem (LDP). Solved the Lagrangian sub-problem simultaneously over size and skew. — Assumed continuous sizes and convex delays ◼ [Wang’09] transformed the primal problem to eliminate skew variables. Formulated an LDP and maximized the dual. — Used network flow solver to update Lagrange multipliers — Optimal for continuous sizes and convex delays ◼ [ Shklover’12] formulated an LDP with discrete sizes and skews. — Focus on clock tree optimization via dynamic programming 5

Our Contributions ◼ Integration of clock skew scheduler inside LR gate sizer ( EGSS ). — Our LR formulation preserves the acyclic structure of the timing graph. — Modify Lagrange multiplier update to account for skew — A new strategy for solving the Lagrangian sub-problem with skew variables ◼ For comparison, we extended the dual maximization strategy from [Wang’09] to apply to discrete sizes and non-convex delay ( NetFlow ). ◼ We identify and empirically demonstrate several limitations of realizing primal optimality via dual maximization. 6 [Wang’09] J . Wang, D. Das, and H. Zhou. Gate sizing by Lagrangian relaxation revisited. IEEE TCAD 28(7):1071 – 1084, 2009.

Primal Problem Formulation Minimize total leakage power minimize 𝑞 𝒚, 𝒙 𝒚,𝒃,𝒙 subject to 𝑏 𝑗 + 𝑒 𝑗𝑘 𝒚 ≤ 𝑏 𝑘 , ∀ 𝑗, 𝑘 ∈ 𝐹 Timing constraints 𝑏 𝑒 𝑙 ≤ 𝑈 − 𝑡𝑓𝑢𝑣𝑞 𝑙 + 𝑥 𝑙 , ∀𝑙 ∈ 𝐺𝐺 𝑥 𝑙 + 𝑒 𝑑𝑚𝑙,𝑟 𝑙 ≤ 𝑏 𝑟 𝑙 , ∀𝑙 ∈ 𝐺𝐺 Skew bounds 𝑥 𝑛𝑗𝑜 ≤ 𝑥 𝑙 ≤ 𝑥 𝑛𝑏𝑦 , ∀𝑙 ∈ 𝐺𝐺 T : target clock period x : cell sizes a i : arrival time at node i ( i , j ) : timing arc from node i to node j E : set of all timing arcs d ij : delay of timing arc from node i to node j w k : skew at flip-flop k FF : set of flip-flops 7

Timing Graph ◼ Graphical representation of timing constraints Timing graph Timing constraints Circuit j i a j a i 𝑏 𝑗 + 𝑒 𝑗𝑘 𝒚 ≤ 𝑏 𝑘 𝑒 𝑗𝑘 𝑏 𝑒 𝑙 + 𝑡𝑓𝑢𝑣𝑞 𝑙 − 𝑥 𝑙 ≤ 𝑈 flip-flop k 𝑏 𝑒 𝑙 𝑏 𝑟 𝑙 𝑥 𝑙 + 𝑒 𝑑𝑚𝑙,𝑟 𝑙 ≤ 𝑏 𝑟 𝑙 D k Q k 𝑡𝑓𝑢𝑣𝑞 𝑙 𝑒 𝑑𝑚𝑙,𝑟 𝑙 Clk k 𝑥 𝑙 −𝑥 𝑙 Clock node 𝑏 𝐽 = 0 𝑏 𝑃 = 𝑈 Dummy nodes 8

NetFlow – Skew Elimination ◼ Due to [Wang’09]. We refer to it as NetFlow . 𝑒 𝑗𝑘 𝑏 𝑒 𝑙 + 𝑡𝑓𝑢𝑣𝑞 𝑙 − 𝑥 𝑙 ≤ 𝑈 𝑥 𝑙 + 𝑒 𝑑𝑚𝑙,𝑟 𝑙 ≤ 𝑏 𝑟 𝑙 𝑏 𝑒 𝑙 𝑏 𝑟 𝑙 𝑥 𝑛𝑗𝑜 ≤ 𝑥 𝑙 ≤ 𝑥 𝑛𝑏𝑦 𝑡𝑓𝑢𝑣𝑞 𝑙 𝑒 𝑑𝑚𝑙,𝑟 𝑙 −𝑥 𝑙 𝑥 𝑙 O and I are dummy nodes. 𝑏 𝐽 = 0 𝑏 𝑃 = 𝑈 𝑏 𝑒 𝑙 + 𝑡𝑓𝑢𝑣𝑞 𝑙 − 𝑈 ≤ 𝑥 𝑙 ≤ 𝑏 𝑟 𝑙 − 𝑒 𝑑𝑚𝑙,𝑟 𝑙 𝑒 𝑗𝑘 𝑥 𝑛𝑗𝑜 ≤ 𝑥 𝑙 ≤ 𝑥 𝑛𝑏𝑦 No skews, but there are loops in the timing graph. 𝑏 𝑒 𝑙 𝑏 𝑟 𝑙 𝑒 𝑑𝑚𝑙,𝑟 𝑙 𝑡𝑓𝑢𝑣𝑞 𝑙 𝑏 𝑒 𝑙 + 𝑡𝑓𝑢𝑣𝑞 𝑙 − 𝑈 ≤ 𝑥 𝑛𝑏𝑦 −𝑈 −𝑥 𝑛𝑏𝑦 𝑥 𝑛𝑗𝑜 ≤ 𝑏 𝑟 𝑙 − 𝑒 𝑑𝑚𝑙,𝑟 𝑙 𝑥 𝑛𝑗𝑜 𝑏 𝑒 𝑙 + 𝑡𝑓𝑢𝑣𝑞 𝑙 − 𝑈 ≤ 𝑏 𝑟 𝑙 − 𝑒 𝑑𝑚𝑙,𝑟 𝑙 𝑏 𝐽 = 0 𝑏 𝑃 = 𝑈 New arc 9 [Wang’09] J . Wang, D. Das, and H. Zhou. Gate sizing by Lagrangian relaxation revisited. IEEE TCAD 28(7):1071 – 1084, 2009.

NetFlow – Lagrangian Relaxation Formulation 𝑒 𝑗𝑘 Primal problem: 𝑏 𝑒 𝑙 𝑏 𝑟 𝑙 minimize 𝑞 𝒚 𝑒 𝑑𝑚𝑙,𝑟 𝑙 𝒚,𝒃 𝑡𝑓𝑢𝑣𝑞 𝑙 −𝑈 subject to −𝑥 𝑛𝑏𝑦 𝑥 𝑛𝑗𝑜 𝑏 𝑗 + 𝑒 𝑗𝑘 𝒚 ≤ 𝑏 𝑘 , ∀ 𝑗, 𝑘 ∈ 𝐹 𝑏 𝑒 𝑙 + 𝑡𝑓𝑢𝑣𝑞 𝑙 − 𝑈 ≤ 𝑥 𝑛𝑏𝑦 , ∀𝑙 ∈ 𝐺𝐺 𝑏 𝐽 = 0 𝑏 𝑃 = 𝑈 𝑥 𝑛𝑗𝑜 ≤ 𝑏 𝑟 𝑙 − 𝑒 𝑑𝑚𝑙,𝑟 𝑙 , ∀𝑙 ∈ 𝐺𝐺 Lagrangian relaxation sub-problem (LRS λ ) 𝑏 𝑒 𝑙 + 𝑡𝑓𝑢𝑣𝑞 𝑙 − 𝑈 ≤ 𝑏 𝑟 𝑙 − 𝑒 𝑑𝑚𝑙,𝑟 𝑙 , ∀𝑙 ∈ 𝐺𝐺 𝑕 𝝁 = min 𝑀 𝝁 (𝒚) 𝑦 𝑕 ∈ 𝑌 𝑕 , ∀𝑕 ∈ 𝐻 𝒚 Lagrangian dual problem (LDP): Lagrangian function: maximize 𝑕 𝝁 𝝁≥𝟏 𝑀 𝝁 𝒚 = 𝑞 𝒚 + ෍ 𝜇 𝑗𝑘 × 𝑑𝑝𝑡𝑢 𝑗𝑘 (𝒚) subject to 𝑗,𝑘 ∈𝐹 𝝁 ∈ Ω = 𝝁 σ 𝑗|(𝑗,𝑣)∈𝐹 𝜇 𝑗𝑣 = σ 𝑗|(𝑣,𝑗)∈𝐹 𝜇 𝑣𝑗 , ∀𝑣 ∈ 𝑂 cost ij is the cost of arc ( i , j ) , i.e. d ij , setup k , etc. flow conservation λ ij is the Lagrange multiplier for timing arc ( i , j ) . where N is the set of all nodes in the timing graph. Network flow solver to update λ . 10 [Wang’09] J . Wang, D. Das, and H. Zhou. Gate sizing by Lagrangian relaxation revisited. IEEE TCAD 28(7):1071 – 1084, 2009.

NetFlow – Dual Maximization Lagrangian dual problem (LDP): LRS λ : maximize 𝑕 𝝁 𝝁≥𝟏 𝑕 𝝁 = min 𝑞 𝒚 + ෍ 𝜇 𝑗𝑘 × 𝑑𝑝𝑡𝑢 𝑗𝑘 (𝒚) subject to 𝒚 flow conservation constraints on 𝝁 𝑗,𝑘 ∈𝐹 Iteratively, ◼ Update 𝝁 , for given 𝒚 subject to flow constraints — Formulated as a min-cost network flow problem. Run time expensive ◼ Update 𝒚 , for given 𝝁 — Heuristically solve LRS – a discrete combinatorial optimization problem. Focus is dual maximization rather than primal feasibility 11 [Wang’09] J . Wang, D. Das, and H. Zhou. Gate sizing by Lagrangian relaxation revisited. IEEE TCAD 28(7):1071 – 1084, 2009.

NetFlow – Visualizing Dual Maximization For a single gate circuit: 𝑞(𝑦) Update 𝝁 rotates line 𝑀 𝝁 𝑦 = 𝑞 𝑦 + 𝜇 × 𝑒 𝑦 − 𝑈 around 𝑦 1 Slope: −𝜇 2 Equation of line on p ( x ) vs. d ( x ) – T plane: ◼ The slope is − 𝝁 . ◼ L 𝝁 ( x ) is the intercept on the p ( x ) axis 0 𝑒 𝑦 1 > 𝑈 𝑌 𝑞(𝑦) Constraint violation ⇒ Increase 𝝁 𝑞(𝑦) 𝑞 𝑦 = −𝜇 1 × 𝑒 𝑦 − 𝑈 + 𝑀 𝜇 1 (𝑦) 𝜇 ∗ = 𝜇 3 𝜇 2 𝒉 𝝁 ∗ = 𝒒 ∗ 𝑞 ∗ To solve LRS λ , push 𝜇 1 𝑦 1 line as low as 𝑕(𝜇 1 ) possible while x ∈ X 𝑞 𝑛𝑗𝑜 𝜇 0 = 0 0 Primal feasible 𝑒(𝑦) − 𝑈 𝑕 0 = 𝑞 𝑛𝑗𝑜 12 𝑒(𝑦) − 𝑈

NetFlow: Dual Maximization Limitations with Discrete Sizes ◼ Duality gap: Dual optimum may not 𝜇 = 𝜇 4 be equal to primal optimum, g * < p * 𝑦 5 𝑞(𝑦) 𝜇 = 𝜇 3 ◼ Primal feasibility: At dual optimum, Dual optimal, 𝑕 ∗ 𝑦 4 𝑦 ∗ multiple sizing solutions are possible & 𝑞 ∗ 𝜇 = 𝜇 2 some don’t satisfy timing constraints. — The dual optimal 𝑕(𝜇 3 ) is realized at 𝑦 3 as 𝑦 3 𝑦 2 well as 𝑦 4 , but only 𝑦 4 is primal feasible. 𝑦 1 𝜇 = 0 𝑞 𝑛𝑗𝑜 ◼ Dual optimality is not guaranteed, 𝑒(𝑦) − 𝑈 as LRS solver is no longer optimal Each dot denotes a distinct sizing solution. 13

NetFlow: Dual Maximization Limitations with Discrete Sizes ◼ Three profiles are shown: — Primal cost (blue dash) — Dual cost (blue dash-dot) — Total negative slack (TNS) (red solid) ◼ Dual cost is less than primal cost. — Gap is roughly 20% wide; may partly be due to the duality gap. ◼ TNS does not converge to zero. — Oscillations prevent convergence ◼ Due to discreteness and non- convexity, dual maximization does not guarantee primal feasibility 14

Effective Gate Sizer and Skew Scheduler (EGSS) ◼ Seamlessly integrates with state-of-the-art discrete LR gate sizer ◼ Re-use LRS solver from discrete LR gate sizer — Focus on primal feasibility rather than exact computation of dual function — Extend the LRS solver to iteratively size gates and schedule skews ◼ Explicitly update skews rather than deducing them implicitly ◼ Modify and apply projection based Lagrange multiplier update — Compared to min-cost flow solver based multiplier update – Linear runtime complexity, more than a order of magnitude faster – Much better convergence — Requires the timing graph to be loop-free 15

Clock Skew Scheduling A Fast and Effective Approach Ankur Sharma, - PowerPoint PPT Presentation

Lagrangian Relaxation Based Gate Sizing with Clock Skew Scheduling A Fast and Effective Approach Ankur Sharma, David Chinnery Mentor, a Siemens Business Chris Chu Iowa State University, Computer Engineering Outline Motivation

Clock IC Product Update Clock IC Product Update Clock Distribution and Clock Generation Solutions

Probability BIO5312 FALL2017 STEPHANIE J. SPIELMAN, PHD Skew Symmetric Left-skew Right-skew

Goals The Clock introduce clock signal. logical level clock fall clock rise Chapter 11:

On Skew-Homomorphisms B. Kuzma 1 G. Dolinar G. Nagy P . Szokol 1 UP FAMNIT May 28, 2015

Grid Grid to Grid Grid-to to Ports Clock Routing for to-Ports Clock Routing for Ports Clock

Retroactively estimating system clock skew from stored web browser cookies Contents 1. Why? 2.

Priority-Driven Scheduling of Periodic Tasks Priority-driven vs. clock-driven scheduling:

Aperiodic Task Scheduling Radek Pel anek Preemptive Scheduling Non-preemptive Scheduling

Time skew analysis using web cookies Bj orgvin Ragnarsson 07-03-2013 Time skew analysis using

Hook formulas for skew shapes Greta Panova (University of Pennsylvania) joint with Alejandro

M obius disjointness for skew products on T \ G Jianya LIU Shandong University Cetraro

Heavy tails: right skew ! Right skew ! normal distribution (not heavy tailed) ! e.g. heights of

Braided skew monoidal categories Stephen Lack Macquarie University joint work with John Bourke

Higher product levels of skew fields J. Cimpri c July 1, 2004 1 product levels levels of

Packet Scheduling: Weighted Fair Queueing (WFQ) ( ) and Virtual Clock (VC) and Virtual Clock

Clock Synchronization Synchronization Clock Henrik Lnn Electronics & Software Volvo

Inclusive, Local Hiring Building the Pipeline to a Healthy Community David Zuckerman Debbi

Object Detection JunYoung Gwak 1 Motivation Image classification Input: Image

Atomic physics with twisted light Andrey Surzhykov Technische Universitt Braunschweig

1 Important Things to Know Archive Version (POP-UP)

Vid Video o Hyp yperlin linkin king (LNK) K) TR TRECVi CVid 2017 2017 Maria Eskevich

Wormhole: A Fast Ordered Index for In-memory Data Management(II) Main Paper : Wormhole: A Fast

Discovery of Genomic Structural Variations with Next-Generation Sequencing Data Advanced Topics

ANCHOR B MARKET 20% 2 M Slips and 23x Commercial Falls Fishermen More Fatal than Avg.

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Clock Skew Scheduling A Fast and Effective Approach Ankur Sharma, - PowerPoint PPT Presentation

Lagrangian Relaxation Based Gate Sizing with Clock Skew Scheduling A Fast and Effective Approach Ankur Sharma, David Chinnery Mentor, a Siemens Business Chris Chu Iowa State University, Computer Engineering Outline Motivation

Clock IC Product Update Clock IC Product Update Clock Distribution and Clock Generation Solutions

Probability BIO5312 FALL2017 STEPHANIE J. SPIELMAN, PHD Skew Symmetric Left-skew Right-skew

Goals The Clock introduce clock signal. logical level clock fall clock rise Chapter 11:

On Skew-Homomorphisms B. Kuzma 1 G. Dolinar G. Nagy P . Szokol 1 UP FAMNIT May 28, 2015

Grid Grid to Grid Grid-to to Ports Clock Routing for to-Ports Clock Routing for Ports Clock

Retroactively estimating system clock skew from stored web browser cookies Contents 1. Why? 2.

Priority-Driven Scheduling of Periodic Tasks Priority-driven vs. clock-driven scheduling:

Aperiodic Task Scheduling Radek Pel anek Preemptive Scheduling Non-preemptive Scheduling

Time skew analysis using web cookies Bj orgvin Ragnarsson 07-03-2013 Time skew analysis using

Hook formulas for skew shapes Greta Panova (University of Pennsylvania) joint with Alejandro

M obius disjointness for skew products on T \ G Jianya LIU Shandong University Cetraro

Heavy tails: right skew ! Right skew ! normal distribution (not heavy tailed) ! e.g. heights of

Braided skew monoidal categories Stephen Lack Macquarie University joint work with John Bourke

Higher product levels of skew fields J. Cimpri c July 1, 2004 1 product levels levels of

Packet Scheduling: Weighted Fair Queueing (WFQ) ( ) and Virtual Clock (VC) and Virtual Clock

Clock Synchronization Synchronization Clock Henrik Lnn Electronics &amp; Software Volvo

Inclusive, Local Hiring Building the Pipeline to a Healthy Community David Zuckerman Debbi

Object Detection JunYoung Gwak 1 Motivation Image classification Input: Image

Atomic physics with twisted light Andrey Surzhykov Technische Universitt Braunschweig

1 Important Things to Know Archive Version (POP-UP)

Vid Video o Hyp yperlin linkin king (LNK) K) TR TRECVi CVid 2017 2017 Maria Eskevich

Wormhole: A Fast Ordered Index for In-memory Data Management(II) Main Paper : Wormhole: A Fast

Discovery of Genomic Structural Variations with Next-Generation Sequencing Data Advanced Topics

ANCHOR B MARKET 20% 2 M Slips and 23x Commercial Falls Fishermen More Fatal than Avg.

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Clock Synchronization Synchronization Clock Henrik Lnn Electronics & Software Volvo