BonnPlace : A Self-Stabilizing Placement Framework Ulrich Brenner, Anna Hermann, Nils Hoppmann, Philipp Ochsendorf Research Institute for Discrete Mathematics, University of Bonn ISPD 2015 1
Placement Problem Placement Problem Placement area A , blockages B , cells C . Instance: Compute a placement of cells C into A respecting given con- Task: straints and optimizing given objectives. Constraints: ◮ Overlap-free ◮ Respect movebounds ◮ Respect placement area A ◮ Placement in rows Objectives: ◮ Net length minimization ◮ Timing optimization ◮ Low power consumption ◮ Routability ◮ Low manufacturing costs 2
Analytical Placement Minimize analytical global objective function (ignoring most constraints): A placement minimizing quadratic netlength on Franziska (633 666 cells, 22 nm). Task: Work towards an overlap-free placement. Two ideas: Partitioning-based and force-directed placement. 3
Partitioning-Based Placement Idea ◮ Partition chip area recursively into regions. ◮ Assign cells to regions they fit into. ◮ Advantages: ◮ Very effective and efficient. ◮ Many different constraints can be considered accurately (e.g. bounds on density, blockages, movebounds etc.). ◮ Drawbacks: ◮ Lack of stability. ◮ Hard to reflect standard objective functions during partitioning (e.g. wirelength). 4
Levels of Partitioning-Based BonnPlace Assignments to regions in levels 1 to 3 (upper row) and 4 to 6 (lower row) on Franziska. 5
Force-Directed Placement Idea: ◮ Pull cells apart from each other in small steps. ◮ Integrate forces into objective function. ◮ Advantages: ◮ Very stable. ◮ Overall objective function is always considered. ◮ Produces very good results in practice. ◮ Drawbacks: ◮ Exact observance of density constraints is difficult. ◮ Placement decisions in a fragmented chip area can be arbitrary. ◮ Complex objective functions hard to model (e.g. congestion and timing). ◮ Significant effort in the legalization may be necessary. 6
Previous Work Partitioning-Based Placers: ◮ Grid Warping [Xiu, Rutenbar ’07]: ◮ Minimize density violations in non-uniform grid and scale to regular bins. ◮ Partitioning-Based BonnPlace [Struzyna ’13]: ◮ Compute cell assignments using flow-based partitioning. Force-Directed Placers: ◮ NTUPlace4 [Hsu, Chou, Link, Chang ’11]: ◮ Penalize violations of locally smoothened density functions. ◮ SimPL [Kim, Lee, Markov ’12] (incl. SimPLR, Ripple, ComPLx, Maple ) : ◮ Run rough but fast legalization. ◮ Pull cells towards legalized positions. ◮ Iterate with new analytical placement. ◮ ePlace [Lu, Chen, Chang, Sha, Huang, Ten, Cheng ’14]: ◮ Translate density violations to potential energy of an electrostatic system. 7
Our Approach: Self-Stabilizing BonnPlace Idea: Integrate a partitioning-based algorithm into a force-directed framework. ◮ Compute forces based on legal partitioning-based placement. ◮ Each iteration produces a competitive placement. ⇒ Timing and congestion evaluation possible. ◮ The placements in subsequent iterations are similar. ⇒ Transferred information on timing and routability is trustworthy. ◮ Single iteration quite time-consuming. ⇒ Only small numbers of iterations affordable. ◮ Incorporate position-based (un-)clustering scheme. 8
Basic Algorithm Algorithm: Self-Stabilizing BonnPlace Input : cells C Output : positions pos(c) for all cells c ∈ C 1 iter ← 0 2 while not BreakCondition(pos, iter) do foreach c ∈ C do 3 Connect c to a new pin at position pos(c) via a virtual net of weight 0 . 01 · iter 4 Partitioning-based GlobalPlacement with position-based (un-)clustering 5 Legalization 6 foreach c ∈ C do 7 Store current location in pos(c) 8 iter ← iter + 1 9 9
Forces in Self-Stabilizing BonnPlace Iteration 0 : Quadratic Placement Iteration 0 : Legal Placement Iteration 0 : Forces Iteration 1 : Quadratic Placement The impact of forces on the first iteration on Beate (41 287 cells, 22 nm). 10
Cell Spreading during Iterations (1) Iteration 0 Iteration 1 Iteration 2 Iteration 9 Cell spreading after Global QP on Beate. 11
Cell Spreading during Iterations (2) Iteration 0 Iteration 1 Iteration 2 Iteration 9 Cell spreading after Global QP on Renaud (324 595 cells, 45 nm). 12
Self-Stabilizing Behavior 9 . 0 8 . 0 Net length [in m] 7 . 0 6 . 0 5 . 0 4 . 0 3 . 0 2 . 0 0 1 2 3 4 5 6 7 8 9 Iterations Bounding box net length development of QP(red) and legal placement (blue) on Meinolf (392 920 cells, 22 nm). 13
Clustering in Self-Stabilizing BonnPlace ◮ Perform partitioning-based GlobalPlacement on clustered netlist. ◮ Compute a new clustering in each iteration of the loop. ◮ Dissolve clusters during the levels of GlobalPlacement . ◮ Methods for both clustering and unclustering are position-based. Levels of GlobalPlacement in a single iteration with coarse clustering on Beate. 14
Position-based BestChoice Clustering Overview of BestChoice clustering algorithm: ◮ Iteratively unite neighboring clusters u and v maximizing a certain clustering score d ( u , v ) ∈ R + . ◮ Stop clustering when a given target ratio α < 1 of cells is reached. Connectivity-based clustering score [Alpert ’05] u 1 u 2 1 w N v 1 v 2 � d c ( u , v ) = a ( u ) + a ( v ) · | N | w 1 w 2 u , v ∈ N BonnPlace position-based clustering score x 1 x 2 d c ( u , v ) if BB( u , v ) ≤ h � � BB( u , v )+ s d p ( u , v ) = a ( u ) total area of cluster u weight of net N w N 0 otherwise | N | number of cells in net N BB ( u , v ) half perimeter of the bounding box of all cells in u or v s , h chip-dependent constants 15
Position-based Unclustering Common Idea for Unclustering: ◮ Dissolve a cluster if its size is large w. r. t. window size in the current placement level. ◮ Leads to sharp unclustering in certain levels. BonnPlace position-based unclustering: ◮ Dissolve clusters with members tending into distinct areas. ◮ Keep cells clustered if their respective optimum positions are close together. ◮ Still dissolve clusters being very large w. r. t. window size. AB AB A B A B 16
Complete Algorithm: Timing- and Congestion-Driven Algorithm: Self-Stabilizing BonnPlace Input : cells C Output : positions pos(c) for all cells c ∈ C 1 iter ← 0 2 while not BreakCondition(pos, iter) do foreach c ∈ C do 3 Connect c to a new pin at position pos(c) via a virtual net of weight 0 . 01 · iter 4 Partitioning-based GlobalPlacement with position-based (un-)clustering 5 Legalization 6 if timing optimization enabled then 7 TimingOptimization 8 if routability driven placement enabled then 9 CongestionAvoidance 10 foreach c ∈ C do 11 Store current location in pos(c) 12 iter ← iter + 1 13 17
Timing-Driven BonnPlace ◮ Apply timing optimization steps at the end of each iteration Local optimization: ◮ LayerAssignment : Assign critical nets to higher routing layers. ◮ RefinePlace : Locally move cells to straighten timing-critical paths. [Bock, Held, K¨ ammerling, Schorr: DAC’15] Global optimization: ◮ Increase net weights on remaining critical paths. ◮ Increase force weights of cells moved by RefinePlace . 18
Slack Distribution during Iterations Iteration 0 Iteration 1 Iteration 2 Iteration 9 A legalized placement on Ida (20 617 cells, 22 nm) after Slack [ps] selected iterations with cells colored by their slack. 19
Congestion-Driven BonnPlace ◮ Run a simplified version of BonnRouteGlobal as a congestion estimation at the end of each iteration. [Ahrens, Gester, Klewinghaus, M¨ uller, Peyer, Schulte, T´ ellez ’15] ◮ Inflate cells in routing-critical areas (for next iteration). ◮ Force router to be quite pessimistic, and forbid larger detours of nets. ◮ Dynamically adaption of the target congestion: ◮ On very congestion-critical chips, the goal is to reduce the congestion to 100 % everywhere. ◮ On less critical chips, the target congestion is stepwise reduced to 90 %. ◮ Less congestion on uncritical chips helps to reduce routing detours. 20
Congestion during Iterations (1) Congestion estimation on Renaud in iterations 0 to 2 (upper row) and 3 to 5 (lower row). 21
Congestion during Iterations (2) Internal pessimistic congestion estimation on superblue9 in iterations 0 and 1 (left, upper row) and iterations 2 and 3 (left, lower row); accurate estimation after placement (right). 22
Self-Stabilizing Behavior 100 10 1 0.1 0 → 1 1 → 2 2 → 3 3 → 4 4 → 5 5 → 6 6 → 7 7 → 8 8 → 9 Linear movement [m] between iterations on the 22 nm designs Ida ( , 20 617 cells), Leo ( , 31 590 cells), Antonio ( , 103 795 cells) and Benedikt ( , 370 210 cells). All runs with timing optimization. Note the logarithmic scaling. 23
Recommend
More recommend