Speaker: Jianchao Lu Jianchao Lu, Xiaomi Mao, Baris Taskin VLSI Lab - PowerPoint PPT Presentation

Speaker: Jianchao Lu Jianchao Lu, Xiaomi Mao, Baris Taskin VLSI Lab Electrical & Computer Engineering Drexel University 1

Outline  Preliminaries  Previous Works  Methodology  Experimental Results  Conclusions 2

Clock Mesh Network  Consists of top level clock tree, mesh grids and stub wires. 3

Power Dissipation on Clock Network  Clock network is a global network of interconnect wires and buffers.  Clock signal switching introduces a lot of dynamic power dissipation.  Consumes more than 40% of the total power.   2 P C V f clk Switching VDD Frequency factor Capacitance 4

Power Dissipation on Clock Network  Clock network is a global network of interconnect wires and buffers.  Clock signal switching introduces a lot of dynamic power dissipation.  Consumes more than 40% of the total power.   2 P C V f clk Switching VDD Frequency Switching capacitance = α *C_total = factor (C_grid + C_stub + C_tree) Capacitance 5

Most Relevant Previous Works  [1] A. Rajaram and D. Pan, Meshworks: An efficient framework for planning, synthesis and optimization of clock mesh networks. In Asia and South Pacific Design Automation Conference (ASPDAC), Jan. 2008.  [2] M. R. Guthaus, G. Wilke, and R. Reis, Non-uniform clock mesh optimization with linear programming buffer insertion. In Proceedings of the ACM/IEEE Design Automation Conference (DAC), June 2010.  [3] Minsik Cho, David Z. Pan and Ruchir Puri, Novel Binary Linear Programming for High Performance Clock Mesh Synthesis, In Proceedings of IEEE/ACM Int'l Conference on Computer-Aided Design (ICCAD), San Jose, CA, November 2010. 7

Meshworks [1]  Identifies relationship between grid size and total mesh wire.  Optimal grid size based on skew.  Mesh reduction.  Modified buffer driver insertion. [1] A. Rajaram and D. Pan. Meshworks: An efficient framework for planning, synthesis and optimization of clock mesh networks. In Asia and South Pacific Design Automation Conference (ASPDAC), pages 250 – 257, Jan. 2008. 8

Non-uniform Mesh [2] [2] M. R. Guthaus, G. Wilke, and R. Reis. Non-uniform clock mesh optimization with linear programming buffer insertion. In Proceedings of the ACM/IEEE Design Automation Conference (DAC), pages 74 – 79, June 2010. 9

ILP Based Mesh Synthesis [3]  Mesh generation and sink assignment algorithms. [3] Minsik Cho, David Z. Pan and Ruchir Puri, Novel Binary Linear Programming for High Performance Clock Mesh Synthesis, In Proceedings of IEEE/ACM Int'l Conference on Computer-Aided Design (ICCAD), Page 438 — 443, November 2010. 10

Proposed Method  Optimizing the placement during the clock mesh synthesis. 1 2 4 3 12

Step 1: Creating Feasible Moving Region of Each Register Final Fanout Fanin Registers Initial Register 13

Creating Feasible Moving Regions 14

Step 2: Mesh Generations  Registers can be moved in feasible moving regions without negative timing slack.  Choose the minimum amount of mesh tracks that all the registers can be moved on as the mesh network. 19

Step 2: Mesh Generations  Registers can be moved in feasible moving regions without negative timing slack.  Choose the minimum amount of mesh tracks that all the registers can be moved on as the mesh network. 20

Mesh Generation Problem  Problem: Assume each mesh track is a set and each register is an element. Finding the minimum amount of sets that includes all the elements is equivalent to finding the minimum amount of mesh tracks that can connect to the mesh wires.  Greedy algorithm: Greedily add the candidate mesh track with the minimum cost.  Cost of each grid wire = total distance of the registers from the grid/number of new elements added in the solution set. 21

Step 3: Incremental Register Placement Objective: minimizing Objective total stub wire. Timing constraints Subject to:  The timing constraints.  The registers should be non-overlapped. Non-overlap Variables: constraints  Registers locations. 22

The Incremental Placement Results (s35932 in ISCAS89) Before placement After placement 23

Top Level Clock Tree Generation  Insert buffer drivers on the intersection of the mesh grid wires[1][2].  Generate top level clock tree where the sinks are buffer drivers of the mesh grid wires. (Buffered DME) 24

Experimental Results Set 2: Compare the proposed Set 1: Compare the proposed method with [2] using the same method with [2] using different grid sizes. grid sizes. Circuit Proposed [2] Circuit Proposed [2] s13207 6*7 8*8 s13207 6*7 6*7 s15850 5*4 8*8 s15850 5*4 5*4 s35932 11*7 12*12 s35932 11*7 11*7 s38417* 10*9 12*12 s38417* 10*9 10*9 s38584 12*7 11*11 s38584 12*7 12*7 [2] M. R. Guthaus, G. Wilke, and R. Reis. Non-uniform clock mesh optimization with linear programming buffer insertion. In Proceedings of the ACM/IEEE Design Automation Conference (DAC), pages 74 – 79, June 2010. 26

Mesh Wire Reduction Set 1 (Different grid size) Set 2 (Same grid size) Average improvement of 51.9%. Average improvement of 50.8% 27

Clock Power Reduction Set 1 (Different grid size) Set 2 (Same grid size) Average improvement of 48.3%. Average improvement of 28.1% 28

Skew Results (45nm PTM) Set 1 (Different grid size) Set 2 (Same grid size) Average skew is in the same Skew is improved by 0.8ps. range. 29

Trade-off  The trade-off is the logic wirelength change due to the register placement. 30

Implications of Placement Congestion Before Register Placement After Register Placement 31

Routing Congestion  The timing slack is decreased by an average of 22ps, which is very limited compared to the 2ns clock period. 32

Conclusions  Advantages  Significantly reduced power dissipation.  Guaranteed timing slack (pre-routing).  Disadvantages  Power density increase.  Timing slack decrease. 34

Speaker: Jianchao Lu Jianchao Lu, Xiaomi Mao, Baris Taskin VLSI Lab - PowerPoint PPT Presentation

Speaker: Jianchao Lu Jianchao Lu, Xiaomi Mao, Baris Taskin VLSI Lab Electrical & Computer Engineering Drexel University 1 Outline Preliminaries Previous Works Methodology Experimental Results Conclusions 2 Clock Mesh

Xiaomi Group Introduction Founded on April 6 th , 2010 Xiaomi Business Model 1. High Quality

Xiaomi Group Introduction Founded on April 6 th , 2010 Xiaomi Business Model 1. High Quality

Manu Jain Vice President, Xiaomi Global Managing Director, Xiaomi India Founded in April 2010

2014/07 Roborock Founded 2014/09 Invested by XiaoMi Became XiaoMi eco chain company 2015/06

Instant Speed with Google AMP and Facebook Instant Articles Baris Wanschers Baris Wanschers

Usability Testing is not Rocket Science! Baris Sarialioglu I am Baris Sarialioglu... Managing

Non-Negative Graph Embedding N N ti G h E b ddi Jianchao Yang Shuicheng Yan Yun Fu Jianchao

Inventory Planning for Hurricane Events Emmett Lodree, Jr. and Selda Taskin Department of

EEM 3117 Introduction Dr. Sezai Taskin Department of Electrical&Electronics Engineering

Reversing IoT: Xiaomi Ecosystem Gain cloud independence and additional functionality by

Quick wins for an accessible website Baris Wanschers & Marloes Bosch - LimoenGroen Quick wins

Crossed product C -algebras and nuclear dimension Jianchao Wu University of M unster Aug

Keypoint-Based Action Keypoint-Based Action Recognition Recognition Presenter: Jianchao Yang

Online Learning and Online Investing Jia Mao February 20, 2006 Jia Mao () Online Learning and

News and Media Literacy: Building Critical Consumers and Creators Jeff Mao @jmao121

Speech Processing 15-492/18-492 Speaker ID Who is speaking? Speaker ID, Speaker Recognition

Convolutional Neural Networks for Particle Tracking Steve Farrell for the HEP.TrkX project May

BGP Review Ming Liu Background The internet is organized as autonomous systems (AS) A

Peer-peer and Applicat ion-level Net working CS 218 Fall 2003 Mult icast Overlays P2P applicat

Stateless automatic IPv4 over IPv6 Tunneling (SA46T) draft-matsuhira-sa46t-spec-00.txt Naoki

MC714 - Sistemas Distribuidos slides by Maarten van Steen (adapted from Distributed System - 3rd

Problems and Solutions in Classical Component Systems Language Transparency

Securing Your Cloud with Xen Projects Advanced Security Features Russell Pavlicek, Xen Project

Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Speaker: Jianchao Lu Jianchao Lu, Xiaomi Mao, Baris Taskin VLSI Lab - PowerPoint PPT Presentation

Speaker: Jianchao Lu Jianchao Lu, Xiaomi Mao, Baris Taskin VLSI Lab Electrical & Computer Engineering Drexel University 1 Outline Preliminaries Previous Works Methodology Experimental Results Conclusions 2 Clock Mesh

Xiaomi Group Introduction Founded on April 6 th , 2010 Xiaomi Business Model 1. High Quality

Xiaomi Group Introduction Founded on April 6 th , 2010 Xiaomi Business Model 1. High Quality

Manu Jain Vice President, Xiaomi Global Managing Director, Xiaomi India Founded in April 2010

2014/07 Roborock Founded 2014/09 Invested by XiaoMi Became XiaoMi eco chain company 2015/06

Instant Speed with Google AMP and Facebook Instant Articles Baris Wanschers Baris Wanschers

Usability Testing is not Rocket Science! Baris Sarialioglu I am Baris Sarialioglu... Managing

Non-Negative Graph Embedding N N ti G h E b ddi Jianchao Yang Shuicheng Yan Yun Fu Jianchao

Inventory Planning for Hurricane Events Emmett Lodree, Jr. and Selda Taskin Department of

EEM 3117 Introduction Dr. Sezai Taskin Department of Electrical&amp;Electronics Engineering

Reversing IoT: Xiaomi Ecosystem Gain cloud independence and additional functionality by

Quick wins for an accessible website Baris Wanschers &amp; Marloes Bosch - LimoenGroen Quick wins

Crossed product C -algebras and nuclear dimension Jianchao Wu University of M unster Aug

Keypoint-Based Action Keypoint-Based Action Recognition Recognition Presenter: Jianchao Yang

Online Learning and Online Investing Jia Mao February 20, 2006 Jia Mao () Online Learning and

News and Media Literacy: Building Critical Consumers and Creators Jeff Mao @jmao121

Speech Processing 15-492/18-492 Speaker ID Who is speaking? Speaker ID, Speaker Recognition

Convolutional Neural Networks for Particle Tracking Steve Farrell for the HEP.TrkX project May

BGP Review Ming Liu Background The internet is organized as autonomous systems (AS) A

Peer-peer and Applicat ion-level Net working CS 218 Fall 2003 Mult icast Overlays P2P applicat

Stateless automatic IPv4 over IPv6 Tunneling (SA46T) draft-matsuhira-sa46t-spec-00.txt Naoki

MC714 - Sistemas Distribuidos slides by Maarten van Steen (adapted from Distributed System - 3rd

Problems and Solutions in Classical Component Systems Language Transparency

Securing Your Cloud with Xen Projects Advanced Security Features Russell Pavlicek, Xen Project

Diffusing Your Mobile Apps: Extending In-Network Function Virtualisation to Mobile Function

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

EEM 3117 Introduction Dr. Sezai Taskin Department of Electrical&Electronics Engineering

Quick wins for an accessible website Baris Wanschers & Marloes Bosch - LimoenGroen Quick wins