ISLPED 2004 8/10/2004 Power-Optimal Pipelining in Deep Submicron Technology Seongmoo Heo and Krste Asanovi Computer Architecture Group, MIT CSAIL
Traditional Pipelining • Goal: Maximum performance Vdd Clk-Q Setup Propagation Delay Clk Clk Clk
Pipelining as a Low-Power Tool • Goal: Low-Power, Fixed Throughput Vdd Clk-Q Setup Propagation Delay Clk Time Slack Clk Time Slack Clk
Pipelining as a Low-Power Tool • Goal: Low-Power, Fixed Throughput Vdd Clk-Q Setup Propagation Delay Clk Time Slack Clk Traded for Power (supply voltage scaling) Time Slack Clk
Pipelining as a Low-Power Tool Power * Clock frequency fixed Flip-flop Power Pipelining Overhead Time slack Delay
Pipelining as a Low-Power Tool Power * Clock frequency fixed Power Saving Supply voltage scaling Delay
Power-Optimal Pipelining • Power reduction from pipelining limited by power overhead of increased number of flip-flops → → → → Power-Optimal Pipelining
Power-Optimal Pipelining • Power reduction from pipelining limited by power overhead of increased number of flip-flops → → → → Power-Optimal Pipelining Power Too shallow pipelining Delay
Power-Optimal Pipelining • Power reduction from pipelining limited by power overhead of increased number of flip-flops → → → → Power-Optimal Pipelining Power Too deep pipelining Too shallow pipelining Delay
Power-Optimal Pipelining • Power reduction from pipelining limited by power overhead of increased number of flip-flops → → → → Power-Optimal Pipelining Power Too deep pipelining Too shallow pipelining Optimal pipelining Optimal Power Saving Delay
Contribution • Pipelining is an old idea. • Research focus has been on performance impact of pipelining. • Idea of using pipelining [Chandrakasan ’92] to lower power has not been fully explored in deep submicron technology. • Analysis and circuit-level simulation of Power-Optimal Pipelining for different regimes of V th , activity factor, clock gating
Bottom-to-Top Approach 1. Impact of pipelining on power component 2. Impact of pipelining on total power (with/without clock-gating) Power Total Power (clock-gated) active active inactive Time Idle Leakage Switching Power Power Power Component Component Component
Bottom-to-Top Approach 1. Impact of pipelining on power component 2. Impact of pipelining on total power (with/without clock-gating) Power Total Power (not clock-gated) active active inactive Time *Idle power = power consumed when circuit is idle Idle Leakage Switching and not clock-gated Power Power Power Component Component Component
Methodology • Target digital system: Fixed throughput, Highly parallel computation, Logic-dominant • Test bench – BPTM (Berkeley Predictive Technology Model) 70nm process: – LVT(0.17/-0.2), MVT(0.19/-0.22), HVT(0.21/-0.24) – Hspice simulation at 100°C, Clock = 2 GHz Baseline N FO4 inverters ( N = 2 ~ 24) TG flip-flops TG flip-flops One Pipeline Stage
Pipelining and Switching Power: Analytical Trend Optimal Switching Power Saving O(N 2 ) Flip-flop overhead Quadratic reduction O(1/N) of logic switching power ∝ ∝ V dd 2 ∝ ∝ N 2 ∝ ∝ ∝ ∝ Optimal FO4 Number of FO4 per stage, N
Pipelining and Leakage Power: Analytical Trend Optimal Saving Leakage Power α ) (1< α O(1/N) α α α < 2) O(N α α α Flip-flop overhead Superlinear reduction of logic leakage power Optimal FO4 ∝ V dd * e( η ∝ η V dd ) ∝ ∝ N α ∝ ∝ η η ∝ ∝ α α α DIBL effect Number of FO4 per stage, N
Pipelining and Idle Power: Analytical Trend • Clock-gating is not always possible – Increased control complexity – insufficient setup time of clock enable signal • Leakage Power + Flip-flop Switching Power – Between leakage power scaling and flip-flop switching power scaling depending on leakage level
Pipelining and Idle Power: Analytical Trend Leakage Flip-flop Switching Power Scale Power Scale Optimal Idle Power Optimal Saving Saving Relative Power O(N) Optimal FO4 Linear reduction of α ) (1< α α α α < 2) O(N α α α Flip-flop switching O(1/N) power ∝ ∝ 1/N * V dd 2 ∝ ∝ N ∝ ∝ ∝ ∝ Optimal FO4 O(1/N) Number of FO4 per stage, N Number of FO4 per stage, N
Simulation Results: Power Components Fixed Throughput @ 2 GHz Power Switching Leakage Idle Components Power Power Power α ) α ) O(N α α α O(N) or O(N α α α O(N 2 ) Right hand (1< α α < 2) α α (1< α α α α < 2) side curve Saving* 79(HVT)~ 70(LVT)~ 55(HVT)~ 82(LVT)% 75(HVT)% 70(LVT)% N* 6 6 8 N = Number of N* = Optimal N Saving* = Optimal FO4 inverters power saving by per stage pipelining (Not including flip-flop delay)
Optimal Power Saving Optimal FO4 = 6 Optimal FO4 = 6~8 No Clock Clock Gating Gating relative power relative power *2 GHz *Flip-flop delay not included in optimal FO4 activity factor activity factor
Optimal Power Saving Optimal FO4 = 6 Optimal FO4 = 6~8 No Clock Clock Idle Gating Gating Power relative power relative power Leakage Power Switching Switching Power Power activity factor activity factor
Optimal Power Saving Optimal FO4 = 6 Optimal FO4 = 6~8 No Clock Clock Gating Gating relative power relative power LVT activity factor activity factor
Discussion • LVT can be fast and power-efficient – enables lower V dd • Flip-flop delay more important than flip-flop power for power-optimal pipelining
Limitation of This Work Effect on Effect on optimal logic optimal depth power saving ↑ ↑ ↓ ↓ ↑ ↑ ↓ ↓ Super-linear growth of flip-flops ↑ ↑ ↑ ↑ ↓ ↓ ↓ ↓ Additional memory ↓ ↓ ↓ ↓ ↑ ↑ ↑ ↑ Reduced glitches ↑ ↑ ↑ ↑ ↓ ↓ ↓ ↓ Parasitic wire capacitance
Conclusion • Pipelining is an effective low-power tool when used to support voltage scaling in digital system implementing highly parallel computation. • Optimal Logic Depth: 6-8 FO4 – ~ 8-10 FO4 including flip-flop delay • Optimal Power Saving: 55 – 80% – It depends on V th , AF, Clock-Gating • Insights: – Pipelining is more effective with High AF • Pipelining is most effective at saving switching power – Pipelining is more effective with lower V th • Except for when leakage power is dominant. – Pipelining is more effective with clock-gating • reduced flip-flop overhead.
Acknowledgments • Thanks to SCALE group members and anonymous reviewers • Funded by NSF CAREER award CCR- 0093354, NSF ITR award CCR-0219545, and a donation from Intel Corporation.
BACKUP SLIDES
Recommend
More recommend