A Mathematical Solution to Power Optimal Pipeline Design Power Optimal Pipeline Design by Utilizing Soft Edge Flip Flops M. Ghasemazar, B. Amelifard, M. Pedram University of Southern California Department of Electrical Engineering August 11, 2008 ISLPED 2008 ISLPED 2008 1
Outline Outline • Soft-Edge Flip Flops Soft Edge Flip Flops • Power Optimal Pipeline Design • Problem Formulation Problem Formulation • SEFF Modeling • Experimental Results • Experimental Results • Conclusion 2
Soft Edge Flip Flop Soft Edge Flip Flop D Q • Key idea: Allow the data to pass through y p g a flip flop during a transparency window, SEFF instead of on a triggering clock edge CLK • Key advantage: Enable slack passing K d t E bl l k i between adjacent pipeline stages which are separated by (master-slave) flip- ow ransparency Windo flops Clk • Circuit implementation: Delay the clock of the master latch to create a window of the master latch to create a window Tr ClkD during which both the master and slave Hold Time Setup Time latches are ON DATA 3
SEFF Implementation SEFF Implementation Clk Clk Conventional Q D Clk Clk Clk Clk (Hard Edge) Clk Clk Master-Slave FF M t Sl FF Clk Clk ClkD Clk Soft Edge D Q Clk ClkD Clk Master-Slave FF ClkD ClkD Clk Clk ClkD Delay Clk ClkD 4
SEFF Characteristics SEFF Characteristics • Setup and hold times, and clock-to-q delay of a Setup and hold times, and clock to q delay of a soft-edge flip-flop are all functions of the transparency window width, w • Simulations show a linear dependency on w 100 y = 0.921x - 30.45 0 921 30 45 80 ⎧ ⎪ = + Setup t ( w ) a w a 60 ⎪ Time (ps) s i , i 1 i 0 Time ⎪ ⎪ 40 ⎪ = + + t t w w b w b w b b ( ( ) ) ⎨ ⎨ 20 20 h i , i 1 i 0 Setup/Hold ⎪ ⎪ 0 ⎪ = + t ( w ) c w c ⎪ 40 60 80 100 120 140 ⎪ ⎩ cq i , i 1 i 0 -20 Hold -40 Time y = -0.651x + 33.54 -60 Window size (ps) 5
SEFF Characteristics – cont’d SEFF Characteristics cont d • Power consumption of a SEFF is monotonically p y increasing with its window size ( w ). This is due to: – Higher switching activities in the internal nodes in the transparency window transparency window – Higher dynamic and leakage power consumption in the additional delay generation circuitry • Experimental evaluation of E i t l l ti f 350 ation (uW) total power consumption: 300 250 Power Dissipa 200 200 2 = + + 150 P d w d w d FF i , 2 i 1 i 0 100 50 P 0 0 40 80 120 160 Transparency window (ps) 6
Pipeline Basics Pipeline Basics D D D Q Q Q C1 C1 C2 C2 FF0 FF0 FF1 FF1 FF2 FF2 CLK t cq,i d i t s,i • Timing constraints for a linear pipeline + + ≤ ≤ ≤ d t t T 1 i N i s i , cq i , − 1 (1) clk δ + ≥ ≤ ≤ t t 1 i N (2) − i cq i , 1 h i , • Substitute FFs with SEFFs – First and Last FF’s remain hard-edge ones st a d ast s e a a d edge o es • This is needed to avoid imposing constraints on the sender/receiver of data – Intermediate stage FF’s may be substituted by SEFFs ≤ ≤ − − ≤ ≤ ≤ ≤ d d T T t t ( ( w ) ) t t ( ( w ) ) 1 1 i i N N − − i clk s i , i cq i , 1 i 1 δ ≥ − ≤ ≤ t , ( w ) t ( w ) 1 i N i h i i cq i , − 1 i − 1 7
Power Optimal Pipeline Power Optimal Pipeline • Main Idea: Passing available slack of some stages to g g more timing critical stages to provide them with more freedom in power optimization through voltage scaling • For example, let T clk =T clk,min =560ps and t s =t h =t cq =30ps F l l t T T 560 d t t t 30 – If FF1 is replaced with a SEFF with a window size of 50ps • the first stage borrows 50ps from the second stage • the circuit can be powered with a lower supply voltage level – Ideally, 10% V dd reduction ->19% power saving C3 C3 C1 C1 C2 D D D D Q Q Q Q d1=500ps d3=450ps d2=400ps FF0 FF1 FF3 FF2 CLK 8
PSLP Problem Statement PSLP Problem Statement • Power-optimal Soft Linear Pipeline Design p p g – Goal: Minimize the total power consumption of an N-stage linear pipeline circuit – Variables: Variables: • Optimal supply voltage level (1 variable) • Transparency windows size of the individual soft-edge FF-sets (N-1) • Delay elements to avoid hold time violations (N) • Delay elements to avoid hold time violations (N) ⎧ – Constraints: ⎪ N N − 1 N ⎪ ∑ ∑ ∑ = + + Min . P P ( ) v P ( , ) w v P ( , ) z v ⎪ ⎪ total Comb i , FF i , i DE i , i ⎪ • Setup/hold times ⎪ i = 1 i = 1 i = 1 ⎪ ⎪ ≤ ≤ − − ≤ ≤ ≤ ≤ st I st I . . ( ) ( ) d v d v ( ) ( ) T T t t ( ( , ) w v w v ) t t ( ( w w , ); 1 v v ); 1 i i N N ⎪ ⎪ • Window size limits Window size limits ⎪ − − i clk s i , i cq i , 1 i 1 ⎪ ⎪ ⎪ δ + ≥ − ≤ ≤ ⎪ ( ) II ( ) v z t ( , ) w v t ( w , ); 1 v i N ⎨ • Single supply voltage ⎪ i i i cq i , − 1 i − 1 h i , ⎪ ⎪ ≤ ≤ ≤ ≤ − ⎪ ( III w ) w w ; 1 i N 1 ⎪ max min i ⎪ ⎪ ⎪ ⎪ ⎪ ∈ { { } } ( IV ) v V V , ..., V − ⎪ ⎪ 0 ⎪ 1 m 1 ⎪ ⎪ ⎩ ⎪ 9
SEFF Modeling SEFF Modeling ⎧ • Setup time, hold time, clock-to-q ⎪ = + t ( w v , ) a v w ( ) a v ( ) ⎪ s i , i 1 i 0 ⎪ ⎪ ⎪ ⎪ delay, and power dissipation are d l d di i ti ⎪ = + t w v , b v w ( ) b v ( ) ⎨ ( ) h i , i 1 i 0 ⎪ ⎪ functions of both voltage and ⎪ = + t w v , c v w ( ) c v ( ) ⎪ ( ) ⎪ ⎩ cq i , i 1 i 0 transparency window size p y 2 = + + P d v w d v w d v – Voltage-dependent coefficients are ( ) ( ) ( ) FF i , 2 i 1 i 0 determined from SPICE simulations 0 100 200 Vdd=0.9V Vdd=0.9V -10 Vdd=1.0V Vdd=1.0V 180 80 Vdd=1.1V Vdd=1.1V Vdd=1.2V elay (ps) Vdd=1.2V -20 me (ps) me (ps) 160 60 -30 Setup Tim Clk-to-Q de Hold Tim 140 -40 40 Vdd=0.9V 120 Vdd=1.0V -50 Vdd=1.1V Vdd=1.2V 20 100 -60 -70 0 80 40 60 80 100 120 140 40 60 80 100 120 140 40 60 80 100 120 140 Transparency window (ps) Transparency window (ps) Transparency window (ps) 10
Combinational Circuit Modeling Combinational Circuit Modeling 2 3 ⎛ ⎞ ⎛ ⎞ v v ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ = + P ( ) ( ) v P P • Total power consumption at p p ⎟ ⎟ ⎜ ⎜ ⎜ ⎜ ⎟ ⎟ ⎟ ⎟ ⎜ ⎜ ⎜ ⎜ ⎟ ⎜ ⎜ ⎜ ⎜ ⎟ comb i comb i , dyn i dyn i , leak i leak i , V V ⎟ V V ⎟ ⎝ ⎠ ⎝ ⎠ voltage level, v : 0 0 α ⎛ ⎛ ⎞ ⎞ − V V V V ⎟ ⎟ ⎜ ⎜ ⎟ = ⎜ d v ( ) 0 t d V ( ) • Max and Min combinational ⎟ ⎜ ⎟ i ⎜ i 0 − ⎝ v V ⎠ t logic cell delays (calculated α ⎛ ⎞ − V V ⎟ ⎜ from the alpha power law): from the alpha power law): ⎟ δ δ = ⎜ δ δ ( ) ( ) v v ⎜ 0 t ( ( V V ) ) ⎟ ⎟ ⎜ ⎜ ⎜ ⎟ i i i i 0 0 − ⎝ v V ⎠ t • Power dissipation overhead = ⋅ P ( z v , ) k v z of a delay element: ( ) DE 11
Solving the PSLP Solving the PSLP • To solve PSLP To solve PSLP – Enumerate all possible values for v – PSLP with fixed voltage ( PSLP-FV) g ( ) • P comb,i terms drop out of the cost function • Voltage constraint (IV) disappears • All other timing and power parameters become only • All other timing and power parameters become only dependent on w i and z i variables – For each fixed v, a quadratic program is set up and solved l d • We must minimize a quadratic cost function subject to linear inequality constraints • PSLP-FV can be solved optimally in polynomial time 12
Experimental Setup Experimental Setup • Hspice simulations were used to extract parameters that p p are needed for the problem formulation – 65nm Predictive Technology Model (PTM) – Nominal supply voltage 1.2V Nominal supply voltage 1 2V – Die temperature 100 o C • The SIS optimization package was used to synthesize a set of linear pipelines as test-bench circuits • The MOSEK toolbox used to solve the mathematical optimization problem optimization problem • All results were collected on a 2.4GHz Pentium 4PC with 2GB memory 13
Benchmark Spec Benchmark Spec Testbench (max, min) stage delays at nominal Clock voltage (ps) lt ( ) freq. f (# of stages) (GHz) TB1 (4) (320,140), (332,150), (308,150), 2.0 (320,170) TB2 (5) (320,140), (332,150), (308,150), 2.0 (280,145), (320,170) TB3 (3) (325, 150), (310,155), (219,160) 2.0 TB4 (5) (275,40), (235,40), (245,60), 2.5 (275 50) (275 70) (275,50), (275,70) TB5 (4) (310,100), (245,40), (245,50), 2.5 (245,60) 14
Recommend
More recommend