Generalized Geometric Programming for Circuit Design Stephen Boyd Seung Jean Kim 4/4/05 ISPD ’05
Outline • Basic approach & applications • Geometric programming & generalized geometric programming • Digital circuit design applications • Conclusions ISPD ’05 1
Basic approach 1. formulate circuit design problem as geometric program (GP) or generalized geometric program (GGP), optimization problems with special form 2. solve GP or GGP using specialized, tailored method • this talk focuses on step 1 (a.k.a. GP modeling ) • step 2 is technology ISPD ’05 2
Applications • wire and device sizing using Elmore delay • digital circuit sizing and extensions (focus of this talk) • analog and mixed signal design – opamps, comparators – ADCs, DACs, PLLs, SC filters • RF design – CMOS inductors, oscillators – LNAs, mixers • optimal doping profiles ISPD ’05 3
Monomial & posynomial functions x = ( x 1 , . . . , x n ) : vector of positive optimization variables • function g of form g ( x ) = cx α 1 1 x α 2 2 · · · x α n n , with c > 0 , α i ∈ R , is called monomial • sum of monomials, i.e. , function f of form t � c k x α 1 k x α 2 k · · · x α nk f ( x ) = , n 1 2 k =1 with c k > 0 , α ik ∈ R , is called posynomial ISPD ’05 4
Examples with x , y , z variables, � x/y , 3 x 2 y − . 12 z are monomials (hence also posynomials) • 0 . 23 , 2 z • 0 . 23 + x/y , 2(1 + xy ) 3 , 2 x + 3 y + 2 z are posynomials • 2 x + 3 y − 2 z , x 2 + tan x are neither ISPD ’05 5
Generalized posynomials f is a generalized posynomial if it can be formed using addition, multiplication, positive power, and maximum, starting from posynomials examples: � � 2 x − 3 . 9 1 + x 1 , 2 x 1 + x 0 . 2 • max 3 � � 1 . 5 0 . 1 x 1 x − 0 . 5 + x 1 . 7 2 x 0 . 7 • 3 3 � � �� 1 . 7 + x 1 . 1 2 x − 3 . 9 1 + x 1 , 2 x 1 + x 0 . 2 2 x 3 . 7 • max 3 3 ISPD ’05 6
Composition rules • monomials closed under product, division, positive scaling, power, inverse • posynomials closed under sum, product, positive scaling, division by monomial, positive integer power • generalized posynomials closed under sum, product, max, positive scaling, division by monomial, positive power ISPD ’05 7
Generalized geometric program (GGP) minimize f 0 ( x ) subject to f i ( x ) ≤ 1 , i = 1 , . . . , m g i ( x ) = 1 , i = 1 , . . . , p f i are generalized posynomials , g i are monomials • called geometric program (GP) when f i are posynomials • a highly nonlinear constrained optimization problem ISPD ’05 8
GP example • maximize volume of box with width w , height h , depth d • subject to limits on wall and floor areas, aspect ratios h/w , d/w maximize hwd subject to 2( hw + hd ) ≤ A wall , wd ≤ A flr α ≤ h/w ≤ β, γ ≤ d/w ≤ δ in standard GP form: h − 1 w − 1 d − 1 minimize subject to (2 /A wall ) hw + (2 /A wall ) hd ≤ 1 , (1 /A flr ) wd ≤ 1 (1 /β ) hw − 1 ≤ 1 αh − 1 w ≤ 1 , γwd − 1 ≤ 1 , (1 /δ ) w − 1 d ≤ 1 ISPD ’05 9
Trade-off analysis (no equality constraints, for simplicity) • form perturbed version of original GGP, with changed righthand sides: minimize f 0 ( x ) subject to f i ( x ) ≤ u i , i = 1 , . . . , m • u i > 1 ( u i < 1 ) means i th constraint is relaxed (tightened) • let p ( u ) be optimal value of perturbed problem • plot of p vs. u is (globally) optimal trade-off surface (of objective against constraints) ISPD ’05 10
Trade-off curves for maximum volume box example 10 5 A wall = 10000 A wall = 10000 10 4 A wall = 1000 A wall = 1000 10 3 V A wall = 100 A wall = 100 10 2 10 10 10 2 10 3 A floor • maximum volume V vs. A flr , for A wall = 100 , 1000 , 10000 • h/w , d/w aspect ratio limits 0 . 5 , 2 ISPD ’05 11
GP and GGP attributes • after log transform of variables/constraints, they become convex problems • can convert GGP to GP, e.g. , f ( x ) + max { g ( x ) , h ( x ) } ≤ 1 becomes f ( x ) + t ≤ 1 , g ( x ) /t ≤ 1 , h ( x ) /t ≤ 1 where t is new (dummy) variable • conversion tricks can be automated – parser scans problem description, forms GP – efficient GP solver solves GP – solution transformed back (dummy variables eliminated) ISPD ’05 12
How GPs (and GGPs) are solved the practical answer: none of your business more politely: you don’t need to know it’s technology: • good algorithms are known • good software implementations are available ISPD ’05 13
How GPs are solved • work with log of variables: y i = log x i • take log of monomials/posynomials to get log f 0 ( e y ) minimize log f i ( e y ) ≤ 0 , subject to i = 1 , . . . , m log g i ( e y ) = 0 , i = 1 , . . . , p • log f i ( e y ) are (smooth) convex functions • log g i ( e y ) are affine functions, i.e. , linear plus a constant • solve (nonlinear) convex optimization problem above using interior-point method ISPD ’05 14
Current state of the art • basic interior-point method that exploits sparsity, generic GP structure • approaching efficiency of linear programming solver – sparse 1000 vbles, 10000 monomial terms: few seconds – sparse 10000 vbles, 100000 monomial terms: minute – sparse 10 6 vbles, 10 7 monomial terms: hour (these are order-of-magnitude estimates, on simple PC) ISPD ’05 15
History • GP (and term ‘posynomial’) introduced in 1967 by Duffin, Peterson, Zener • engineering applications from the very beginning – early applications in chemical, mechanical, power engineering – digital circuit transistor and wire sizing with Elmore delay since 1984 (Fishburn & Dunlap’s TILOS, Sapatnekar, Kang, . . . ) – analog circuit design since 1997 (Hershenson, Boyd, Lee) – other applications in statistics, wireless power control, . . . • extremely efficient solution methods since 1994 or so (Nesterov & Nemirovsky) ISPD ’05 16
Gate scaling input flip flops combinational logic block output flip flops 1 4 6 2 out in 5 7 3 clock • combinational logic; circuit topology & gate types given • gate sizes (scale factors x i ≥ 1 ) to be determined • scale factors affect total circuit area, power and delay ISPD ’05 17
RC gate delay model V dd C in i R i C int C L C in i i i • input & intrinsic capacitances, driving resistance, load capacitance � i = ¯ = ¯ R i = ¯ C in C in C int C int C L C in i x i , i x i , R i /x i , i = i j j ∈ FO( i ) ISPD ’05 18
RC gate model • RC gate delay: � ¯ R i ¯ i + ( ¯ ¯ D i = 0 . 69 R i ( C L i + C int C in C in i ) = 0 . 69 R i /x i ) j x j j ∈ FO( i ) • D i are posynomials (of scale factors) ISPD ’05 19
Path and circuit delay 1 4 6 2 5 7 3 • delay of a path: sum of delays of gates on path . . . posynomial • circuit delay: maximum delay over all paths . . . generalized posynomial ISPD ’05 20
Area & power • total circuit area: A = x 1 ¯ A 1 + · · · + x n ¯ A n • total power is P = P dyn + P stat n � f i ( C L i + C int i ) V 2 – dynamic power P dyn = dd i =1 f i is gate switching frequency n � x i ¯ I leak – static power P stat = V dd i i =1 ¯ I leak is leakage current (average over input states) of unit scaled gate i • A and P are linear functions of x , with positive coefficients, hence posynomials ISPD ’05 21
Basic gate scaling problem minimize D P ≤ P max , A ≤ A max subject to 1 ≤ x i , i = 1 , . . . , n . . . a GGP extensions/variations: • minimize area, power, or some combination • maximize clock frequency subject to area, power limits • add other constraints • optimal trade-off of area, power, delay ISPD ’05 22
Example: 32-bit Ladner-Fisher adder • 451 gates (scale factors), 5 gate types, 64 inputs, 32 outputs • logical effort gate delay model parameters: ¯ ¯ ¯ ¯ ¯ C in C int I leak gate type R A INV 3 3 0 . 48 3 0.006 NAND2 4 6 0 . 48 8 0.007 NOR2 5 6 0 . 48 10 0.009 AOI21 6 7 0 . 48 17 0.003 OAI21 6 7 0 . 48 16 0.003 • time unit is τ , delay of min-size inverter ( 0 . 69 · 0 . 48 · 3 = 1 ) • area (total width) unit is width of NMOS in min-size inverter ISPD ’05 23
Example: 32-bit Ladner-Fisher adder • typical optimization time: few seconds on PC 16000 A max 3000 45 70 D ISPD ’05 24
Extensions • can use better (GP-compatible) models of delay, area, power, . . . • can distinguish rising/falling transitions, input pins, . . . • can add effect of signal slope . . . problem remains a GGP ISPD ’05 25
Statistical parameter variation • circuit peformance depends on random device and process parameters • hence, performance measures like P , D are random variables P , D • delay D is max of many random variables; often skewed to right • distributions of P , D depend on gate scalings x i frequency 45 53 circuit delay • related to (parametric) yield, DFM, DFY . . . ISPD ’05 26
Statistical design • measure random performance measures by 95% quantile (say) Q . 95 ( D ) minimize Q . 95 ( P ) ≤ P max , A ≤ A max subject to 1 ≤ x i , i = 1 , . . . , n • extremely difficult stochastic optimization problem; almost no analytic/exact results • but, (GP-compatible) heuristic method works well ISPD ’05 27
Recommend
More recommend