Chapt er 15: Numer ical St r engt h Reduct ion Keshab K. Parhi
• Sub-expression eliminat ion is a numerical t ransf ormat ion of t he const ant mult iplicat ions t hat can lead t o ef f icient hardware in t erms of area, power and speed. • Sub-expression can only be perf ormed on const ant mult iplicat ions t hat operat e on a common variable. • I t is essent ially t he process of examining t he shif t and add implement at ions of t he const ant mult iplicat ions and f inding redundant operat ions. Example: a × x and b × x, where a = 001101 and • b = 011011 can be perf ormed as f ollows: – a × x = 000100 × x + 001001 × x – b × x = 010010 × x + 001001 × x = (001001 × x) < < 1 + (001001 × x). – The t erm 001001 × x needs t o be comput ed only once. – So, mult iplicat ions were implement ed using 3 shif t s and 3 adds as opposed t o 5 shif t s and 5 adds. Chap. 15 2
Multiple Constant Multiplication(MCM) The algorit hm f or MCM uses an it erat ive mat ching process t hat consist s of t he f ollowing st eps: • Express each const ant in t he set using a binary f ormat (such as signed, unsigned, 2’s complement represent at ion). • Det ermine t he number of bit -wise mat ches (non- zero bit s) bet ween all of t he const ant s in t he set . • Choose t he best mat ch. • Eliminat e t he redundancy f rom t he best mat ch. Ret urn t he remainders and t he redundancy t o t he set of coef f icient s. • Repeat St eps 2-4 unt il no improvement is achieved. Chap. 15 3
Example: Const ant Value Unsigned a 237 11101101 b 182 10110110 c 93 01011101 Binary represent at ion of const ant s Const ant Unsigned Const ant Unsigned Rem. of a 10100000 Rem. of a 00000000 b 10110110 Rem. of b 00010110 Rem. of c 00010000 Rem. of c 00010000 Red. of a,c 01001101 Red. of a,c 01001101 Red. of Rem a,b 10100000 Updat ed set of const ant s Updat ed set of const ant s 1 st it erat ion 2 nd it erat ion Chap. 15 4
Linear Transf ormations • A general f orm of linear t ransf ormat ion is given as: y =T*x where, T is an m by n mat rix, y is lengt h-m vect or and x is a lengt h-n vect or. I t can also be writ t en as: = ∑ = n , 1 ,..., y t x i m i ij j = j 1 • The f ollowing st eps are f ollowed: � Minimize t he number of shif t s and adds required t o comput e t he product s t ij x j by using t he it erat ive mat ching algorit hm. � Format ion of unique product s using t he sub-expression f ound in t he 1 st st ep. � Final st ep involves t he sharing of addit ions, which is common among t he y i ’s. This st ep is very similar t o t he MCM problem. Chap. 15 5
Example: 7 8 2 13 12 11 7 13 = T 5 8 2 15 7 11 7 11 •The const ant s in each column mult iply t o a common variable. For Example x 1 is mult iplied t o t he set of const ant s [7, 12, 5, 7]. • Applying it erat ive mat ching algorit hm t he f ollowing t able is obt ained. Column 1 Column 2 Column 3 Column 4 0101 1000 0010 1001 0010 1011 0111 0100 1100 0010 Chap. 15 6
• Next , t he unique product s are f ormed as shown below: p 1 = 0101*x 1 , p 2 = 0010*x 1 , p 3 = 1100*x 1 p 4 = 1000*x 2 , p 5 = 1011*x 2 , p 6 = 0010*x 3 , p 7 = 0111*x 3 p 8 = 1001*x 4 , p 9 = 0100*x 4 , p 10 = 0010*x 4 • Using t hese product s t he y i ’s are as f ollows: y 1 = p 1 + p 2 + p 4 + p 6 + p 8 + p 9 ; y 2 = p 3 + p 5 + p 7 + p 8 + p 9 ; y 3 = p 1 + p 4 + p 6 + p 8 + p 9 + p 10 ; y 4 = p 1 + p 2 + p 5 + p 7 + p 8 + p 10 ; Chap. 15 7
• This st ep involves sharing of addit ions which are common t o all y i ’s. For t his each y i is represent ed as k bit word (1 ≤ k ≤ 10), where each of t he k product s f ormed af t er t he 2 nd st ep represent s a part icular bit posit ion. Thus, y 1 = 1101010110, y 2 = 0010101110, y 3 = 1001010111, y 4 = 1100101101. • Applying it erat ive mat ching algorit hm t o reduce t he number of addit ions required f or y i ’s we get : y 1 = p 2 + (p 1 + p 4 + p 6 + p 8 + p 9 ); y 2 = p 3 + p 9 + (p 5 + p 7 + p 8 ); y 3 = p 10 + (p 1 + p 4 + p 6 + p 8 + p 9 ); y 4 = p 1 + p 2 + p 10 + (p 5 + p 7 + p 8 ); • The t ot al number of addit ions are reduced f rom 35 t o 20. Chap. 15 8
Polynomial Evaluation Evaluat ing t he polynomial: x 13 + x 7 + x 4 + x 2 + x • Wit hout considering t he redundancies t his polynomial evaluat ion requires 22 mult iplicat ions. • Examining t he exponent s and considering t heir binary represent at ions: 1 = 0001, 2 = 0010, 4 = 0100, 7 = 0111, 13 = 1101. x 7 can be considered as x 4 × x 2 × x 1 . Applying sub-expression • sharing t o t he exponent s t he polynomial can be evaluat ed as f ollows: x 8 × (x 4 × x) + x 2 × (x 4 × x) + x 4 + x 2 + x • The t erms x 2 , x 4 and x 8 each require one mult iplicat ion as shown below: x 2 = x × x, x 4 = x 2 × x 2 , x 8 = x 4 × x 4 • Thus, we require 6 inst ead of 22 mult iplicat ions. Chap. 15 9
Sub- expression Sharing in Digital Filters • Example of common sub-expression eliminat ion wit hin a single mult iplicat ion : y = 0.101000101*x. This may be implement ed as: y = (x > > 1) – (x > > 3) + (x > > 7) – (x > > 9). Alt ernat ively, t his can be implement ed as, x2 = x – (x > > 2) Y = (x2 > > 1) + (x2 > > 7) which requires one less addit ion. Chap. 15 10
• I n order t o realize t he sub-expression eliminat ion t ransf ormat ion, t he N-t ap FI R f ilt er: y(n) = c 0 x(n) + c 1 x(n-1) + … + c 0 x(n-N+1) must be implement ed using t ransposed direct - f orm st ruct ure also called dat a-broadcast f ilt er st ruct ure as shown below: Chap. 15 11
• Represent a f ilt er operat ion by a t able (mat rix) {x ij }, where t he rows are indexed by delay i and t he columns by shif t j , i.e., t he row i is t he coef f icient c i f or t he t erm x(n-i), and t he column 0 in row i is t he msb of c i and column W-1 in row i is t he lsb of c i , where W is t he word lengt h. • The row and column indexing st art s at 0. • The ent ries are 0 or 1 if 2’s complement represent at ion is used and {1, 0, 1} if CSD is used. • A non-zero ent ry in row i and column j represent s x(n-i) > > j . I t is t o be added or subt ract ed according t o whet her t he ent ry is +1 or –1. Chap. 15 12
Example: y(n) = 1.000100000*x(n) + 0.101010010*x(n-1) + 0.000100001*x(n-2) 1 -1 -1 -1 1 1 1 -1 This f ilt er has 8 non-zero t erms and t hus requires 7 addit ions. But , t he sub-expressions x1 + x1[-1] > > 1 occurs 4 t imes in shif t ed and delayed f orms by various amount s as circled. So, t he f ilt er requires 4 adds. x2 = x1 – x1[-1] > > 1 y = x2 – (x2 > > 4) – (x2[-1] > > 3) + (x2[-1] > > 8) An alt ernat ive realizat ion is : x2 = x1 – (x1 > > 4) – (x1[-1] > > 3) + (x1[-1] > > 8) y = x2 – (x2[-1] > > 1). Chap. 15 13
Example: y(n) = 1.01010000010*x(n) + 0.10001010101*x(n-1) + 0.10010000010*x(n-2) + 1.00000101000*x(n-4) The subst ruct ure mat ching procedure f or t his design is as f ollows: • St art wit h t he t able cont aining t he coef f icient s of t he FI R f ilt er. An ent ry wit h absolut e value of 1 in t his t able denot es add or subt ract of x1. I dent if y t he best sub-expression of size 2. -1 1 1 1 -1 -1 -1 -1 -1 -1 1 1 1 1 -1 Chap. 15 14
• Remove each occurrence of each sub-expression and replace it by a value of 2 or –2 in place of t he f irst (row maj or) of t he 2 t erms making up t he sub-expression. -1 2 1 2 -2 -1 -1 -2 -2 1 -1 • Record t he def init ion of t he sub-expression. This may require a negat ive value of shif t which will be t aken care of lat er. x3 = x1 – x1[-1] > > (-1) Chap. 15 15
• Cont inue by f inding more sub-expressions unt il done. -1 3 2 -3 -2 -2 1 -1 5. Writ e out t he complet e def init ion of t he f ilt er. x2 = x1 – x1[-1] > > (-1) x3 = x2 + x1 > > 2 y = -x1 + x3 > > 2 + x2 > > 10 – x3[-1] > > 5 – x2[-1] > > 11 -x2[-2] > > 1 + x1[-3] > > 6 – x1[-3] > > 8. Chap. 15 16
• I f any sub-expression def init ion involves negat ive shif t , t hen modif y t he def init ion and subsequent uses of t hat variable t o remove t he negat ive shif t as shown below: x2 = x1 > > 1 – x1[-1] x3 = x2 + x1 > > 3 y = -x1 + x3 > > 1 + x2 > > 9 – x3[-1] > > 4 – x2[-1] > > 10 - x2[-2] + x1[-3] > > 6 – x1[-3] > > 8. Chap. 15 17
3-t ap FI R f ilt er wit h sub-expression sharing f or 3-t ap FI R f ilt er wit h coef f icient s c 2 = 0.11010010, c 1 = 0.10011010 and c 0 = 0.00101011. This requires 7 shif t s and 9 addit ions compared t o 12 shif t s and 11 addit ions. Chap. 15 18
3-t ap FI R f ilt er wit h sub-expression sharing requiring 8 addit ions as compared t o 9 in t he previous implement at ion. Chap. 15 19
Recommend
More recommend