patmos2010 patmos2010
play

PATMOS2010 PATMOS2010 Optimization and Simulation Optimization and - PowerPoint PPT Presentation

20th International Workshop 20th International Workshop on Power and Timing Modeling. on Power and Timing Modeling. PATMOS2010 PATMOS2010 Optimization and Simulation Optimization and Simulation, Grenoble, France Residue arithmetic for


  1. 20th International Workshop 20th International Workshop on Power and Timing Modeling. on Power and Timing Modeling. PATMOS2010 PATMOS2010 Optimization and Simulation Optimization and Simulation, Grenoble, France Residue arithmetic for designing low-power multiply-add units I. Kouretas and V. Paliouras Electrical and Computer Engineering Dept.. University of Patras. GREECE

  2. Outline Review of RNS basics � Architecture of RNS-based systems � Multi-Vdd RNS architecture � Structure of processing units � Relevance to RNS bases � Results � Conclusions � PATMOS 2010, Grenoble, France University of Patras 2 PATMOS 2010, Grenoble, France University of Patras 2

  3. Low-power through alternative number representations � Sign-magnitude versus two’s-complement � Depends on data (signal) statistics � Logarithmic number system � Choice of representation parameters V. Paliouras, T. Stouraitis, “Low-power properties of logarithmic number � system,” IEEE Symposium on Computer Arithmetic, 2001. � Residue representations � Numerical properties of RNS � Inherently parallel structure T. Stouraitis, V. Paliouras, “Considering the alternatives in low power design,” IEEE Circuits and Devices, 2001. PATMOS 2010, Grenoble, France University of Patras 3 PATMOS 2010, Grenoble, France University of Patras 3

  4. RNS basics � RNS maps an integer X to a N - tuple of residues x i . { } ⎯⎯ ⎯ → RNS X x x , ,..., x 1 2 N x i = X mod m i and x i is called the i - th residue . � m i is a member of a set of pair-wise co-prime integers { } m m , ,..., m called base . 1 2 N � m i is called modulo . ∏ N � Dynamic range: i m . = i 1 PATMOS 2010, Grenoble, France University of Patras 4 PATMOS 2010, Grenoble, France University of Patras 4

  5. RNS architecture vs binary architecture n bits n bits binary operands results processor n bits 1 mod m 1 processor n bits 2 n bits n bits mod m RNS 2 bin processor operands to to results bin RNS n bits i mod m L processor � Data is processed in L parallel independent channels � Benefit: n i << n PATMOS 2010, Grenoble, France University of Patras 5 PATMOS 2010, Grenoble, France University of Patras 5

  6. Remarks on RNS architecture n bits 1 mod m 1 processor n bits 2 n bits n bits mod m RNS 2 bin processor operands to to results bin RNS n bits i mod m L processor � The conversion issue � Forward and inverse � Implementation of moduli channels is not identical � There are fast channels and slow channels PATMOS 2010, Grenoble, France University of Patras 6 PATMOS 2010, Grenoble, France University of Patras 6

  7. RNS advantages / disadvantages � Advantages � Parallel multiplication or addition � Fault tolerance � Reduced power dissipation in filters � Disadvantages � Difficult comparisons � Overflow detection � Sign detection � Division � Scaling / Rounding / Truncation PATMOS 2010, Grenoble, France University of Patras 7 PATMOS 2010, Grenoble, France University of Patras 7

  8. RNS multi-VDD architecture results operands p ∑ = ⋅ ⋅ ⋅ 2 P C V f a dyn L dd i , i i i = i 1 � p is the number of moduli channels � Power is quadratically related to voltage � Distiguished moduli channels with Vdd(H) and Vdd(L) � Benefit: Easy to implement PATMOS 2010, Grenoble, France University of Patras 8 PATMOS 2010, Grenoble, France University of Patras 8

  9. Employed RNS bases Bases used in this study: { n − } + n n 2 ,2 1,2 1 1 2 3 three-moduli base { } − − + n n n n 2 ,2 1,2 1,2 1 1 2 3 4 four-moduli base { } − − + + n n n n 4 n 5 2 ,2 1,2 1,2 1,2 1 1 2 3 five-moduli base � Cases of common choices in the literature } { { } + − − − + n n n n n n n 2 , 2 1, 2 1 , 2 , 2 1, 2 1, ..., 2 1 1 2 3 PATMOS 2010, Grenoble, France University of Patras 9 PATMOS 2010, Grenoble, France University of Patras 9

  10. Architectures of multiply adders ( ) n − ( ) modulo-2 n binary n + modulo- 2 1 modulo- 2 1 PATMOS 2010, Grenoble, France University of Patras 10 PATMOS 2010, Grenoble, France University of Patras 10

  11. Four moduli-bases results base Power(mW) Area(um 2 ) Delay(ns) Power Savings(%) before after {16, 31, 2047, 1025*} 3.16 2.28 12507.15 2 27.79 {32, 15, 511*, 4097} 3.11 2.36 12056.04 2 24.05 {16, 31, 2047*, 1025*} 3.16 2.07 14390.08 2 34.52 {32, 511*, 2047, 17} 2.99 2.24 11585.72 2 25.01 {16, 31*, 2047, 1025*} 3.16 2.15 12301.9 2 31.9 {32, 511*, 2047*, 17} 2.99 2.03 13468.65 2 32.14 {16, 31*, 2047*, 1025*} 3.16 1.94 14184.83 2 38.63 {256*, 31, 4095, 17} 1.82 1.68 7958.7 2 7.77 {16*, 31, 2047, 1025*} 3.16 2.25 12265.13 2 28.89 {32*, 15, 511*, 4097} 3.11 2.33 12082.38 2 25.08 {16*, 31, 2047*, 1025*} 3.16 2.03 14148.06 2 35.63 {32*, 511*, 2047, 17} 2.99 2.21 11612.06 2 26.08 {16*, 31*, 2047, 1025*} 3.16 2.12 12059.88 2 33 {32*, 511*, 2047*, 17} 2.99 2 13494.99 2 33.2 PATMOS 2010, Grenoble, France University of Patras 11 PATMOS 2010, Grenoble, France University of Patras 11

  12. Five moduli-bases results base Power(mW) Area(um 2 ) Delay(ns) Power Savings(%) before After {16, 31, 63, 17, 1025*} 3.3735 2.4955 13912.0799 1.67 26.03 {64, 31, 127, 33*, 65} 2.1642 2.0944 10590.7424 1.3 3.23 {16, 31, 63, 17*, 1025*} 3.3735 2.4145 14082.7567 1.67 28.43 {64, 31, 511*, 17, 33} 3.1573 2.4103 12844.1152 1.73 23.66 {16, 31, 63*, 17, 1025*} 3.3735 2.3016 13527.3711 1.67 31.77 {64, 31, 511*, 17*, 33} 3.1573 2.3293 13014.792 1.73 26.22 {16, 31, 63*, 17*, 1025*} 3.3735 2.2206 13698.0479 1.67 34,.8 {64, 63*, 127, 17, 65} 2.2674 2.0735 10667.5744 1.47 8.55 {16, 31*, 63, 17, 1025*} 3.3735 2.3656 13706.8287 1.67 29.88 {64, 63*, 127, 17*, 65} 2.2674 1.9925 10838.512 1.47 12.12 {16, 31*, 63, 17*, 1025*} 3.3735 2.2846 13877.5055 1.67 32.28 {64, 31*, 511*, 17, 33} 3.1573 2.2804 12638.864 1.73 27.77 {16, 31*, 63*, 17, 1025*} 3.3735 2.1717 13322.1199 1.67 35.62 {64, 31*, 511*, 17*, 33} 3.1573 2.1994 12809.5408 1.73 30.34 {16, 31*, 63*, 17*, 1025*} 3.3735 2.0907 13492.7967 1.67 38.03 PATMOS 2010, Grenoble, France University of Patras 12 PATMOS 2010, Grenoble, France University of Patras 12

  13. Conclusions � Multi-Vdd low power technique is suitable for RNS systems. � Application of multi-Vdd further reduces power dissipation in RNS systems. � High and low Vdd channels can be easily determined. PATMOS 2010, Grenoble, France University of Patras 13 PATMOS 2010, Grenoble, France University of Patras 13

  14. The End Thank you for your attention! PATMOS 2010, Grenoble, France University of Patras 14 PATMOS 2010, Grenoble, France University of Patras 14

More recommend