A New Mult ltipli licative In Inverse Archit itecture in in Normal Basis is Usin ing Novel Concurrent Seria ial Squarin ing and Mult ltipli lication Amin Monfared, Hayssam El-Razouk and Arash Reyhani-Masoleh Presented by: Arash Reyhani-Masoleh Department of Electrical and Computer Engineering Western University, London, Ontario, Canada 24 th IEEE Symposium on Computer Arithmetic, 2017 1
Outline β’ Motivation β’ Arithmetic operations over π»πΊ(2 π ) using Gaussian Normal Basis (GNB) β’ Proposed digit-level square-multiply architecture β’ It computes π΅ Γ πΆ 2 π β’ Both digits of inputs π΅ and πΆ are entered serially β’ Denoted by Digit-Level Fully Serial-In Square-Multiply (DL-FSISM) β’ Proposed inversion architecture β’ It uses the DL-FSISM β’ ASIC implementations and comparison β’ Conclusions and future work 2
Motivation: Fin inite Fields β’ Many applications use arithmetic operations over π»πΊ(2 π ) β’ Cryptography: Elliptic Curve, AES β’ Error control coding β’ Reed-Solomon code β’ There are different bases to represent a field element. β’ Polynomial basis, normal basis (NB), dual basis, etc. β’ In NB, squaring is free in hardware. 3
Motivation: Gaussian Normal Basis is (GNB) β’ GNB over π»πΊ 2 π is a special class of NB and exists whenever π is not divisible by 8. β’ GNBs have been included in IEEE and NIST standards for ECDSA. β’ Any field element π΅ can be represented as πβ1 π π πΎ 2 π , where π π π{0,1} and π΅ = ΰ· π=0 {πΎ, β¦ , πΎ 2 πβ1 } is a GNB over π»πΊ 2 π . β’ In this paper, we consider GNB and propose new digit-level architectures for square-multiply and inversion. 4
ic Operations over π»πΊ(2 π ) using GNB Arit ithmetic GNB β’ Addition β’ Let π΅ and πΆ be two Field elements represented in GNB. β’ The addition operation is bit-wise XOR operation of the coordinates of the two inputs: πβ1 (π π +π π )πΎ 2 π π΅ + πΆ = ΰ· π=0 β’ Squaring β’ Squaring operation is performed by right cyclic shift of the coordinates of π΅ : πβ1 π΅ 2 = ΰ· π π πΎ 2 π+1 π=0 β’ It is free in hardware if all coordinates are available in parallel. 5
Arit ithmetic ic Operations usin ing GNB: : Mult ltip ipli lication β’ Finite field multiplication is more complex than addition and squaring. β’ Multiplication can be implemented in digit-level architectures, in which the digit size can be chosen based on available resources. β’ In this paper, we have used two different types of digit-level multiplier namely: β’ Digit-Level Parallel-In Serial-Out (DL-PISO) β’ Digit-Level Parallel-In Parallel-Out (DL-PIPO) β’ Also, we have proposed a new multiplier/squarer architecture β’ Digit-Level Fully Serial-In Square-Multiply (DL-FSISM). 6
Arit ithmetic ic Operations usin ing GNB: : In Inversion β’ Based on Fermat Little Theorem, an inversion can be calculated by β’ π΅ β1 = π΅ 2 π β2 β π»πΊ 2 π , π΅ β 0. β’ In Itoh and Tsujii algorithm (ITA) [4], the number of multiplications is reduced based on decomposing 2 πβ1 β 1 β’ As an example for the NIST recommended field over π»πΊ(2 233 ) : 2 232 β 1 = (1 + 2)(1 + 2 2 )(1 + 2 4 )(1 + 2 8 (1 + 2 8 )(1 + 2 16 )(1 + 2 32 (1 + 2 32 )(1 + 2 64 (1 + 2 64 )))) β’ The inversion using ITA takes a total of 10 iterations. β’ Each iteration consists of one single digit-level parallel-in parallel- out (DL-PIPO) multiplication and one free squaring. -------------------------------------------------------------------------------------- 7 [4] T. Itoh and S. Tsujii , βA fast algorithm for computing multiplicative inverses in GF(2 m ) using normal bases,β Information and computation, vol. 78, no. 3, pp. 171 β 177, 1988.
Arit rithmetic ic Operatio ions usi sing GNB: In Inversio ion ( contβd) β’ Our inversion flow diagram (based on ITA) uses an interleaved computations of digit-level parallel-in serial-out (DL-PISO) multiplier and our new DL-FSISM architecture. β’ It only needs a total of 5 iterations. β’ Each iteration consists of two single multiplications (and squarings) β’ In this paper, we propose a new digit-level fully serial-in parallel-out square-multiply (DL-FSISM) architecture which performs concurrent squaring and multiplication without introducing any delay. 8
Proposed Dig igit it-Level l Fully lly Se Seri rial-In Sq Square-Mult ltip iply ly (DL-FSISM) (D β’ Let A and B be field elements and e be an integer. β’ The proposed scheme reads the inputs of A and B digit-by- digit serially and concurrently computes πΊ = π΅ Γ πΆ 2 π . β’ The composite operations of squaring and multiplication are concurrently performed without introducing any additional delay. π β’ For a digit size of π bits, it would take β π β clock cycles to generate the result πΊ = π΅ Γ πΆ 2 π . 9
Proposed DL-FSISM: Key y Formulation Proposition 1: Let π΅ and πΆ be two π»πΊ(2 π ) elements that are represented in GNB {πΎ, β¦ , πΎ 2 πβ1 } . One can compute πΊ = π΅πΆ 2 π , by proceeding from π = 0 to π β 1 , the result πΊ = πΊ πβ1 = π΅ (πβ1) (πΆ πβ1 ) 2 π is obtained using the following recurrence relation πΊ π = πΊ πβ1 2 π + Ο π=0 2 π πβ1 π π π π πβ1βπ +π , πΆ π + 2 πβπ πβ1 π ) 2 π π π πβ1βπ +π , π΅ πβ1 (Ο π=0 π πβ1 π€ π πΎ 2 π β π»πΊ 2 π . π π£, π = π£ππΎ 2 π , u π 0,1 and π = Ο π=0 where π 10
Proposed DL-FSISM: Archit itecture πβ1 πβ1 πβ1 2 π + ΰ· 2 π 2 πβπ ) 2 π π π π πβ1βπ +π , πΆ π π π πβ1βπ +π , π΅ πβ1 πΊ π = πΊ π + (ΰ· π π π=0 π=0 β’ Three registers X, a d(k-1-i)+d- 1 in1 1 d m-d d m B (i) Β»e n Y, and Z are d- 1 m e n in2 B B B m - - i - m m 0 k 1 k 1 B (i) + 0 m-d -1 <Y> initially cleared d n d 0 m -1 a d(k-1-i)+ 0 m <Z> in1 1 + d d π΅πΆ 2 π β’ Digits of inputs 0 m in2 m m d are entered to X b d(k-1-i)+d- 1 in1 1 d and Y serially ((A (i-1) Β»d)Β«e n ) m-d m d- 1 m e n d in2 m + A (i- 1 ) from MSB A A A - - - e n 0 k 1 i k 1 n b d(k-1-i)+ 0 0 m-d -1 m m in1 <X> 1 d d β’ After β π n 0 m π β clock in2 m cycles, Z contains π΅πΆ 2 π 11
Proposed DL-FSISM: Archit itecture (contβd) πβ1 πβ1 πβ1 2 π + ΰ· 2 π 2 πβπ ) 2 π π π π πβ1βπ +π , πΆ π π Γ π π πβ1βπ +π , π΅ πβ1 πΊ π = πΊ π + (ΰ· π π=0 π=0 a d(k-1-i)+d- 1 in1 1 d m-d d m B (i) Β»e n d- 1 m e n in2 B B B - - i - m m m k 1 0 k 1 B (i) + 0 m-d -1 <Y> d n d 0 m -1 a d(k-1-i)+ 0 m <Z> in1 1 + d d 0 m in2 m m d b d(k-1-i)+d- 1 in1 1 d ((A (i-1) Β»d)Β«e n ) m-d m d- 1 m e n in2 d m + A (i- 1 ) A A A - - - e n 0 k 1 i k 1 n b d(k-1-i)+ 0 m π 0 m-d -1 m in1 <X> 1 π d d n 0 m in2 m 1 1 in1 in2 1 ο’ m j j 1 m m m m e 1 e 1 m m 1 X 2 -e n X 2 e n e n e n 1 X X m m 1 m m m m e v m e v m n n 12
Proposed In Inversion Archit itecture β’ The inversion core is made by serially connecting of DL-PISO and DL-FSISM β’ The register file only stores from the multipliers β’ π -bits register is Ξ΅ = {2,8,32,64} 32 added between two multipliers to shorten the critical path β’ Each iteration selects one of inputs of multiplexers and takes β π π β +1 clock cycles 13
In Inversion Archit itecture Comparison (Number of It Iterations) Architecture Algorithm Multiplication Number of m = 163 m = 233 m = 283 m = 409 m = 571 type Iterations [4] ITA 1 Γ Single N 1 9 10 11 11 13 [7, 6] TIT/MTIT 1 Γ double N 2 5 9 8 7 8 [8] Optimal-3 1 Γ double N 3 5 7 6 7 7 chain β N 1 Proposed ITA 2 Γ Single 5 5 6 6 7 2 β Interleaved β’ Our Proposed inversion architecture reduces the required number of iterations as compared with previous works. β’ The best performance is achieved when π = 233. [4] T. Itoh and S. Tsujii , βA fast algorithm for computing multiplicative inverses in GF(2m) using normal bases,β Information and computation, vol. 78, no. 3, pp. 171 β 177, 1988. [6] J. Hu, W. Guo , J. Wei, and R. Cheung, βFast and Generic Inversion Architectures Over GF(2m) Using Modified Itohβ Tsujii Algorithms,β IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 62, pp. 367β 371, April 2015. [7] R. Azarderakhsh, K. Jarvinen, and V. Dimitrov , βFast Inversion in GF(2m) with Normal Basis Using Hybrid - Double Multipliers,β IEEE Trans. Comput., vol. 63, pp. 1041 β 1047, April 2014. [8] K. Jarvinen, V. Dimitrov, and R. Azarderakhsh , βA Generalization of Addition Chains and Fast Inversions in Binary Fields,β IEEE Trans. Comput., vol. 64, pp. 2421 β 2432, Sept. 2015. 14
Recommend
More recommend