Single Base Modular Multiplication for Efficient Hardware RNS Implementations of ECC Karim Bigou and Arnaud Tisserand CNRS, IRISA, INRIA Centre Rennes - Bretagne Atlantique and Univ. Rennes 1 CHES 2015, Sept. 13 – 16 Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 1 / 21
Context Design efficient hardware implementations of asymmetric cryptosystems using fast arithmetic techniques: RSA [RSA78] Discrete Logarithm Cryptosystems: Diffie-Hellman [DH76] (DH), ElGamal [Elg85] Elliptic Curve Cryptography (ECC) [Mil85] [Kob87] The residue number system (RNS) is a representation which enables fast computations for cryptosystems requiring large integers or F P elements Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 2 / 21
Residue Number System (RNS) [SV55] [Gar59] X a large integer of ℓ bits ( ℓ ≈ 160–4096) is represented by: − → X = ( x 1 , . . . , x n ) = ( X mod m 1 , . . . , X mod m n ) RNS base B = ( m 1 , . . . , m n ), n pairwise co-primes of w bits, n × w � ℓ channel 1 channel 2 channel n x 1 x 2 x n X . . . y 1 y 2 y n Y . . . w w w w w w ±× ±× ±× . . . mod m 1 mod m 2 mod m n w w w z 1 z 2 z n Z . . . RNS relies on the Chinese remainder theorem (CRT) EMM = w -bit elementary modular multiplication in one channel Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 3 / 21
RNS Properties Pros: Carry free between channels each channel is independant Fast parallel + , − , × and some exact divisions computations over all channels can be performed in parallel an RNS multiplication requires n EMM s Flexibility for hardware implementations the number of hardware channels and logical channels can be different various area/time trade-offs and multi-size support Non-positional number system randomization of internal computations (SCA countermeasures) Cons: Non-positional number system comparison, modular reduction and division are much harder modular reduction : RNS version of Montgomery reduction MR Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 4 / 21
Montgomery and Pseudo-Mersenne Reductions in RNS Classical binary positional representation: in practice, standards use special primes to perform faster reduction: the pseudo-Mersenne primes P = 2 ℓ − c where c < 2 ℓ/ 2 has a small Hamming weight: fast reduction using 2 ℓ ≡ c mod P In RNS, no equivalent to pseudo-Mersenne number in state-of-the-art Approaches in RNS literature to speed up modular arithmetic: reduce the number of MR ( e.g. [BDE13, BT13]): for instance computing pattern of the form AB + CD mod P improves MR in specific context ( e.g. [Gui10, GLP + 12, BT14]): for example RSA or ECC choose carefully some parameters of the representation to reduce the internal computation cost of MR s [BKP09, BM14, YFCV14] Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 5 / 21
RNS Montgomery Reduction ( MR ) [PP95] Input : − → X , − → X ′ with X < α P 2 < PM and 2 P < M ′ Output : ( − → ω , − → ω ′ ) with ω ≡ X × M − 1 mod P B ′ B 0 � ω < 2 P × − → − − → X × ( −− → • P − 1 ) Q ← (in base B ) BE • → − − BE ( − → Q ′ ← Q , B , B ′ ) ( n × n EMM s) × − → − − → X ′ + − → Q ′ × − → S ′ ← P ′ (in base B ′ ) + − − → S ′ × − → × − → ω ′ ← M − 1 (in base B ′ ) • BE → − − BE ( − → ω ′ , B ′ , B ) • ω ← ( n × n EMM s) where M = � n i =1 m i BE : base extension ( i.e. conversion) MR cost: 2 n 2 + O ( n ) EMM s Note: MM = 1 RNS mult. + MR Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 6 / 21
Size of Elements Using MM B ′ B � �� � � �� � X × × × × × × × × 2 n EMM s Y XY 2 n 2 + O ( n ) EMM s RNS Montgomery Reduction MR Z (= | XY | P ) Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 7 / 21
A New RNS Modular Multiplication Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 8 / 21
First Step: Changing the Representation We split field elements in 2 parts of the same size B a B b How? � �� � � �� � B = B a | b using half-bases : n × w = ℓ � �� � n/ 2 × w i =1 m a , i , we split − X into ( − → K x , − → → Using M a = � n a R x ) such that: − → − → − − → − → X = K x M a + R x K x and R x are ℓ/ 2 bits long F P elements are now represented by ( K , R ) : we add a little positional information We call Split the function to get ( − K x , − → R x ) from − → → X Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 9 / 21
Decomposition with Split Algorithm Input : − − → X a | b − − − − − − → � M − 1 � Precomp. : a b Output : − ( K x ) a | b , − − − − − → ( R x ) a | b with − − − − − → X a | b = − − → ( K x ) a | b × − − − − − → ( M a ) a | b + − − − − − → − − − − → ( R x ) a | b − − − → � − − − → � ( n 2 × n ( R x ) b ← BE ( R x ) a , B a , B b 2 ) EMM s − − − − − − → − − − − → � − X b − − → − − → � M − 1 � � ( K x ) b ← ( R x ) b × a b if − ( K x ) b = − − − − → → − 1 then ( K x ) b ← − − − − − → → 0 /*with Kawamura BE correction [KKSS00] */ ( R x ) b ← − − − − → ( R x ) b − − − − → − − − → ( M a ) b − − − → � − − − − → � ( n 2 × n ( K x ) a ← BE ( K x ) b , B b , B a 2 ) EMM s return − ( K x ) a | b , − − − − − → − − − − → ( R x ) a | b Note: the cost of Split is dominated by the 2 BE s on half bases : n 2 2 + O ( n ) when n a = n b = n / 2 Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 10 / 21
A New Choice for P Second step: we propose the form P = M 2 a − c with P prime and c small Some remarks P = M 2 a − 1 is never prime in practice, we choose P = M 2 a − 2 with M a odd i.e. M 2 a ≡ 2 mod P One can find a lot of P for a given size (probabilistic primality tests using isprime from Maple, for instance generating 10 000 P of 512 bits in 15 s) P is an equivalent for RNS to pseudo-Mersenne numbers for the radix 2 standard representation (for instance P = 2 521 − 1) Our Single Base Modular Multiplication SBMM combines: P = M 2 a − 2 ( K x , R x ) representation Split function Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 11 / 21
SBMM Algorithm Parameters : B a such that M 2 a = P + 2 and B b such that M b > 6 M a Input : − ( K x ) a | b , − − − − − → ( R x ) a | b , − − − − − → ( K y ) a | b , − − − − − → − − − − → ( R y ) a | b with K x , R x , K y , R y < M a Output : − ( K z ) a | b , − − − − − → − − − → ( R z ) a | b with K z < 5 M a and R z < 6 M a U a | b ← − − − → − − − − − − − − − − → 2 K x K y + R x R y − V a | b ← − − → − − − − − − − − − → K x R y + R x K y � − ( K u ) a | b , − − − − − → − − − − → ← Split ( − − → � ( R u ) a | b U a | b ) } in parallel � − ( K v ) a | b , − − − − − → − − − − → ← Split ( − − → � ( R v ) a | b V a | b ) � − ( K z ) a | b , − − − − − → − − − → � − ( K u + R v ) a | b , − − − − − − − − − → − − − − − − − − − − → � � ( R z ) a | b ← (2 · K v + R u ) a | b � − ( K z ) a | b , − − − − − → − − − → � ( R z ) a | b return Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 12 / 21
SBMM Principle 1/2 B a B b B a B b � �� � � �� � � �� � � �� � X : K x R x × × × × × × × × 2 n EMM s Y : K y R y K x K y R x R y X : K x R x × × × × × × × × 2 n EMM s Y : R y K y K x R y R x K y XY ≡ 2 K x K y + ( K x R y + K y R x ) M a + R x R y ≡ U + V M a mod P Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 13 / 21
SBMM Principle 2/2 XY ≡ U + V M a ≡ ( K u + R v ) M a + ( R u + 2 K v ) ≡ K z M a + R z mod P 2 K x K y R x K y + + + + + + + + R x R y R x K y U V � 2 + O ( n ) � � 2 2 � n Split Split 2 = n 2 + O ( n ) EMM s K u + R v = K z R u +2 K v = R z Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 14 / 21
SBMM Architecture with n / 2 Rower s channel n channel n 2 + 1 channel 1 channel 2 2 CTRL x n y n x n 2 +1 y n x 1 y 1 x 2 y 2 2 +1 2 2 w w w 6 . . . w w w w w w w 6 6 1 rower rower n cox rower 1 rower 2 . . . 2 n 2 + 1 6 w w w 6 w Output Karim Bigou and Arnaud Tisserand SBMM Modular Multiplication CHES 2015, Sept. 13 – 16 15 / 21
Recommend
More recommend