� New speed records 640838 Pentium M cycles for point multiplication to compute a 32-byte secret shared by Dan and Tanja, D. J. Bernstein given Dan’s 32-byte secret key and Tanja’s 32-byte public key . 2 128 cycles. All known attacks: This is the new speed record for high-security Diffie-Hellman. Thanks to: Encrypt and authenticate messages University of Illinois at Chicago using hash of shared secret as key. NSF CCR–9983950 Diffie-Hellman is the bottleneck Alfred P. Sloan Foundation if total message length is short.
✂ ✂ ✂ ✂ ✂ ✁ � ✂ ✁ ✄ ✁ ✁ � ✂ ✂ ✁ ✄ � ✄ � � rds 640838 Pentium M cycles 640838 Pentium M multiplication to compute a 32-byte secret to compute � -coordinate shared by Dan and Tanja, multiple of ( ✂ ) given Dan’s 32-byte secret key given 0 ✁ 1 2 254 + 8 0 and Tanja’s 32-byte public key . ✁ 1 2 128 cycles. All known attacks: Curve25519 is the 2 = � 3 + 486662 This is the new speed record mod the prime 2 255 for high-security Diffie-Hellman. 624786 Athlon (622) Encrypt and authenticate messages 832457 Pentium II Illinois at Chicago using hash of shared secret as key. 957904 Pentium 4 CCR–9983950 Diffie-Hellman is the bottleneck I anticipate similar Foundation if total message length is short. for UltraSPARC, P
✂ ✁ ✂ ✁ ✂ ✂ ✄ � ✂ ✁ ✂ ✂ ✄ � � ✄ ✂ 640838 Pentium M cycles 640838 Pentium M (695) cycles � th to compute a 32-byte secret to compute � -coordinate of shared by Dan and Tanja, multiple of ( ✂ ) on Curve25519, ✁ 2 256 given Dan’s 32-byte secret key given 0 ✁ 1 1 and 2 254 + 8 0 ✁ 2 251 and Tanja’s 32-byte public key . ✁ 1 1 . 2 128 cycles. All known attacks: Curve25519 is the elliptic curve 2 = � 3 + 486662 � 2 + This is the new speed record mod the prime 2 255 19. for high-security Diffie-Hellman. 624786 Athlon (622) cycles; Encrypt and authenticate messages 832457 Pentium III (686) cycles; using hash of shared secret as key. 957904 Pentium 4 (f12) cycles. Diffie-Hellman is the bottleneck I anticipate similar cycle counts if total message length is short. for UltraSPARC, PowerPC, etc.
✁ ✂ ✂ ✂ ✁ ✂ ✄ � ✄ ✂ ✂ ✂ ✄ ✄ ✄ � � ✂ ✁ M cycles 640838 Pentium M (695) cycles Immune to timing � th 32-byte secret to compute � -coordinate of including cache-timing and Tanja, multiple of ( ✂ ) on Curve25519, including hyperthreading ✁ 2 256 yte secret key given 0 ✁ 1 1 and No data-dependent 2 254 + 8 0 ✁ 2 251 yte public key . ✁ 1 1 . no data-dependent 2 128 cycles. attacks: Curve25519 is the elliptic curve Software is in public 2 = � 3 + 486662 � 2 + 16 kilobytes when speed record mod the prime 2 255 19. cr.yp.to/ecdh.html Diffie-Hellman. 624786 Athlon (622) cycles; No known patent p authenticate messages 832457 Pentium III (686) cycles; shared secret as key. For comparison, Bro 957904 Pentium 4 (f12) cycles. the bottleneck much smaller prime, I anticipate similar cycle counts length is short. 780000 PII cycles; for UltraSPARC, PowerPC, etc. no timing-attack p
✂ ✄ ✄ ✂ ✂ ✂ ✁ ✁ ✄ ✂ � ✂ ✂ ✁ ✄ � ✂ ✄ 640838 Pentium M (695) cycles Immune to timing attacks, � th to compute � -coordinate of including cache-timing attacks, multiple of ( ✂ ) on Curve25519, including hyperthreading attacks. ✁ 2 256 given 0 ✁ 1 1 and No data-dependent branches; 2 254 + 8 0 ✁ 2 251 ✁ 1 1 . no data-dependent indexing. Curve25519 is the elliptic curve Software is in public domain. 2 = � 3 + 486662 � 2 + 16 kilobytes when compiled. mod the prime 2 255 19. cr.yp.to/ecdh.html 624786 Athlon (622) cycles; No known patent problems. 832457 Pentium III (686) cycles; For comparison, Brown et al.: 957904 Pentium 4 (f12) cycles. much smaller prime, 2 192 2 64 1; I anticipate similar cycle counts 780000 PII cycles; given; for UltraSPARC, PowerPC, etc. no timing-attack protection.
� ✄ � ✄ ✄ � ✂ ✂ ✂ ✁ ✂ � ✄ ✂ ✂ ✂ ✁ ✁ ✄ ✂ ✂ ✁ M (695) cycles Immune to timing attacks, Where are the cycles � th ordinate of including cache-timing attacks, Focus today on Pentium ✂ ) on Curve25519, including hyperthreading attacks. Fastest arithmetic ✁ 2 256 1 and No data-dependent branches; uses floating-point ✁ 2 251 ✁ 1 1 . no data-dependent indexing. fp adds, fp subs, fp the elliptic curve Software is in public domain. � 2 + Each Pentium M cycle 486662 16 kilobytes when compiled. 1 fp op. 255 19. cr.yp.to/ecdh.html Point multiplication: (622) cycles; No known patent problems. 589825 fp ops; 0 III (686) cycles; For comparison, Brown et al.: 4 (f12) cycles. Understand cycle counts much smaller prime, 2 192 2 64 1; similar cycle counts by simply counting 780000 PII cycles; given; ARC, PowerPC, etc. no timing-attack protection.
✄ ✄ Immune to timing attacks, Where are the cycles going? including cache-timing attacks, Focus today on Pentium M. including hyperthreading attacks. Fastest arithmetic on Pentium M No data-dependent branches; uses floating-point operations: no data-dependent indexing. fp adds, fp subs, fp mults. Software is in public domain. Each Pentium M cycle does 16 kilobytes when compiled. 1 fp op. cr.yp.to/ecdh.html Point multiplication: 640838 cycles. No known patent problems. 589825 fp ops; 0 ✂ 92 per cycle. For comparison, Brown et al.: Understand cycle counts fairly well much smaller prime, 2 192 2 64 1; by simply counting fp ops. 780000 PII cycles; given; no timing-attack protection.
✄ ✄ � � ✄ timing attacks, Where are the cycles going? Avoiding all time va cache-timing attacks, to stop timing attacks: Focus today on Pentium M. erthreading attacks. 1. For 0 ✁ 1 , compute Fastest arithmetic on Pentium M endent branches; as � [1] + (1 ) uses floating-point operations: endent indexing. Avoids data-dependent fp adds, fp subs, fp mults. public domain. Costs 36210 fp ops Each Pentium M cycle does when compiled. 2. Compute final recip 1 fp op. cr.yp.to/ecdh.html by Fermat, not extended Point multiplication: 640838 cycles. patent problems. Avoids data-dependent 589825 fp ops; 0 ✂ 92 per cycle. Brown et al.: 3. Don’t branch fo Understand cycle counts fairly well rime, 2 192 2 64 1; Allow non-least remainders. by simply counting fp ops. cycles; given; No cost—this saves protection.
✄ Where are the cycles going? Avoiding all time variability to stop timing attacks: Focus today on Pentium M. 1. For 0 ✁ 1 , compute � [ ] Fastest arithmetic on Pentium M as � [1] + (1 ) � [0] or similar. uses floating-point operations: Avoids data-dependent indexing. fp adds, fp subs, fp mults. Costs 36210 fp ops (6%). Each Pentium M cycle does 2. Compute final reciprocal 1 fp op. by Fermat, not extended Euclid. Point multiplication: 640838 cycles. Avoids data-dependent branching. 589825 fp ops; 0 ✂ 92 per cycle. 3. Don’t branch for remainders. Understand cycle counts fairly well Allow non-least remainders. by simply counting fp ops. No cost—this saves time!
� ✄ ✄ cycles going? Avoiding all time variability Main loop: 545700 to stop timing attacks: 2140 times 255 iterations. Pentium M. 1. For 0 ✁ 1 , compute � [ ] Reciprocal: 43821 rithmetic on Pentium M as � [1] + (1 ) � [0] or similar. 41148 = 254 � 162 oint operations: Avoids data-dependent indexing. 2673 = 11 � 243 for subs, fp mults. Costs 36210 fp ops (6%). Additional work: 304 cycle does 2. Compute final reciprocal Inside one main-loop by Fermat, not extended Euclid. 80 = 8 � 10 for 8 adds/subs; multiplication: 640838 cycles. Avoids data-dependent branching. 55 for mult by 121665; 0 ✂ 92 per cycle. 3. Don’t branch for remainders. 648 = 4 � 162 for 4 cycle counts fairly well Allow non-least remainders. 1215 = 5 � 243 for counting fp ops. No cost—this saves time! 142 for � [1] + (1
✄ ✄ Avoiding all time variability Main loop: 545700 fp ops (92.5%). to stop timing attacks: 2140 times 255 iterations. 1. For 0 ✁ 1 , compute � [ ] Reciprocal: 43821 fp ops (7.4%). as � [1] + (1 ) � [0] or similar. 41148 = 254 � 162 for 254 squarings; Avoids data-dependent indexing. 2673 = 11 � 243 for 11 more mults. Costs 36210 fp ops (6%). Additional work: 304 fp ops. 2. Compute final reciprocal Inside one main-loop iteration: by Fermat, not extended Euclid. 80 = 8 � 10 for 8 adds/subs; Avoids data-dependent branching. 55 for mult by 121665; 3. Don’t branch for remainders. 648 = 4 � 162 for 4 squarings; Allow non-least remainders. 1215 = 5 � 243 for 5 more mults; No cost—this saves time! 142 for � [1] + (1 ) � [0] etc.
Recommend
More recommend