modern ecc signatures many papers have explored
play

Modern ECC signatures Many papers have explored Curve25519/Ed25519 - PowerPoint PPT Presentation

1 2 Modern ECC signatures Many papers have explored Curve25519/Ed25519 speed. 2011 BernsteinDuifLange SchwabeYang: e.g. 2015 Chou software: Ed25519 signature scheme = on Intel Sandy Bridge (2011), EdDSA using conservative


  1. 4 5 Better comparisons For each of these operations, (still raising many questions): on each of these curves, on each of these CPUs: ECDH on Intel Pentium II/III (still not exactly the same): Simplest implementations 1920000 cycles for NIST P-256, are much, much, much slower. 832457 cycles for Curve25519. Questions in algorithm design ECDH on Sandy Bridge: and software engineering: 374000 cycles for NIST P-256 How to build the fastest software (from 2013 Gueron–Krasnov), on, e.g., an ARM Cortex-A8 for 159128 cycles for Curve25519. Ed25519 signature verification? Verification on Sandy Bridge: Answers feed back into crypto 529000 cycles for ECDSA-P-256, design: e.g., choosing fast curves. 205741 cycles for Ed25519.

  2. 4 5 comparisons For each of these operations, Several levels raising many questions): on each of these curves, ECC on each of these CPUs: verify S on Intel Pentium II/III not exactly the same): Simplest implementations Point 1920000 cycles for NIST P-256, are much, much, much slower. P; Q 832457 cycles for Curve25519. Questions in algorithm design on Sandy Bridge: and software engineering: Field 374000 cycles for NIST P-256 How to build the fastest software x 1 ; x 2 �→ 2013 Gueron–Krasnov), on, e.g., an ARM Cortex-A8 for 159128 cycles for Curve25519. Ed25519 signature verification? Machine erification on Sandy Bridge: Answers feed back into crypto 32-bit multiplication 529000 cycles for ECDSA-P-256, design: e.g., choosing fast curves. 205741 cycles for Ed25519. Gates: AND,

  3. � � � � 4 5 risons For each of these operations, Several levels to optimize: many questions): on each of these curves, ECC ops: e.g., on each of these CPUs: verify SB = R + h entium II/III windowing the same): Simplest implementations Point ops: e.g., for NIST P-256, are much, much, much slower. P; Q �→ P + Q r Curve25519. Questions in algorithm design faster doubling Bridge: and software engineering: Field ops: e.g., r NIST P-256 How to build the fastest software x 1 ; x 2 �→ x 1 x 2 in F Gueron–Krasnov), on, e.g., an ARM Cortex-A8 for delayed r Curve25519. Ed25519 signature verification? Machine insns: e.g., Sandy Bridge: Answers feed back into crypto 32-bit multiplication r ECDSA-P-256, design: e.g., choosing fast curves. � pipelining r Ed25519. Gates: e.g., AND, OR, XOR

  4. � � � � 4 5 For each of these operations, Several levels to optimize: questions): on each of these curves, ECC ops: e.g., on each of these CPUs: verify SB = R + hA I/III windowing etc. same): Simplest implementations Point ops: e.g., P-256, are much, much, much slower. P; Q �→ P + Q Curve25519. Questions in algorithm design faster doubling etc and software engineering: Field ops: e.g., P-256 How to build the fastest software x 1 ; x 2 �→ x 1 x 2 in F p Gueron–Krasnov), on, e.g., an ARM Cortex-A8 for delayed carries etc. Curve25519. Ed25519 signature verification? Machine insns: e.g., Bridge: Answers feed back into crypto 32-bit multiplication ECDSA-P-256, design: e.g., choosing fast curves. � pipelining etc. Ed25519. Gates: e.g., AND, OR, XOR

  5. � � � � 5 6 For each of these operations, Several levels to optimize: on each of these curves, ECC ops: e.g., on each of these CPUs: verify SB = R + hA windowing etc. Simplest implementations Point ops: e.g., are much, much, much slower. P; Q �→ P + Q Questions in algorithm design faster doubling etc. and software engineering: Field ops: e.g., How to build the fastest software x 1 ; x 2 �→ x 1 x 2 in F p on, e.g., an ARM Cortex-A8 for delayed carries etc. Ed25519 signature verification? Machine insns: e.g., Answers feed back into crypto 32-bit multiplication design: e.g., choosing fast curves. � pipelining etc. Gates: e.g., AND, OR, XOR

  6. � � � � 5 6 each of these operations, Several levels to optimize: Single-scala each of these curves, ECC ops: e.g., Fundamental each of these CPUs: verify SB = R + hA n; P �→ n windowing etc. Simplest implementations Input n is Point ops: e.g., much, much, much slower. ˘ 0 ; 1 ; : : : P; Q �→ P + Q Questions in algorithm design Input P faster doubling etc. software engineering: Field ops: e.g., Will build to build the fastest software x 1 ; x 2 �→ x 1 x 2 in F p using additions e.g., an ARM Cortex-A8 for delayed carries etc. and subtractions Ed25519 signature verification? Machine insns: e.g., Later will ers feed back into crypto 32-bit multiplication double-scala design: e.g., choosing fast curves. � pipelining etc. m; P; n; Q Gates: e.g., AND, OR, XOR

  7. � � � � 5 6 these operations, Several levels to optimize: Single-scalar multip curves, ECC ops: e.g., Fundamental ECC CPUs: verify SB = R + hA n; P �→ nP . windowing etc. implementations Input n is integer in, Point ops: e.g., much, much slower. 0 ; 1 ; : : : ; 2 256 − 1 ˘ P; Q �→ P + Q algorithm design Input P is point on faster doubling etc. engineering: Field ops: e.g., Will build n; P �→ n the fastest software x 1 ; x 2 �→ x 1 x 2 in F p using additions P; Q Cortex-A8 for delayed carries etc. and subtractions P signature verification? Machine insns: e.g., Later will also look back into crypto 32-bit multiplication double-scalar multip osing fast curves. � pipelining etc. m; P; n; Q �→ mP + Gates: e.g., AND, OR, XOR

  8. � � � � 5 6 erations, Several levels to optimize: Single-scalar multiplication ECC ops: e.g., Fundamental ECC operation: verify SB = R + hA n; P �→ nP . windowing etc. Input n is integer in, e.g., Point ops: e.g., wer. 0 ; 1 ; : : : ; 2 256 − 1 ˘ ¯ . P; Q �→ P + Q design Input P is point on elliptic curve. faster doubling etc. Field ops: e.g., Will build n; P �→ nP software x 1 ; x 2 �→ x 1 x 2 in F p using additions P; Q �→ P + rtex-A8 for delayed carries etc. and subtractions P; Q �→ P − verification? Machine insns: e.g., Later will also look at ypto 32-bit multiplication double-scalar multiplication curves. � pipelining etc. m; P; n; Q �→ mP + nQ . Gates: e.g., AND, OR, XOR

  9. � � � � 6 7 Several levels to optimize: Single-scalar multiplication ECC ops: e.g., Fundamental ECC operation: verify SB = R + hA n; P �→ nP . windowing etc. Input n is integer in, e.g., Point ops: e.g., 0 ; 1 ; : : : ; 2 256 − 1 ˘ ¯ . P; Q �→ P + Q Input P is point on elliptic curve. faster doubling etc. Field ops: e.g., Will build n; P �→ nP x 1 ; x 2 �→ x 1 x 2 in F p using additions P; Q �→ P + Q delayed carries etc. and subtractions P; Q �→ P − Q . Machine insns: e.g., Later will also look at 32-bit multiplication double-scalar multiplication � pipelining etc. m; P; n; Q �→ mP + nQ . Gates: e.g., AND, OR, XOR

  10. � � � � 6 7 Several levels to optimize: Single-scalar multiplication Left-to-right ECC ops: e.g., Fundamental ECC operation: def scalarmult(n,P): SB = R + hA n; P �→ nP . if n == windowing etc. if n == Input n is integer in, e.g., oint ops: e.g., R = scalarmult(n//2,P) 0 ; 1 ; : : : ; 2 256 − 1 ˘ ¯ . Q �→ P + Q R = R Input P is point on elliptic curve. faster doubling etc. if n % Field ops: e.g., return Will build n; P �→ nP �→ x 1 x 2 in F p using additions P; Q �→ P + Q Two Python delayed carries etc. and subtractions P; Q �→ P − Q . • n//2 in Machine insns: e.g., Later will also look at multiplication • Recursion double-scalar multiplication � pipelining etc. See sys.setrecursionlimit m; P; n; Q �→ mP + nQ . Gates: e.g., AND, OR, XOR

  11. 6 7 optimize: Single-scalar multiplication Left-to-right binary e.g., Fundamental ECC operation: def scalarmult(n,P): hA n; P �→ nP . if n == 0: return windowing etc. if n == 1: return Input n is integer in, e.g., e.g., R = scalarmult(n//2,P) 0 ; 1 ; : : : ; 2 256 − 1 ˘ ¯ . Q R = R + R Input P is point on elliptic curve. faster doubling etc. if n % 2: R = R e.g., return R Will build n; P �→ nP in F p using additions P; Q �→ P + Q Two Python notes: ed carries etc. and subtractions P; Q �→ P − Q . • n//2 in Python e.g., Later will also look at multiplication • Recursion depth double-scalar multiplication elining etc. See sys.setrecursionlimit m; P; n; Q �→ mP + nQ . e.g., OR

  12. 6 7 Single-scalar multiplication Left-to-right binary method Fundamental ECC operation: def scalarmult(n,P): n; P �→ nP . if n == 0: return 0 tc. if n == 1: return P Input n is integer in, e.g., R = scalarmult(n//2,P) 0 ; 1 ; : : : ; 2 256 − 1 ˘ ¯ . R = R + R Input P is point on elliptic curve. etc. if n % 2: R = R + P return R Will build n; P �→ nP using additions P; Q �→ P + Q Two Python notes: etc. and subtractions P; Q �→ P − Q . • n//2 in Python means ⌊ n = Later will also look at • Recursion depth is limited. double-scalar multiplication See sys.setrecursionlimit m; P; n; Q �→ mP + nQ .

  13. 7 8 Single-scalar multiplication Left-to-right binary method Fundamental ECC operation: def scalarmult(n,P): n; P �→ nP . if n == 0: return 0 if n == 1: return P Input n is integer in, e.g., R = scalarmult(n//2,P) 0 ; 1 ; : : : ; 2 256 − 1 ˘ ¯ . R = R + R Input P is point on elliptic curve. if n % 2: R = R + P return R Will build n; P �→ nP using additions P; Q �→ P + Q Two Python notes: and subtractions P; Q �→ P − Q . • n//2 in Python means ⌊ n = 2 ⌋ . Later will also look at • Recursion depth is limited. double-scalar multiplication See sys.setrecursionlimit . m; P; n; Q �→ mP + nQ .

  14. 7 8 Single-scalar multiplication Left-to-right binary method This recursion “ n undamental ECC operation: def scalarmult(n,P): • 2 2 P nP . if n == 0: return 0 e.g. 20 if n == 1: return P n is integer in, e.g., R = scalarmult(n//2,P) : : : ; 2 256 − 1 ¯ . „ n − • 2 R = R + R 2 P is point on elliptic curve. if n % 2: R = R + P e.g. 21 return R build n; P �→ nP additions P; Q �→ P + Q Two Python notes: Base cases subtractions P; Q �→ P − Q . 0 P = 0. • n//2 in Python means ⌊ n = 2 ⌋ . 1 P = P . will also look at • Recursion depth is limited. double-scalar multiplication Assuming See sys.setrecursionlimit . ; Q �→ mP + nQ . Otherwise

  15. 7 8 multiplication Left-to-right binary method This recursion computes “ n ECC operation: def scalarmult(n,P): ” • 2 2 P if n ∈ 2 if n == 0: return 0 e.g. 20 P = 2 · 10 if n == 1: return P integer in, e.g., R = scalarmult(n//2,P) ¯ 1 . „ n − 1 « • 2 P + P R = R + R 2 on elliptic curve. if n % 2: R = R + P e.g. 21 P = 2 · 10 return R nP ; Q �→ P + Q Two Python notes: Base cases in recursion: P; Q �→ P − Q . 0 P = 0. For Edwa • n//2 in Python means ⌊ n = 2 ⌋ . 1 P = P . Could omit ok at • Recursion depth is limited. ultiplication Assuming n ≥ 0 fo See sys.setrecursionlimit . + nQ . Otherwise use nP

  16. 7 8 Left-to-right binary method This recursion computes nP “ n eration: def scalarmult(n,P): ” • 2 2 P if n ∈ 2 Z . if n == 0: return 0 e.g. 20 P = 2 · 10 P . if n == 1: return P R = scalarmult(n//2,P) „ n − 1 « • 2 P + P if n ∈ 1 R = R + R 2 curve. if n % 2: R = R + P e.g. 21 P = 2 · 10 P + P . return R + Q Two Python notes: Base cases in recursion: − Q . 0 P = 0. For Edwards: 0 = (0 • n//2 in Python means ⌊ n = 2 ⌋ . 1 P = P . Could omit this case. • Recursion depth is limited. lication Assuming n ≥ 0 for simplicit See sys.setrecursionlimit . Otherwise use nP = − ( − n ) P

  17. 8 9 Left-to-right binary method This recursion computes nP as “ n def scalarmult(n,P): ” • 2 2 P if n ∈ 2 Z . if n == 0: return 0 e.g. 20 P = 2 · 10 P . if n == 1: return P R = scalarmult(n//2,P) „ n − 1 « • 2 P + P if n ∈ 1 + 2 Z . R = R + R 2 if n % 2: R = R + P e.g. 21 P = 2 · 10 P + P . return R Two Python notes: Base cases in recursion: 0 P = 0. For Edwards: 0 = (0 ; 1). • n//2 in Python means ⌊ n = 2 ⌋ . 1 P = P . Could omit this case. • Recursion depth is limited. Assuming n ≥ 0 for simplicity. See sys.setrecursionlimit . Otherwise use nP = − ( − n ) P .

  18. 8 9 Left-to-right binary method This recursion computes nP as If 0 ≤ n this algo “ n scalarmult(n,P): ” • 2 2 P if n ∈ 2 Z . ≤ 2 b − 2 == 0: return 0 ≤ b − 1 doublings e.g. 20 P = 2 · 10 P . == 1: return P ≤ b − 1 additions scalarmult(n//2,P) „ n − 1 « • 2 P + P if n ∈ 1 + 2 Z . Example + R 2 31 P = 2(2(2(2 % 2: R = R + P e.g. 21 P = 2 · 10 P + P . 31 = (11111) return R 4 doublings; Python notes: Base cases in recursion: Average 0 P = 0. For Edwards: 0 = (0 ; 1). in Python means ⌊ n = 2 ⌋ . 35 P = 2(2(2(2(2 1 P = P . Could omit this case. Recursion depth is limited. 35 = (100011) Assuming n ≥ 0 for simplicity. sys.setrecursionlimit . 5 doublings; Otherwise use nP = − ( − n ) P .

  19. 8 9 If 0 ≤ n < 2 b then binary method This recursion computes nP as this algorithm uses “ n scalarmult(n,P): ” • 2 2 P if n ∈ 2 Z . ≤ 2 b − 2 additions: return 0 ≤ b − 1 doublings e.g. 20 P = 2 · 10 P . return P ≤ b − 1 additions of scalarmult(n//2,P) „ n − 1 « • 2 P + P if n ∈ 1 + 2 Z . Example of worst case: 2 31 P = 2(2(2(2 P + R + P e.g. 21 P = 2 · 10 P + P . 31 = (11111) 2 ; b = 4 doublings; 4 more notes: Base cases in recursion: Average case is better: 0 P = 0. For Edwards: 0 = (0 ; 1). Python means ⌊ n = 2 ⌋ . 35 P = 2(2(2(2(2 P 1 P = P . Could omit this case. depth is limited. 35 = (100011) 2 ; b Assuming n ≥ 0 for simplicity. sys.setrecursionlimit . 5 doublings; 2 additions. Otherwise use nP = − ( − n ) P .

  20. 8 9 If 0 ≤ n < 2 b then d This recursion computes nP as this algorithm uses “ n ” • 2 2 P if n ∈ 2 Z . ≤ 2 b − 2 additions: specifically ≤ b − 1 doublings and e.g. 20 P = 2 · 10 P . ≤ b − 1 additions of P . „ n − 1 « • 2 P + P if n ∈ 1 + 2 Z . Example of worst case: 2 31 P = 2(2(2(2 P + P )+ P )+ P e.g. 21 P = 2 · 10 P + P . 31 = (11111) 2 ; b = 5; 4 doublings; 4 more additions. Base cases in recursion: Average case is better: e.g. 0 P = 0. For Edwards: 0 = (0 ; 1). ⌊ = 2 ⌋ . 35 P = 2(2(2(2(2 P ))) + P ) + 1 P = P . Could omit this case. limited. 35 = (100011) 2 ; b = 6; Assuming n ≥ 0 for simplicity. sys.setrecursionlimit . 5 doublings; 2 additions. Otherwise use nP = − ( − n ) P .

  21. 9 10 If 0 ≤ n < 2 b then This recursion computes nP as this algorithm uses “ n ” • 2 2 P if n ∈ 2 Z . ≤ 2 b − 2 additions: specifically ≤ b − 1 doublings and e.g. 20 P = 2 · 10 P . ≤ b − 1 additions of P . „ n − 1 « • 2 P + P if n ∈ 1 + 2 Z . Example of worst case: 2 31 P = 2(2(2(2 P + P )+ P )+ P )+ P . e.g. 21 P = 2 · 10 P + P . 31 = (11111) 2 ; b = 5; 4 doublings; 4 more additions. Base cases in recursion: Average case is better: e.g. 0 P = 0. For Edwards: 0 = (0 ; 1). 35 P = 2(2(2(2(2 P ))) + P ) + P . 1 P = P . Could omit this case. 35 = (100011) 2 ; b = 6; Assuming n ≥ 0 for simplicity. 5 doublings; 2 additions. Otherwise use nP = − ( − n ) P .

  22. 9 10 If 0 ≤ n < 2 b then recursion computes nP as Non-adjacent this algorithm uses def scalarmult(n,P): ” P if n ∈ 2 Z . ≤ 2 b − 2 additions: specifically if n == ≤ b − 1 doublings and 20 P = 2 · 10 P . if n == ≤ b − 1 additions of P . if n % « − 1 P + P if n ∈ 1 + 2 Z . Example of worst case: R = 2 31 P = 2(2(2(2 P + P )+ P )+ P )+ P . R = 21 P = 2 · 10 P + P . 31 = (11111) 2 ; b = 5; return 4 doublings; 4 more additions. if n % cases in recursion: R = Average case is better: e.g. 0. For Edwards: 0 = (0 ; 1). R = 35 P = 2(2(2(2(2 P ))) + P ) + P . P . Could omit this case. return 35 = (100011) 2 ; b = 6; Assuming n ≥ 0 for simplicity. R = scalarmult(n/2,P) 5 doublings; 2 additions. Otherwise use nP = − ( − n ) P . return

  23. 9 10 If 0 ≤ n < 2 b then computes nP as Non-adjacent form this algorithm uses def scalarmult(n,P): 2 Z . ≤ 2 b − 2 additions: specifically if n == 0: return ≤ b − 1 doublings and 10 P . if n == 1: return ≤ b − 1 additions of P . if n % 4 == 1: P if n ∈ 1 + 2 Z . Example of worst case: R = scalarmult((n-1)/4,P) 31 P = 2(2(2(2 P + P )+ P )+ P )+ P . R = R + R 10 P + P . 31 = (11111) 2 ; b = 5; return (R + R) 4 doublings; 4 more additions. if n % 4 == 3: recursion: R = scalarmult((n+1)/4,P) Average case is better: e.g. Edwards: 0 = (0 ; 1). R = R + R 35 P = 2(2(2(2(2 P ))) + P ) + P . omit this case. return (R + R) 35 = (100011) 2 ; b = 6; for simplicity. R = scalarmult(n/2,P) 5 doublings; 2 additions. P = − ( − n ) P . return R + R

  24. 9 10 If 0 ≤ n < 2 b then P as Non-adjacent form (NAF) this algorithm uses def scalarmult(n,P): ≤ 2 b − 2 additions: specifically if n == 0: return 0 ≤ b − 1 doublings and if n == 1: return P ≤ b − 1 additions of P . if n % 4 == 1: 1 + 2 Z . Example of worst case: R = scalarmult((n-1)/4,P) 31 P = 2(2(2(2 P + P )+ P )+ P )+ P . R = R + R 31 = (11111) 2 ; b = 5; return (R + R) + P 4 doublings; 4 more additions. if n % 4 == 3: R = scalarmult((n+1)/4,P) Average case is better: e.g. (0 ; 1). R = R + R 35 P = 2(2(2(2(2 P ))) + P ) + P . case. return (R + R) - P 35 = (100011) 2 ; b = 6; simplicity. R = scalarmult(n/2,P) 5 doublings; 2 additions. ) P . return R + R

  25. 10 11 If 0 ≤ n < 2 b then Non-adjacent form (NAF) this algorithm uses def scalarmult(n,P): ≤ 2 b − 2 additions: specifically if n == 0: return 0 ≤ b − 1 doublings and if n == 1: return P ≤ b − 1 additions of P . if n % 4 == 1: Example of worst case: R = scalarmult((n-1)/4,P) 31 P = 2(2(2(2 P + P )+ P )+ P )+ P . R = R + R 31 = (11111) 2 ; b = 5; return (R + R) + P 4 doublings; 4 more additions. if n % 4 == 3: R = scalarmult((n+1)/4,P) Average case is better: e.g. R = R + R 35 P = 2(2(2(2(2 P ))) + P ) + P . return (R + R) - P 35 = (100011) 2 ; b = 6; R = scalarmult(n/2,P) 5 doublings; 2 additions. return R + R

  26. 10 11 n < 2 b then Non-adjacent form (NAF) Subtraction algorithm uses is as cheap def scalarmult(n,P): 2 additions: specifically NAF tak if n == 0: return 0 1 doublings and 31 P = 2(2(2(2(2 if n == 1: return P 1 additions of P . 31 = (10000 if n % 4 == 1: Example of worst case: R = scalarmult((n-1)/4,P) 35 P = 2(2(2(2(2 2(2(2(2 P + P )+ P )+ P )+ P . R = R + R 35 = (10010 (11111) 2 ; b = 5; return (R + R) + P “Non-adjacent”: doublings; 4 more additions. if n % 4 == 3: separated R = scalarmult((n+1)/4,P) Average case is better: e.g. R = R + R Worst case: 2(2(2(2(2 P ))) + P ) + P . return (R + R) - P plus ≈ b= (100011) 2 ; b = 6; R = scalarmult(n/2,P) On average doublings; 2 additions. return R + R

  27. 10 11 then Non-adjacent form (NAF) Subtraction on the uses is as cheap as addition. def scalarmult(n,P): additions: specifically NAF takes advantage if n == 0: return 0 doublings and 31 P = 2(2(2(2(2 P if n == 1: return P additions of P . 31 = (10000¯ 1) 2 ; ¯ 1 if n % 4 == 1: rst case: R = scalarmult((n-1)/4,P) 35 P = 2(2(2(2(2 P + P )+ P )+ P )+ P . R = R + R 35 = (10010¯ 1) 2 . b = 5; return (R + R) + P “Non-adjacent”: ± more additions. if n % 4 == 3: separated by ≥ 2 doublings. R = scalarmult((n+1)/4,P) better: e.g. R = R + R Worst case: ≈ b doublings 2(2(2(2(2 P ))) + P ) + P . return (R + R) - P plus ≈ b= 2 additions b = 6; R = scalarmult(n/2,P) On average ≈ b= 3 additions. return R + R

  28. 10 11 Non-adjacent form (NAF) Subtraction on the curve is as cheap as addition. def scalarmult(n,P): ecifically NAF takes advantage of this. if n == 0: return 0 31 P = 2(2(2(2(2 P )))) − P . if n == 1: return P 31 = (10000¯ 1) 2 ; ¯ 1 denotes − if n % 4 == 1: R = scalarmult((n-1)/4,P) 35 P = 2(2(2(2(2 P )) + P )) − P )+ P . R = R + R 35 = (10010¯ 1) 2 . return (R + R) + P “Non-adjacent”: ± P ops are additions. if n % 4 == 3: separated by ≥ 2 doublings. R = scalarmult((n+1)/4,P) e.g. R = R + R Worst case: ≈ b doublings ) + P . return (R + R) - P plus ≈ b= 2 additions of ± P . R = scalarmult(n/2,P) On average ≈ b= 3 additions. return R + R

  29. 11 12 Non-adjacent form (NAF) Subtraction on the curve is as cheap as addition. def scalarmult(n,P): NAF takes advantage of this. if n == 0: return 0 31 P = 2(2(2(2(2 P )))) − P . if n == 1: return P 31 = (10000¯ 1) 2 ; ¯ 1 denotes − 1. if n % 4 == 1: R = scalarmult((n-1)/4,P) 35 P = 2(2(2(2(2 P )) + P )) − P . R = R + R 35 = (10010¯ 1) 2 . return (R + R) + P “Non-adjacent”: ± P ops are if n % 4 == 3: separated by ≥ 2 doublings. R = scalarmult((n+1)/4,P) R = R + R Worst case: ≈ b doublings return (R + R) - P plus ≈ b= 2 additions of ± P . R = scalarmult(n/2,P) On average ≈ b= 3 additions. return R + R

  30. 11 12 Non-adjacent form (NAF) Subtraction on the curve Width-2 is as cheap as addition. scalarmult(n,P): def window2(n,P,P3): NAF takes advantage of this. == 0: return 0 if n == 31 P = 2(2(2(2(2 P )))) − P . == 1: return P if n == 31 = (10000¯ 1) 2 ; ¯ 1 denotes − 1. % 4 == 1: if n == scalarmult((n-1)/4,P) if n % 35 P = 2(2(2(2(2 P )) + P )) − P . R + R R = 35 = (10010¯ 1) 2 . return (R + R) + P R = “Non-adjacent”: ± P ops are % 4 == 3: R = separated by ≥ 2 doublings. scalarmult((n+1)/4,P) return R + R if n % Worst case: ≈ b doublings return (R + R) - P R = plus ≈ b= 2 additions of ± P . scalarmult(n/2,P) R = On average ≈ b= 3 additions. return R + R R = return

  31. 11 12 rm (NAF) Subtraction on the curve Width-2 signed sliding is as cheap as addition. scalarmult(n,P): def window2(n,P,P3): NAF takes advantage of this. return 0 if n == 0: return 31 P = 2(2(2(2(2 P )))) − P . return P if n == 1: return 31 = (10000¯ 1) 2 ; ¯ 1 denotes − 1. if n == 3: return scalarmult((n-1)/4,P) if n % 8 == 1: 35 P = 2(2(2(2(2 P )) + P )) − P . R = window2((n-1)/8,P,P3) 35 = (10010¯ 1) 2 . R) + P R = R + R “Non-adjacent”: ± P ops are R = R + R separated by ≥ 2 doublings. scalarmult((n+1)/4,P) return (R + R) if n % 8 == 3: Worst case: ≈ b doublings R) - P R = window2((n-3)/8,P,P3) plus ≈ b= 2 additions of ± P . scalarmult(n/2,P) R = R + R On average ≈ b= 3 additions. R = R + R return (R + R)

  32. 11 12 Subtraction on the curve Width-2 signed sliding windo is as cheap as addition. def window2(n,P,P3): NAF takes advantage of this. if n == 0: return 0 31 P = 2(2(2(2(2 P )))) − P . if n == 1: return P 31 = (10000¯ 1) 2 ; ¯ 1 denotes − 1. if n == 3: return P3 scalarmult((n-1)/4,P) if n % 8 == 1: 35 P = 2(2(2(2(2 P )) + P )) − P . R = window2((n-1)/8,P,P3) 35 = (10010¯ 1) 2 . R = R + R “Non-adjacent”: ± P ops are R = R + R separated by ≥ 2 doublings. scalarmult((n+1)/4,P) return (R + R) + P if n % 8 == 3: Worst case: ≈ b doublings R = window2((n-3)/8,P,P3) plus ≈ b= 2 additions of ± P . R = R + R On average ≈ b= 3 additions. R = R + R return (R + R) + P3

  33. 12 13 Subtraction on the curve Width-2 signed sliding windows is as cheap as addition. def window2(n,P,P3): NAF takes advantage of this. if n == 0: return 0 31 P = 2(2(2(2(2 P )))) − P . if n == 1: return P 31 = (10000¯ 1) 2 ; ¯ 1 denotes − 1. if n == 3: return P3 if n % 8 == 1: 35 P = 2(2(2(2(2 P )) + P )) − P . R = window2((n-1)/8,P,P3) 35 = (10010¯ 1) 2 . R = R + R “Non-adjacent”: ± P ops are R = R + R separated by ≥ 2 doublings. return (R + R) + P if n % 8 == 3: Worst case: ≈ b doublings R = window2((n-3)/8,P,P3) plus ≈ b= 2 additions of ± P . R = R + R On average ≈ b= 3 additions. R = R + R return (R + R) + P3

  34. 12 13 Subtraction on the curve Width-2 signed sliding windows if n % cheap as addition. R = def window2(n,P,P3): takes advantage of this. R = if n == 0: return 0 R = 2(2(2(2(2 P )))) − P . if n == 1: return P return (10000¯ 1) 2 ; ¯ 1 denotes − 1. if n == 3: return P3 if n % if n % 8 == 1: 2(2(2(2(2 P )) + P )) − P . R = R = window2((n-1)/8,P,P3) (10010¯ 1) 2 . R = R = R + R R = “Non-adjacent”: ± P ops are R = R + R return rated by ≥ 2 doublings. return (R + R) + P R = window2(n/2,P,P3) if n % 8 == 3: case: ≈ b doublings return R = window2((n-3)/8,P,P3) b= 2 additions of ± P . R = R + R average ≈ b= 3 additions. def scalarmult(n,P): R = R + R return return (R + R) + P3

  35. 12 13 the curve Width-2 signed sliding windows if n % 8 == 5: addition. R = window2((n+3)/8,P,P3) def window2(n,P,P3): advantage of this. R = R + R if n == 0: return 0 R = R + R 2(2(2(2(2 P )))) − P . if n == 1: return P return (R + R) ¯ 1 denotes − 1. if n == 3: return P3 if n % 8 == 7: if n % 8 == 1: 2(2(2(2(2 P )) + P )) − P . R = window2((n+1)/8,P,P3) R = window2((n-1)/8,P,P3) R = R + R R = R + R R = R + R ± P ops are R = R + R return (R + R) doublings. return (R + R) + P R = window2(n/2,P,P3) if n % 8 == 3: doublings return R + R R = window2((n-3)/8,P,P3) additions of ± P . R = R + R 3 additions. def scalarmult(n,P): R = R + R return window2(n,P,P+P+P) return (R + R) + P3

  36. 12 13 Width-2 signed sliding windows if n % 8 == 5: R = window2((n+3)/8,P,P3) def window2(n,P,P3): this. R = R + R if n == 0: return 0 R = R + R . if n == 1: return P return (R + R) - P3 denotes − 1. if n == 3: return P3 if n % 8 == 7: if n % 8 == 1: )) − P . R = window2((n+1)/8,P,P3) R = window2((n-1)/8,P,P3) R = R + R R = R + R R = R + R are R = R + R return (R + R) - P doublings. return (R + R) + P R = window2(n/2,P,P3) if n % 8 == 3: return R + R R = window2((n-3)/8,P,P3) . R = R + R additions. def scalarmult(n,P): R = R + R return window2(n,P,P+P+P) return (R + R) + P3

  37. 13 14 Width-2 signed sliding windows if n % 8 == 5: R = window2((n+3)/8,P,P3) def window2(n,P,P3): R = R + R if n == 0: return 0 R = R + R if n == 1: return P return (R + R) - P3 if n == 3: return P3 if n % 8 == 7: if n % 8 == 1: R = window2((n+1)/8,P,P3) R = window2((n-1)/8,P,P3) R = R + R R = R + R R = R + R R = R + R return (R + R) - P return (R + R) + P R = window2(n/2,P,P3) if n % 8 == 3: return R + R R = window2((n-3)/8,P,P3) R = R + R def scalarmult(n,P): R = R + R return window2(n,P,P+P+P) return (R + R) + P3

  38. 13 14 Width-2 signed sliding windows Worst case: if n % 8 == 5: ≈ b= 3 additions R = window2((n+3)/8,P,P3) window2(n,P,P3): On average R = R + R == 0: return 0 R = R + R == 1: return P return (R + R) - P3 == 3: return P3 if n % 8 == 7: % 8 == 1: R = window2((n+1)/8,P,P3) window2((n-1)/8,P,P3) R = R + R R + R R = R + R R + R return (R + R) - P return (R + R) + P R = window2(n/2,P,P3) % 8 == 3: return R + R window2((n-3)/8,P,P3) R + R def scalarmult(n,P): R + R return window2(n,P,P+P+P) return (R + R) + P3

  39. 13 14 sliding windows Worst case: ≈ b doublings if n % 8 == 5: ≈ b= 3 additions of R = window2((n+3)/8,P,P3) window2(n,P,P3): On average ≈ b= 4 R = R + R return 0 R = R + R return P return (R + R) - P3 return P3 if n % 8 == 7: R = window2((n+1)/8,P,P3) window2((n-1)/8,P,P3) R = R + R R = R + R return (R + R) - P R) + P R = window2(n/2,P,P3) return R + R window2((n-3)/8,P,P3) def scalarmult(n,P): return window2(n,P,P+P+P) R) + P3

  40. 13 14 windows Worst case: ≈ b doublings plus if n % 8 == 5: ≈ b= 3 additions of ± P or ± 3 R = window2((n+3)/8,P,P3) On average ≈ b= 4 additions. R = R + R R = R + R return (R + R) - P3 if n % 8 == 7: R = window2((n+1)/8,P,P3) window2((n-1)/8,P,P3) R = R + R R = R + R return (R + R) - P R = window2(n/2,P,P3) return R + R window2((n-3)/8,P,P3) def scalarmult(n,P): return window2(n,P,P+P+P)

  41. 14 15 Worst case: ≈ b doublings plus if n % 8 == 5: ≈ b= 3 additions of ± P or ± 3 P . R = window2((n+3)/8,P,P3) On average ≈ b= 4 additions. R = R + R R = R + R return (R + R) - P3 if n % 8 == 7: R = window2((n+1)/8,P,P3) R = R + R R = R + R return (R + R) - P R = window2(n/2,P,P3) return R + R def scalarmult(n,P): return window2(n,P,P+P+P)

  42. 14 15 Worst case: ≈ b doublings plus if n % 8 == 5: ≈ b= 3 additions of ± P or ± 3 P . R = window2((n+3)/8,P,P3) On average ≈ b= 4 additions. R = R + R R = R + R Width-3 signed sliding windows: return (R + R) - P3 Precompute P; 3 P; 5 P; 7 P . if n % 8 == 7: On average ≈ b= 5 additions. R = window2((n+1)/8,P,P3) R = R + R R = R + R return (R + R) - P R = window2(n/2,P,P3) return R + R def scalarmult(n,P): return window2(n,P,P+P+P)

  43. 14 15 Worst case: ≈ b doublings plus if n % 8 == 5: ≈ b= 3 additions of ± P or ± 3 P . R = window2((n+3)/8,P,P3) On average ≈ b= 4 additions. R = R + R R = R + R Width-3 signed sliding windows: return (R + R) - P3 Precompute P; 3 P; 5 P; 7 P . if n % 8 == 7: On average ≈ b= 5 additions. R = window2((n+1)/8,P,P3) Width 4: Precompute R = R + R P; 3 P; 5 P; 7 P; 9 P; 11 P; 13 P; 15 P . R = R + R On average ≈ b= 6 additions. return (R + R) - P R = window2(n/2,P,P3) return R + R def scalarmult(n,P): return window2(n,P,P+P+P)

  44. 14 15 Worst case: ≈ b doublings plus if n % 8 == 5: ≈ b= 3 additions of ± P or ± 3 P . R = window2((n+3)/8,P,P3) On average ≈ b= 4 additions. R = R + R R = R + R Width-3 signed sliding windows: return (R + R) - P3 Precompute P; 3 P; 5 P; 7 P . if n % 8 == 7: On average ≈ b= 5 additions. R = window2((n+1)/8,P,P3) Width 4: Precompute R = R + R P; 3 P; 5 P; 7 P; 9 P; 11 P; 13 P; 15 P . R = R + R On average ≈ b= 6 additions. return (R + R) - P R = window2(n/2,P,P3) Cost of precomputation return R + R eventually outweighs savings. Optimal: ≈ b doublings plus def scalarmult(n,P): roughly b= lg b additions. return window2(n,P,P+P+P)

  45. 14 15 Worst case: ≈ b doublings plus Double-scala % 8 == 5: ≈ b= 3 additions of ± P or ± 3 P . window2((n+3)/8,P,P3) Want to On average ≈ b= 4 additions. R + R m; P; n; Q R + R Width-3 signed sliding windows: e.g. verify return (R + R) - P3 Precompute P; 3 P; 5 P; 7 P . by computing % 8 == 7: On average ≈ b= 5 additions. computing window2((n+1)/8,P,P3) checking Width 4: Precompute R + R P; 3 P; 5 P; 7 P; 9 P; 11 P; 13 P; 15 P . R + R Obvious On average ≈ b= 6 additions. return (R + R) - P Compute window2(n/2,P,P3) Cost of precomputation e.g. b = return R + R eventually outweighs savings. ≈ 256 doublings Optimal: ≈ b doublings plus ≈ 256 doublings scalarmult(n,P): roughly b= lg b additions. ≈ 50 additions return window2(n,P,P+P+P) ≈ 50 additions

  46. 14 15 Worst case: ≈ b doublings plus Double-scalar multiplication ≈ b= 3 additions of ± P or ± 3 P . window2((n+3)/8,P,P3) Want to quickly compute On average ≈ b= 4 additions. m; P; n; Q �→ mP + Width-3 signed sliding windows: e.g. verify signature R) - P3 Precompute P; 3 P; 5 P; 7 P . by computing h = On average ≈ b= 5 additions. computing SB − h window2((n+1)/8,P,P3) checking whether R Width 4: Precompute P; 3 P; 5 P; 7 P; 9 P; 11 P; 13 P; 15 P . Obvious approach: On average ≈ b= 6 additions. R) - P Compute mP ; compute window2(n/2,P,P3) Cost of precomputation e.g. b = 256: eventually outweighs savings. ≈ 256 doublings fo Optimal: ≈ b doublings plus ≈ 256 doublings fo scalarmult(n,P): roughly b= lg b additions. ≈ 50 additions for m window2(n,P,P+P+P) ≈ 50 additions for n

  47. 14 15 Worst case: ≈ b doublings plus Double-scalar multiplication ≈ b= 3 additions of ± P or ± 3 P . window2((n+3)/8,P,P3) Want to quickly compute On average ≈ b= 4 additions. m; P; n; Q �→ mP + nQ . Width-3 signed sliding windows: e.g. verify signature ( R; S ) Precompute P; 3 P; 5 P; 7 P . by computing h = H ( R; M ), On average ≈ b= 5 additions. computing SB − hA , window2((n+1)/8,P,P3) checking whether R = SB − Width 4: Precompute P; 3 P; 5 P; 7 P; 9 P; 11 P; 13 P; 15 P . Obvious approach: On average ≈ b= 6 additions. Compute mP ; compute nQ ; Cost of precomputation e.g. b = 256: eventually outweighs savings. ≈ 256 doublings for mP , Optimal: ≈ b doublings plus ≈ 256 doublings for nQ , roughly b= lg b additions. ≈ 50 additions for mP , window2(n,P,P+P+P) ≈ 50 additions for nQ .

  48. 15 16 Worst case: ≈ b doublings plus Double-scalar multiplication ≈ b= 3 additions of ± P or ± 3 P . Want to quickly compute On average ≈ b= 4 additions. m; P; n; Q �→ mP + nQ . Width-3 signed sliding windows: e.g. verify signature ( R; S ) Precompute P; 3 P; 5 P; 7 P . by computing h = H ( R; M ), On average ≈ b= 5 additions. computing SB − hA , checking whether R = SB − hA . Width 4: Precompute P; 3 P; 5 P; 7 P; 9 P; 11 P; 13 P; 15 P . Obvious approach: On average ≈ b= 6 additions. Compute mP ; compute nQ ; add. Cost of precomputation e.g. b = 256: eventually outweighs savings. ≈ 256 doublings for mP , Optimal: ≈ b doublings plus ≈ 256 doublings for nQ , roughly b= lg b additions. ≈ 50 additions for mP , ≈ 50 additions for nQ .

  49. 15 16 case: ≈ b doublings plus Double-scalar multiplication Joint doublings additions of ± P or ± 3 P . Want to quickly compute Do much average ≈ b= 4 additions. m; P; n; Q �→ mP + nQ . 2 X + 2 Y Width-3 signed sliding windows: e.g. verify signature ( R; S ) def scalarmult2(m,P,n,Q): Precompute P; 3 P; 5 P; 7 P . by computing h = H ( R; M ), if m == average ≈ b= 5 additions. computing SB − hA , return checking whether R = SB − hA . 4: Precompute if n == 5 P; 7 P; 9 P; 11 P; 13 P; 15 P . return Obvious approach: average ≈ b= 6 additions. R = scalarmult2(m//2,P,n//2,Q) Compute mP ; compute nQ ; add. R = R of precomputation e.g. b = 256: if m % eventually outweighs savings. ≈ 256 doublings for mP , if n % Optimal: ≈ b doublings plus ≈ 256 doublings for nQ , return roughly b= lg b additions. ≈ 50 additions for mP , ≈ 50 additions for nQ .

  50. 15 16 doublings plus Double-scalar multiplication Joint doublings of ± P or ± 3 P . Want to quickly compute Do much better by 4 additions. m; P; n; Q �→ mP + nQ . 2 X + 2 Y into 2( X sliding windows: e.g. verify signature ( R; S ) def scalarmult2(m,P,n,Q): P; 5 P; 7 P . by computing h = H ( R; M ), if m == 0: 5 additions. computing SB − hA , return scalarmult(n,Q) checking whether R = SB − hA . Precompute if n == 0: ; 11 P; 13 P; 15 P . return scalarmult(m,P) Obvious approach: 6 additions. R = scalarmult2(m//2,P,n//2,Q) Compute mP ; compute nQ ; add. R = R + R recomputation e.g. b = 256: if m % 2: R = R eighs savings. ≈ 256 doublings for mP , if n % 2: R = R doublings plus ≈ 256 doublings for nQ , return R additions. ≈ 50 additions for mP , ≈ 50 additions for nQ .

  51. 15 16 plus Double-scalar multiplication Joint doublings ± 3 P . Want to quickly compute Do much better by merging additions. m; P; n; Q �→ mP + nQ . 2 X + 2 Y into 2( X + Y ). windows: e.g. verify signature ( R; S ) def scalarmult2(m,P,n,Q): by computing h = H ( R; M ), if m == 0: additions. computing SB − hA , return scalarmult(n,Q) checking whether R = SB − hA . if n == 0: ; 15 P . return scalarmult(m,P) Obvious approach: additions. R = scalarmult2(m//2,P,n//2,Q) Compute mP ; compute nQ ; add. R = R + R e.g. b = 256: if m % 2: R = R + P savings. ≈ 256 doublings for mP , if n % 2: R = R + Q plus ≈ 256 doublings for nQ , return R ≈ 50 additions for mP , ≈ 50 additions for nQ .

  52. 16 17 Double-scalar multiplication Joint doublings Want to quickly compute Do much better by merging m; P; n; Q �→ mP + nQ . 2 X + 2 Y into 2( X + Y ). e.g. verify signature ( R; S ) def scalarmult2(m,P,n,Q): by computing h = H ( R; M ), if m == 0: computing SB − hA , return scalarmult(n,Q) checking whether R = SB − hA . if n == 0: return scalarmult(m,P) Obvious approach: R = scalarmult2(m//2,P,n//2,Q) Compute mP ; compute nQ ; add. R = R + R e.g. b = 256: if m % 2: R = R + P ≈ 256 doublings for mP , if n % 2: R = R + Q ≈ 256 doublings for nQ , return R ≈ 50 additions for mP , ≈ 50 additions for nQ .

  53. 16 17 Double-scalar multiplication Joint doublings For example: 35 P = 2(2(2(2(2 to quickly compute Do much better by merging 31 Q = 2(2(2(2 ; Q �→ mP + nQ . 2 X + 2 Y into 2( X + Y ). into 35 P verify signature ( R; S ) def scalarmult2(m,P,n,Q): 2(2(2(2(2 computing h = H ( R; M ), if m == 0: + P computing SB − hA , return scalarmult(n,Q) ≈ b doublings checking whether R = SB − hA . if n == 0: ≈ b= 2 additions return scalarmult(m,P) Obvious approach: ≈ b= 2 additions R = scalarmult2(m//2,P,n//2,Q) Compute mP ; compute nQ ; add. Combine R = R + R = 256: ≈ 256 doublings if m % 2: R = R + P doublings for mP , ≈ 50 additions if n % 2: R = R + Q doublings for nQ , ≈ 50 additions return R additions for mP , additions for nQ .

  54. 16 17 multiplication Joint doublings For example: merge 35 P = 2(2(2(2(2 P compute Do much better by merging 31 Q = 2(2(2(2 Q + + nQ . 2 X + 2 Y into 2( X + Y ). into 35 P + 31 Q = signature ( R; S ) def scalarmult2(m,P,n,Q): 2(2(2(2(2 P + Q )+ Q = H ( R; M ), if m == 0: + P + Q . hA , return scalarmult(n,Q) ≈ b doublings (merged!), whether R = SB − hA . if n == 0: ≈ b= 2 additions of return scalarmult(m,P) roach: ≈ b= 2 additions of R = scalarmult2(m//2,P,n//2,Q) compute nQ ; add. Combine idea with R = R + R ≈ 256 doublings fo if m % 2: R = R + P for mP , ≈ 50 additions usin if n % 2: R = R + Q for nQ , ≈ 50 additions usin return R r mP , r nQ .

  55. 16 17 multiplication Joint doublings For example: merge 35 P = 2(2(2(2(2 P ))) + P ) + Do much better by merging 31 Q = 2(2(2(2 Q + Q )+ Q )+ Q 2 X + 2 Y into 2( X + Y ). into 35 P + 31 Q = ) def scalarmult2(m,P,n,Q): 2(2(2(2(2 P + Q )+ Q )+ Q )+ P ), if m == 0: + P + Q . return scalarmult(n,Q) ≈ b doublings (merged!), − hA . if n == 0: ≈ b= 2 additions of P , return scalarmult(m,P) ≈ b= 2 additions of Q . R = scalarmult2(m//2,P,n//2,Q) ; add. Combine idea with windows: R = R + R ≈ 256 doublings for b = 256, if m % 2: R = R + P ≈ 50 additions using P , if n % 2: R = R + Q ≈ 50 additions using Q . return R

  56. 17 18 Joint doublings For example: merge 35 P = 2(2(2(2(2 P ))) + P ) + P , Do much better by merging 31 Q = 2(2(2(2 Q + Q )+ Q )+ Q )+ Q 2 X + 2 Y into 2( X + Y ). into 35 P + 31 Q = def scalarmult2(m,P,n,Q): 2(2(2(2(2 P + Q )+ Q )+ Q )+ P + Q ) if m == 0: + P + Q . return scalarmult(n,Q) ≈ b doublings (merged!), if n == 0: ≈ b= 2 additions of P , return scalarmult(m,P) ≈ b= 2 additions of Q . R = scalarmult2(m//2,P,n//2,Q) Combine idea with windows: e.g., R = R + R ≈ 256 doublings for b = 256, if m % 2: R = R + P ≈ 50 additions using P , if n % 2: R = R + Q ≈ 50 additions using Q . return R

  57. 17 18 doublings For example: merge Batch verification 35 P = 2(2(2(2(2 P ))) + P ) + P , much better by merging Verifying 31 Q = 2(2(2(2 Q + Q )+ Q )+ Q )+ Q 2 Y into 2( X + Y ). need to b into 35 P + 31 Q = S 1 B = R scalarmult2(m,P,n,Q): 2(2(2(2(2 P + Q )+ Q )+ Q )+ P + Q ) S 2 B = R == 0: + P + Q . S 3 B = R return scalarmult(n,Q) ≈ b doublings (merged!), etc. == 0: ≈ b= 2 additions of P , Obvious return scalarmult(m,P) ≈ b= 2 additions of Q . Check each scalarmult2(m//2,P,n//2,Q) Combine idea with windows: e.g., + R ≈ 256 doublings for b = 256, % 2: R = R + P ≈ 50 additions using P , % 2: R = R + Q ≈ 50 additions using Q . return R

  58. 17 18 For example: merge Batch verification 35 P = 2(2(2(2(2 P ))) + P ) + P , by merging Verifying many signatures: 31 Q = 2(2(2(2 Q + Q )+ Q )+ Q )+ Q X + Y ). need to be confident into 35 P + 31 Q = S 1 B = R 1 + h 1 A 1 , scalarmult2(m,P,n,Q): 2(2(2(2(2 P + Q )+ Q )+ Q )+ P + Q ) S 2 B = R 2 + h 2 A 2 , + P + Q . S 3 B = R 3 + h 3 A 3 , scalarmult(n,Q) ≈ b doublings (merged!), etc. ≈ b= 2 additions of P , Obvious approach: scalarmult(m,P) ≈ b= 2 additions of Q . Check each equation scalarmult2(m//2,P,n//2,Q) Combine idea with windows: e.g., ≈ 256 doublings for b = 256, R + P ≈ 50 additions using P , R + Q ≈ 50 additions using Q .

  59. 17 18 For example: merge Batch verification 35 P = 2(2(2(2(2 P ))) + P ) + P , merging Verifying many signatures: 31 Q = 2(2(2(2 Q + Q )+ Q )+ Q )+ Q need to be confident that into 35 P + 31 Q = S 1 B = R 1 + h 1 A 1 , scalarmult2(m,P,n,Q): 2(2(2(2(2 P + Q )+ Q )+ Q )+ P + Q ) S 2 B = R 2 + h 2 A 2 , + P + Q . S 3 B = R 3 + h 3 A 3 , scalarmult(n,Q) ≈ b doublings (merged!), etc. ≈ b= 2 additions of P , Obvious approach: scalarmult(m,P) ≈ b= 2 additions of Q . Check each equation separately scalarmult2(m//2,P,n//2,Q) Combine idea with windows: e.g., ≈ 256 doublings for b = 256, ≈ 50 additions using P , ≈ 50 additions using Q .

  60. 18 19 For example: merge Batch verification 35 P = 2(2(2(2(2 P ))) + P ) + P , Verifying many signatures: 31 Q = 2(2(2(2 Q + Q )+ Q )+ Q )+ Q need to be confident that into 35 P + 31 Q = S 1 B = R 1 + h 1 A 1 , 2(2(2(2(2 P + Q )+ Q )+ Q )+ P + Q ) S 2 B = R 2 + h 2 A 2 , + P + Q . S 3 B = R 3 + h 3 A 3 , ≈ b doublings (merged!), etc. ≈ b= 2 additions of P , Obvious approach: ≈ b= 2 additions of Q . Check each equation separately. Combine idea with windows: e.g., ≈ 256 doublings for b = 256, ≈ 50 additions using P , ≈ 50 additions using Q .

  61. 18 19 For example: merge Batch verification 35 P = 2(2(2(2(2 P ))) + P ) + P , Verifying many signatures: 31 Q = 2(2(2(2 Q + Q )+ Q )+ Q )+ Q need to be confident that into 35 P + 31 Q = S 1 B = R 1 + h 1 A 1 , 2(2(2(2(2 P + Q )+ Q )+ Q )+ P + Q ) S 2 B = R 2 + h 2 A 2 , + P + Q . S 3 B = R 3 + h 3 A 3 , ≈ b doublings (merged!), etc. ≈ b= 2 additions of P , Obvious approach: ≈ b= 2 additions of Q . Check each equation separately. Combine idea with windows: e.g., Much faster approach: ≈ 256 doublings for b = 256, Check random linear combination ≈ 50 additions using P , of the equations. ≈ 50 additions using Q .

  62. 18 19 example: merge Batch verification Pick indep 2(2(2(2(2 P ))) + P ) + P , 128-bit z Verifying many signatures: 2(2(2(2 Q + Q )+ Q )+ Q )+ Q need to be confident that Check whether 35 P + 31 Q = S 1 B = R 1 + h 1 A 1 , ( z 1 S 1 + 2(2(2(2(2 P + Q )+ Q )+ Q )+ P + Q ) S 2 B = R 2 + h 2 A 2 , z 1 R 1 + ( P + Q . S 3 B = R 3 + h 3 A 3 , z 2 R 2 + ( doublings (merged!), etc. z 3 R 3 + ( additions of P , Obvious approach: (If � =: See additions of Q . Check each equation separately. Doumen–Lange–Oosterwijk.) Combine idea with windows: e.g., Much faster approach: Easy to p doublings for b = 256, Check random linear combination forgeries additions using P , of the equations. of fooling additions using Q .

  63. 18 19 merge Batch verification Pick independent unifo 2(2(2(2(2 P ))) + P ) + P , 128-bit z 1 ; z 2 ; z 3 ; : Verifying many signatures: + Q )+ Q )+ Q )+ Q need to be confident that Check whether = S 1 B = R 1 + h 1 A 1 , ( z 1 S 1 + z 2 S 2 + z 3 S )+ Q )+ Q )+ P + Q ) S 2 B = R 2 + h 2 A 2 , z 1 R 1 + ( z 1 h 1 ) A 1 + S 3 B = R 3 + h 3 A 3 , z 2 R 2 + ( z 2 h 2 ) A 2 + (merged!), etc. z 3 R 3 + ( z 3 h 3 ) A 3 + of P , Obvious approach: (If � =: See 2012 Bernstein– of Q . Check each equation separately. Doumen–Lange–Oosterwijk.) with windows: e.g., Much faster approach: Easy to prove: for b = 256, Check random linear combination forgeries have probabilit sing P , of the equations. of fooling this check. sing Q .

  64. 18 19 Batch verification Pick independent uniform random ) + P , 128-bit z 1 ; z 2 ; z 3 ; : : : . Verifying many signatures: )+ Q )+ Q need to be confident that Check whether S 1 B = R 1 + h 1 A 1 , ( z 1 S 1 + z 2 S 2 + z 3 S 3 + · · · ) B P + Q ) S 2 B = R 2 + h 2 A 2 , z 1 R 1 + ( z 1 h 1 ) A 1 + S 3 B = R 3 + h 3 A 3 , z 2 R 2 + ( z 2 h 2 ) A 2 + etc. z 3 R 3 + ( z 3 h 3 ) A 3 + · · · . Obvious approach: (If � =: See 2012 Bernstein– Check each equation separately. Doumen–Lange–Oosterwijk.) ws: e.g., Much faster approach: Easy to prove: 256, Check random linear combination forgeries have probability ≤ 2 of the equations. of fooling this check.

  65. 19 20 Batch verification Pick independent uniform random 128-bit z 1 ; z 2 ; z 3 ; : : : . Verifying many signatures: need to be confident that Check whether S 1 B = R 1 + h 1 A 1 , ( z 1 S 1 + z 2 S 2 + z 3 S 3 + · · · ) B = S 2 B = R 2 + h 2 A 2 , z 1 R 1 + ( z 1 h 1 ) A 1 + S 3 B = R 3 + h 3 A 3 , z 2 R 2 + ( z 2 h 2 ) A 2 + etc. z 3 R 3 + ( z 3 h 3 ) A 3 + · · · . Obvious approach: (If � =: See 2012 Bernstein– Check each equation separately. Doumen–Lange–Oosterwijk.) Much faster approach: Easy to prove: forgeries have probability ≤ 2 − 128 Check random linear combination of the equations. of fooling this check.

  66. 19 20 verification Pick independent uniform random Multi-scala 128-bit z 1 ; z 2 ; z 3 ; : : : . erifying many signatures: Review of to be confident that Check whether 1939 Brauer R 1 + h 1 A 1 , ( z 1 S 1 + z 2 S 2 + z 3 S 3 + · · · ) B = ≈ (1 + 1 R 2 + h 2 A 2 , z 1 R 1 + ( z 1 h 1 ) A 1 + additions R 3 + h 3 A 3 , z 2 R 2 + ( z 2 h 2 ) A 2 + P �→ nP z 3 R 3 + ( z 3 h 3 ) A 3 + · · · . 1964 Straus Obvious approach: (If � =: See 2012 Bernstein– each equation separately. Doumen–Lange–Oosterwijk.) ≈ (1 + k additions faster approach: Easy to prove: forgeries have probability ≤ 2 − 128 P 1 ; : : : ; P random linear combination if n 1 ; : : : equations. of fooling this check.

  67. 19 20 verification Pick independent uniform random Multi-scalar multip 128-bit z 1 ; z 2 ; z 3 ; : : : . signatures: Review of asymptotic confident that Check whether 1939 Brauer (wind 1 , ( z 1 S 1 + z 2 S 2 + z 3 S 3 + · · · ) B = ≈ (1 + 1 = lg b ) b 2 , z 1 R 1 + ( z 1 h 1 ) A 1 + additions to compute 3 , z 2 R 2 + ( z 2 h 2 ) A 2 + P �→ nP if n < 2 b . z 3 R 3 + ( z 3 h 3 ) A 3 + · · · . 1964 Straus (joint roach: (If � =: See 2012 Bernstein– equation separately. Doumen–Lange–Oosterwijk.) ≈ (1 + k= lg b ) b additions to compute roach: Easy to prove: forgeries have probability ≤ 2 − 128 P 1 ; : : : ; P k �→ n 1 P 1 linear combination if n 1 ; : : : ; n k < 2 b . equations. of fooling this check.

  68. 19 20 Pick independent uniform random Multi-scalar multiplication 128-bit z 1 ; z 2 ; z 3 ; : : : . signatures: Review of asymptotic speeds: Check whether 1939 Brauer (windows): ( z 1 S 1 + z 2 S 2 + z 3 S 3 + · · · ) B = ≈ (1 + 1 = lg b ) b z 1 R 1 + ( z 1 h 1 ) A 1 + additions to compute z 2 R 2 + ( z 2 h 2 ) A 2 + P �→ nP if n < 2 b . z 3 R 3 + ( z 3 h 3 ) A 3 + · · · . 1964 Straus (joint doublings): (If � =: See 2012 Bernstein– rately. Doumen–Lange–Oosterwijk.) ≈ (1 + k= lg b ) b additions to compute Easy to prove: forgeries have probability ≤ 2 − 128 P 1 ; : : : ; P k �→ n 1 P 1 + · · · + n combination if n 1 ; : : : ; n k < 2 b . of fooling this check.

  69. 20 21 Pick independent uniform random Multi-scalar multiplication 128-bit z 1 ; z 2 ; z 3 ; : : : . Review of asymptotic speeds: Check whether 1939 Brauer (windows): ( z 1 S 1 + z 2 S 2 + z 3 S 3 + · · · ) B = ≈ (1 + 1 = lg b ) b z 1 R 1 + ( z 1 h 1 ) A 1 + additions to compute z 2 R 2 + ( z 2 h 2 ) A 2 + P �→ nP if n < 2 b . z 3 R 3 + ( z 3 h 3 ) A 3 + · · · . 1964 Straus (joint doublings): (If � =: See 2012 Bernstein– Doumen–Lange–Oosterwijk.) ≈ (1 + k= lg b ) b additions to compute Easy to prove: forgeries have probability ≤ 2 − 128 P 1 ; : : : ; P k �→ n 1 P 1 + · · · + n k P k if n 1 ; : : : ; n k < 2 b . of fooling this check.

  70. 20 21 independent uniform random Multi-scalar multiplication 1976 Yao: 128-bit z 1 ; z 2 ; z 3 ; : : : . Review of asymptotic speeds: ≈ (1 + k whether additions 1939 Brauer (windows): + z 2 S 2 + z 3 S 3 + · · · ) B = P �→ n 1 P ≈ (1 + 1 = lg b ) b ( z 1 h 1 ) A 1 + if n 1 ; : : : additions to compute ( z 2 h 2 ) A 2 + 1976 Pipp P �→ nP if n < 2 b . ( z 3 h 3 ) A 3 + · · · . Similar asym 1964 Straus (joint doublings): See 2012 Bernstein– but replace Doumen–Lange–Oosterwijk.) ≈ (1 + k= lg b ) b Faster than additions to compute to prove: if k is large. rgeries have probability ≤ 2 − 128 P 1 ; : : : ; P k �→ n 1 P 1 + · · · + n k P k (Knuth sa if n 1 ; : : : ; n k < 2 b . oling this check. as if speed

  71. 20 21 endent uniform random Multi-scalar multiplication 1976 Yao: ; : : : . Review of asymptotic speeds: ≈ (1 + k= lg b ) b additions to compute 1939 Brauer (windows): z 3 S 3 + · · · ) B = P �→ n 1 P; : : : ; n k P if n 1 ; : : : ; n k < 2 b . ≈ (1 + 1 = lg b ) b + additions to compute + 1976 Pippenger: P �→ nP if n < 2 b . + · · · . Similar asymptotics, 1964 Straus (joint doublings): Bernstein– but replace lg b with Doumen–Lange–Oosterwijk.) ≈ (1 + k= lg b ) b Faster than Straus additions to compute if k is large. robability ≤ 2 − 128 P 1 ; : : : ; P k �→ n 1 P 1 + · · · + n k P k (Knuth says “generalization” if n 1 ; : : : ; n k < 2 b . check. as if speed were the

  72. 20 21 random Multi-scalar multiplication 1976 Yao: Review of asymptotic speeds: ≈ (1 + k= lg b ) b additions to compute 1939 Brauer (windows): · ) B = P �→ n 1 P; : : : ; n k P if n 1 ; : : : ; n k < 2 b . ≈ (1 + 1 = lg b ) b additions to compute 1976 Pippenger: P �→ nP if n < 2 b . Similar asymptotics, 1964 Straus (joint doublings): Bernstein– but replace lg b with lg( kb ). Doumen–Lange–Oosterwijk.) ≈ (1 + k= lg b ) b Faster than Straus and Yao additions to compute if k is large. ≤ 2 − 128 P 1 ; : : : ; P k �→ n 1 P 1 + · · · + n k P k (Knuth says “generalization” if n 1 ; : : : ; n k < 2 b . as if speed were the same.)

  73. 21 22 Multi-scalar multiplication 1976 Yao: Review of asymptotic speeds: ≈ (1 + k= lg b ) b additions to compute 1939 Brauer (windows): P �→ n 1 P; : : : ; n k P if n 1 ; : : : ; n k < 2 b . ≈ (1 + 1 = lg b ) b additions to compute 1976 Pippenger: P �→ nP if n < 2 b . Similar asymptotics, 1964 Straus (joint doublings): but replace lg b with lg( kb ). ≈ (1 + k= lg b ) b Faster than Straus and Yao additions to compute if k is large. P 1 ; : : : ; P k �→ n 1 P 1 + · · · + n k P k (Knuth says “generalization” if n 1 ; : : : ; n k < 2 b . as if speed were the same.)

  74. 21 22 Multi-scalar multiplication 1976 Yao: More generally algorithm of asymptotic speeds: ≈ (1 + k= lg b ) b ‘ sums of additions to compute Brauer (windows): „ P �→ n 1 P; : : : ; n k P ≈ min { if n 1 ; : : : ; n k < 2 b . 1 = lg b ) b if all coefficients additions to compute 1976 Pippenger: Within 1 P if n < 2 b . Similar asymptotics, Straus (joint doublings): but replace lg b with lg( kb ). k= lg b ) b Faster than Straus and Yao additions to compute if k is large. ; P k �→ n 1 P 1 + · · · + n k P k (Knuth says “generalization” : : ; n k < 2 b . as if speed were the same.)

  75. 21 22 multiplication 1976 Yao: More generally, Pipp algorithm computes asymptotic speeds: ≈ (1 + k= lg b ) b ‘ sums of multiples additions to compute (windows): „ P �→ n 1 P; : : : ; n k P ≈ min { k; ‘ } + lg if n 1 ; : : : ; n k < 2 b . if all coefficients are compute 1976 Pippenger: Within 1 + › of optimal. 2 b . Similar asymptotics, (joint doublings): but replace lg b with lg( kb ). Faster than Straus and Yao compute if k is large. P 1 + · · · + n k P k (Knuth says “generalization” b . as if speed were the same.)

  76. 21 22 1976 Yao: More generally, Pippenger’s algorithm computes eeds: ≈ (1 + k= lg b ) b ‘ sums of multiples of k inputs. additions to compute „ « k‘ P �→ n 1 P; : : : ; n k P ≈ min { k; ‘ } + b lg( k‘b ) if n 1 ; : : : ; n k < 2 b . if all coefficients are below 2 1976 Pippenger: Within 1 + › of optimal. Similar asymptotics, doublings): but replace lg b with lg( kb ). Faster than Straus and Yao if k is large. n k P k (Knuth says “generalization” as if speed were the same.)

  77. 22 23 1976 Yao: More generally, Pippenger’s algorithm computes ≈ (1 + k= lg b ) b ‘ sums of multiples of k inputs. additions to compute „ « k‘ P �→ n 1 P; : : : ; n k P ≈ min { k; ‘ } + b adds lg( k‘b ) if n 1 ; : : : ; n k < 2 b . if all coefficients are below 2 b . 1976 Pippenger: Within 1 + › of optimal. Similar asymptotics, but replace lg b with lg( kb ). Faster than Straus and Yao if k is large. (Knuth says “generalization” as if speed were the same.)

  78. 22 23 1976 Yao: More generally, Pippenger’s algorithm computes ≈ (1 + k= lg b ) b ‘ sums of multiples of k inputs. additions to compute „ « k‘ P �→ n 1 P; : : : ; n k P ≈ min { k; ‘ } + b adds lg( k‘b ) if n 1 ; : : : ; n k < 2 b . if all coefficients are below 2 b . 1976 Pippenger: Within 1 + › of optimal. Similar asymptotics, Various special cases of but replace lg b with lg( kb ). Pippenger’s algorithm were Faster than Straus and Yao reinvented and patented by if k is large. 1993 Brickell–Gordon–McCurley– Wilson, 1995 Lim–Lee, etc. (Knuth says “generalization” Is that the end of the story? as if speed were the same.)

  79. 22 23 ao: More generally, Pippenger’s No! 1989 algorithm computes k= lg b ) b If n 1 ≥ n ‘ sums of multiples of k inputs. additions to compute n 1 P 1 + n „ « k‘ 1 P; : : : ; n k P ( n 1 − qn ≈ min { k; ‘ } + b adds lg( k‘b ) : : ; n k < 2 b . n 3 P 3 + · if all coefficients are below 2 b . Pippenger: Remarkab Within 1 + › of optimal. competitive r asymptotics, Various special cases of for random replace lg b with lg( kb ). Pippenger’s algorithm were much better than Straus and Yao reinvented and patented by large. 1993 Brickell–Gordon–McCurley– Wilson, 1995 Lim–Lee, etc. (Knuth says “generalization” Is that the end of the story? speed were the same.)

  80. 22 23 More generally, Pippenger’s No! 1989 Bos–Coste algorithm computes If n 1 ≥ n 2 ≥ · · · then ‘ sums of multiples of k inputs. compute n 1 P 1 + n 2 P 2 + n 3 P „ « k‘ P ( n 1 − qn 2 ) P 1 + n 2 ≈ min { k; ‘ } + b adds lg( k‘b ) b . n 3 P 3 + · · · where q if all coefficients are below 2 b . Remarkably simple; Within 1 + › of optimal. competitive with Pipp tics, Various special cases of for random choices with lg( kb ). Pippenger’s algorithm were much better memo Straus and Yao reinvented and patented by 1993 Brickell–Gordon–McCurley– Wilson, 1995 Lim–Lee, etc. “generalization” Is that the end of the story? the same.)

  81. 22 23 More generally, Pippenger’s No! 1989 Bos–Coster: algorithm computes If n 1 ≥ n 2 ≥ · · · then ‘ sums of multiples of k inputs. n 1 P 1 + n 2 P 2 + n 3 P 3 + · · · = „ « k‘ ( n 1 − qn 2 ) P 1 + n 2 ( qP 1 + P 2 ≈ min { k; ‘ } + b adds lg( k‘b ) n 3 P 3 + · · · where q = ⌊ n 1 =n if all coefficients are below 2 b . Remarkably simple; Within 1 + › of optimal. competitive with Pippenger Various special cases of for random choices of n i ’s; ). Pippenger’s algorithm were much better memory usage. ao reinvented and patented by 1993 Brickell–Gordon–McCurley– Wilson, 1995 Lim–Lee, etc. “generalization” Is that the end of the story? same.)

  82. 23 24 More generally, Pippenger’s No! 1989 Bos–Coster: algorithm computes If n 1 ≥ n 2 ≥ · · · then ‘ sums of multiples of k inputs. n 1 P 1 + n 2 P 2 + n 3 P 3 + · · · = „ « k‘ ( n 1 − qn 2 ) P 1 + n 2 ( qP 1 + P 2 ) + ≈ min { k; ‘ } + b adds lg( k‘b ) n 3 P 3 + · · · where q = ⌊ n 1 =n 2 ⌋ . if all coefficients are below 2 b . Remarkably simple; Within 1 + › of optimal. competitive with Pippenger Various special cases of for random choices of n i ’s; Pippenger’s algorithm were much better memory usage. reinvented and patented by 1993 Brickell–Gordon–McCurley– Wilson, 1995 Lim–Lee, etc. Is that the end of the story?

  83. 23 24 generally, Pippenger’s No! 1989 Bos–Coster: Example rithm computes If n 1 ≥ n 2 ≥ · · · then 000100000 of multiples of k inputs. n 1 P 1 + n 2 P 2 + n 3 P 3 + · · · = 000010000 « k‘ ( n 1 − qn 2 ) P 1 + n 2 ( qP 1 + P 2 ) + 100101100 min { k; ‘ } + b adds lg( k‘b ) n 3 P 3 + · · · where q = ⌊ n 1 =n 2 ⌋ . 010010010 coefficients are below 2 b . 001001101 Remarkably simple; 1 + › of optimal. 000000010 competitive with Pippenger 000000001 rious special cases of for random choices of n i ’s; enger’s algorithm were much better memory usage. Goal: Compute reinvented and patented by 300 P , 146 Brickell–Gordon–McCurley– Wilson, 1995 Lim–Lee, etc. the end of the story?

  84. 23 24 Pippenger’s No! 1989 Bos–Coster: Example of Bos–Coster: computes If n 1 ≥ n 2 ≥ · · · then 000100000 = 32 multiples of k inputs. n 1 P 1 + n 2 P 2 + n 3 P 3 + · · · = 000010000 = 16 « k‘ ( n 1 − qn 2 ) P 1 + n 2 ( qP 1 + P 2 ) + 100101100 = 300 b adds lg( k‘b ) n 3 P 3 + · · · where q = ⌊ n 1 =n 2 ⌋ . 010010010 = 146 are below 2 b . 001001101 = 77 Remarkably simple; optimal. 000000010 = 2 competitive with Pippenger 000000001 = 1 cases of for random choices of n i ’s; rithm were much better memory usage. Goal: Compute 32 patented by 300 P , 146 P , 77 P , rdon–McCurley– Lim–Lee, etc. of the story?

  85. 23 24 enger’s No! 1989 Bos–Coster: Example of Bos–Coster: If n 1 ≥ n 2 ≥ · · · then 000100000 = 32 inputs. n 1 P 1 + n 2 P 2 + n 3 P 3 + · · · = 000010000 = 16 ( n 1 − qn 2 ) P 1 + n 2 ( qP 1 + P 2 ) + 100101100 = 300 b adds n 3 P 3 + · · · where q = ⌊ n 1 =n 2 ⌋ . 010010010 = 146 2 b . 001001101 = 77 Remarkably simple; 000000010 = 2 competitive with Pippenger 000000001 = 1 for random choices of n i ’s; ere much better memory usage. Goal: Compute 32 P , 16 P , y 300 P , 146 P , 77 P , 2 P , 1 P . rdon–McCurley– etc. ry?

  86. 24 25 No! 1989 Bos–Coster: Example of Bos–Coster: If n 1 ≥ n 2 ≥ · · · then 000100000 = 32 n 1 P 1 + n 2 P 2 + n 3 P 3 + · · · = 000010000 = 16 ( n 1 − qn 2 ) P 1 + n 2 ( qP 1 + P 2 ) + 100101100 = 300 n 3 P 3 + · · · where q = ⌊ n 1 =n 2 ⌋ . 010010010 = 146 001001101 = 77 Remarkably simple; 000000010 = 2 competitive with Pippenger 000000001 = 1 for random choices of n i ’s; much better memory usage. Goal: Compute 32 P , 16 P , 300 P , 146 P , 77 P , 2 P , 1 P .

  87. 24 25 1989 Bos–Coster: Example of Bos–Coster: Reduce la n 2 ≥ · · · then 000100000 = 32 000100000 n 2 P 2 + n 3 P 3 + · · · = 000010000 = 16 000010000 qn 2 ) P 1 + n 2 ( qP 1 + P 2 ) + 100101100 = 300 010011010 · · · where q = ⌊ n 1 =n 2 ⌋ . 010010010 = 146 010010010 001001101 = 77 001001101 rkably simple; 000000010 = 2 000000010 etitive with Pippenger 000000001 = 1 000000001 andom choices of n i ’s; better memory usage. Goal: Compute 32 P , 16 P , Goal: Compute 300 P , 146 P , 77 P , 2 P , 1 P . 154 P , 146 Plus one add 146 P obtaining

Recommend


More recommend