sorting integer arrays bob s laptop screen security speed
play

Sorting integer arrays: Bobs laptop screen: security, speed, and - PowerPoint PPT Presentation

1 2 Sorting integer arrays: Bobs laptop screen: security, speed, and verification From: Alice D. J. Bernstein Thank you for your submission. We received many interesting papers, and unfortunately your Bob assumes this message is


  1. 3 4 (TCB) Examples of attack strategies: Classic security strategy: system 1. Attacker uses buffer overflow Rearchitect computer systems rcing in a device driver to control to have a much smaller TCB Linux kernel on Alice’s laptop. talk: 2. Attacker uses buffer overflow in a web browser to control Alice ” disk files on Bob’s laptop. Alice. Device driver is in the TCB. Web browser is in the TCB. ranteed CPU is in the TCB. Etc. matter what Massive TCB has many bugs, es. including many security holes. Any hope of fixing this?

  2. 4 5 Examples of attack strategies: Classic security strategy: 1. Attacker uses buffer overflow Rearchitect computer systems in a device driver to control to have a much smaller TCB. Linux kernel on Alice’s laptop. 2. Attacker uses buffer overflow in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

  3. 4 5 Examples of attack strategies: Classic security strategy: 1. Attacker uses buffer overflow Rearchitect computer systems in a device driver to control to have a much smaller TCB. Linux kernel on Alice’s laptop. Carefully audit the TCB. 2. Attacker uses buffer overflow in a web browser to control disk files on Bob’s laptop. Device driver is in the TCB. Web browser is in the TCB. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

  4. 4 5 Examples of attack strategies: Classic security strategy: 1. Attacker uses buffer overflow Rearchitect computer systems in a device driver to control to have a much smaller TCB. Linux kernel on Alice’s laptop. Carefully audit the TCB. 2. Attacker uses buffer overflow e.g. Bob runs many VMs: in a web browser to control VM A VM C disk files on Bob’s laptop. · · · Alice data Charlie data Device driver is in the TCB. TCB stops each VM from Web browser is in the TCB. touching data in other VMs. CPU is in the TCB. Etc. Massive TCB has many bugs, including many security holes. Any hope of fixing this?

  5. 4 5 Examples of attack strategies: Classic security strategy: 1. Attacker uses buffer overflow Rearchitect computer systems in a device driver to control to have a much smaller TCB. Linux kernel on Alice’s laptop. Carefully audit the TCB. 2. Attacker uses buffer overflow e.g. Bob runs many VMs: in a web browser to control VM A VM C disk files on Bob’s laptop. · · · Alice data Charlie data Device driver is in the TCB. TCB stops each VM from Web browser is in the TCB. touching data in other VMs. CPU is in the TCB. Etc. Browser in VM C isn’t in TCB. Massive TCB has many bugs, Can’t touch data in VM A, including many security holes. if TCB works correctly. Any hope of fixing this?

  6. 4 5 Examples of attack strategies: Classic security strategy: 1. Attacker uses buffer overflow Rearchitect computer systems in a device driver to control to have a much smaller TCB. Linux kernel on Alice’s laptop. Carefully audit the TCB. 2. Attacker uses buffer overflow e.g. Bob runs many VMs: in a web browser to control VM A VM C disk files on Bob’s laptop. · · · Alice data Charlie data Device driver is in the TCB. TCB stops each VM from Web browser is in the TCB. touching data in other VMs. CPU is in the TCB. Etc. Browser in VM C isn’t in TCB. Massive TCB has many bugs, Can’t touch data in VM A, including many security holes. if TCB works correctly. Any hope of fixing this? Alice also runs many VMs.

  7. 4 5 Examples of attack strategies: Classic security strategy: Cryptography ttacker uses buffer overflow Rearchitect computer systems How does device driver to control to have a much smaller TCB. that incoming Linux kernel on Alice’s laptop. is from Alice’s Carefully audit the TCB. ttacker uses buffer overflow Cryptographic e.g. Bob runs many VMs: web browser to control Message-authentication VM A VM C files on Bob’s laptop. · · · Alice data Charlie data Alice’s driver is in the TCB. TCB stops each VM from rowser is in the TCB. touching data in other VMs. authenticated is in the TCB. Etc. Browser in VM C isn’t in TCB. authenticated Massive TCB has many bugs, Can’t touch data in VM A, including many security holes. if TCB works correctly. hope of fixing this? Alice’s Alice also runs many VMs.

  8. � � � 4 5 attack strategies: Classic security strategy: Cryptography buffer overflow Rearchitect computer systems How does Bob’s laptop driver to control to have a much smaller TCB. that incoming netw on Alice’s laptop. is from Alice’s laptop? Carefully audit the TCB. buffer overflow Cryptographic solution: e.g. Bob runs many VMs: wser to control Message-authentication VM A VM C Bob’s laptop. · · · Alice data Charlie data Alice’s message in the TCB. TCB stops each VM from in the TCB. touching data in other VMs. authenticated message TCB. Etc. untrusted Browser in VM C isn’t in TCB. authenticated message has many bugs, Can’t touch data in VM A, security holes. if TCB works correctly. fixing this? Alice’s message Alice also runs many VMs.

  9. � � � � 4 5 strategies: Classic security strategy: Cryptography overflow Rearchitect computer systems How does Bob’s laptop know control to have a much smaller TCB. that incoming network data laptop. is from Alice’s laptop? Carefully audit the TCB. overflow Cryptographic solution: e.g. Bob runs many VMs: control Message-authentication codes. VM A VM C laptop. · · · Alice data Charlie data Alice’s message TCB. TCB stops each VM from TCB. touching data in other VMs. authenticated message untrusted netwo Browser in VM C isn’t in TCB. authenticated message bugs, Can’t touch data in VM A, holes. if TCB works correctly. Alice’s message � Alice also runs many VMs.

  10. � � � � � 5 6 Classic security strategy: Cryptography Rearchitect computer systems How does Bob’s laptop know to have a much smaller TCB. that incoming network data is from Alice’s laptop? Carefully audit the TCB. Cryptographic solution: e.g. Bob runs many VMs: Message-authentication codes. VM A VM C · · · Alice data Charlie data Alice’s message k TCB stops each VM from touching data in other VMs. authenticated message untrusted network Browser in VM C isn’t in TCB. authenticated message Can’t touch data in VM A, if TCB works correctly. Alice’s message k Alice also runs many VMs.

  11. � � � � � 5 6 Classic security strategy: Cryptography Rearchitect computer systems How does Bob’s laptop know to have a much smaller TCB. that incoming network data is from Alice’s laptop? Carefully audit the TCB. Cryptographic solution: e.g. Bob runs many VMs: Message-authentication codes. VM A VM C · · · Alice data Charlie data Alice’s message k TCB stops each VM from touching data in other VMs. authenticated message untrusted network Browser in VM C isn’t in TCB. modified message Can’t touch data in VM A, if TCB works correctly. “Alert: forgery!” k Alice also runs many VMs.

  12. � � � � � 5 6 security strategy: Cryptography Important to share rchitect computer systems How does Bob’s laptop know have a much smaller TCB. that incoming network data What if is from Alice’s laptop? on their refully audit the TCB. Cryptographic solution: Bob runs many VMs: Message-authentication codes. A VM C · · · data Charlie data Alice’s message k stops each VM from touching data in other VMs. authenticated message untrusted network wser in VM C isn’t in TCB. modified message touch data in VM A, works correctly. “Alert: forgery!” k also runs many VMs.

  13. � � � � � 5 6 strategy: Cryptography Important for Alice to share the same computer systems How does Bob’s laptop know smaller TCB. that incoming network data What if attacker w is from Alice’s laptop? on their communication the TCB. Cryptographic solution: many VMs: Message-authentication codes. VM C · · · Charlie data Alice’s message k VM from other VMs. authenticated message untrusted network C isn’t in TCB. modified message ta in VM A, rrectly. “Alert: forgery!” k many VMs.

  14. � � � � � 5 6 Cryptography Important for Alice and Bob to share the same secret k . systems How does Bob’s laptop know TCB. that incoming network data What if attacker was spying is from Alice’s laptop? on their communication of k Cryptographic solution: Message-authentication codes. · · · Alice’s message k VMs. authenticated message untrusted network TCB. modified message A, “Alert: forgery!” k VMs.

  15. � � � � � 6 7 Cryptography Important for Alice and Bob to share the same secret k . How does Bob’s laptop know that incoming network data What if attacker was spying is from Alice’s laptop? on their communication of k ? Cryptographic solution: Message-authentication codes. Alice’s message k authenticated message untrusted network modified message “Alert: forgery!” k

  16. � � � � � � � � � � � � 6 7 Cryptography Important for Alice and Bob to share the same secret k . How does Bob’s laptop know that incoming network data What if attacker was spying is from Alice’s laptop? on their communication of k ? Cryptographic solution: Solution 1: Message-authentication codes. Public-key encryption. Alice’s message private key a k k authenticated message ciphertext public key aG untrusted network network network modified message ciphertext public key aG “Alert: forgery!” k k

  17. � � � � � � � � � � � � � � � 6 7 Cryptography Important for Alice and Bob Solution to share the same secret k . Public-key does Bob’s laptop know incoming network data What if attacker was spying m Alice’s laptop? on their communication of k ? signed message Cryptographic solution: Solution 1: Message-authentication codes. Public-key encryption. signed message Alice’s message private key a k k m authenticated message ciphertext public key aG untrusted network network network dified message ciphertext public key aG “Alert: forgery!” k k

  18. � � � � � � � � � � � � 6 7 Important for Alice and Bob Solution 2: to share the same secret k . Public-key signatures. laptop know network data What if attacker was spying m laptop? on their communication of k ? signed message solution: Solution 1: network Message-authentication codes. Public-key encryption. signed message message private key a k k � � m message ciphertext public key aG untrusted network network network message ciphertext public key aG rgery!” k k

  19. � � � � � � � � � � � � � � 6 7 Important for Alice and Bob Solution 2: to share the same secret k . Public-key signatures. know data What if attacker was spying m a on their communication of k ? signed message aG Solution 1: network net des. Public-key encryption. signed message aG private key a k k m ciphertext public key aG work network network ciphertext public key aG k k

  20. � � � � � � � � � � � � � � 7 8 Important for Alice and Bob Solution 2: to share the same secret k . Public-key signatures. What if attacker was spying m a on their communication of k ? signed message aG Solution 1: network network Public-key encryption. signed message aG private key a k m ciphertext public key aG network network ciphertext public key aG k

  21. � � � � � � � � � � � � � � 7 8 Important for Alice and Bob Solution 2: to share the same secret k . Public-key signatures. What if attacker was spying m a on their communication of k ? signed message aG Solution 1: network network Public-key encryption. signed message aG private key a k m ciphertext public key aG No more shared secret k network network but Alice still has secret a . ciphertext public key aG Cryptography requires TCB to protect secrecy of keys, k even if user has no other secrets.

  22. � � � � � � � � � � � 7 8 rtant for Alice and Bob Solution 2: Constant-time re the same secret k . Public-key signatures. Large po if attacker was spying optimizations m a their communication of k ? addresses signed message aG Solution 1: Consider network network Public-key encryption. instruction signed message aG parallel cache private key a store-to-load m branch p ciphertext public key aG No more shared secret k network network but Alice still has secret a . ciphertext public key aG Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets.

  23. � � � � � � � � � 7 8 Alice and Bob Solution 2: Constant-time soft same secret k . Public-key signatures. Large portion of CPU was spying optimizations depending m a communication of k ? addresses of memo signed message aG Consider data cachin network network encryption. instruction caching, signed message aG parallel cache banks, private key a store-to-load forwa m branch prediction, public key aG No more shared secret k network but Alice still has secret a . public key aG Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets.

  24. � � � � � � � 7 8 Bob Solution 2: Constant-time software . Public-key signatures. Large portion of CPU hardw ying optimizations depending on m a of k ? addresses of memory locations. signed message aG Consider data caching, network network instruction caching, signed message aG parallel cache banks, key a store-to-load forwarding, m branch prediction, etc. y aG No more shared secret k network but Alice still has secret a . y aG Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets.

  25. � � � � � � � 8 9 Solution 2: Constant-time software Public-key signatures. Large portion of CPU hardware: optimizations depending on m a addresses of memory locations. signed message aG Consider data caching, network network instruction caching, signed message aG parallel cache banks, store-to-load forwarding, m branch prediction, etc. No more shared secret k but Alice still has secret a . Cryptography requires TCB to protect secrecy of keys, even if user has no other secrets.

  26. � � � � � � � 8 9 Solution 2: Constant-time software Public-key signatures. Large portion of CPU hardware: optimizations depending on m a addresses of memory locations. signed message aG Consider data caching, network network instruction caching, signed message aG parallel cache banks, store-to-load forwarding, m branch prediction, etc. No more shared secret k Many attacks (e.g. TLBleed from but Alice still has secret a . 2018 Gras–Razavi–Bos–Giuffrida) Cryptography requires TCB show that this portion of the CPU to protect secrecy of keys, has trouble keeping secrets. even if user has no other secrets.

  27. � � � � � � � 8 9 Solution 2: Constant-time software Typical literature Public-key signatures. Large portion of CPU hardware: Understand optimizations depending on But details m a addresses of memory locations. not exposed message aG Consider data caching, Try to push network network instruction caching, This becomes message aG parallel cache banks, Tweak the store-to-load forwarding, to try to m branch prediction, etc. re shared secret k Many attacks (e.g. TLBleed from Alice still has secret a . 2018 Gras–Razavi–Bos–Giuffrida) Cryptography requires TCB show that this portion of the CPU rotect secrecy of keys, has trouble keeping secrets. if user has no other secrets.

  28. � � 8 9 Constant-time software Typical literature on signatures. Large portion of CPU hardware: Understand this po optimizations depending on But details are often a addresses of memory locations. not exposed to securit aG Consider data caching, Try to push attacks network instruction caching, This becomes very aG parallel cache banks, Tweak the attacked store-to-load forwarding, to try to stop the kno branch prediction, etc. secret k Many attacks (e.g. TLBleed from has secret a . 2018 Gras–Razavi–Bos–Giuffrida) requires TCB show that this portion of the CPU secrecy of keys, has trouble keeping secrets. no other secrets.

  29. 8 9 Constant-time software Typical literature on this topic: Large portion of CPU hardware: Understand this portion of CPU. optimizations depending on But details are often proprieta addresses of memory locations. not exposed to security review. Consider data caching, Try to push attacks further. network instruction caching, This becomes very complicated. parallel cache banks, Tweak the attacked software store-to-load forwarding, to try to stop the known attacks. branch prediction, etc. Many attacks (e.g. TLBleed from . 2018 Gras–Razavi–Bos–Giuffrida) TCB show that this portion of the CPU eys, has trouble keeping secrets. secrets.

  30. 9 10 Constant-time software Typical literature on this topic: Large portion of CPU hardware: Understand this portion of CPU. optimizations depending on But details are often proprietary, addresses of memory locations. not exposed to security review. Consider data caching, Try to push attacks further. instruction caching, This becomes very complicated. parallel cache banks, Tweak the attacked software store-to-load forwarding, to try to stop the known attacks. branch prediction, etc. Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.

  31. 9 10 Constant-time software Typical literature on this topic: Large portion of CPU hardware: Understand this portion of CPU. optimizations depending on But details are often proprietary, addresses of memory locations. not exposed to security review. Consider data caching, Try to push attacks further. instruction caching, This becomes very complicated. parallel cache banks, Tweak the attacked software store-to-load forwarding, to try to stop the known attacks. branch prediction, etc. For researchers: This is great! Many attacks (e.g. TLBleed from 2018 Gras–Razavi–Bos–Giuffrida) show that this portion of the CPU has trouble keeping secrets.

  32. 9 10 Constant-time software Typical literature on this topic: Large portion of CPU hardware: Understand this portion of CPU. optimizations depending on But details are often proprietary, addresses of memory locations. not exposed to security review. Consider data caching, Try to push attacks further. instruction caching, This becomes very complicated. parallel cache banks, Tweak the attacked software store-to-load forwarding, to try to stop the known attacks. branch prediction, etc. For researchers: This is great! Many attacks (e.g. TLBleed from For auditors: This is a nightmare. 2018 Gras–Razavi–Bos–Giuffrida) Many years of security failures. show that this portion of the CPU No confidence in future security. has trouble keeping secrets.

  33. 9 10 Constant-time software Typical literature on this topic: The “constant-time” Don’t give portion of CPU hardware: Understand this portion of CPU. to this p optimizations depending on But details are often proprietary, (1987 Goldreich, addresses of memory locations. not exposed to security review. Oblivious Consider data caching, Try to push attacks further. domain-sp instruction caching, This becomes very complicated. rallel cache banks, Tweak the attacked software re-to-load forwarding, to try to stop the known attacks. prediction, etc. For researchers: This is great! attacks (e.g. TLBleed from For auditors: This is a nightmare. Gras–Razavi–Bos–Giuffrida) Many years of security failures. that this portion of the CPU No confidence in future security. trouble keeping secrets.

  34. 9 10 software Typical literature on this topic: The “constant-time” Don’t give any secrets CPU hardware: Understand this portion of CPU. to this portion of the depending on But details are often proprietary, (1987 Goldreich, 1990 memory locations. not exposed to security review. Oblivious RAM; 2004 caching, Try to push attacks further. domain-specific for caching, This becomes very complicated. banks, Tweak the attacked software rwarding, to try to stop the known attacks. rediction, etc. For researchers: This is great! (e.g. TLBleed from For auditors: This is a nightmare. Gras–Razavi–Bos–Giuffrida) Many years of security failures. ortion of the CPU No confidence in future security. eeping secrets.

  35. 9 10 Typical literature on this topic: The “constant-time” solution: Don’t give any secrets rdware: Understand this portion of CPU. to this portion of the CPU. on But details are often proprietary, (1987 Goldreich, 1990 Ostrovsky: cations. not exposed to security review. Oblivious RAM; 2004 Bernstein: Try to push attacks further. domain-specific for better sp This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! TLBleed from For auditors: This is a nightmare. Gras–Razavi–Bos–Giuffrida) Many years of security failures. the CPU No confidence in future security. secrets.

  36. 10 11 Typical literature on this topic: The “constant-time” solution: Don’t give any secrets Understand this portion of CPU. to this portion of the CPU. But details are often proprietary, (1987 Goldreich, 1990 Ostrovsky: not exposed to security review. Oblivious RAM; 2004 Bernstein: Try to push attacks further. domain-specific for better speed) This becomes very complicated. Tweak the attacked software to try to stop the known attacks. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

  37. 10 11 Typical literature on this topic: The “constant-time” solution: Don’t give any secrets Understand this portion of CPU. to this portion of the CPU. But details are often proprietary, (1987 Goldreich, 1990 Ostrovsky: not exposed to security review. Oblivious RAM; 2004 Bernstein: Try to push attacks further. domain-specific for better speed) This becomes very complicated. TCB analysis: Need this portion Tweak the attacked software of the CPU to be correct, but to try to stop the known attacks. don’t need it to keep secrets. Makes auditing much easier. For researchers: This is great! For auditors: This is a nightmare. Many years of security failures. No confidence in future security.

  38. 10 11 Typical literature on this topic: The “constant-time” solution: Don’t give any secrets Understand this portion of CPU. to this portion of the CPU. But details are often proprietary, (1987 Goldreich, 1990 Ostrovsky: not exposed to security review. Oblivious RAM; 2004 Bernstein: Try to push attacks further. domain-specific for better speed) This becomes very complicated. TCB analysis: Need this portion Tweak the attacked software of the CPU to be correct, but to try to stop the known attacks. don’t need it to keep secrets. Makes auditing much easier. For researchers: This is great! Good match for attitude and For auditors: This is a nightmare. experience of CPU designers: e.g., Many years of security failures. Intel issues errata for correctness No confidence in future security. bugs, not for information leaks.

  39. 10 11 ypical literature on this topic: The “constant-time” solution: Case study: Don’t give any secrets Understand this portion of CPU. Serious risk to this portion of the CPU. details are often proprietary, Attacker (1987 Goldreich, 1990 Ostrovsky: exposed to security review. breaking Oblivious RAM; 2004 Bernstein: public-key push attacks further. domain-specific for better speed) e.g., finding ecomes very complicated. TCB analysis: Need this portion the attacked software of the CPU to be correct, but to stop the known attacks. don’t need it to keep secrets. Makes auditing much easier. researchers: This is great! Good match for attitude and auditors: This is a nightmare. experience of CPU designers: e.g., years of security failures. Intel issues errata for correctness confidence in future security. bugs, not for information leaks.

  40. 10 11 literature on this topic: The “constant-time” solution: Case study: Constant-time Don’t give any secrets portion of CPU. Serious risk within to this portion of the CPU. often proprietary, Attacker has quantum (1987 Goldreich, 1990 Ostrovsky: security review. breaking today’s most Oblivious RAM; 2004 Bernstein: public-key crypto (RSA ttacks further. domain-specific for better speed) e.g., finding a given very complicated. TCB analysis: Need this portion attacked software of the CPU to be correct, but the known attacks. don’t need it to keep secrets. Makes auditing much easier. This is great! Good match for attitude and This is a nightmare. experience of CPU designers: e.g., security failures. Intel issues errata for correctness future security. bugs, not for information leaks.

  41. 10 11 topic: The “constant-time” solution: Case study: Constant-time so Don’t give any secrets of CPU. Serious risk within 10 years: to this portion of the CPU. rietary, Attacker has quantum computer (1987 Goldreich, 1990 Ostrovsky: review. breaking today’s most popula Oblivious RAM; 2004 Bernstein: public-key crypto (RSA and further. domain-specific for better speed) e.g., finding a given aG ). complicated. TCB analysis: Need this portion are of the CPU to be correct, but attacks. don’t need it to keep secrets. Makes auditing much easier. great! Good match for attitude and nightmare. experience of CPU designers: e.g., failures. Intel issues errata for correctness security. bugs, not for information leaks.

  42. 11 12 The “constant-time” solution: Case study: Constant-time sorting Don’t give any secrets Serious risk within 10 years: to this portion of the CPU. Attacker has quantum computer (1987 Goldreich, 1990 Ostrovsky: breaking today’s most popular Oblivious RAM; 2004 Bernstein: public-key crypto (RSA and ECC; domain-specific for better speed) e.g., finding a given aG ). TCB analysis: Need this portion of the CPU to be correct, but don’t need it to keep secrets. Makes auditing much easier. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.

  43. 11 12 The “constant-time” solution: Case study: Constant-time sorting Don’t give any secrets Serious risk within 10 years: to this portion of the CPU. Attacker has quantum computer (1987 Goldreich, 1990 Ostrovsky: breaking today’s most popular Oblivious RAM; 2004 Bernstein: public-key crypto (RSA and ECC; domain-specific for better speed) e.g., finding a given aG ). TCB analysis: Need this portion 2017: Hundreds of people of the CPU to be correct, but submit 69 complete proposals don’t need it to keep secrets. to international competition for Makes auditing much easier. post-quantum crypto standards. Good match for attitude and experience of CPU designers: e.g., Intel issues errata for correctness bugs, not for information leaks.

  44. 11 12 The “constant-time” solution: Case study: Constant-time sorting Don’t give any secrets Serious risk within 10 years: to this portion of the CPU. Attacker has quantum computer (1987 Goldreich, 1990 Ostrovsky: breaking today’s most popular Oblivious RAM; 2004 Bernstein: public-key crypto (RSA and ECC; domain-specific for better speed) e.g., finding a given aG ). TCB analysis: Need this portion 2017: Hundreds of people of the CPU to be correct, but submit 69 complete proposals don’t need it to keep secrets. to international competition for Makes auditing much easier. post-quantum crypto standards. Good match for attitude and Subroutine in some submissions: experience of CPU designers: e.g., sort array of secret integers. Intel issues errata for correctness e.g. sort 768 32-bit integers. bugs, not for information leaks.

  45. 11 12 “constant-time” solution: Case study: Constant-time sorting How to so give any secrets without Serious risk within 10 years: portion of the CPU. Attacker has quantum computer Goldreich, 1990 Ostrovsky: breaking today’s most popular Oblivious RAM; 2004 Bernstein: public-key crypto (RSA and ECC; domain-specific for better speed) e.g., finding a given aG ). analysis: Need this portion 2017: Hundreds of people CPU to be correct, but submit 69 complete proposals need it to keep secrets. to international competition for auditing much easier. post-quantum crypto standards. match for attitude and Subroutine in some submissions: erience of CPU designers: e.g., sort array of secret integers. issues errata for correctness e.g. sort 768 32-bit integers. not for information leaks.

  46. 11 12 “constant-time” solution: Case study: Constant-time sorting How to sort secret secrets without any secret Serious risk within 10 years: of the CPU. Attacker has quantum computer Goldreich, 1990 Ostrovsky: breaking today’s most popular 2004 Bernstein: public-key crypto (RSA and ECC; for better speed) e.g., finding a given aG ). Need this portion 2017: Hundreds of people e correct, but submit 69 complete proposals keep secrets. to international competition for much easier. post-quantum crypto standards. attitude and Subroutine in some submissions: U designers: e.g., sort array of secret integers. errata for correctness e.g. sort 768 32-bit integers. information leaks.

  47. 11 12 solution: Case study: Constant-time sorting How to sort secret data without any secret addresses? Serious risk within 10 years: CPU. Attacker has quantum computer Ostrovsky: breaking today’s most popular Bernstein: public-key crypto (RSA and ECC; speed) e.g., finding a given aG ). ortion 2017: Hundreds of people but submit 69 complete proposals secrets. to international competition for easier. post-quantum crypto standards. and Subroutine in some submissions: designers: e.g., sort array of secret integers. rrectness e.g. sort 768 32-bit integers. leaks.

  48. 12 13 Case study: Constant-time sorting How to sort secret data without any secret addresses? Serious risk within 10 years: Attacker has quantum computer breaking today’s most popular public-key crypto (RSA and ECC; e.g., finding a given aG ). 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers.

  49. 12 13 Case study: Constant-time sorting How to sort secret data without any secret addresses? Serious risk within 10 years: Attacker has quantum computer Typical sorting algorithms— breaking today’s most popular merge sort, quicksort, etc.— public-key crypto (RSA and ECC; choose load/store addresses e.g., finding a given aG ). based on secret data. Usually also branch based on secret data. 2017: Hundreds of people submit 69 complete proposals to international competition for post-quantum crypto standards. Subroutine in some submissions: sort array of secret integers. e.g. sort 768 32-bit integers.

  50. 12 13 Case study: Constant-time sorting How to sort secret data without any secret addresses? Serious risk within 10 years: Attacker has quantum computer Typical sorting algorithms— breaking today’s most popular merge sort, quicksort, etc.— public-key crypto (RSA and ECC; choose load/store addresses e.g., finding a given aG ). based on secret data. Usually also branch based on secret data. 2017: Hundreds of people submit 69 complete proposals One submission to competition: to international competition for “Radix sort is used as post-quantum crypto standards. constant-time sorting algorithm.” Some versions of radix sort Subroutine in some submissions: avoid secret branches. sort array of secret integers. e.g. sort 768 32-bit integers.

  51. 12 13 Case study: Constant-time sorting How to sort secret data without any secret addresses? Serious risk within 10 years: Attacker has quantum computer Typical sorting algorithms— breaking today’s most popular merge sort, quicksort, etc.— public-key crypto (RSA and ECC; choose load/store addresses e.g., finding a given aG ). based on secret data. Usually also branch based on secret data. 2017: Hundreds of people submit 69 complete proposals One submission to competition: to international competition for “Radix sort is used as post-quantum crypto standards. constant-time sorting algorithm.” Some versions of radix sort Subroutine in some submissions: avoid secret branches. sort array of secret integers. But data addresses in radix sort e.g. sort 768 32-bit integers. still depend on secrets.

  52. 12 13 study: Constant-time sorting How to sort secret data Foundation without any secret addresses? a compa Serious risk within 10 years: er has quantum computer Typical sorting algorithms— x reaking today’s most popular merge sort, quicksort, etc.— public-key crypto (RSA and ECC; choose load/store addresses • finding a given aG ). based on secret data. Usually also branch based on secret data. Hundreds of people min { x; y submit 69 complete proposals One submission to competition: international competition for “Radix sort is used as Easy constant-time ost-quantum crypto standards. constant-time sorting algorithm.” Warning: Some versions of radix sort compiler routine in some submissions: avoid secret branches. rray of secret integers. Even easier But data addresses in radix sort rt 768 32-bit integers. still depend on secrets.

  53. 12 13 Constant-time sorting How to sort secret data Foundation of solution: without any secret addresses? a comparator sorting within 10 years: quantum computer Typical sorting algorithms— x most popular merge sort, quicksort, etc.— crypto (RSA and ECC; choose load/store addresses • given aG ). based on secret data. Usually also branch based on secret data. of people min { x; y } max complete proposals One submission to competition: competition for “Radix sort is used as Easy constant-time crypto standards. constant-time sorting algorithm.” Warning: C standa Some versions of radix sort compiler to screw some submissions: avoid secret branches. cret integers. Even easier exercise But data addresses in radix sort 32-bit integers. still depend on secrets.

  54. 12 13 Constant-time sorting How to sort secret data Foundation of solution: without any secret addresses? a comparator sorting 2 integers. rs: computer Typical sorting algorithms— x y opular merge sort, quicksort, etc.— and ECC; choose load/store addresses • • based on secret data. Usually also branch based on secret data. min { x; y } max { x; y } osals One submission to competition: etition for “Radix sort is used as Easy constant-time exercise standards. constant-time sorting algorithm.” Warning: C standard allows Some versions of radix sort compiler to screw this up. submissions: avoid secret branches. integers. Even easier exercise in asm. But data addresses in radix sort gers. still depend on secrets.

  55. 13 14 How to sort secret data Foundation of solution: without any secret addresses? a comparator sorting 2 integers. Typical sorting algorithms— x y merge sort, quicksort, etc.— choose load/store addresses • • based on secret data. Usually also branch based on secret data. min { x; y } max { x; y } One submission to competition: “Radix sort is used as Easy constant-time exercise in C. constant-time sorting algorithm.” Warning: C standard allows Some versions of radix sort compiler to screw this up. avoid secret branches. Even easier exercise in asm. But data addresses in radix sort still depend on secrets.

  56. 13 14 to sort secret data Foundation of solution: Combine without any secret addresses? a comparator sorting 2 integers. sorting net ypical sorting algorithms— Example x y sort, quicksort, etc.— load/store addresses • • on secret data. Usually • anch based on secret data. min { x; y } max { x; y } submission to competition: sort is used as Easy constant-time exercise in C. • constant-time sorting algorithm.” Warning: C standard allows versions of radix sort compiler to screw this up. secret branches. Even easier exercise in asm. • data addresses in radix sort depend on secrets.

  57. 13 14 ecret data Foundation of solution: Combine comparato secret addresses? a comparator sorting 2 integers. sorting network fo algorithms— Example of a sorting x y quicksort, etc.— re addresses • • data. Usually • • based on secret data. min { x; y } max { x; y } • to competition: used as Easy constant-time exercise in C. • • rting algorithm.” Warning: C standard allows • of radix sort compiler to screw this up. ranches. Even easier exercise in asm. • • addresses in radix sort secrets.

  58. 13 14 Foundation of solution: Combine comparators into a addresses? a comparator sorting 2 integers. sorting network for more inputs. rithms— Example of a sorting network: x y tc.— addresses • • Usually • • secret data. min { x; y } max { x; y } • • etition: Easy constant-time exercise in C. • • • • rithm.” Warning: C standard allows • • rt compiler to screw this up. Even easier exercise in asm. • • radix sort

  59. 14 15 Foundation of solution: Combine comparators into a a comparator sorting 2 integers. sorting network for more inputs. Example of a sorting network: x y • • • • min { x; y } max { x; y } • • Easy constant-time exercise in C. • • • • Warning: C standard allows • • compiler to screw this up. Even easier exercise in asm. • •

  60. 14 15 oundation of solution: Combine comparators into a Positions comparator sorting 2 integers. sorting network for more inputs. in a sorting independent Example of a sorting network: y Naturally • • • ; y } max { x; y } • • constant-time exercise in C. • • • • rning: C standard allows • • compiler to screw this up. easier exercise in asm. • •

  61. 14 15 solution: Combine comparators into a Positions of compa sorting 2 integers. sorting network for more inputs. in a sorting network independent of the Example of a sorting network: y Naturally constant-time. • • • max { x; y } • • constant-time exercise in C. • • • • standard allows • • screw this up. exercise in asm. • •

  62. 14 15 Combine comparators into a Positions of comparators integers. sorting network for more inputs. in a sorting network are independent of the input. Example of a sorting network: Naturally constant-time. • • } • • exercise in C. • • • • ws • • . • •

  63. 15 16 Combine comparators into a Positions of comparators sorting network for more inputs. in a sorting network are independent of the input. Example of a sorting network: Naturally constant-time. • • • • • • • • • • • •

  64. 15 16 Combine comparators into a Positions of comparators sorting network for more inputs. in a sorting network are independent of the input. Example of a sorting network: Naturally constant-time. But ( n 2 − n ) = 2 comparators produce complaints about • • performance as n increases. • • • • • • • • • •

  65. 15 16 Combine comparators into a Positions of comparators sorting network for more inputs. in a sorting network are independent of the input. Example of a sorting network: Naturally constant-time. But ( n 2 − n ) = 2 comparators produce complaints about • • performance as n increases. • • Speed is a serious issue in the post-quantum competition. • • • • “Cost” is evaluation criterion; • • “we’d like to stress this once again on the forum that we’d • • really like to see more platform- optimized implementations”; etc.

  66. 15 16 Combine comparators into a Positions of comparators void int32_sort(int32 rting network for more inputs. in a sorting network are { int64 independent of the input. if (n Example of a sorting network: Naturally constant-time. t = 1; while But ( n 2 − n ) = 2 comparators for (p produce complaints about • • for performance as n increases. if • • Speed is a serious issue in the post-quantum competition. • • • • for “Cost” is evaluation criterion; for • • “we’d like to stress this once again on the forum that we’d • • really like to see more platform- } optimized implementations”; etc. }

  67. 15 16 rators into a Positions of comparators void int32_sort(int32 for more inputs. in a sorting network are { int64 t,p,q,i; independent of the input. if (n < 2) return; rting network: Naturally constant-time. t = 1; while (t < n - But ( n 2 − n ) = 2 comparators for (p = t;p > produce complaints about for (i = 0;i performance as n increases. if (!(i & p)) • Speed is a serious issue in the minmax(x+i,x+i+p); post-quantum competition. • • for (q = t;q “Cost” is evaluation criterion; for (i = 0;i • “we’d like to stress this once if (!(i & again on the forum that we’d minmax(x+i+p,x+i+q); really like to see more platform- } optimized implementations”; etc. }

  68. 15 16 a Positions of comparators void int32_sort(int32 *x,int64 inputs. in a sorting network are { int64 t,p,q,i; independent of the input. if (n < 2) return; ork: Naturally constant-time. t = 1; while (t < n - t) t += But ( n 2 − n ) = 2 comparators for (p = t;p > 0;p >>= produce complaints about for (i = 0;i < n - p;++i) performance as n increases. if (!(i & p)) Speed is a serious issue in the minmax(x+i,x+i+p); post-quantum competition. • for (q = t;q > p;q >>= “Cost” is evaluation criterion; for (i = 0;i < n - “we’d like to stress this once if (!(i & p)) again on the forum that we’d minmax(x+i+p,x+i+q); really like to see more platform- } optimized implementations”; etc. }

  69. 16 17 Positions of comparators void int32_sort(int32 *x,int64 n) in a sorting network are { int64 t,p,q,i; independent of the input. if (n < 2) return; Naturally constant-time. t = 1; while (t < n - t) t += t; But ( n 2 − n ) = 2 comparators for (p = t;p > 0;p >>= 1) { produce complaints about for (i = 0;i < n - p;++i) performance as n increases. if (!(i & p)) Speed is a serious issue in the minmax(x+i,x+i+p); post-quantum competition. for (q = t;q > p;q >>= 1) “Cost” is evaluation criterion; for (i = 0;i < n - q;++i) “we’d like to stress this once if (!(i & p)) again on the forum that we’d minmax(x+i+p,x+i+q); really like to see more platform- } optimized implementations”; etc. }

  70. 16 17 ositions of comparators Previous void int32_sort(int32 *x,int64 n) rting network are 1973 Knuth { int64 t,p,q,i; endent of the input. which is if (n < 2) return; Naturally constant-time. 1968 Batcher t = 1; sorting net while (t < n - t) t += t; 2 − n ) = 2 comparators for (p = t;p > 0;p >>= 1) { duce complaints about ≈ n (log 2 for (i = 0;i < n - p;++i) rmance as n increases. Much faster if (!(i & p)) is a serious issue in the Warning: minmax(x+i,x+i+p); ost-quantum competition. of Batcher’s for (q = t;q > p;q >>= 1) is evaluation criterion; require n for (i = 0;i < n - q;++i) like to stress this once Also, Wikip if (!(i & p)) on the forum that we’d networks minmax(x+i+p,x+i+q); like to see more platform- handling } optimized implementations”; etc. }

  71. 16 17 comparators Previous slide: C translation void int32_sort(int32 *x,int64 n) ork are 1973 Knuth “merge { int64 t,p,q,i; the input. which is a simplified if (n < 2) return; constant-time. 1968 Batcher “odd-even t = 1; sorting networks. while (t < n - t) t += t; comparators for (p = t;p > 0;p >>= 1) { ≈ n (log 2 n ) 2 = 4 compa complaints about for (i = 0;i < n - p;++i) n increases. Much faster than bubble if (!(i & p)) serious issue in the Warning: many other minmax(x+i,x+i+p); competition. of Batcher’s sorting for (q = t;q > p;q >>= 1) evaluation criterion; require n to be a p for (i = 0;i < n - q;++i) stress this once Also, Wikipedia sa if (!(i & p)) rum that we’d networks : : : are n minmax(x+i+p,x+i+q); more platform- handling arbitrarily } implementations”; etc. }

  72. 16 17 Previous slide: C translation void int32_sort(int32 *x,int64 n) 1973 Knuth “merge exchange”, { int64 t,p,q,i; which is a simplified version if (n < 2) return; 1968 Batcher “odd-even merge” t = 1; sorting networks. while (t < n - t) t += t; rs for (p = t;p > 0;p >>= 1) { ≈ n (log 2 n ) 2 = 4 comparators. for (i = 0;i < n - p;++i) increases. Much faster than bubble sort. if (!(i & p)) the Warning: many other descriptions minmax(x+i,x+i+p); etition. of Batcher’s sorting networks for (q = t;q > p;q >>= 1) criterion; require n to be a power of 2. for (i = 0;i < n - q;++i) once Also, Wikipedia says “Sorting if (!(i & p)) e’d networks : : : are not capable minmax(x+i+p,x+i+q); platform- handling arbitrarily large inputs } tations”; etc. }

  73. 17 18 Previous slide: C translation of void int32_sort(int32 *x,int64 n) 1973 Knuth “merge exchange”, { int64 t,p,q,i; which is a simplified version of if (n < 2) return; 1968 Batcher “odd-even merge” t = 1; sorting networks. while (t < n - t) t += t; for (p = t;p > 0;p >>= 1) { ≈ n (log 2 n ) 2 = 4 comparators. for (i = 0;i < n - p;++i) Much faster than bubble sort. if (!(i & p)) Warning: many other descriptions minmax(x+i,x+i+p); of Batcher’s sorting networks for (q = t;q > p;q >>= 1) require n to be a power of 2. for (i = 0;i < n - q;++i) Also, Wikipedia says “Sorting if (!(i & p)) networks : : : are not capable of minmax(x+i+p,x+i+q); handling arbitrarily large inputs.” } }

  74. 17 18 Previous slide: C translation of int32_sort(int32 *x,int64 n) This constant-time 1973 Knuth “merge exchange”, t,p,q,i; which is a simplified version of < 2) return; 1968 Batcher “odd-even merge” 1; sorting networks. Constant-time (t < n - t) t += t; (p = t;p > 0;p >>= 1) { ≈ n (log 2 n ) 2 = 4 comparators. Bernstein–Chuengsatiansup– (i = 0;i < n - p;++i) Much faster than bubble sort. Lange–van if (!(i & p)) Warning: many other descriptions “NTRU minmax(x+i,x+i+p); of Batcher’s sorting networks (q = t;q > p;q >>= 1) require n to be a power of 2. for (i = 0;i < n - q;++i) Also, Wikipedia says “Sorting if (!(i & p)) networks : : : are not capable of minmax(x+i+p,x+i+q); constant-time handling arbitrarily large inputs.”

  75. � � 17 18 Previous slide: C translation of int32_sort(int32 *x,int64 n) This constant-time 1973 Knuth “merge exchange”, vecto which is a simplified version of return; (fo 1968 Batcher “odd-even merge” sorting networks. Constant-time so t) t += t; included in 0;p >>= 1) { ≈ n (log 2 n ) 2 = 4 comparators. Bernstein–Chuengsatiansup– < n - p;++i) Much faster than bubble sort. Lange–van V p)) Warning: many other descriptions “NTRU Prime” soft minmax(x+i,x+i+p); of Batcher’s sorting networks > p;q >>= 1) revamp require n to be a power of 2. 0;i < n - q;++i) higher Also, Wikipedia says “Sorting & p)) networks : : : are not capable of New: “djbso minmax(x+i+p,x+i+q); constant-time so handling arbitrarily large inputs.”

  76. � � 17 18 Previous slide: C translation of *x,int64 n) This constant-time sorting co 1973 Knuth “merge exchange”, vectorization which is a simplified version of (for Haswell) 1968 Batcher “odd-even merge” sorting networks. Constant-time sorting code t; included in 2017 1) { ≈ n (log 2 n ) 2 = 4 comparators. Bernstein–Chuengsatiansup– p;++i) Much faster than bubble sort. Lange–van Vredendaal Warning: many other descriptions “NTRU Prime” software release minmax(x+i,x+i+p); of Batcher’s sorting networks >>= 1) revamped fo require n to be a power of 2. q;++i) higher speed Also, Wikipedia says “Sorting networks : : : are not capable of New: “djbsort” minmax(x+i+p,x+i+q); constant-time sorting code handling arbitrarily large inputs.”

  77. � � 18 19 Previous slide: C translation of This constant-time sorting code 1973 Knuth “merge exchange”, vectorization which is a simplified version of (for Haswell) 1968 Batcher “odd-even merge” sorting networks. Constant-time sorting code included in 2017 ≈ n (log 2 n ) 2 = 4 comparators. Bernstein–Chuengsatiansup– Much faster than bubble sort. Lange–van Vredendaal Warning: many other descriptions “NTRU Prime” software release of Batcher’s sorting networks revamped for require n to be a power of 2. higher speed Also, Wikipedia says “Sorting networks : : : are not capable of New: “djbsort” constant-time sorting code handling arbitrarily large inputs.”

  78. � � 18 19 Previous slide: C translation of The slowdo This constant-time sorting code Knuth “merge exchange”, Massive vectorization is a simplified version of (for Haswell) 2015 Gueron–Krasnov: Batcher “odd-even merge” AVX2 (Hasw networks. Constant-time sorting code quicksort. included in 2017 (log 2 n ) 2 = 4 comparators. ≈ 45 cycles/b Bernstein–Chuengsatiansup– faster than bubble sort. ≈ 55 cycles/b Lange–van Vredendaal rning: many other descriptions “NTRU Prime” software release Slower than Batcher’s sorting networks implemented revamped for n to be a power of 2. the fastest higher speed Wikipedia says “Sorting aware of rks : : : are not capable of New: “djbsort” IPP: Intel’s constant-time sorting code handling arbitrarily large inputs.” Performance

  79. � � 18 19 translation of The slowdown for This constant-time sorting code “merge exchange”, Massive fast-sorting vectorization simplified version of (for Haswell) 2015 Gueron–Krasnov: dd-even merge” AVX2 (Haswell) optimization rks. Constant-time sorting code quicksort. For 32-bit included in 2017 comparators. ≈ 45 cycles/byte fo Bernstein–Chuengsatiansup– than bubble sort. ≈ 55 cycles/byte fo Lange–van Vredendaal other descriptions “NTRU Prime” software release Slower than “the radix rting networks implemented of IPP revamped for power of 2. the fastest in-memo higher speed says “Sorting aware of”: 32, 40 not capable of New: “djbsort” IPP: Intel’s Integrated constant-time sorting code rily large inputs.” Performance Primitives

  80. � � 18 19 translation of The slowdown for constant time This constant-time sorting code exchange”, Massive fast-sorting literature. vectorization version of (for Haswell) 2015 Gueron–Krasnov: AVX merge” AVX2 (Haswell) optimization Constant-time sorting code quicksort. For 32-bit integers: included in 2017 rs. ≈ 45 cycles/byte for n ≈ 2 10 Bernstein–Chuengsatiansup– sort. ≈ 55 cycles/byte for n ≈ 2 20 Lange–van Vredendaal descriptions “NTRU Prime” software release Slower than “the radix sort rks implemented of IPP, which is revamped for 2. the fastest in-memory sort w higher speed rting aware of”: 32, 40 cycles/byte. capable of New: “djbsort” IPP: Intel’s Integrated constant-time sorting code inputs.” Performance Primitives library

  81. � � 19 20 The slowdown for constant time This constant-time sorting code Massive fast-sorting literature. vectorization (for Haswell) 2015 Gueron–Krasnov: AVX and AVX2 (Haswell) optimization of Constant-time sorting code quicksort. For 32-bit integers: included in 2017 ≈ 45 cycles/byte for n ≈ 2 10 , Bernstein–Chuengsatiansup– ≈ 55 cycles/byte for n ≈ 2 20 . Lange–van Vredendaal “NTRU Prime” software release Slower than “the radix sort implemented of IPP, which is revamped for the fastest in-memory sort we are higher speed aware of”: 32, 40 cycles/byte. New: “djbsort” IPP: Intel’s Integrated constant-time sorting code Performance Primitives library.

  82. � � 19 20 The slowdown for constant time Constant-time constant-time sorting code again on Massive fast-sorting literature. vectorization (for Haswell) 2015 Gueron–Krasnov: AVX and AVX2 (Haswell) optimization of Constant-time sorting code quicksort. For 32-bit integers: included in 2017 ≈ 45 cycles/byte for n ≈ 2 10 , Bernstein–Chuengsatiansup– ≈ 55 cycles/byte for n ≈ 2 20 . Lange–van Vredendaal “NTRU Prime” software release Slower than “the radix sort implemented of IPP, which is revamped for the fastest in-memory sort we are higher speed aware of”: 32, 40 cycles/byte. New: “djbsort” IPP: Intel’s Integrated constant-time sorting code Performance Primitives library.

  83. 19 20 The slowdown for constant time Constant-time results, constant-time sorting code again on Haswell CPU Massive fast-sorting literature. vectorization (for Haswell) 2015 Gueron–Krasnov: AVX and AVX2 (Haswell) optimization of Constant-time sorting code quicksort. For 32-bit integers: in 2017 ≈ 45 cycles/byte for n ≈ 2 10 , Bernstein–Chuengsatiansup– ≈ 55 cycles/byte for n ≈ 2 20 . Vredendaal software release Slower than “the radix sort implemented of IPP, which is revamped for the fastest in-memory sort we are higher speed aware of”: 32, 40 cycles/byte. “djbsort” IPP: Intel’s Integrated constant-time sorting code Performance Primitives library.

  84. 19 20 The slowdown for constant time Constant-time results, rting code again on Haswell CPU core: Massive fast-sorting literature. rization ell) 2015 Gueron–Krasnov: AVX and AVX2 (Haswell) optimization of code quicksort. For 32-bit integers: ≈ 45 cycles/byte for n ≈ 2 10 , Bernstein–Chuengsatiansup– ≈ 55 cycles/byte for n ≈ 2 20 . redendaal release Slower than “the radix sort implemented of IPP, which is ed for the fastest in-memory sort we are eed aware of”: 32, 40 cycles/byte. IPP: Intel’s Integrated code Performance Primitives library.

Recommend


More recommend