making byzantine fault tolerant systems tolerate
play

Making Byzantine Fault Tolerant Systems Tolerate Byzantine - PowerPoint PPT Presentation

Making Byzantine Fault Tolerant Systems Tolerate Byzantine Failures Allen Clement, Mirco Marchetti, Edmund Wong Lorenzo Alvisi, Mike Dahlin BFT Systems PBFT [OSDI 98] Attested Append Only Memory [SOSP 07] HQ [OSDI 06] Beyond 1/3 Faulty in


  1. Making Byzantine Fault Tolerant Systems Tolerate Byzantine Failures Allen Clement, Mirco Marchetti, Edmund Wong Lorenzo Alvisi, Mike Dahlin

  2. BFT Systems PBFT [OSDI 98] Attested Append Only Memory [SOSP 07] HQ [OSDI 06] Beyond 1/3 Faulty in BFT [SOSP 07] Zyzzyva [SOSP 07] BASE [OSDI 02] HT BFT [DSN 04] SafeStore [USENIX 07] QU [SOSP 05] Separating Agreement from BFT Under Attack [NSDI 08] Execution [SOSP 03] Commit Barrier Scheduling SUNDR [OSDI 04] [SOSP 07] ... Low Overhead BFT [SOSP 07]

  3. System Throughput Best Faulty Client Faulty Faulty Case Client Flood Primary Replica PBFT 62k 0 crash 1K 250 Q/U 24k 0 crash NA 19k HQ 15k NA 4.5k NA crash Zyzzyva 65k 0 crash crash 0 ops/sec

  4. System Throughput Best Faulty Client Faulty Faulty Case Client Flood Primary Replica PBFT 62k 0 crash 1K 250 Q/U 24k 0 crash NA 19k HQ 15k NA 4.5k NA crash Zyzzyva 65k 0 crash crash 0 ops/sec

  5. System Throughput Best Faulty Client Faulty Faulty Case Client Flood Primary Replica PBFT 62k 0 crash 1K 250 Q/U 24k 0 crash NA 19k HQ 15k NA 4.5k NA crash Zyzzyva 65k 0 crash crash 0 Aardvark 39k 39k 7 .8k 37k 11k ops/sec

  6. Outline Robust BFT: The case for a new goal Aardvark: Designing for RBFT Evaluation: RBFT in action

  7. Paved with good intentions No BFT protocol should rely on synchrony for safety FLP: No consensus protocol can be both safe and live in an asynchronous system! All one can guarantee is eventual progress “Handle normal and worst case separately as a rule, because the requirements for the two are quite different: the normal case must be fast; the worst case must make some progress” -- Butler Lampson, “Hints for Computer System Design”

  8. Recasting the problem Maximize performance when the network is synchronous all clients and servers behave correctly While remaining safe if at most servers fails f eventually live

  9. Recasting the problem Misguided Dangerous Futile

  10. Recasting the problem Misguided it encourages systems that fail to deliver BFT Dangerous Futile

  11. Recasting the problem Misguided it encourages systems that fail to deliver BFT Dangerous it encourages fragile optimizations Futile

  12. Recasting the problem Misguided it encourages systems that fail to deliver BFT Dangerous it encourages fragile optimizations Futile it yields diminishing return on common case

  13. A New Goal Asynchronous Synchronous Failures No Failures No Failures

  14. A New Goal Asynchronous Synchronous Synchronous ? Failures Failures No Failures No Failures

  15. A New Goal Asynchronous Synchronous Failures Failures No Failures

  16. Robust BFT Maximize performance when the network is synchronous at most servers fail f While remaining safe if at most servers fail f eventually live

  17. Outline Robust BFT: The case for a new goal Aardvark: Designing for RBFT Evaluation: RBFT in action

  18. Protocol Structure Step 1 Step 2 Step 3 “Good” messages Computation steps “Bad” messages

  19. Fragile Optimizations

  20. Revisiting conventional wisdom Signatures are expensive - use MACs View changes are to be avoided � Hardware multicast is a boon

  21. Revisiting conventional wisdom Signatures are expensive - use MACs Faulty clients can use MACs to generate ambiguity Aardvark requires clients to sign requests View changes are to be avoided Hardware multicast is a boon

  22. Revisiting conventional wisdom Signatures are expensive - use MACs Faulty clients can use MACs to generate ambiguity Aardvark requires clients to sign requests View changes are to be avoided Aardvark uses regular view changes to maintain high throughput despite faulty primaries Hardware multicast is a boon

  23. Revisiting conventional wisdom Signatures are expensive - use MACs Faulty clients can use MACs to generate ambiguity Aardvark requires clients to sign requests View changes are to be avoided Aardvark uses regular view changes to maintain high throughput despite faulty primaries Hardware multicast is a boon Aardvark uses separate work queues for clients and individual replicas

  24. Big MAC Attack c

  25. Big MAC Attack c

  26. Big MAC Attack c

  27. Big MAC Attack c c c

  28. Big MAC Attack c c c

  29. Big MAC Attack

  30. Big MAC Attack c

  31. Big MAC Attack c

  32. Big MAC Attack c

  33. Big MAC Attack c c c

  34. Big MAC Attack c c c

  35. Big MAC Attack c c c c c c c c c c Faulty Client Faulty Primary

  36. Hybrid MAC/Signatures [ ] c

  37. Hybrid MAC/Signatures [ ] c request submission

  38. Hybrid MAC/Signatures The MAC is good. How is the signature? [ ] c request submission

  39. Hybrid MAC/Signatures c request submission

  40. Hybrid MAC/Signatures Signature is good too! c request submission

  41. Hybrid MAC/Signatures c request submission

  42. Hybrid MAC/Signatures c c c request primary submission orders request

  43. Hybrid MAC/Signatures c c c request primary submission orders request

  44. Hybrid MAC/Signatures c c c request primary submission orders request

  45. Signed Request Filtering Client Verify Verify Blacklist Blacklisted? MAC Signature Client Process Request

  46. Big MAC Attack PBFT request primary replicas agree on replicas respond submission orders the next request to the client request

  47. Big MAC Attack Zyzzyva execute the request request primary primary replicas agree on replicas respond submission orders orders the next request to the client request request

  48. Big MAC Attack Q/U execute the request “primary” orders request replicas agree on request replicas respond the next request submission to the client view change

  49. Big MAC Attack HQ execute the request request view replicas respond “primary” orders request submission change to the client replicas agree on the next request

  50. Slow Primary

  51. Slow Primary

  52. Slow Primary

  53. Adaptive View Changes Throughput Time Observed Throughput Required Throughput

  54. Adaptive View Changes Throughput Time Observed Throughput Required Throughput

  55. Adaptive View Changes Throughput Time Observed Throughput Required Throughput

  56. Adaptive View Changes Throughput Time Observed Throughput Required Throughput

  57. Implementation details Sign client requests Adaptive view change Separate network channels Fair scheduling clients -v- replicas replicas -v- replicas Exploit multicore architectures

  58. Outline Robust BFT: The case for a new goal Aardvark: Designing for RBFT Evaluation: RBFT can work

  59. Throughput -v- Latency 7 6 5 Latency (ms) 4 HQ Q/U Aardvark 3 PBFT Zyzzyva 2 1 0 0 10 20 30 40 50 60 70 80 Throughput (Kops/sec)

  60. Aardvark, Incrementally Adaptive MAC Client Sign Client View Request Request Change 62k 30k - PBFT 58k 39k 39k Aardvark

  61. Performance with failures Byzantine failures are arbitrary Good faith effort

  62. Big MAC Attack Faulty Peak Client 62k 0 PBFT 24k 0 Q/U 7 .6k - HQ 65k 0 Zyzzyva 39k 39k Aardvark

  63. Slow Primary 1ms 10ms 100ms Peak delay delay delay PBFT 62k 5k 5k 1k Zyzzyva 65k 28k 5k crash Aardvark 39k 38k 37k 38k

  64. Summary RBFT: a new goal for BFT systems Aardvark: rejecting conventional wisdom Evaluation: it works!

Recommend


More recommend