failure detectors
play

Failure Detectors Concurrency Trilogy Part IV Announcements - PowerPoint PPT Presentation

Failure Detectors Concurrency Trilogy Part IV Announcements Project proposals are due tonight, unless you got an extension. Announcements Project proposals are due tonight, unless you got an extension. Only a few hours left to submit


  1. Revisiting RSMs M0 set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2 cas(x, 5, 4) i Raft Raft Raft Raft Term = 2 M2 AppendEntries h leaderCommit = 2 cas(x, 5, 4) g set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2 Client Client Is this correct? Client Client M3 set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2

  2. Revisiting RSMs M0 set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 get(x), 1, 2 KeyValue(x, 4) j cas(x,5,4), 2, 2 cas(x, 5, 4) i Raft Raft Raft Raft Term = 2 M2 AppendEntries h leaderCommit = 2 cas(x, 5, 4) g set(x, 5), 0, 1 k get(x), 1, 2 cas(x,5,4), 2, 2 Client Client Client Client M3 set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2

  3. Con fi guration Change

  4. Why? • Want to be able to change the set of servers.

  5. Why? • Want to be able to change the set of servers. • Take down servers for maintenance.

  6. Why? • Want to be able to change the set of servers. • Take down servers for maintenance. • Add new servers to replace failed ones.

  7. Why? • Want to be able to change the set of servers. • Take down servers for maintenance. • Add new servers to replace failed ones. • Other reasons.

  8. How? Index Term Config • Use a special log message which contains the set of servers.

  9. How? Index Term Config • Use a special log message which contains the set of servers. • Use Raft to replicate this to everyone.

  10. How Special? • All peers use configuration as soon as logged. Index • Why safe? Term Config • We know how to revert this change.

  11. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 2 0 1 3 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  12. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 2 0 1 3 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  13. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  14. Protocol 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 2 4 5 0 1 3 0 0 0 0 0 0 set(x, 5) set(x, 6) ... ... ... C 4 5 2 3 0 1 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C set(x, 5) set(x, 6)

  15. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  16. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  17. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 2 3 0 1 0 0 0 1 set(x, 5) set(x, 6) ... ...

  18. Protocol 2 3 0 1 0 1 0 0 ... ... set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 2 3 0 1 0 0 0 1 set(x, 5) set(x, 6) ... ...

  19. Protocol 2 3 0 1 0 1 0 0 ... ... set(x, 5) set(x, 6) 2 0 1 0 0 0 ... set(x, 5) set(x, 6) 2 3 0 1 0 0 0 1 set(x, 5) set(x, 6) ... ...

  20. Protocol 2 3 0 1 0 1 0 0 ... ... set(x, 5) set(x, 6) 2 0 1 2 3 0 0 0 0 1 ... set(x, 5) set(x, 6) ... ... 2 3 0 1 0 0 0 1 set(x, 5) set(x, 6) ... ...

  21. Protocol 2 3 0 1 0 1 0 0 ... ... set(x, 5) set(x, 6) 2 0 1 2 3 0 0 0 0 1 ... set(x, 5) set(x, 6) ... ... 2 3 0 1 0 0 0 1 set(x, 5) set(x, 6) ... ... What happens now?

  22. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 2 0 1 3 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  23. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 2 0 1 3 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  24. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  25. Protocol 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C-all set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all 2 4 0 1 3 0 0 0 0 0 set(x, 5) set(x, 6) ... ... ... 4 5 2 3 0 1 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C-all set(x, 5) set(x, 6)

  26. Protocol 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C-all set(x, 5) set(x, 6) 4 5 5 2 0 1 3 0 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all C-new 2 4 0 1 3 0 0 0 0 0 set(x, 5) set(x, 6) ... ... ... 4 5 2 3 0 1 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C-all set(x, 5) set(x, 6)

  27. Protocol 4 5 2 5 3 0 1 0 0 0 0 0 0 0 ... ... ... C-all C-new set(x, 5) set(x, 6) 4 5 5 2 0 1 3 0 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all C-new 4 5 5 2 3 0 1 0 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all C-new 4 2 5 5 3 0 1 0 0 0 0 0 0 0 ... ... ... C-all C-new set(x, 5) set(x, 6)

  28. Failure Detectors

  29. What Problem? • We have been depending on random timeouts, etc. to build consensus.

  30. What Problem? • We have been depending on random timeouts, etc. to build consensus. • Based on partial synchrony: the network is not always behaving at its worse.

  31. What Problem? • We have been depending on random timeouts, etc. to build consensus. • Based on partial synchrony: the network is not always behaving at its worse. • Tedious to model (for proofs) and tune (for deployment).

  32. What Problem? • We have been depending on random timeouts, etc. to build consensus. • Based on partial synchrony: the network is not always behaving at its worse. • Tedious to model (for proofs) and tune (for deployment). • Abstract them away with failure detectors.

  33. Failure Detector Application Failure Detector

  34. Failure Detector Application suspect p0 is failed. Failure Detector

  35. Failure Detector Application suspect p0 is failed. suspect p0, p1 are failed. Failure Detector

  36. Failure Detector Application suspect p0 is failed. suspect p0, p1 are failed. suspect p1, p2 are failed. Failure Detector

  37. Failure Detector Application suspect p0 is failed. suspect p0, p1 are failed. suspect p1, p2 are failed. suspect p1 is failed. Failure Detector

  38. Reasoning about Detectors Completeness Accuracy

  39. Reasoning about Detectors Completeness Accuracy Failed nodes: • When are they detected? • Who detects them?

  40. Reasoning about Detectors Completeness Accuracy Failed nodes: Live nodes: • When are they detected? • When can they be suspected? • Who detects them?

  41. Reasoning about Detectors Completeness Accuracy Strong Weak

  42. Reasoning about Detectors Completeness Accuracy Every failed node is eventually detected by all correct nodes. Strong Weak

  43. Reasoning about Detectors Completeness Accuracy Every failed node is eventually detected by all correct nodes. Strong Weak Every failed node is eventually detected by some correct nodes.

  44. Reasoning about Detectors Completeness Accuracy Every failed node is eventually No correct node is ever suspected. detected by all correct nodes. Strong Weak Every failed node is eventually detected by some correct nodes.

  45. Reasoning about Detectors Completeness Accuracy Every failed node is eventually No correct node is ever suspected. detected by all correct nodes. Strong Weak Every failed node is eventually Some correct node is never suspected detected by some correct nodes. by any node.

  46. Reasoning about Detectors Accuracy Eventual Not Eventual Strong Weak

  47. Reasoning about Detectors Accuracy Eventual Not Eventual Eventually No correct node is ever suspected. Strong Weak

  48. Reasoning about Detectors Accuracy Eventual Not Eventual Eventually No correct node No correct node is ever suspected. is ever suspected. Strong Weak

  49. Reasoning about Detectors Accuracy Eventual Not Eventual Eventually No correct node No correct node is ever suspected. is ever suspected. Strong Weak Eventually some correct node is never suspected by any node.

  50. Reasoning about Detectors Accuracy Eventual Not Eventual Eventually No correct node No correct node is ever suspected. is ever suspected. Strong Weak Eventually some correct node is Some correct node is never never suspected by any node. suspected by any node.

  51. Types of Detectors • Strong completeness, strong accuracy: Perfect detector (P)

  52. Types of Detectors • Strong completeness, strong accuracy: Perfect detector (P) • Strong completeness, weak accuracy: Strong detector (S)

  53. Types of Detectors • Strong completeness, strong accuracy: Perfect detector (P) • Strong completeness, weak accuracy: Strong detector (S) • Strong completeness, eventual strong accuracy: ♢ P

  54. Types of Detectors • Strong completeness, strong accuracy: Perfect detector (P) • Strong completeness, weak accuracy: Strong detector (S) • Strong completeness, eventual strong accuracy: ♢ P • Strong completeness, eventual weak accuracy: ♢ S or Ω

  55. Types of Detectors

  56. Types of Detectors • Weak completeness, strong accuracy: Q

  57. Types of Detectors • Weak completeness, strong accuracy: Q • Weak completeness, weak accuracy: Weak Detector (W)

  58. Types of Detectors • Weak completeness, strong accuracy: Q • Weak completeness, weak accuracy: Weak Detector (W) • Weak completeness, eventual strong accuracy: ♢ Q

  59. Types of Detectors • Weak completeness, strong accuracy: Q • Weak completeness, weak accuracy: Weak Detector (W) • Weak completeness, eventual strong accuracy: ♢ Q • Weak completeness, eventual weak accuracy: ♢ W

Recommend


More recommend