failure detectors
play

Failure Detectors Concurrency Trilogy Part IV Announcements - PowerPoint PPT Presentation

Failure Detectors Concurrency Trilogy Part IV Announcements Project proposals are due tonight, unless you got an extension. Only a few hours left to submit something or seek an extension. No quiz next week. Should have gotten


  1. Failure Detectors Concurrency Trilogy Part IV

  2. Announcements • Project proposals are due tonight, unless you got an extension. • Only a few hours left to submit something or seek an extension. • No quiz next week. • Should have gotten results for last week's quiz.

  3. RSMs All Over Again

  4. Revisiting RSMs Application Application Application Application Ordering Ordering Ordering Ordering Client Client Client Client

  5. Revisiting RSMs KVStore KVStore KVStore KVStore Raft Raft Raft Raft Client Client Client Client

  6. Revisiting RSMs Raft Raft Raft Raft Client Client Client Client

  7. Revisiting RSMs Raft Raft Raft Raft Client Client Client Client

  8. Revisiting RSMs Application Application Application Application Raft Raft Raft Raft The act of executing a command at the application is destructive . Cannot undo a command. Client Client Client Client

  9. Revisiting RSMs Application Application Application Application Raft Raft Raft Raft Requirement : All application replicas end up in the same state. Client Client Client Client

  10. Revisiting RSMs M0 M1 M2 M3 KVStore KVStore KVStore KVStore Raft Raft Raft Raft set(x, 5) 1 Client Client Client Client

  11. Revisiting RSMs M0 M1 M2 M3 KVStore KVStore KVStore KVStore 2 AppendEntries 2 Raft Raft Raft Raft 2 set(x, 5) 1 Client Client Client Client

  12. Revisiting RSMs M0 M1 M2 M3 KVStore KVStore KVStore KVStore set(x, 5) 4 3 2 AppendEntries 2 Raft Raft Raft Raft 2 success set(x, 5) 5 1 Client Client Client Client

  13. Revisiting RSMs M0 set(x, 5), 0, 1 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 set(x, 5) 4 3 2 AppendEntries 2 Raft Raft Raft Raft M2 2 set(x, 5), 0, 1 set(x, 5) success 1 5 Client Client Client Client M3 set(x, 5), 0, 1

  14. Revisiting RSMs M0 set(x, 5), 0, 1 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 For which replicas is x=5? Raft Raft Raft Raft M2 set(x, 5), 0, 1 Client Client Client Client M3 set(x, 5), 0, 1

  15. Revisiting RSMs M0 set(x, 5), 0, 1 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 Raft Raft Raft Raft M2 set(x, 5), 0, 1 When? Client Client Client Client M3 set(x, 5), 0, 1

  16. Revisiting RSMs M0 set(x, 5), 0, 1 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 set(x, 5) 6 Raft Raft Raft Raft Term = 2 M2 set(x, 5), 0, 1 Client Client Is this safe? Client Client M3 set(x, 5), 0, 1

  17. Revisiting RSMs M0 set(x, 5), 0, 1 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 Raft Raft Raft Raft Term = 2 M2 leaderCommit =-1 get(x) a set(x, 5), 0, 1 Client Client Client Client M3 set(x, 5), 0, 1

  18. Revisiting RSMs M0 set(x, 5), 0, 1 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 Raft Raft Raft Raft Term = 2 M2 AppendEntries b leaderCommit =-1 get(x) a set(x, 5), 0, 1 get(x), 1, 2 Client Client Client Client M3 set(x, 5), 0, 1

  19. Revisiting RSMs M0 set(x, 5), 0, 1 get(x), 1, 2 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 get(x), 1, 2 Raft Raft Raft Raft Term = 2 M2 AppendEntries b leaderCommit =-1 get(x) a set(x, 5), 0, 1 get(x), 1, 2 Client Client Client Client M3 set(x, 5), 0, 1 get(x), 1, 2

  20. Revisiting RSMs M0 set(x, 5), 0, 1 get(x), 1, 2 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 get(x), 1, 2 get(x) c Raft Raft Raft Raft Term = 2 M2 AppendEntries b leaderCommit =-1 get(x) a set(x, 5), 0, 1 get(x), 1, 2 Client Client Is this correct? Client Client M3 set(x, 5), 0, 1 get(x), 1, 2

  21. Revisiting RSMs M0 set(x, 5), 0, 1 get(x), 1, 2 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 get(x), 1, 2 set(x,5) c Raft Raft Raft Raft Term = 2 M2 AppendEntries b leaderCommit = 0 get(x) a set(x, 5), 0, 1 get(x), 1, 2 Client Client Client Client M3 set(x, 5), 0, 1 get(x), 1, 2

  22. Revisiting RSMs M0 set(x, 5), 0, 1 get(x), 1, 2 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 get(x), 1, 2 set(x,5) c KeyValue(x, 5) e get(x) d Raft Raft Raft Raft Term = 2 M2 AppendEntries b leaderCommit = 1 get(x) a set(x, 5), 0, 1 KeyValue(x, 5) f get(x), 1, 2 Client Client Client Client M3 set(x, 5), 0, 1 get(x), 1, 2

  23. Revisiting RSMs M0 set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2 cas(x, 5, 4) i Raft Raft Raft Raft Term = 2 M2 AppendEntries h leaderCommit = 2 cas(x, 5, 4) g set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2 Client Client Is this correct? Client Client M3 set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2

  24. Revisiting RSMs M0 set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2 M0 M1 M2 M3 M1 KVStore KVStore KVStore KVStore set(x, 5), 0, 1 get(x), 1, 2 KeyValue(x, 4) j cas(x,5,4), 2, 2 cas(x, 5, 4) i Raft Raft Raft Raft Term = 2 M2 AppendEntries h leaderCommit = 2 cas(x, 5, 4) g set(x, 5), 0, 1 k get(x), 1, 2 cas(x,5,4), 2, 2 Client Client Client Client M3 set(x, 5), 0, 1 get(x), 1, 2 cas(x,5,4), 2, 2

  25. Con fi guration Change

  26. Why? • Want to be able to change the set of servers. • Take down servers for maintenance. • Add new servers to replace failed ones. • Other reasons.

  27. How? Index Term Config • Use a special log message which contains the set of servers. • Use Raft to replicate this to everyone.

  28. How Special? • All peers use configuration as soon as logged. Index • Why safe? Term Config • We know how to revert this change.

  29. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  30. Protocol 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 2 4 5 0 1 3 0 0 0 0 0 0 set(x, 5) set(x, 6) ... ... ... C 4 5 2 3 0 1 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C set(x, 5) set(x, 6)

  31. Protocol 2 3 0 1 0 1 0 0 ... ... set(x, 5) set(x, 6) 4 5 2 0 1 2 3 3 0 0 0 0 0 0 0 1 ... ... ... set(x, 5) set(x, 6) ... ... C 2 3 0 1 0 0 0 1 set(x, 5) set(x, 6) ... ... What happens now?

  32. Protocol 0 1 0 0 set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all 2 0 1 0 0 0 set(x, 5) set(x, 6) ...

  33. Protocol 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C-all set(x, 5) set(x, 6) 4 5 2 0 1 3 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all 2 4 0 1 3 0 0 0 0 0 set(x, 5) set(x, 6) ... ... ... 4 5 2 3 0 1 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C-all set(x, 5) set(x, 6)

  34. Protocol 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C-all set(x, 5) set(x, 6) 4 5 5 2 0 1 3 0 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all C-new 2 4 0 1 3 0 0 0 0 0 set(x, 5) set(x, 6) ... ... ... 4 5 2 3 0 1 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all 4 2 5 3 0 1 0 0 0 0 0 0 ... ... ... C-all set(x, 5) set(x, 6)

  35. Protocol 4 5 2 5 3 0 1 0 0 0 0 0 0 0 ... ... ... C-all C-new set(x, 5) set(x, 6) 4 5 5 2 0 1 3 0 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all C-new 4 5 5 2 3 0 1 0 0 0 0 0 0 0 ... ... ... set(x, 5) set(x, 6) C-all C-new 4 2 5 5 3 0 1 0 0 0 0 0 0 0 ... ... ... C-all C-new set(x, 5) set(x, 6)

  36. Failure Detectors

  37. What Problem? • We have been depending on random timeouts, etc. to build consensus. • Based on partial synchrony: the network is not always behaving at its worse. • Tedious to model (for proofs) and tune (for deployment). • Abstract them away with failure detectors.

  38. Failure Detector Application suspect p0 is failed. suspect p0, p1 are failed. suspect p1, p2 are failed. suspect p1 is failed. Failure Detector

  39. Reasoning about Detectors Completeness Accuracy Failed nodes: Live nodes: • When are they detected? • When can they be suspected? • Who detects them?

  40. Reasoning about Detectors Completeness Accuracy Every failed node is eventually No correct node is ever suspected. detected by all correct nodes. Strong Weak Every failed node is eventually Some correct node is never suspected detected by some correct nodes. by any node.

  41. Reasoning about Detectors Accuracy Eventual Not Eventual Eventually No correct node No correct node is ever suspected. is ever suspected. Strong Weak Eventually some correct node is Some correct node is never never suspected by any node. suspected by any node.

  42. Types of Detectors • Strong completeness, strong accuracy: Perfect detector (P) • Strong completeness, weak accuracy: Strong detector (S) • Strong completeness, eventual strong accuracy: ♢ P • Strong completeness, eventual weak accuracy: ♢ S or Ω

Recommend


More recommend