agreement in distributed systems cs 188 distributed
play

Agreement in Distributed Systems CS 188 Distributed Systems - PowerPoint PPT Presentation

Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Lecture 12 Page 1 CS 188,Winter 2015 Introduction We frequently want to get a set of nodes in a distributed system to agree Commitment protocols and


  1. Agreement in Distributed Systems CS 188 Distributed Systems February 19, 2015 Lecture 12 Page 1 CS 188,Winter 2015

  2. Introduction • We frequently want to get a set of nodes in a distributed system to agree • Commitment protocols and mutual exclusion are particular cases • The approaches we discussed for those work in limited situations • In general, when can we reach agreement in a distributed system? Lecture 12 Page 2 CS 188,Winter 2015

  3. Basics of Agreement Protocols • What is agreement? • What are the necessary conditions for agreement? Lecture 12 Page 3 CS 188,Winter 2015

  4. What Do We Mean By Agreement? • In simplest case, can n processors agree that a variable takes on value 0 or 1? – Only non-faulty processors need agree • More complex agreements can be built from this simple agreement Lecture 12 Page 4 CS 188,Winter 2015

  5. Conditions for Agreement Protocols • Consistency – All participants agree on same value and decisions are final • Validity – Participants agree on a value at least one of them wanted • Termination/Progress – All participants choose a value in a finite number of steps Lecture 12 Page 5 CS 188,Winter 2015

  6. Challenges to Agreement • Delays – In message delivery – In nodes responding to messages • Failures – And recovery from failures • Lies by participants – Or innocent errors that have similar effects Lecture 12 Page 6 CS 188,Winter 2015

  7. Failures and Agreement • Failures make agreement difficult – Failed nodes don’t participate – Failed nodes sometimes recover at inconvenient times – At worst, failed nodes participate in harmful ways • Real failures are worse than fail-stop Lecture 12 Page 7 CS 188,Winter 2015

  8. Types of Failures • Fail-stop – A nice, clean failure – Processor stops executing anything • Realistic failures – Partitionings – Arbitrary delays • Adversarial failures – Arbitrary bad things happen Lecture 12 Page 8 CS 188,Winter 2015

  9. Election Algorithms • If you get everyone to agree a particular node is in charge, • Future consensus is easy, since he makes the decisions • How do you determine who’s in charge? – Statically – Dynamically Lecture 12 Page 9 CS 188,Winter 2015

  10. Static Leader Selection Methods • Predefine one process/node as the leader • Simple – Everyone always knows who’s the leader • Not very resilient – If the leader fails, then what? Lecture 12 Page 10 CS 188,Winter 2015

  11. Dynamic Leader Selection Methods • Choose a new leader dynamically whenever necessary • More complicated • But failure of a leader is easy to handle – Just elect a new one • Election doesn’t imply voting – Not necessarily majority-based Lecture 12 Page 11 CS 188,Winter 2015

  12. Election Algorithms vs. Mutual Exclusion Algorithms • Most mutual exclusion algorithms don’t care much about failures • Election algorithms are designed to handle failures • Also, mutual exclusion algorithms only need a winner • Election algorithms need everyone to know who won Lecture 12 Page 12 CS 188,Winter 2015

  13. A Typical Use of Election Algorithms • A group of processes wants to periodically take a distributed snapshot • They don’t want multiple simultaneous snapshots • So they want one leader to order them to take the snapshot Lecture 12 Page 13 CS 188,Winter 2015

  14. Problems in Election Algorithms • Some of the nodes may have failed before the algorithm starts • Some of the nodes may fail during the algorithm • Some nodes may recover from failure – Possible at inconvenient times • What about partitions? Lecture 12 Page 14 CS 188,Winter 2015

  15. Election Algorithms and the Real Work • The election algorithm is usually overhead • There’s a real computation you want to perform • The election algorithm chooses someone to lead it • Having two leaders while real computation is going on is bad Lecture 12 Page 15 CS 188,Winter 2015

  16. The Bully Algorithm • The biggest kid on the block gets to be the leader • But what if the biggest kid on the block is taking his piano lesson? • The next biggest kid gets to be leader – Until the piano lesson is over . . . Lecture 12 Page 16 CS 188,Winter 2015

  17. Electing a Bully The Spike’s piano Mom The kids come out to play lesson hasn’t ends let him out yet I’m the leader, Hey, Hey, I’m here, where I’m the leader, I’m here, Hey, Hey, Butch! and we’re playing Peewee! Peewee! Spike! Spike! are you sissies? let’s play tag! who else is? Butch! Spike! Cuthbert Cuthbert! baseball! Lecture 12 Page 17 CS 188,Winter 2015

  18. Assumptions of the Bully Algorithm • A static set of possible participants – With an agreed-upon order • All messages are delivered with T m seconds • All responses are sent within T p seconds of delivery • These last two imply synchronous behavior Lecture 12 Page 18 CS 188,Winter 2015

  19. The Basic Idea Behind the Bully Algorithm • Possible leaders try to take over • If they detect a better leader, they agree to its leadership • Keep track of state information about whether you are electing a leader • Only do real work when you agree on a leader Lecture 12 Page 19 CS 188,Winter 2015

  20. The Bully Algorithm and Timeouts • Call out the biggest kid’s name – If he doesn’t answer soon enough, call out the next biggest kid’s name – Until you hear an answer – Or the caller is the biggest kid – Then take over, by telling everyone else you’re the leader Lecture 12 Page 20 CS 188,Winter 2015

  21. The Bully Algorithm At Work • One node is currently the coordinator • It expects a certain set of nodes to be up and participating • The coordinator asks all other nodes • If an expected node doesn’t answer, start an election – Also if it answers in the negative • If an unexpected node answers, start an election Lecture 12 Page 21 CS 188,Winter 2015

  22. The Practicality of the Bully Algorithm • The bully algorithm works reasonably well if the timeouts are effective – A timeout occurring really means the site in question is down • And there are no partitions at all – If there are, what happens? Lecture 12 Page 22 CS 188,Winter 2015

  23. The Invitation Algorithm • More practical than bully algorithm – Doesn’t depend on timeouts • But its results are not as definitive • An asynchronous algorithm Lecture 12 Page 23 CS 188,Winter 2015

  24. The Basic Idea Behind the Invitation Algorithm • A current coordinator tries to get all other nodes to agree to his leadership • If more than one coordinator around, get together and merge groups • Use timeouts only to allow progress, not to make definitive decisions • No set priorities for who will be coordinator Lecture 12 Page 24 CS 188,Winter 2015

  25. The Invitation Algorithm and Group Numbers • The invitation algorithm recruits a group of nodes to work together – More than one group can exist simultaneously • Group numbers identify the group • Why not identify with coordinator ID? – Because one node can serially coordinate many groups Lecture 12 Page 25 CS 188,Winter 2015

  26. The Basic Operation of the Invitation Algorithm • Coordinators in a normal state periodically check all other nodes • If any other node is a coordinator, try to merge the groups • If timeouts occur, don’t worry about it – Also don’t worry if a response to check comes from this or earlier request Lecture 12 Page 26 CS 188,Winter 2015

  27. Merging in the Invitation Algorithm • Merging always requires forming new group – May have same coordinator, but different group number • Coordinator who initiates merge asks all other known coordinators to merge – They ask their group members – Original group members also asked Lecture 12 Page 27 CS 188,Winter 2015

  28. A Simplified Example UP ={1,2,3,4} 1 3 AreYouCoordinator? Yes 1 1 1 3 Accept Invite Ready Node 1 checks AreYouCoordinator? Accept Ready Invite Invite on for other No behalf of node coordinator 1 4 2 3 1 1 1 So node 1 finds another coordinator Node 1 asks the other coordinator and his old node to join his group Node 1 forms a new group If all members of UP{} respond, we’re fine Lecture 12 Page 28 CS 188,Winter 2015

  29. The Reorganization State • Nodes enter the reorganization state after getting their answer • What’s the point of this state? – Why not just start up the group? – After all, we all know who’s going to be a member • Or do we? Lecture 12 Page 29 CS 188,Winter 2015

  30. Why We Need Another Round of Messages 1 3 Invitation 1 1 Invitation Invitation 4 2 1 1 Who does 1 think will join the group, at this point? Assuming no timeouts, 4 will also join And what if someone crashes? 2 and 3 And 2 needs to know that Presumably not accepting the invitation? Lecture 12 Page 30 CS 188,Winter 2015

  31. Timeouts in the Merge • Don’t worry too much about them • Some nodes respond before the timeout – Some don’t • If you don’t catch them this time, you might the next Lecture 12 Page 31 CS 188,Winter 2015

  32. Straggler Messages • This algorithm is asynchronous – So messages may come in late • What do we do when messages arrive late? • Mostly, reject them • How do we tell? – Messages contain group number Lecture 12 Page 32 CS 188,Winter 2015

Recommend


More recommend