In Search of an Understandable Consensus Algorithm Diego Ongaro and John Ousterhout Stanford University
Overview λ Problem: Consensus; every state machine should be in the same state λ Why Raft when (Multi-)Paxos already exists? - Easier to understand - Easier to implement and modify
Power of Simplicity λ Decompose logic: Clean separation between logical steps like leader election, normal operation and configuration changes. λ Reduce state space: Reduce the number of states in the state machine. Also, reduce non- determinism. λ Delineation of roles: Raft has a single proposer (called the leader), rest of the servers are passive acceptors.
How it works λ Replicating state machines is equivalent to replicating a log of commands and then applying those commands to the state machine in order. λ So, the problem at hand is: How do we make consistent copies of logs across servers? Also, how do we know when it is safe to execute a command in the state machine?
Raft in a nutshell λ The server which is most up-to-date always wins the election λ Client sends command to leader λ Leader appends command to its log λ Leader asks the followers to append the command to their log λ Once new entry committed(majority of followers added the command in their log): - Leader passes command to its state machine, returns result to client - Leader notifies followers of committed entries in subsequent requests to append λ Followers pass committed commands to their state machines λ Crashed/slow followers? - Leader retries RPCs until they succeed λ Performance is optimal in common case: - One successful RPC to any majority of servers
Similarities and Differences (With Paxos) λ Similarities - Both solve consensus by replicating logs across servers - Both use two phases to reach consensus - Safety (same log everywhere) in asynchronous setting and liveness (progress) in synchronous setting λ Differences - Raft requires a leader, Paxos uses a leader to avoid livelocks (which can be solved using other techniques) and to improve performance (avoiding the propose phase) - Raft chooses the most up-to-date server as a leader, Paxos chooses leaders by their IDs - Raft treats log as a single entity it needs to replicate. Paxos treats the muti- value problem as a composition of single-value problem (Synod) - Paxos start with a specific case (single-value) and then uses it to solve the general case (multi-value). Raft directly solves the general case. - [I think this is why Raft is easier to understand]
Key Findings λ Raft is simple and easy to understand λ [Not as simple as I just explained. I left out a lot of edge cases and details of other logical components like leader election and configuration changes.] λ To check exactly how easy to understand it is a quiz was conducted in classes of Stanford and UC Berkeley. λ [Students were first presented with a video lecture of Raft and Paxos and then graded on each of them] λ Raft's performance is similar to other consensus algorithms such as Paxos. - Uses minimum number of messages for replication; single round-trip from leader to half the followers - Easily supports batching and pipelining
Key Findings
Key Findings
Take Aways λ Raft decomposes logically separate components to make things simple and easier to explain λ Separation of roles (Heterogeneous) vs. Homogeneous system (Single leader vs. anyone can serve client requests) λ Solving general case vs. deriving general from a specific case
Recommend
More recommend