RSM & Paxos Consensus Trilogy - Episode II
Replicated State Machine
What is the problem?
Fault Tolerance by Replication KV-Store •Replica takes over on failure. KV-Store KV-Store •Or in other scenarios. •Challenge: •Ensure replicas are equivalent. Client Client •Why? set("s0", ...) set("s0", ...) set("s1", ...) set("s1", ...) get("s0")->... get("s0")->... get("s1")-> ☠ get("s1")->...
Replication Requirements • Replicas must have the same state/be equivalent. • A simple way to build this out • Ensure software running at each replica is deterministic . • Ensure commands/operations are executed in the same order .
Determinism • Ensure that equivalent replicas executing the same operation remain equivalent. • What does it mean to be equivalent? • Depends on what you are running. • What does it mean to be deterministic? • Depends on what you are running. • State machines are an abstraction over these details. • Think back to ADTs from linearizability.
Ordering
What is the Problem KV-Store KV-Store In what order should these commands be run? Client Client Client set("s1", ...) set("s0", ...) set("s2", ...) get("s0")->... set("s1", ...) get("s1")->... get("s0")->... set("s1", ...)
A Possible Solution set("s1", 1729) KV-Store KV-Store set("s1", 25) set("s1", 42) set("s1", 42) set("s1", 1729) Client Client Client set("s1", 25) set("s1", 42) set("s1", 1729)
A Possible Solution set("s1", 1729) KV-Store KV-Store set("s1", 25) set("s1", 42) set("s1", 42) set("s1", 1729) Client Client Client set("s1", 25) set("s1", 42) set("s1", 1729)
A Possible Solution set("s1", 42) set("s1", 42) set("s1", 25) set("s1", 25) set("s1", 1729) set("s1", 1729) KV-Store KV-Store Client Client Client set("s1", 25) set("s1", 42) set("s1", 1729)
How to build fault tolerant oracles?
What Do We Need? • Agreement on operation order. • Validity to ensure operations executed were actually issued.
Consensus Protocols • Termination : All correct nodes eventually decide on a value to output. • Agreement : All decided nodes decide on the same value. • Validity : The decision must be one of the inputs.
Consensus Protocols • Termination : All correct nodes eventually decide on a value to output. • Eventual Agreement : All decided nodes eventually decide on the same value. • Validity : The decision must be one of the inputs.
Welcome to Paxos
Outline • Going to go over single-decree Paxos. • Lamport's paper. Idea is to understand when and why it works. • Then look at how to apply this idea to build out a RSM.
Outline • Going to go over single-decree Paxos. • Lamport's paper. Idea is to understand when and why it works. • Then look at how to apply this idea to build out a RSM.
Single Decree Paxos
Three Types of Participants Proposers Acceptors Learners
Three Types of Participants Propose values Are told what decision Decide what value that should was made and can then is ultimately accepted. be selected from. act on the decision. Proposers Acceptors Learners
Paxos: Requirements • Validity : Acceptors should only choose values that are proposed. • Agreement : Only one value should be chosen.
Achieving Agreement • Relies on both proposers and acceptors . • Acceptors make sure that a chosen value cannot be forgotten. • How? • Proposers make sure that they don't try to override a chosen value. • How?
Paxos Invariants (1, a) Proposal: (id, value) (2, b) (3, a) Chosen (4, a) (5, a) • Each proposal has a unique ID. [For example use machine ID to ensure this]. • Need to make sure proposals are totally ordered . • If some proposal with ID i and value v is chosen then • all proposals with ID > i must also have value v.
Paxos Protocol: Phase 1 Want to propose cake Proposal: (0, z) prepare (1, a) Accepted: ∅ p r a e p a r e ( 1 , a ) Proposal: (0, z) Accepted: ∅ prepare (1, a) b Proposal: (0, z) Accepted: ∅ c Proposal: (0, z) Accepted: ∅ Proposal ID: (<index>, <Sequence #>) Prepare Message: prepare <proposal ID>
Paxos Protocol: Phase 1 promise (1, a) ∅ Want to propose cake Proposal: (1, a) Accepted: ∅ p r o m i s e a ( 1 , a ) ∅ promise (1, a) ∅ Proposal: (1, a) Accepted: ∅ b Proposal: (1, a) Accepted: ∅ c Proposal: (0, z) Accepted: ∅ Promise Message: promise <proposal ID> <accepted value>
Paxos Protocol: Phase 2 Want to propose cake accept (1, a) cake Proposal: (1, a) Accepted: ∅ a c c e a p t ( 1 , a ) c a k e accept (1, a) cake Proposal: (1, a) Accepted: ∅ b Proposal: (1, a) Accepted: ∅ c Proposal: (0, z) Accepted: ∅ Accept Message: accept <proposal ID> <value>
Paxos Protocol: Phase 2 Proposal: (1, a) Want to propose cake Accepted: cake a Proposal: (1, a) Accepted: cake b accepted cake Proposal: (1, a) Accepted: cake c Proposal: (0, z) Accepted: ∅
Paxos Protocol: Phase 1 Proposal: (1, a) Accepted: cake a Proposal: (1, a) Want to propose ice cream prepare (1, b) Accepted: cake p b r e p a r e ( 1 , b ) Proposal: (1, a) prepare (1, b) Accepted: cake c Proposal: (0, z) Accepted: ∅ Prepare Message: prepare <proposal ID>
Paxos Protocol: Phase 1 Proposal: (1, a) Accepted: cake a Proposal: (1, b) promise (1, b) cake Want to propose ice cream Accepted: cake p r o 😟 b m i s e ( 1 , b ) c a k e promise (1, b) ∅ Proposal: (1, b) Accepted: cake c Proposal: (1, b) Accepted: ∅ Promise Message: promise <proposal ID> <accepted value>
Paxos Protocol: Phase 2 Proposal: (1, a) Accepted: cake a Proposal: (1, b) accept (1, b) cake Want to propose ice cream Accepted: cake a c b c e p t ( 1 , b ) c a k e accept (1, b) cake Proposal: (1, b) Accepted: cake c Proposal: (1, b) Accepted: ∅ Prepare Message: prepare <proposal ID>
Paxos Protocol: Phase 2 Proposal: (1, a) Accepted: cake a Proposal: (1, b) Accepted: cake b Proposal: (1, b) Accepted: cake c Proposal: (1, b) Accepted: cake
Paxos: Some Questions • Why do proposers need to pick the last committed value returned in Phase 1?
Paxos: Some Questions Proposal: (1, a) Accepted: cake Proposal: (1, a) a Accepted: cake Is it possible to reach this situation? Proposal: (1, b) b Accepted: cannoli Proposal: (1, b) c Accepted: cannoli Proposal: (1, b) Accepted: cannoli
Paxos: Some Questions Proposal: (1, a) Accepted: cake prepare (1, c) Proposal: (1, a) a Accepted: cake prepare (1, c) Proposal: (1, b) b prepare (1, c) Accepted: cannoli Proposal: (1, b) c Accepted: cannoli Want to propose cake Proposal: (1, b) Accepted: cannoli
Paxos: Non-Termination
Paxos Protocol: Phase 1 Proposal: (0, z) prepare (1, a) Accepted: ∅ p r e a p a r e ( 1 , a ) prepare (1, a) Proposal: (0, z) Accepted: ∅ b Proposal: (0, z) Accepted: ∅ c
Paxos Protocol: Phase 1 promise (1, a) ∅ Proposal: (1, a) Accepted: ∅ p r o m a i s e ( 1 , a ) ∅ promise (1, a) ∅ Proposal: (1, a) Accepted: ∅ b Proposal: (1, a) Accepted: ∅ c
Paxos Protocol: Phase 1 Proposal: (1, a) Accepted: ∅ prepare (1, b) a Proposal: (1, a) prepare (1, b) Accepted: ∅ prepare (1, b) b Proposal: (1, a) Accepted: ∅ c
Paxos Protocol: Phase 1 Proposal: (1, b) Accepted: ∅ promise (1, b) ∅ a Proposal: (1, b) ) b ∅ , 1 ( e s m i o r p Accepted: ∅ p r o m b i s e ( 1 , b ) ∅ Proposal: (1, b) Accepted: ∅ c
Paxos Protocol: Phase 1 Proposal: (1, b) Accepted: ∅ Accept for (1, a) will fail. a Proposal: (1, b) Accepted: ∅ b Proposal: (1, b) Accepted: ∅ c
Paxos Protocol: Phase 1 Proposal: (1, b) prepare (2, a) Accepted: ∅ p r e p a r e a ( 2 , a ) prepare (2, a) Proposal: (1, b) Accepted: ∅ b Proposal: (1, b) Accepted: ∅ c
Paxos Protocol: Phase 1 promise (2, a) ∅ Proposal: (2, a) Accepted: ∅ p r o m a i s e ( 2 , a ) ∅ promise (2, a) ∅ Proposal: (2, a) Accepted: ∅ b Proposal: (2, a) Accepted: ∅ c
Paxos Protocol: Phase 1 Proposal: (2, a) Accepted: ∅ a Proposal: (2, a) Accepted: ∅ Accept for (1, b) will fail. b Proposal: (2, a) Accepted: ∅ c
How to Resolve this Problem? • Elect a leader. • Introduce random timeouts to ensure someone eventually wins. • Leader is the only proposer (by and large). • Still need acceptors and quorum to make sure future leaders don't forget. • Elect a new leader in response to failure/timeout/etc.
Extending to State Machine
Recommend
More recommend