Implementing Distributed Consensus Dan Lüdtke
What? ● My hobby project of learning about Distributed Consensus ○ I implemented a Paxos variant in Go and learned a lot about reaching consensus ○ A fine selection of some of the mistakes I made Why? ● I wanted to understand Distributed Consensus ○ Everyone seemed to understand it. Except me. ● I am a hands-on person. ○ Doing $stuff > Reading about $stuff Why talk about it? ● Knowledge sharing
Distributed Consensus
Protocols Implementations ● Paxos ● Chubby ○ Multi-Paxos ○ coarse grained lock service ○ Cheap Paxos ● etcd ● Raft ○ a distributed key value store ● ZooKeeper Atomic Broadcast ● Apache ZooKeeper ● Proof-of-Work Systems ○ a centralized service for ○ Bitcoin maintaining configuration ● Lockstep Anti-Cheating information, naming, providing distributed synchronization
Paxos
Paxos Roles ● Client ○ Issues request to a proposer ○ Waits for response from a learner ■ Consensus on value X ■ No consensus on value X ● Acceptor ● Proposer ● Learner P ● Leader client Consensus on X?
Paxos Roles A ● Client ● Proposer (P) Proposing X... ○ Advocates a client request ○ Asks acceptors to agree on the proposed value A ○ Move the protocol forward when there is conflict ● Acceptor ● Learner P ● Leader Proposing X... client
Paxos Roles L A Yay ● Client Yay ● Proposer (P) Yay ● Acceptor (A) ○ Also called "voter" A ○ The Fault-tolerant "memory" of the system Yay ○ Groups of acceptors form a quorum ● Learner P ● Leader client
Paxos Roles L A ● Client ● Proposer (P) ● Acceptor (A) Yay ● Learner (L) A ○ Adds replication to the protocol ○ Takes action on learned (agreed on) values ○ E.g. respond to client P ● Leader client
Paxos Roles L A ● Client ● Proposer (P) ● Acceptor (A) Yay ● Learner (L) A ● Leader (LD) ○ Distinguished proposer ○ The only proposer that can make P LD progress ○ Multiple proposers may believe to be leader ○ Acceptors decide which one gets a client 1 client 2 majority
Coalesced Roles ● A single processors can have P+ multiple roles ● P+ ○ Proposer P+ P+ ○ Acceptor ○ Learner ● Client talks to any processor ○ Nearest one? P+ P+ ○ Leader? client
Coalesced Roles at Scale ● P+ system is a complete digraph P+ ○ a directed graph in which every pair of distinct vertices is connected by a pair of unique edges P+ P+ ○ Everyone talks to everyone ● Let n be the number of processors ○ a.k.a. Quorum Size P+ P+ ● Connections = n * ( n - 1) ○ Potential network (TCP) connections client
Coalesced Roles with Leader ● P+ system with a leader is a directed P+ graph ○ Leader talks to everyone else P+ P+ ● Let n be the number of processors ○ a.k.a. Quorum Size ● Connections = n - 1 ○ Network (TCP) connections P+ P+ client
Coalesced Roles at Scale
Limitations - Single consensus - Once consensus has been reached no more progress can be made - But: Applications can start new Paxos runs - Multiple proposers may believe to be the leader - dueling proposers - theoretically infinite duell - practically retry-limits and jitter helps - Standard Paxos not resilient against Byzantine failures - Byzantine: Lying or compromised processors - Solution: Byzantine Paxos Protocol
Introducing Skinny ● Paxos-based ● Feature-free ● Educational ● Lock Service
Skinny "Features" ● Easy to understand and observe ● Coalesced Roles ● Single Lock ○ Locks are always advisory! ○ A lock service does not enforce obedience to locks. ● Go ● Protocol Buffers ● gRPC ● Do not use in production!
Assuming... ● Oregon ○ North America ● São Paulo ○ South America ● London ○ Europe ● Taiwan ○ Asia ● Sydney ○ Australia
How Skinny reaches consensus
SKINNY QUORUM Lock please?
ID 0 PHASE 1A: PROPOSE Promised 0 Holder _ ID 0 ID 0 Promised 0 Promised 0 Holder _ Holder _ Proposal ID 1 Proposal ID 1 Proposal ID 1 Lock please? Proposal ID 1 ID 0 ID 0 Promised 0 Promised 1 Holder _ Holder _
ID 0 PHASE 1B: PROMISE Promised 1 Holder _ ID 0 ID 0 Promise Promised 1 Promised 1 ID 1 Holder _ Holder _ Promise ID 1 Promise ID 1 Promise ID 1 ID 0 ID 0 Promised 1 Promised 1 Holder _ Holder _
ID 0 PHASE 2A: COMMIT Promised 1 Holder _ ID 0 Commit ID 0 Promised 1 ID 1 Promised 1 Holder _ Holder Beaver Holder _ Commit ID 1 Holder Beaver Commit ID 1 Holder Beaver Commit ID 1 Holder Beaver ID 0 ID 1 Promised 1 Promised 1 Holder _ Holder Beaver
ID 1 PHASE 2B: COMMITTED Promised 1 Holder Beaver ID 1 ID 1 Committed Promised 1 Promised 1 Holder Beaver Holder Beaver Committed Committed Lock acquired! Holder is Beaver. Committed ID 1 ID 1 Promised 1 Promised 1 Holder Beaver Holder Beaver
How Skinny deals with Instance Failure
ID 9 SCENARIO Promised 9 Holder Beaver ID 9 Promised 9 Holder Beaver ID 9 Promised 9 Holder Beaver ID 9 ID 9 Promised 9 Promised 9 Holder Beaver Holder Beaver
ID 9 TWO INSTANCES FAIL Promised 9 Holder Beaver ID 9 Promised 9 Holder Beaver ID 9 Promised 9 Holder Beaver ID 9 ID 9 Promised 9 Promised 9 Holder Beaver Holder Beaver
ID 9 INSTANCES ARE BACK Promised 9 BUT STATE IS LOST Holder Beaver ID 0 Promised 0 Holder ID 0 Promised 0 Holder Lock please? ID 9 ID 9 Promised 9 Promised 9 Holder Beaver Holder Beaver
ID 9 INSTANCES ARE BACK Promised 9 BUT STATE IS LOST Holder Beaver ID 1 Promised 1 Proposal Holder ID 0 ID 1 Promised 0 Holder Lock please? Proposal ID 1 Proposal Proposal ID 1 ID 1 ID 9 ID 9 Promised 9 Promised 9 Holder Beaver Holder Beaver
ID 9 PROPOSAL REJECTED Promised 9 Holder Beaver ID 1 Promised 1 NOT Promised Holder ID 9 ID 0 Promised 1 Holder Beaver Holder Promise ID 1 NOT Promised ID 9 NOT Promised Holder Beaver ID 9 Holder Beaver ID 9 ID 9 Promised 9 Promised 9 Holder Beaver Holder Beaver
ID 9 START NEW PROPOSAL Promised 9 WITH LEARNED VALUES Holder Beaver ID 9 Promised 12 Proposal Holder Beaver ID 0 ID 12 Promised 1 Holder Proposal ID 12 Proposal Proposal ID 12 ID 12 ID 9 ID 9 Promised 9 Promised 9 Holder Beaver Holder Beaver
ID 9 PROPOSAL ACCEPTED Promised 12 Holder Beaver ID 9 Promise Promised 12 ID 12 Holder Beaver ID 0 Promised 12 Holder Promise ID 12 Promise ID 12 Promise ID 12 ID 9 ID 9 Promised 12 Promised 12 Holder Beaver Holder Beaver
ID 9 COMMIT LEARNED VALUE Promised 12 Holder Beaver ID 12 Promised 12 Commit Holder Beaver ID 9 ID 12 Promised 12 Holder Beaver Holder Beaver Commit ID 12 Holder Beaver Commit ID 12 Commit Holder Beaver ID 12 Holder Beaver ID 9 ID 9 Promised 12 Promised 12 Holder Beaver Holder Beaver
ID 12 COMMIT ACCEPTED Promised 12 LOCK NOT GRANTED Holder Beaver ID 12 Committed Promised 12 Holder Beaver ID 12 Promised 12 Holder Beaver Committed Committed Committed Lock NOT acquired! Holder is Beaver. ID 12 ID 12 Promised 12 Promised 12 Holder Beaver Holder Beaver
Skinny APIs
Consensus API Skinny APIs ● Lock API ○ Used by clients to acquire or release a lock ● Consensus API ○ Used by Skinny instances to reach Lock API consensus ● Control API Control API ○ Used by us to observe client admin what's happening
Lock API message AcquireRequest { message ReleaseRequest {} string Holder = 1; message ReleaseResponse { } bool Released = 1; message AcquireResponse { } bool Acquired = 1; string Holder = 2; } service Lock { rpc Acquire(AcquireRequest) returns (AcquireResponse); client rpc Release(ReleaseRequest) returns (ReleaseResponse); } admin
Consensus API // Phase 1: Promise // Phase 2: Commit message PromiseRequest { message CommitRequest { uint64 ID = 1; uint64 ID = 1; } string Holder = 2; message PromiseResponse { } bool Promised = 1; message CommitResponse { uint64 ID = 2; bool Committed = 1; string Holder = 3; } } service Consensus { rpc Promise (PromiseRequest) returns (PromiseResponse); rpc Commit (CommitRequest) returns (CommitResponse); }
Control API message StatusRequest {} service Control { message StatusResponse { rpc Status(StatusRequest) returns (StatusResponse); string Name = 1; } uint64 Increment = 2; string Timeout = 3; uint64 Promised = 4; uint64 ID = 5; admin string Holder = 6; message Peer { string Name = 1; string Address = 2; } repeated Peer Peers = 7; }
My Stupid Mistakes My Awesome Learning Opportunities
Reaching Out...
// Instance represents a skinny instance type Instance struct { Skinny Instance mu sync.RWMutex // begin protected fields ● List of peers ... ○ All other instances in the peers []*peer quorum // end protected fields ● Peer } ○ gRPC Client Connection ○ Consensus API Client type peer struct { name string address string conn *grpc.ClientConn client pb.ConsensusClient }
Recommend
More recommend