sdpaxos building efficient semi decentralized
play

SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State - PowerPoint PPT Presentation

SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines Hanyu Zhao * , Quanlu Zhang , Zhi Yang * , Ming Wu , Yafei Dai * * Peking University Microsoft Research Replication for Fault Tolerance Peking


  1. SDPaxos: Building Efficient Semi-Decentralized Geo-replicated State Machines Hanyu Zhao * , Quanlu Zhang † , Zhi Yang * , Ming Wu † , Yafei Dai * * Peking University † Microsoft Research

  2. Replication for Fault Tolerance Peking University, Microsoft Research 2

  3. Replication in the Wide Area 150ms 20ms - Reducing wide-area latency for clients Peking University, Microsoft Research 3

  4. Keeping the Replicated State Consistent “Having fun at SoCC !” “Having fun at OSDI!” Inconsistent! Peking University, Microsoft Research 4

  5. State Machine Replication (SMR) A = 3 A = 3 A = 3 A = 1 A = 2 A = 3 A = 1 A = 2 A = 3 A = 1 A = 2 A = 3 Execute the same sequence of commands in the same order Peking University, Microsoft Research 5

  6. Paxos - A distributed agreement protocol - Tolerates F failures given 2F+1 replicas - Choose a single command for eac ach command slo slot t using a Paxos ins instance A = 1 A = 1 A = 1 Paxos instance 1 Peking University, Microsoft Research 6

  7. Paxos - A distributed agreement protocol - Tolerates F failures given 2F+1 replicas - Choose a single command for eac ach command slo slot t using a Paxos ins instance A = 1 A = 2 A = 1 A = 2 A = 1 A = 2 Paxos instance 2 Peking University, Microsoft Research 7

  8. Paxos - A distributed agreement protocol - Tolerates F failures given 2F+1 replicas - Choose a single command for eac ach command slo slot t using a Paxos ins instance A = 1 A = 2 A = 3 A = 1 A = 2 A = 3 A = 1 A = 2 A = 3 Paxos instance 3 Peking University, Microsoft Research 8

  9. Centralized SMR - Liveness property of Paxos: - There should not be multiple replicas proposing commands in the same instance simultaneously A = 1 A = 2 A = 3 Conflict! Peking University, Microsoft Research 9

  10. Centralized SMR - Liveness property of Paxos: - There should not be multiple replicas proposing commands in the same instance simultaneously A stable leader A = 1 A = 2 A = 3 Peking University, Microsoft Research 10

  11. Drawbacks of Centralized SMR - Potential performance bottleneck - Low throughput Peking University, Microsoft Research 11

  12. Drawbacks of Centralized SMR - Potential performance bottleneck - Low throughput - High wide-area latency 20ms 200ms Peking University, Microsoft Research 12

  13. Drawbacks of Centralized SMR - Potential performance bottleneck - Low throughput - High wide-area latency Centralized SMR Limited performance Peking University, Microsoft Research 13

  14. Drawbacks of Centralized SMR - Potential performance bottleneck - Low throughput - High wide-area latency Centralized SMR Decentralized SMR Limited performance High performance? Peking University, Microsoft Research 14

  15. Decentralizing SMR Replicas should propose commands in different command slots R0 R1 R2 A = 0 A = 0 A = 0 How to order them? Peking University, Microsoft Research 15

  16. Decentralizing SMR Replicas should propose commands in different command slots R0 R1 R2 A = 0 A = 0 A = 0 A = 1 A = 1 A = 1 How to order them? Peking University, Microsoft Research 16

  17. Decentralizing SMR Replicas should propose commands in different command slots R0 R1 R2 A = 0 A = 0 A = 0 A = 1 A = 1 A = 1 A = 2 A = 2 A = 2 How to order them? Peking University, Microsoft Research 17

  18. Static Ordering - The system runs at the speed of the slo slowest one Straggler A = 1 A = 2 A = 3 Blocked Peking University, Microsoft Research 18

  19. Dependency-based Ordering - Ordering overhead under contention A = 1 A = 1 A = 3 A = 3 A = 2 A = 2 A = 3 A = 3 Peking University, Microsoft Research 19

  20. Dependency-based Ordering - Ordering overhead under contention A = 1 A = 2 A = 3 Peking University, Microsoft Research 20

  21. Drawbacks of Decentralized SMR - Extra coordination for ordering => performance degradation - Lower throughput - Higher latency Centralized SMR Decentralized SMR Limited performance Poor performance stability Peking University, Microsoft Research 21

  22. Drawbacks of Decentralized SMR - Extra coordination for ordering => performance degradation - Lower throughput - Higher latency Semi-Decentralized SMR High performance SDPaxos Strong performance stability Peking University, Microsoft Research 22

  23. SDPaxos Intuition R0 R1 R2 A = 0 A = 0 A = 0 A = 1 A = 1 A = 1 A = 2 A = 2 A = 2 Peking University, Microsoft Research 23

  24. SDPaxos Intuition R0 R1 R2 A = 0 A = 0 A = 0 A = 1 A = 1 A = 1 A = 2 A = 2 A = 2 R2 R1 R0 A = 0 A = 1 A = 2 Peking University, Microsoft Research 24

  25. Centralizing Ordering I want to propose a command Sequencer R0 R1 R2 R0 R2 - Dynamical leadership establishment (stragglers won’t block others) - All commands are serialized (no conflicts) - Ordering is more lightweight than replicating Peking University, Microsoft Research 25

  26. SDPaxos: The Basic Protocol Client request for command A 1.5 round trips R0 Replicating A to others C-accept (A) C-ACK (A) O-ACK (R0) w/o execution order R1 O-ACK (R0) O-accept (R0) Assigning A to the next slot R2 (Sequencer) Peking University, Microsoft Research 26

  27. Reducing Latency for 3 Replicas Client request for R0 and R2 have command A constituted a majority R0 Replicating A to others C-accept (A) C-ACK (A) O-ACK (R0) w/o execution order R1 O-ACK (R0) O-accept (R0) Assigning A to the next slot R2 (Sequencer) Peking University, Microsoft Research 27

  28. Reducing Latency for 3 Replicas Client request for R0 and R2 have command A 1 round trip constituted a majority R0 Replicating A to others C-accept (A) C-ACK (A) w/o execution order R1 O-ACK (R0) O-accept (R0) Assigning A to the next slot R2 (Sequencer) Peking University, Microsoft Research 28

  29. Reducing Latency for 5 Replicas This assignment can be lost if R0 and R2 fail R0 C-accept (A) C-ACK (A) R1 O-accept (R0) R2 (Sequencer) R3 R4 Peking University, Microsoft Research 29

  30. Reducing Latency for 5 Replicas R0 R1 R2 Assignments for the sequencer (Sequencer) C-accept & C-ACK & can be seen by a majority in O-accept O-ACK just one round trip R3 R4 Peking University, Microsoft Research 30

  31. Handling Failures for 5 Replicas R0 R0 R1 R2 R3 R4 (Seq) R1 R0 R1 R2 R0 R2 R3 R3 R4 R4 Peking University, Microsoft Research 31

  32. Handling Failures for 5 Replicas R0 R0 R1 R2 R3 R4 (Seq) R1 R0 R1 R2 R0 R2 R3 R3 R4 R4 R2 R0 R3 R4 R1 Peking University, Microsoft Research 32

  33. More Details in the Paper - The detailed protocol and fault tolerance approach - Reads bypassing Paxos - Leveraging the centralized ordering to perform fast and safe reads - Performance optimizations - Lightening the load of ordering - Straggler detection - … Peking University, Microsoft Research 33

  34. Experimental Setup - Baselines - Multi-Paxos - Mencius - EPaxos - Workload: a replicated key-value store - Testbed: Amazon EC2 m4.large instances - Wide-area experiments: CA, OR, OH, IRE, SEL Peking University, Microsoft Research 34

  35. Performance Stability against Stragglers 120000 20.0% 100000 Throughput (ops / sec) 28.2% 1.6x 80000 47.7% 60000 40000 67.2% 20000 0 Multi-Paxos Mencius SDPaxos-N SDPaxos-S Peking University, Microsoft Research 35

  36. Performance Stability against Contention 75000 Throughput (ops / sec) 70000 65000 60000 1.35x 55000 50000 45000 40000 35000 30000 0% 5% 25% 50% 75% 100% Contention rate EPaxos-3 EPaxos-5 SDPaxos-3 SDPaxos-5 Peking University, Microsoft Research 36

  37. Wide-area Latency Latency (ms) - SDPaxos achieves optimal number of round trips - SDPaxos’s latency is relevant to the distance to the sequencer (IRE) - SDPaxos’s latency is not impacted by stragglers or contention Peking University, Microsoft Research 37

  38. Conclusion - The first semi-decentralized SMR protocol - High performance - Strong performance stability - One-round-trip under realistic configurations tolerating one or two failures - High throughput, low latency with stragglers, under contention or in ideal cases Peking University, Microsoft Research 38

  39. Q & A Peking University, Microsoft Research 39

Recommend


More recommend