crdts in production
play

CRDTs in Production Dmitry Martyanov, Software Engineer @ PayPal - PowerPoint PPT Presentation

CRDTs in Production Dmitry Martyanov, Software Engineer @ PayPal QCon, 2018 Geo-Distributed Datastore Context More than 200 countries Regulatory requirements State Machine of Compliance Status Modified by multiple Actors Shared Mutable


  1. CRDTs in Production Dmitry Martyanov, Software Engineer @ PayPal QCon, 2018

  2. Geo-Distributed Datastore

  3. Context More than 200 countries Regulatory requirements State Machine of Compliance Status Modified by multiple Actors

  4. Shared Mutable State

  5. Shared Mutable State Mutex

  6. Shared Mutable State Mutex Transactions

  7. Geo-Distributed Datastore

  8. Eventual Consistency

  9. Distributed System Replica A Replica B Replica D Replica C

  10. Distributed System Replica A Replica B t n : PUT(key, val) t n+1 : GET(key) = val Replica D Replica C

  11. Distributed System Replica A Replica B t n : PUT(key, val) t n+1 : GET(key) = val t n+1 : GET(key) = ? Replica D Replica C

  12. Affinity Based Approaches Replica A Replica B Replica D Replica C

  13. Affinity Based Approaches Replica A Replica B Replica D Replica C

  14. Coordinator Based Approaches Replica A Replica B Replica D Replica C

  15. Consensus Based Approaches Paxos, Raft, etc.

  16. Service Stack What type of documents Business Rules Filtering Logic, etc. Entity objects DAO Layer Business Logic Flow Control Service Domain Platform Service discovery Routing & Balancing Service Infrastructure Failover strategy Data Model, Records Domain Data Deployment configuration Datastore Namespaces & Schemas Data Infrastructure Replication params

  17. Service Stack Business Logic Service Domain Platform Service Infrastructure Domain Data Datastore Data Infrastructure

  18. Service Stack Business Logic Service Domain Platform Service Infrastructure Domain Data Datastore Data Infrastructure

  19. Conflict-free Replicated Data Types

  20. CRDTs 
 convergent commutative Requirements: Requirements: ● + Commutativity ● + Commutativity ● + Associativity ● + Associativity ● + Exactly once delivery ● + Idempotence ● - Idempotence ● - Exactly once delivery f(x1) g(x2) f(x1) merge A A g(x2) f(x3) g(x2) merge B B C C merge merge g(x2) f(x3)

  21. Convergent CRDTs •M(a, b) = M(b, a) •M(M(a, b), c) = M(a, M(b, c)) •M(a, b) = M(M(a, b), b) = M(M(M(a, b), b), b)

  22. Impacted Components for CRDTs CRDTs Business Logic Service Domain Platform CRDTs Service Infrastructure Domain Data CRDTs Datastore Data Infrastructure

  23. Online Flight Check-in System seat: 12F seat: 16D t1 XDR seat: 12F seat: 12F t2 a b TIME

  24. Online Flight Check-in System seat: 12F (220) seat: 16D (150) t1 XDR seat: 12F (220) seat: 12F (220) t2 LWW a b M(a, b) for LWW = MAX(a, b) TIME

  25. Online Flight Check-in System seat: {a 1 :12F} seat: {b 1 : 16D} t1 XDR seat: { seat: { t2 b 1 : 16D, b 1 : 16D, a 1 : 12F a 1 : 12F } } a b TIME

  26. Online Flight Check-in System seat: {a 1 :12F} seat: {b 1 : 16D} t1 XDR seat: { seat: { t2 b 1 : 16D, b 1 : 16D, a 1 : 12F a 1 : 12F Add-O } } Map a b TIME

  27. Causality seat: { b 1 : 16D, a 1 : 12F } a1;b1 Causality Vector (cv)

  28. Causality seat: { b 1 : 16D, a 1 : 12F 5C 12F 10A } 12F is causal to 10A - we can drop 12F a1;b1 10A is causal to 5C - we can drop 10A Causality Vector (cv)

  29. Causality seat: { b 1 : 16D, a 1 : 12F 5C 12F 10A } 12F is causal to 10A - we can drop 12F a1;b1 10A is NOT causal to 5C - we can NOT drop 10A Causality Vector (cv)

  30. Causality seat: { Client Operations: b 1 : 16D, GET(key): value => GET(key): (value , cv ) a 1 : 12F } PUT(key, value) => PUT(key, value , cv ) a1;b1 Causality Vector (cv)

  31. Causality seat: { Client Operations: b 1 : (16D, cv ), GET(key): value => GET(key): (value , cv ) , cv ) a 1 : (12F } PUT(key, value) => PUT(key, value , cv ) a1;b1 Causality Vector (cv)

  32. Aerospike Datastore Client Read Database Path Memory Database Database Disk (Flash) Disk (Flash) Async XDR

  33. Aerospike Datastore Record Key Client Metadata Bin1 Bins Read Bin2 Database Path Memory Bin3 Database Database Disk (Flash) Disk (Flash) Async XDR

  34. Aerospike Datastore User-Defined Functions 1 2 A B Result [ [ [ XDR bin1: 16B, bin2: 14C, bin1: 16B, bin2: 12A bin3: 10A bin2: (12A or 14C), ] ] bin3: 10A ]

  35. 12F(_) a a1:(12F ,_) b Bins 1 Bins 1 a1 (12F, _) a b

  36. 10D(_) 12F(_) a a1:(12F ,_) b b1:(10D,_) Bins 2 Bins 2 1 1 b1 a1 (10D, _) (12F, _) (12F, _) a b

  37. 10D(_) 12F(_) a a1:(12F ,_) a1: (12F , _) b1:(10D,_) b b1:(10D,_) Bins 3 Bins 3 1 2 1 2 a1 a1 (12F, _) (12F, _) (12F, _) (12F, _) b1 b1 (10D, _) (10D, _) (10D, _) a b

  38. 10D(_) [12F] -> 10F [12F](a1) 10F(a1) 12F(_) a a1:(12F ,_) a1: (12F , _) b1:(10D,_) a1: (12F , _) a2: (10F , a1) b1:(10D,_) b b1:(10D,_) Bins 4 Bins 4 1 2 3 1 2 3 a1 a1 (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) b1 b1 (10D, _) (10D, _) (10D, _) (10D, _) (10D, _) a2 (10F, a1) a b

  39. [10F , 10D] -> 5C 5C(a2b1) 10D(_) [10F ,10D](a2b1) [12F] -> 10F [12F](a1) 10F(a1) 12F(_) a a1:(12F ,_) a1: (12F , _) b1:(10D,_) b1:(10D,_) a1: (12F , _) a2: (10F , a1) a2: (10F , a1) b1:(10D,_) a3:(5C, a2b1) b b1:(10D,_) Bins 5 Bins 5 1 2 3 4 1 2 3 4 a1 a1 (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) b1 b1 (10D, _) (10D, _) (10D, _) (10D, _) (10D, _) (10D, _) (10D, _) a2 (10F, a1) (10F, a1) a3 (5C, a2b1) a b

  40. [10D, 10F] -> 5C 5C(a2b1) 10B(_) [10D,10F](a2b1) [12F] -> 10F [12F](a1) 10F(a1) 12F(_) a a1:(12F ,_) a1: (12F , _) b1:(10D,_) b1:(10D,_) a1: (12F , _) a2: (10F , a1) a2: (10F , a1) a3:(5C, a2b1) b1:(10D,_) a3:(5C, a2b1) b b1:(10D,_) Bins 6 Bins 6 1 2 3 4 5 1 2 3 4 5 a1 a1 (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) (12F, _) b1 b1 (10D, _) (10D, _) (10D, _) (10D, _) (10D, _) (10D, _) (10D, _) (10D, _) (10D, _) a3 a2 (5C, a2b1) (10F, a1) (10F, a1) (10F, a1) a3 (5C, a2b1) (5C, a2b1) a b

  41. Learnings •CRDTs allowed us to achieve convergent predictable state of our data

  42. Learnings •CRDTs allowed us to achieve convergent predictable state of our data •Education about right trade-off between Consistency and Correctness

  43. Learnings •CRDTs allowed us to achieve convergent predictable state of our data •Education about right trade-off between Consistency and Correctness •Do not underestimate concurrent data access

  44. Caveat #1: CV Propagation Data Decision 1 Access Engine Data Mid-layer UX 2 Access Service Component

  45. Caveat #1: CV Propagation Data Decision 1 Access Engine Data Mid-layer UX 2 Access Service Component

  46. Caveat #2: Siblings Explosion 12F(_) 5A(_) 10A(_) 10B(_) 12B(_) a1:(10A,_) a1:(10A,_) a1:(10A,_) a1:(10A,_) a1:(10A,_) a2:(12F ,_) a2:(12F ,_) a2:(12F ,_) a2:(12F ,_) a a3:(10B,_) a3:(10B,_) a3:(10B,_) a4:(12B,_) a4:(12B,_) a5:(5A,_)

  47. Caveat #3: Wait, Siblings ? ??? [10D, 10F] -> 5C 5C(a2b1) 10B(_) [10D,10F](a2b1) [12F] -> 10F [12F](a1) 10F(a1) 12F(_) a a1:(12F ,_) a1: (12F , _) b1:(10D,_) b1:(10D,_) a1: (12F , _) a2: (10F , a1) a2: (10F , a1) a3:(5C, a2b1) b1:(10D,_) a3:(5C, a2b1) b b1:(10D,_)

  48. Thanks!

Recommend


More recommend