CRDTs in Practice Marc Shapiro – Inria & UPMC Nuno Preguiça – U. NOVA
Cloud to the edge Social, web, e-commerce: shared mutable data Scalability ⇒ replication ⇒ consistency issues 2 [CRDTs in practice — CodeMesh 2015]
Cloud to the edge Social, web, e-commerce: shared mutable data Scalability ⇒ replication ⇒ consistency issues 3 [CRDTs in practice — CodeMesh 2015]
Conflict-free replicated data types Data type • Encapsulates issues Replicated • At multiple nodes Available • Update my replica without coordination • Convergence guaranteed (by mathematical properties) • Decentralised, peer-to-peer 4 [CRDTs in practice — CodeMesh 2015]
Why use CRDTs Availability is king (otherwise stay away) • ⟹ concurrent updates Fine-grain mutable shared data • Registers not sufficient Mobile computing In DC Geo-replication 5 [All About Consistency — CodeMesh 2015]
CRDT design concepts Backward-compatible with sequential datatype If operations commute, they can be concurrent • add(e); rm(f) ≣ rm(f); add(e) ≣ add(e) || rm (f) ≣ add(e) || rm (f) Otherwise, deterministic semantics • Close to sequential rm(e);add(e) or add(e); rm(e) • Don’t lose updates • Result doesn't depend on order received • Stable preconditions 6 [CRDTs in practice — CodeMesh 2015]
7 [CRDTs in practice — CodeMesh 2015]
bet365 Largest European on-line betting operator • Bursty load: 2.5 million simultaneous users • 1 Tb working set • 1000s servers • Slow users: transient inconsistency OK • Availability, read my writes, monotonic reads • Transparency Before: SQLserver, doesn't scale, hours to converge mid 2013: noSQL riak: available, siblings; ad-hoc merge (hard!) 8 [CRDTs in practice — CodeMesh 2015]
bet365 CRDT experience ≥ Jan. 2014; in anger ≥ Dec. 2014 ORSWOT add-remove set • Add, remove element; scan for similar • 100s Gb Transformational : “CRDTs saved the day” • Correct by construction • Stable; partitions fixed quickly, correctly Future wish list: “Extra guarantees … without impacting availability.” 9 [CRDTs in practice — CodeMesh 2015]
CRDT Set design space Many Set operations commute: add(e) / add(f), add(e) / rm(f) , etc. Non-commuting pair: add(e) / rm(e) • sequential consistency • last writer wins? { add(e)<rmv(e) ⟹ e ∉ S rmv(e)<add(e) ⟹ e ∈ S } ∧ • error state? { ⊥ e ∈ S} • add wins? {e ∈ S} • remove wins? {e ∉ S} All deterministic, satisfy conditions 10 [CRDTs in practice — CodeMesh 2015]
Wedding list TV TV TV TV TV TV TV TV TV Venice Venice Ski trip TV TV Ski trip Venice Books Venice Venice Venice Books Books Books Books Venice Venice Replicated wedding list Ski trip Ski trip Ordered list of “wishes” (strings) • lookup (wish) ⟶ rank • add (position, wish) • rm (position) Position: “after item” 11 [CRDTs in practice — CodeMesh 2015]
iDrone ⊢ ⊢ ⊣ ⊣ TV laptop Venice ski books World trip peace Each item points to the next one • add (pos, item) : link item after the one at pos • rm (item) : mark as tombstone • add (pos, item1) || add (pos, item2) : deterministic 12 [CRDTs in practice — CodeMesh 2015]
iDrone ⊢ ⊢ ⊣ ⊣ TV laptop Venice ski books World trip peace Each item points to the next one • add (pos, item) : link item after the one at pos • rm (item) : mark as tombstone • add (pos, item1) || add (pos, item2) : deterministic 13 [CRDTs in practice — CodeMesh 2015]
Lowering your expectations iDrone iDrone World Peace iDrone iDrone iDrone World Peace iDrone World Peace TV World Peace World Peace • lookup (wish) ⟶ rank TV TV TV Ski trip TV TV Ski trip Ski trip Ski trip Ski trip Books Ski trip • add (pos, wish) Books Books Books Books Books Laptop • rm (pos) Laptop Laptop Laptop World Peace Laptop Laptop • mv (wish, pos1, pos2) • add (…, pos2); rm (pos1) • offer (wish) 14 [CRDTs in practice — CodeMesh 2015]
Lowering your expectations World Peace World Peace iDrone iDrone TV TV • lookup (wish) ⟶ rank Ski trip Ski trip • add (pos, wish) Books Books Laptop Laptop • rm (pos) World Peace World Peace • mv (wish, pos1, pos2) • add (…, pos2); rm (pos1) • offer (wish) 15 [CRDTs in practice — CodeMesh 2015]
Lowering your expectations • • iDrone iDrone iDrone World Peace World Peace World Peace • lookup (wish) ⟶ rank TV TV TV Ski trip Ski trip Ski trip • add (pos, wish) Books Books Books • rm (pos) Laptop Laptop Laptop • mv (wish, pos1, pos2) • add (…, pos2); rm (pos1) • offer (wish) 16 [CRDTs in practice — CodeMesh 2015]
The problem with invariants Remove specification { true } rm(wish) { tombstone(wish) } Move, offer: maintain uniqueness invariant { ¬offered(wish,_) } offer(wish) { offered(wish, red) } Precondition stable under concurrent updates? • If so, invariant guaranteed • Otherwise, all bets are off 17 [CRDTs in practice — CodeMesh 2015]
Lessons learned Availability ⟹ concurrent updates • Mask their undesirable effects Backwards compatible • Same sequential semantics • Commute ⟹ same concurrent semantics • otherwise, “close enough” Maintaining invariants • Stable preconditions 18 [CRDTs in practice — CodeMesh 2015]
Numeric Invariants Many applications need to enforce conditions like: counter ≥ K E.g.: • Number of impressions left ≥ 0 • Virtual money in a game ≥ 0 19 [All About Consistency — CodeMesh 2015] Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Numeric invariants X ≥ 0 Given X = n , there are n rights to execute dec() Distribute rights among replicas • Consume rights for dec() • Create rights on inc() 20 [All About Consistency — CodeMesh 2015]
CRDT-ish Execute operations locally without coordination Peer-to-peer synchronisation Fail if not enough rights exist 21 [All About Consistency — CodeMesh 2015]
Bounded Counter: API Create(type, value); Increment(value); Decrement(value); Value(); Transfer(to, qty); 22 [All About Consistency — CodeMesh 2015] Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Bounded Counter: increment R R r 1 r 2 r 3 U r 1 r 2 r 3 U r 1 r 1 0 0 0 0 10 0 0 0 r 2 r 2 0 0 0 0 0 0 0 0 Increment(10); r 3 0 0 0 0 r 3 0 0 0 0 R 1 R R r 1 r 2 r 3 U r 1 r 2 r 3 U r 1 r 1 0 0 0 0 0 0 0 0 r 2 r 2 0 0 0 0 0 15 0 0 Increment(15); r 3 0 0 0 0 r 3 0 0 0 0 R 2 R R r 1 r 2 r 3 U r 1 r 2 r 3 U r 1 r 1 0 0 0 0 0 0 0 0 r 2 r 2 0 0 0 0 0 0 0 0 Increment(8); r 3 0 0 0 0 r 3 0 0 8 0 R 3 23 [All About Consistency — CodeMesh 2015] Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Bounded Counter: increment R r 1 r 2 r 3 U r 1 10 0 0 0 r 2 0 0 0 0 r 3 0 0 0 0 R 1 R r 1 r 2 r 3 U r 1 0 0 0 0 r 2 0 15 0 0 r 3 0 0 0 0 R 2 R r 1 r 2 r 3 U r 1 0 0 0 0 r 2 0 0 0 0 r 3 0 0 8 0 R 3 24 [All About Consistency — CodeMesh 2015] Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Bounded Counter: decrement R r 1 r 2 r 3 U r 1 10 0 0 0 r 2 0 0 0 0 decrement(15); r 3 0 0 0 0 R 1 R R r 1 r 2 r 3 U r 1 r 2 r 3 U r 1 r 1 0 0 0 0 0 0 0 0 r 2 r 2 0 15 0 0 0 15 0 5 decrement(5); r 3 0 0 0 0 r 3 0 0 0 0 R 2 R r 1 r 2 r 3 U r 1 0 0 0 0 r 2 0 0 0 0 r 3 0 0 8 0 R 3 25 [All About Consistency — CodeMesh 2015] Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Bounded Counter: transfer R r 1 r 2 r 3 U r 1 10 0 0 0 r 2 0 0 0 0 r 3 0 0 0 0 R 1 R R r 1 r 2 r 3 U r 1 r 2 r 3 U r 1 r 1 0 0 0 0 0 0 0 0 r 2 r 2 0 15 0 0 0 15 0 5 r 3 0 0 0 0 r 3 0 0 0 0 R 2 R R r 1 r 2 r 3 U r 1 r 2 r 3 U r 1 r 1 0 0 0 0 0 0 0 0 r 2 r 2 0 0 0 0 0 0 0 0 transfer(r 1 , 4); r 3 0 0 8 0 r 3 4 0 8 0 R 3 26 [All About Consistency — CodeMesh 2015] Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Bounded Counter: transfer R r 1 r 2 r 3 U r 1 10 0 0 0 r 2 0 0 0 0 r 3 0 0 0 0 R 1 R r 1 r 2 r 3 U r 1 0 0 0 0 r 2 0 15 0 5 r 3 0 0 0 0 R 2 R r 1 r 2 r 3 U r 1 0 0 0 0 r 2 0 0 0 0 r 3 4 0 8 0 R 3 27 [All About Consistency — CodeMesh 2015] Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Bounded Counter: merge R r 1 r 2 r 3 U Each replica only r 1 10 0 0 0 touches his line. r 2 0 0 0 0 Merge by taking r 3 0 0 0 0 max of each cell. R 1 R r 1 r 2 r 3 U r 1 0 0 0 0 r 2 0 15 0 5 merge(r 1 ,r 2 ); r 3 0 0 0 0 R 2 R r 1 r 2 r 3 U r 1 0 0 0 0 r 2 0 0 0 0 r 3 4 0 8 0 R 3 28 [All About Consistency — CodeMesh 2015] Valter Balegas et al.– NOVA LINCS, DI, FCT, Universidade NOVA de Lisboa @ RICON'15
Recommend
More recommend