programming distributed systems
play

Programming Distributed Systems 07 Consistency Annette Bieniusa AG - PowerPoint PPT Presentation

Programming Distributed Systems 07 Consistency Annette Bieniusa AG Softech FB Informatik TU Kaiserslautern Summer Term 2018 Annette Bieniusa Programming Distributed Systems Summer Term 2018 1/ 48 Motivation One of the most important


  1. Programming Distributed Systems 07 Consistency Annette Bieniusa AG Softech FB Informatik TU Kaiserslautern Summer Term 2018 Annette Bieniusa Programming Distributed Systems Summer Term 2018 1/ 48

  2. Motivation One of the most important abstraction in distributed computing is shared state . Problematic: Communication is typically slow and/or unreliable Cannot achieve strong consistency, low latency, and availability at the same time All material and graphics in this section are based on material by Sebastian Burkhardt (Microsoft Research)[1]. Annette Bieniusa Programming Distributed Systems Summer Term 2018 2/ 48

  3. Consistency in Database Systems The distributed systems and database communities use the same word, con- sistency, with different meanings. Distributed systems: “consistency” refers to the observable be- haviour of a data store. Databases: roughly the same concept is called “isolation”,whereas the term “consistency” refers to the property that application code is se- quentially safe (the C in ACID). Annette Bieniusa Programming Distributed Systems Summer Term 2018 3/ 48

  4. “Single-Value Register” Operations rd () → v and wr ( v ) → ok System architecture: Annette Bieniusa Programming Distributed Systems Summer Term 2018 4/ 48

  5. Implementation 1: Single-copy Register Single replica of shared register Forward all read and write requests Annette Bieniusa Programming Distributed Systems Summer Term 2018 5/ 48

  6. Implementation 2: Epidemic Register Each replica stores a timestamped value Reads return this value; writes update this value, stamped with current time (e.g. logical clock) At random times, replicas send stored timestamped value to random recipients When receiving timestamped value, replace locally stored value if incoming timestamp is later Annette Bieniusa Programming Distributed Systems Summer Term 2018 6/ 48

  7. Question Can clients observe a difference between the two implementations (single-copy vs. epidemic)? Assumptions: Asynchronous communication Fairness of transport “Randomly” generated values Annette Bieniusa Programming Distributed Systems Summer Term 2018 7/ 48

  8. Notions of consistency Single-Copy Register: Linearizability Epidemic Register: Sequential Consistency When generalized to key-value store, the epidemic variant guarantees Eventual Consistency (if sending randomly selected tuple in each message) or Causal Consistency (if sending all tuples in each message) Annette Bieniusa Programming Distributed Systems Summer Term 2018 8/ 48

  9. Consistency model Required for any type of storage (system) that processes more than one operation at a time. Unless the consistency model is linearizability (= single-copy semantics), applications observe non-sequential behaviors, called anomalies. The set of possible behaviors, and conversely of possible anomalies, constitutes the consistency model of the data store. Annette Bieniusa Programming Distributed Systems Summer Term 2018 9/ 48

  10. Consistency specifications Annette Bieniusa Programming Distributed Systems Summer Term 2018 10/ 48

  11. What is a replicated shared object / service? Different names and examples: REST Service, file system, key-value store, counters, registers, . . . Formally specified by a set of operations Op and either a sequential semantics S , or a concurrent semantics F Annette Bieniusa Programming Distributed Systems Summer Term 2018 11/ 48

  12. Sequential semantics S : Op × Op ∗ → V al Operation to be performed Sequence of all prior operations (“current state”) Returned value Example: Register S ( rd, ǫ ) = undef (read returns initial value) S ( rd, wr (2) · wr (8)) = 8 (read returns last value written) S ( wr (3) , rd · wr (2) · wr (8)) = ok (write always returns ok) Annette Bieniusa Programming Distributed Systems Summer Term 2018 12/ 48

  13. Histories A history records all the interactions between clients and the system. Operations performed Indication whether operation successfully completed and return value Relative order of concurrent operations Session of an operation (corresponds to client / connection) Annette Bieniusa Programming Distributed Systems Summer Term 2018 13/ 48

  14. Classically, histories are represented as sequences of calls and returns[2]. ⇒ Generalize this to event graphs Annette Bieniusa Programming Distributed Systems Summer Term 2018 14/ 48

  15. Annette Bieniusa Programming Distributed Systems Summer Term 2018 15/ 48

  16. Annette Bieniusa Programming Distributed Systems Summer Term 2018 16/ 48

  17. Annette Bieniusa Programming Distributed Systems Summer Term 2018 17/ 48

  18. Annette Bieniusa Programming Distributed Systems Summer Term 2018 18/ 48

  19. Annette Bieniusa Programming Distributed Systems Summer Term 2018 19/ 48

  20. Annette Bieniusa Programming Distributed Systems Summer Term 2018 20/ 48

  21. Annette Bieniusa Programming Distributed Systems Summer Term 2018 21/ 48

  22. Annette Bieniusa Programming Distributed Systems Summer Term 2018 22/ 48

  23. Event graphs An event graph represents an execution of a system. Vertices : events Attributes : label for vertices with information on the corresponding event (e.g. which operation, parameters, return values) Relations : orderings or groupings of events Definition An event graph G is a tuple ( E, d 1 , . . . , d n ) where E ⊆ Events is a finite or countably infinite set of events, and each d i is an attribute or relation over E . Annette Bieniusa Programming Distributed Systems Summer Term 2018 23/ 48

  24. Histories as event graphs A history is an event graph ( E, op, rval, rb, ss ) where op : E → Op associate operation with an event rval : E → V alues ∪ {∇} are return values ( ∇ denotes that operation never returns) rb is returns-before order ss is same-session relation Annette Bieniusa Programming Distributed Systems Summer Term 2018 24/ 48

  25. Hands-on: Timeline diagram vs. event graph Annette Bieniusa Programming Distributed Systems Summer Term 2018 25/ 48

  26. Annette Bieniusa Programming Distributed Systems Summer Term 2018 26/ 48

  27. When is a history valid? Common approach: Require linearizability Insert linearization points between begin and end of operation Semantics of operations must hold with respect to these linearization points Linearization points serves as justification / witness for a history Here: Consistency semantics beyond linearizability! Annette Bieniusa Programming Distributed Systems Summer Term 2018 27/ 48

  28. Specifying the Consistency Semantics History: defines the what client interaction is observable Specification: is a “test” on histories But how do we specify such a “test” / predicate? Execution: is an account of what happened when executing the implementation Operational consistency model Provides an abstract reference implementation whose behaviors provide the specifications Well-studied methodology for proving correctness (e.g. simulation relations or refinement) Problem: Typically close to specific concrete implementation technique Annette Bieniusa Programming Distributed Systems Summer Term 2018 28/ 48

  29. Specifying the Consistency Semantics Abstract execution: account of the “essence” of what happened Applicable to many implementations Correctness critirion: History is valid if consistent with an abstract execution satisfying some consistency guarantees Concrete execution: account of what happened when executing a particular actual implementation Axiomatic consistency model Uses logical conditions to define valid behaviors Allows to combine different aspects (here: consistency guarantees) Annette Bieniusa Programming Distributed Systems Summer Term 2018 29/ 48

  30. Decomposing abstract executions Essence of what happened can be tracked down to two basic responsibilities of the underlying protocol: 1. Update Propagation: All operations must eventually become visible everywhere 2. Conflict Resolution: Conflicting operations must be arbitrated consistently Annette Bieniusa Programming Distributed Systems Summer Term 2018 30/ 48

  31. Visibility Relation that determines the subset of operations “visible” to an operation Relative timing of update propagation and operations a vis − − → b Effect of operation a is visible to the client performing b Updates are concurrent if they are not ordered by visibility (i.e. if they cannot see each other) Annette Bieniusa Programming Distributed Systems Summer Term 2018 31/ 48

  32. Arbitration Used for resolution of update conflicts (i.e. concurrent updates that do not commute) a ar − → b Total order on operations Often solved in practice by using timestamps Annette Bieniusa Programming Distributed Systems Summer Term 2018 32/ 48

  33. Abstract Executions An abstract execution is an event graph ( E, op, rval, rb, ss, vis, ar ) such that ( E, op, rval, rb, ss ) is a history vis is acyclic ar is a total order Annette Bieniusa Programming Distributed Systems Summer Term 2018 33/ 48

  34. Abstract Executions An abstract execution is an event graph ( E, op, rval, rb, ss, vis, ar ) such that ( E, op, rval, rb, ss ) is a history vis is acyclic ar is a total order Annette Bieniusa Programming Distributed Systems Summer Term 2018 33/ 48

Recommend


More recommend