virtualized physicalclocks what do we use clocks for
play

Virtualized PhysicalClocks What do we use Clocks for When did - PowerPoint PPT Presentation

Virtualized PhysicalClocks What do we use Clocks for When did something happen? When will it happen This class starts at 3pm How long does something take? This class lasts for 1 hour 20 minutes What happened first and happened


  1. Virtualized PhysicalClocks

  2. What do we use Clocks for • When did something happen? When will it happen • This class starts at 3pm • How long does something take? • This class lasts for 1 hour 20 minutes • What happened first and happened later • The class started before it ended

  3. Clocks in Distributed Systems • We use clocks for similar things in distributed systems • Take a backup at 5pm/Restore to the backup at 5pm • Take a backup every hour • Ensure that resource is released by process 1 before process 2 accesses it

  4. Application of Clocks to Order Events • Consider a multi‐version database system • When a new version is created, we add it to existing versions • A transaction (system on behalf of the transition) can determine which version to read • Each version has a timestamp. • Suppose you have a perfectly synchronized clock and a very fast processor • Treat every transaction as if it were instantaneous • Assign a timestamp say T for the transaction • Each read and write of the transaction would have time T

  5. Application of Clocks to Order Events • Example: • T1 has timestamp 100 • It reads x which has versions at time 0, 50, 75, 90 • T1 would read the version at time 90 • It creates a new version of x • It would have a timestamp of 100 • T2 has timestamp 110 • Assuming no transactions other than T1 and T2. If it reads x, it should read x written by T1 • Advantages • To know what the state of the system was at time 100 is trivial • Read only transactions are never aborted • Problems • If T2 ran concurrently with T1 and read x before T1 had written x, aborting T1 or T2 may be necessary • …

  6. But.. • Our Clocks are not perfectly synchronized • Problems caused by loosely synchronized clocks • Suppose we have transactions T3 and T4 such that • T3 wrote x • T4 read x • T3 finished before T4 started • Then, T4 must be ordered later than T3 in serialization order • i.e., T4 must read x written by T3 (or some later transaction) • Loose synchronization may, however, permit the possibility that T3’s timestamp is higher than T4’s timestamp. It will prevent T4 from reading the value of x written by T3. • To prevent this problem, Google Spanner introduces the notion of commit‐wait • Force T4 to delay thereby guarantee that its timestamp is higher than that of T3

  7. Why did this happen? • We anticipated/wanted that temporal dependency would translate into causal dependency. • T3 finished before T4 started • We wanted to T3 to impact T4 • Notion of causality captures what events can (potentially) affect other events

  8. Causality • Causality (happened before) captures the information flow • Event a happened before b iff • a and b are on the same process and a occurred before b • a is a send event and b is corresponding receive event, or • there exists event c such that • a happened before c and • c happened before b • Lamport’s logical clocks assign a timestamp to each event such that • a happened before b  l.a < l.b • Vector clocks assign a (vector) timestamp to each event such that • a happened before b  vc.a < vc.b

  9. Causality (Continued) • Implementation of Logical Clocks • When process j sends message m • l.j = l.j + 1 • l.m = l.j • For receive event where message m is received • l.j = max(l.j, l.m) + 1 • Property of logical clocks • a happened before b  l.a < l.b • l.a = l.b  l.a is concurrent with l.b • Useful to take a consistent snapshot

  10. How would logical clocks be different? • Given the expected dependency between T3 and T4 • Assign timestamp of T4 to be higher than that of T3 • Waiting not involved since it is a logical clock

  11. Let’s review what we wanted to do with (logical) clocks • When did something happen? When will it happen • This class starts at 3pm • NO • How long does something take? • This class takes 1 hour 20 minutes • NO • What happened first and happened later • The class started before it ended • YES/NO

  12. What is the problem? • Logical clocks did not convey any meaning to the actual real/physical time

  13. Goals • Problem: Given a distributed system, assign each event e a timestamp l.e, such that 1. e hb f => l.e < l.f 2. Space requirement of l.e is O(1) integers 3. l.e is represented with bounded space 4. l.e is close to pt.e i.e. |l.e – pt.e| is bounded.

  14. Naïve Algorithm Logical Clocks Naïve Algoirthm • When process j sends message m • When process j sends message m • l.j = l.j + 1 • l.j = l.j + 1 • l.m = l.j • l.j := max(l.j, pt.j) • l.m = l.j • For receive event where message m is • For receive event where message m is received received • l.j = max(l.j, l.m) + 1 • l.j = max(l.j, l.m) + 1 • l.j := max(l.j, pt.j)

  15. Naïve Algorithm  Satisfies first two requirements: 1. e hb f => l.e < l.f 2. Space requirement of l.e is O(1) integers • Fails these requirements (we will ignore proof) 1. l.e is represented with bounded space 2. l.e is close to pt.e i.e. |l.e – pt.e| is bounded. • Unbounded drift caused by • l.j := max (l.j+1, pt.j), and • l.j := max(l.j+1, l.m+1, pt.j) 15

  16. This is an example to show that drift between l and pt can increase in unbounded fashion 16

  17. Problem with Naïve algorithm • Drift between l.e and pt.e is not bounded • Why is this a problem? • Consider the case where the user wants a snapshot of a database at time t • Since no process knows the precise physical time, the snapshot provided will not be precisely at physical time t. • It will be at time t’ (that is hopefully close to t) • If clock skew is  , the best we can do is to let t’ to be in [t‐  , t+  ] 17

  18. Algorithm for Hybrid Logical Clocks Naïve Algoirthm Revised Algorithm • When process j sends message m • When process j sends message m • l.j = l.j + 1 • l.j’ = l.j • l.j := max(l.j, pt.j) • l.j := max(l.j, pt.j) • l.m = l.j • If (l.j = l.j’) c.j = c.j + 1 • Else c.j = 0 • l.m = l.j, c.m = c.j

  19. Algorithm for Hybrid Logical Clocks (Continued) Naïve Algorithm Revised Algorithm • Upon receiving m at j • Upon receiving m at j • l.j = max(l.j, l.m) + 1 • l.j’ := l.j; • l.j := max(l.j, pt.j) • l.j := max(l.j’, l.m, pt.j); • If (l.j =l.j’ =l.m) then c.j := max(c.j, c.m)+1 • Elseif (l.j’ =l.j) then c.j := c.j + 1 • Elseif (l.j =l.m) then c.j := c.m + 1 • Else c.j := 0 • l.m = l.j

  20. HLC Algorithm pt.j l’.j c.j 10,10,0 0 l’.j c.j pt.j l.j := max(l’.j, pt.j); l.m = 10 l’.j pt.j l.j := max(l’.j, l.m, pt.j); l’.j c.j pt.j c.m = 0 elseif (l.j =l.m) then c.j := c.m + 1 13,13,0 1 0,0,0 1,10,1 2,10,2 14,14,0 Reset c l.m = 10 l.m = 10 l.j := max(l’.j, l.m, pt.j); c.m = 2 c.m = 4 If (l.j =l’.j) then c.j := c.j + 1 2,10,3 3,10,4 2 l.m = 10 c.m = 4 3 3,10, 5 4,10, 6 20

  21. Properties of HLC • Logical clock property: • e hb f => (l.e, c.e) < (l.f, c.f) (lexicographical comparison) • |l.f – pt.f| <= є • pt.e <= l.e <= pt.e + є • The value c.e is bounded • c.e <= N * (number of events that can be created on a process within є) • In practice, it is very small (in single digits) 21

  22. Let’s review what we wanted to do with (hybrid logical) clocks • When did something happen? When will it happen • The class started at 3pm • Yes. Choose l value to be within epsilon of 3pm. The best we can do anyway. • How long does something take? • The class took 1 hour 20 minutes • Look at the difference between the l values • What happened first and happened later • The class started before it ended • Use lexicographic ordering (guarantees consistency with causal order)

  23. Revisiting Multiversion Database

  24. Review the earlier example • Problems caused by loosely synchronized clocks • Suppose we have transactions T3 and T4 such that • T3 wrote x • T4 read x • T3 finished before T4 started • Then, T4 must be ordered later than T3 in serialization order • i.e., T4 must read x written by T3 (or some later transaction) • Loose synchronization may, however, permit the possibility that T3’s timestamp is higher than T4’s timestamp. It will prevent T4 from reading the value of x written by T3. • To prevent this problem, Google Spanner introduces the notion of commit‐wait • Force T4 to delay thereby guarantee that its timestamp is higher than that of T3

  25. Other Choices • Alternate choice • Increase (physical) time of the machine running T4 • Unacceptable, as it would cause problems to other applications (e.g., sleep function) as well as NTP synchronization • A better choice • Create a new HLC timesdtamp for T4 that is higher than that of T3 • Leave physical time unchanged • Change l value of the timestamp (and if necessary c value) • c value is still bounded

  26. Other Applications of HLC • Causally Consistent Data Store • Rollback on Key‐Value store • Runtime monitoring partially synchronous distributed systems

  27. Moral Questions?

Recommend


More recommend