Verteilte Systeme (Distributed Systems) Karl M. Göschka Karl.Goeschka@tuwien.ac.at http://www.infosys.tuwien.ac.at/teaching/courses/ VerteilteSysteme/
Lecture 6: Clocks and Agreement Synchronization of physical clocks Logical clocks and ordering Distributed mutual exclusion Election Global state
Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned an earlier time. Time is so basic to the way people think! 3
Physical Clocks (1) Computation of the mean solar day. 4
Time and Clocks Historically, time has been measured astronomically: Solar day (transit of the sun) and solar second as 1/86400 of a solar day Earth‘s rotation is not constant (core turbulence) and slowing down (tidal friction, atmospheric drag) mean solar second (GMT) 9.192.631.770 transitions of Cesium 133 International Atomic Time (TAI) at the BIH Coordinated Universal Time (UTC): UTC second = TAI second, but leap seconds keep UTC in phase with solar time 5
Physical Clocks (2) TAI 0 1 2 3 4 5 6 7 solar leap second second 0 1 2 3 4 5 6 UTC 0 1 2 3 3 4 5 6 TAI seconds are of constant length, unlike solar seconds. Leap seconds are introduced when necessary to keep UTC in phase with the sun. 6
Timer A timer is a counter that counts clock ticks Crystal oszillator Battery backed CMOS RAM (initial setting) Clock offset, skew, drift (different definitions in literature!) UTC is provided e.g. by National Institute of Standard Time (NIST): WWV, GEOS, GPS,... Real-time systems need actual clock time synchronize with real-world time (external) synchronize with each other (internal) 7
Clock Synchronization Algorithms The relation between clock time and UTC ticking at different rates. Maximum drift rate determines required re-synchronization interval. 8
Network Time Protocol (1) T 3 ‘=T 3 - θ T 2 ‘=T 2 - θ Getting the current time from a time server. 9
Network Time Protocol (2) Time must never run backward All nodes adjust (advance/slow down) their clocks locally Estimate/measure propagation delay Estimate offset and compute accuracy Take best (minimum delay) of eight measures Use multiple sources to improve accuracy Hierarchical precision (strata) ~ms (WAN), ~µs (LAN), ~ns (with hardware support, e.g., IEEE 1588) Security? 10
Network Time Protocol (3) Stratum 0 Stratum 1 Stratum 2 Stratum 3 NTP precision levels 11
Attacking time synchronization 12
The Berkeley Algorithm a) The time daemon asks all the other machines for their clock values b) The machines answer c) The time daemon tells everyone how to adjust their clock 14
Clock Synchronization in Wireless (1) e.g., sensor networks nodes are resource constrained multihop routing is expensive optimize algorithms for energy consumption RBS – Reference Broadcast Synchronization internal sync (no absolute clock) only receivers synchronize (based on receipt of reference message) signal propagation time ~ constant (without multihop routing) 15
Clock Synchronization in Wireless (2) The usual critical The critical path in path in determining the case of RBS. network delays. 16
Lecture 6: Clocks and Agreement Synchronization of physical clocks Logical clocks and ordering Distributed mutual exclusion Election Global state
Time vs. Order (logical time) Synchronous system: Algorithms are easier to model, but clock synchronization needed Asynchronous system: Today‘s reality, but many design problems can not be solved with deterministic algorithms However, often no global clock and no clock synchronization are needed: It is sufficient to agree on the order of events (logical clocks) – time is relative, anyway Then, some events are ordered, some are „concurrent“ (partial order) 19
Making clocks move forward This situation must Fixed! be prevented In many cases, wall clock time does not matter. All we care about is relative time. (L. Lamport) (This is not true in some real-time systems .) 20
Happened-before (1) Definition of logical clocks based on the happened-before relation to order events sequentially in a distributed system: Events in one process are ordered (local clock) Message send happens before message receive happened-before is transitive Events that are not ordered are concurrent (partial ordering) Similar to physical causality, therefore also called potential causal ordering 21
Happened-before (2) p 1 a b m 1 Physical p 2 time c d m 2 p 3 e f Feynman (space-time) diagrams document causality Relationship is transitive: a happened-before f Imposes a partial order (not total): a b c d f e||(a,b,c,d), but e f 22
Logical clock implementation Captures happened-before ordering numerically Lamport timestamps Each node keeps a counter (LC): 1. Increment LC before each event (computation, send, receive) 2. On message send, piggyback LC 3. On message receive set local LC to max(Local LC, Received LC) (time can only move forward) and then apply rule 1 for receipt (+1). Total order by adding process ID a b L(a) < L(b), but the converse is not true! 23
Lamport clocks in middleware The positioning of Lamport’s logical clocks in distributed systems. 24
Example: Inconsistent replication Problem due to message delays and lack of global time If (non-commutative) updates arrive in different orders at the two sites, the databases will become inconsistent. We could require all messages to arrive at all nodes in the same order (Which may be too strong also see causal). 25
Synchronizing multicast messages Assume data is replicated on several servers Updates to data are performed by clients Update request is multicast to all servers Multicast messages arrive in different orders at different servers How to ensure consistency of data at all servers? Order message deliveries at servers… Differentiate between receipt and delivery 26
Totally-Ordered Multicast clients multicast their updates with (Lamport) timestamp (FIFO, reliable) upon receipt, the message is put into local queue ordered by timestamp server acknowledges receipt of requests by multicast (for total ordering) eventually all processes will have the same copy of the local queue a message that is at the head of the queue and has been acknowledged by all processes is delivered to server process (respective ACKs are deleted) updates may not be done in “correct (?) order” but they are done in the same order at all nodes 31
Vector Clocks - Principle Logical clocks order ? related events; nothing can be said about unrelated events Problem with Lamport timestamps: L(a)<L(b) ≠ > a b Rather: L(a)<L(b) (a b) or (a || b) too restrictive Concurrent message transmission using logical clocks. 32
Vector Clocks - Example (1,0,0) (2,0,0) p 1 a b m 1 (2,1,0) (2,2,0) Physical p 2 time c d m 2 (2,2,2) (0,0,1) p 3 e f 33
Vector Clocks - Algorithm 1. Initially, V i [j]=0 2. Before P i timestamps an event, V i [i]:=V i [i]+1 3. P i includes V i in every message it sends 4. When P i receives a timestamp t in a message, it sets V i [j]:=max(V i [j],t[j]) (merge operation), and then applies rule 2 for receipt. 34
Vector Clocks – Usage V i [i] is the number of events P i has timestamped V i [j] (j ≠ i) is the number of events occurred at P j on which P i may causally depend Comparison of vector clocks: V=V’ iff V[j]=V’[j] j V ≤ V’ iff V[j] ≤ V’[j] j V<V’ iff V ≤ V’ and V ≠ V’ Now, V(a)<V(b) a b (and vice-versa) Disadvantage: more storage and message payload optimizations exist 35
Causal ordering using vector timestamps (0,1,0) (0,1,1) (0,2,1) (0,0,0) P1 (0,1,0) P2 (0,0,0) (0,1,1) (0,2,1) P3 (0,0,0) (0,1,0) (0,1,1) 39
Lecture 6: Clocks and Agreement Synchronization of physical clocks Logical clocks and ordering Distributed mutual exclusion Election Global state
Mutual Exclusion Coordinate activities, share resources critical section (monitor, semaphor) locally assisted by OS in turn assisted by HW in order to guarantee atomic operations Distributed mutex: based solely on message passing: Safety: At most one process may execute in the critical section at a time Liveness: Requests to enter and exit the critical section eventually succeed (no deadlock, no starvation) Ordering: Happened-before (fairness) 42
A Centralized Algorithm a) Process 1 asks the coordinator for permission to enter a critical region. Permission is granted b) Process 2 then asks permission to enter the same critical region. The coordinator does not reply. c) When process 1 exits the critical region, it tells the coordinator, which then replies to 2 43
Recommend
More recommend