tim e and the lack thereof outline
play

Tim e, and the lack thereof Outline Introduction A global notion - PDF document

Distributed System s Fall 2 0 0 9 Time and Synchronization Tim e, and the lack thereof Outline Introduction A global notion of the correct time would be tremendously useful. Basic definitions Synchronization algorithms Why?


  1. Distributed System s Fall 2 0 0 9 Time and Synchronization Tim e, and the lack thereof Outline • Introduction • A global notion of the correct time would be tremendously useful. • Basic definitions • Synchronization algorithms Why? – Synchronous systems – Cristian's algorithm – Consistency of distributed data, – Berkeley algorithm transactions, authenticity checks – Network Time Protocol (ticket lifetimes), duplication detection, • Summary distributed debugging and garbage detection, etc. Fall 2 0 0 9 5 DV0 2 0 3 Fall 2 0 0 9 5 DV0 2 0 4 Tim e, and the lack thereof Basic definitions • Why do we not have global time? • Distributed system is P, consisting of N processes: p i , i = 1, 2, …, N • Each process has state s i – Clocks drift, are inaccurate, may fail arbitrarily, etc. • Processes communicate only via – Time is relative, and depends on the message passing (network) observer of the timed events • Events e occur in processes • Causal relationships (cause and effect) may – Internal events not be violated – Send events – Receive events Fall 2 0 0 9 5 DV0 2 0 5 Fall 2 0 0 9 5 DV0 2 0 6

  2. Basic definitions Basic definitions • Events are ordered within a process • Clock skew by the relation → i – Instantaneous difference between e 0 → i e 1 → i e 2 readings of any two clocks • Define a history of p i as the events • Clock drift as described by → i – Variations in how clocks count time history(p i ) = h i = <e i 0 , e i 1 , e i 2 , ...> (oscillations in a crystal), which cause divergence between clocks Fall 2 0 0 9 5 DV0 2 0 7 Fall 2 0 0 9 5 DV0 2 0 8 Basic definitions Com puter clocks • Clock drift rate • Hardware clock H(t) – Change in offset between clock and a – Gives “raw” time reading perfect clock • Consumer level clocks 10 -6 seconds/second, • Software clock roughly 1 second for each 11.6 days C(t) = α H(t) + β – Scaled by OS to give accurate time – Used for timestamps Fall 2 0 0 9 5 DV0 2 0 1 0 Tim e sources Synchronization types • Coordinated Universal Time • External synchronization (abbreviated UTC, thanks to the – Processes are synchronized to external French) time source (e.g. UTC) – Atomic clocks • Internal synchronization – Used for synchronization of all kinds of equipment (e.g. your computer, GPS, – “Correct time” exists only within a fancy radio-controlled clocks, etc.) group of processes – Must not be synchronized to external source Fall 2 0 0 9 5 DV0 2 0 1 1 Fall 2 0 0 9 5 DV0 2 0 1 2

  3. Correctness and m onotonicity Synchronization algorithm s • Correctness (drift is bounded): • Internal synchronization (1 – p)(t' – t) ≤ H(t') – H(t) ≤ (1 + p)(t' – t) – In synchronous systems (trivial case) – Forbids “jumps” in hardware clocks to – Berkeley algorithm the bound p • External synchronization – Cristian's algorithm • Monotonicity (ever-increasing) – Network Time Protocol (NTP) t' > t ⇒ C(t') > C(t) – Note: only deals with software clock – Simpler, and often sufficient Fall 2 0 0 9 5 DV0 2 0 1 3 Fall 2 0 0 9 5 DV0 2 0 1 4 l l a a n n Clock synchronization in Clock synchronization in r r e e t t synchronous system s n synchronous system s n I I • Synchronous systems define • Only uncertainty is actual current bounds on all relevant parts transmission delay – Clock drift u = (max – min) – Message transmission delays – Set time to (time in response) + u/2 – Process execution step requirements – For N processes, optimum bound is u(1 - 1/N) • Send request, get response back Fall 2 0 0 9 5 DV0 2 0 1 5 l l a a n n r r e e Cristian's algorithm Cristian's algorithm t t x x E E • Only if at same LAN! But then, if minimum transmit time ( t min ) is known: • S is connected to time source – Latest time S could have placed time in m t was t min after p dispatched mr , and • p requests ( m r ) and receives ( m t ) t min before p received m t time – [t + t min , t + T round – t min ] – S records time as soon before – Width of range is (T round – 2t min ) , so transmitting message as possible accuracy is +-(T round /2 - t min ) – p knows total round-trip-time T round – Simply set time to (t + T round / 2) ? Fall 2 0 0 9 5 DV0 2 0 1 7 Fall 2 0 0 9 5 DV0 2 0 1 8

  4. l a l a n n r r e e Cristian's algorithm Berkeley algorithm t t x n E I • Single point of failure! • Uses Cristian's methods • Crashing server? • Master/Slave relationship – Multicast to group of servers • Master polls slaves • Fake servers? – Gets current time in each slave – Establish cryptographic authentication – Sends the offset from own time to each slave • Arbitrarily failing servers? • Master fails? – Have enough correct ones to achieve agreement – Crash: elect a new one! – Arbitrary failure? Oops… Fall 2 0 0 9 5 DV0 2 0 1 9 Fall 2 0 0 9 5 DV0 2 0 2 0 l l a a n n r r e e Netw ork Tim e Protocol Netw ork Tim e Protocol t t x x E E • Unlike the others, designed for • Synchronization subnets WAN rather than LAN use – Primary level (stratum) is directly connected to time source – Time servers close to the time source are more trusted – Secondary level syncs to primary, tertiary to secondary, etc. – Redundant paths → survives disconnects • High strata number means less reliable – Dynamically reconfigurable: if time – Massively scalable source goes down, primary level – Authentication of time servers to avoid becomes secondary level propagation of arbitrary failures Fall 2 0 0 9 5 DV0 2 0 2 1 Fall 2 0 0 9 5 DV0 2 0 2 2 l l a a n n r r e e Netw ork Tim e Protocol Netw ork Tim e Protocol t t x x E E • Multicast mode • All messages sent over UDP – “Time is X” between LAN nodes • For procedure-call and symmetric • Only as accurate as LAN allows mode, messages contain • Used only for unimportant nodes – Local time of previous NTP messages • Procedure-call mode between the nodes were sent and received – Similar to Cristian's algorithm – More accurate than multicast mode – Local time of current message transmission • Symmetric mode • Receiver notes local time when – Pairs of messages message is received – Used in lower strata Fall 2 0 0 9 5 DV0 2 0 2 3 Fall 2 0 0 9 5 DV0 2 0 2 4

  5. l l a a n n r r e e Netw ork Tim e Protocol Netw ork Tim e Protocol t t x x E E • For each message pair calculate o i estimated offset between clocks d i total transmission time (delay) • True offset is denoted o (without the index) • Denote transmission time of m as t, • Delay in Server B may be non- and that of m' as t' negligible • Messages may be lost along the way Fall 2 0 0 9 5 DV0 2 0 2 5 Fall 2 0 0 9 5 DV0 2 0 2 6 l l a a n n r r e e Netw ork Tim e Protocol Netw ork Tim e Protocol t t x x E E T i-2 = T i-3 + t + o • Since t, t' ≥ 0, we know that T i = T i-1 + t' – o o i – d i /2 ≤ o ≤ o i + d i /2 leads to d i = t + t' = T i-2 – T i-3 + T i – T i-1 • Or, in English: o i is an estimate of also the offset, and d i is a measure of its accuracy o = o i + (t' – t)/2 , where o i = (T i-2 – T i-3 + T i-1 - T i )/2 Fall 2 0 0 9 5 DV0 2 0 2 7 Fall 2 0 0 9 5 DV0 2 0 2 8 l a n r e Netw ork Tim e Protocol Sum m ary t x E • Pairs are retained for quality • We do not have universal time calculations – But we can synchronize clocks “reasonably well” anyway • NTP peers communicate with many • Internal vs. external other peers, to decrease error synchronization • Real-time systems must use more sophisticated algorithms than what we have seen during this lecture! Fall 2 0 0 9 5 DV0 2 0 2 9 Fall 2 0 0 9 5 DV0 2 0 3 0

  6. Sum m ary Next lecture • Algorithms • Logical time – Synchronous system (trivial) • Global states – Cristian's algorithm • Distributed debugging • Used in many others – Berkeley algorithm • Master/Slave application of Cristian's for internal synchronization – Network Time Protocol • Suitable for WANs • Message pairs Fall 2 0 0 9 5 DV0 2 0 3 1 Fall 2 0 0 9 5 DV0 2 0 3 2

Recommend


More recommend