D ISTRIBUTED S YSTEMS [COMP9243] S YNCHRONOUS VS A SYNCHRONOUS D ISTRIBUTED S YSTEMS Lecture 7 (A): Synchronisation and Coordination Timing model of a distributed system Part 1 Slide 1 Slide 3 Affected by: ➜ Execution speed/time of processes ➀ Distributed Algorithms ➜ Communication delay ➁ Time and Clocks ➜ Clocks & clock drift ➂ Global State ➃ Concurrency Control D ISTRIBUTED A LGORITHMS Synchronous Distributed System: Time variance is bounded Algorithms that are intended to work in a distributed environment Execution : bounded execution speed and time Used to accomplish tasks such as: Communication : bounded transmission delay ➜ Communication Clocks : bounded clock drift (and differences in clocks) ➜ Accessing resources Slide 2 Slide 4 ➜ Allocating resources Effect: ➜ Consensus ➜ Can rely on timeouts to detect failure ➜ etc. � Easier to design distributed algorithms Synchronisation and coordination inextricably linked to � Very restrictive requirements distributed algorithms • Limit concurrent processes per processor Why? ➜ Achieved using distributed algorithms • Limit concurrent use of network Why? • Require precise clocks and synchronisation ➜ Required by distributed algorithms S YNCHRONOUS VS A SYNCHRONOUS D ISTRIBUTED S YSTEMS 1 S YNCHRONOUS VS A SYNCHRONOUS D ISTRIBUTED S YSTEMS 2
Asynchronous Distributed System: Time variance is not bounded S YNCHRONISATION AND C OORDINATION Execution : different steps can have varying duration Communication : transmission delays vary widely Important: Clocks : arbitrary clock drift Doing the right thing at the right time. Slide 5 Slide 7 Effect: Two fundamental issues: ➜ Allows no assumption about time intervals ➜ Coordination (the right thing) � Cannot rely on timeouts to detect failure ➜ Synchronisation (the right time) � Most asynch DS problems hard to solve � Solution for asynch DS is also a solution for synch DS ➜ Most real distributed systems are hybrid synch and asynch E VALUATING D ISTRIBUTED A LGORITHMS Key Properties: C OORDINATION ➀ Safety: Nothing bad happens ➁ Liveness: Something good eventually happens Coordinate actions and agree on values. General Properties: Coordinate Actions: ➜ Performance ➜ What actions will occur • number of messages exchanged Slide 6 Slide 8 • response/wait time ➜ Who will perform actions • delay, throughput: 1 / ( delay + executiontime ) Agree on Values: • complexity: O () ➜ Agree on global value ➜ Efficiency ➜ Agree on environment • resource usage: memory, CPU, etc. ➜ Agree on state ➜ Scalability ➜ Reliability • number of points of failure (low is good) S YNCHRONISATION AND C OORDINATION 3 S YNCHRONISATION 4
S YNCHRONISATION Ordering of all actions ➜ Total ordering of events T IME AND C LOCKS Slide 9 Slide 11 ➜ Total ordering of instructions ➜ Total ordering of communication ➜ Ordering of access to resources ➜ Requires some concept of time T IME Global Time: ➜ ’Absolute’ time M AIN I SSUES • Einstein says no absolute time Time and Clocks: synchronising clocks and using time in • Absolute enough for our purposes distributed algorithms ➜ Astronomical time Slide 10 Slide 12 Global State: how to acquire knowledge of the system’s • Based on earth’s rotation • Not stable global state ➜ International Atomic Time (IAT) Concurrency Control: coordinating concurrent access to • Based on oscillations of Cesium-133 resources ➜ Coordinated Universal Time (UTC) • Leap seconds • Signals broadcast over the world T IME AND C LOCKS 5 T IME 6
P HYSICAL C LOCKS Based on actual time: ➜ C p ( t ) : current time (at UTC time t ) on machine p ➜ Ideally C p ( t ) = t � Clock differences causes clocks to drift ➜ Must regularly synchronise with UTC Local Time: Computer Clocks: ➜ Relative not ’absolute’ Slide 13 Slide 15 ➜ Crystal oscillates at known frequency ➜ Not synchronised to Global source ➜ Oscillations cause timer interrupts ➜ Timer interrupts update clock Clock Skew: ➜ Crystals in different computers run at slightly different rates ➜ Clocks get out of sync ➜ Skew: instantaneous difference ➜ Drift: rate of change of skew S YNCHRONISING P HYSICAL C LOCKS U SING C LOCKS IN C OMPUTERS Internal Synchronisation: Timestamps: ➜ Clocks synchronise locally ➜ Used to denote at which time an event occurred ➜ Only synchronised with each other Synchronisation Using Clocks: External Synchronisation: Slide 14 Slide 16 ➜ Performing events at an exact time (turn lights on/off, ➜ Clocks synchronise to an external time source lock/unlock gates) ➜ Synchronise with UTC every δ seconds ➜ Logging of events (for security, for profiling, for debugging) ➜ Tracking (tracking a moving object with separate cameras) Time Server: ➜ Make (edit on one computer build on another) ➜ Server that has the correct time ➜ Ordering messages ➜ Server that calculates the correct time P HYSICAL C LOCKS 7 B ERKELEY A LGORITHM 8
B ERKELEY A LGORITHM N ETWORK T IME P ROTOCOL (NTP) Time daemon Hierarchy of Servers: 3:00 3:00 3:05 3:00 0 +5 ➜ Primary Server: has UTC clock 3:00 -10 +15 ➜ Secondary Server: connected to primary 3:00 +25 -20 ➜ etc. Network Slide 17 Slide 19 Synchronisation Modes: 2:50 3:25 2:50 3:25 3:05 3:05 Multicast: for LAN, low accuracy (a) (b) (c) Procedure Call: clients poll, reasonable accuracy Accuracy: 20-25 milliseconds Symmetric: Between peer servers. highest accuracy When is this useful? C RISTIAN ’ S A LGORITHM Time Server: ➜ Has UTC receiver Synchronisation: ➜ Passive ➜ Estimate clock offsets and transmission delays between two Algorithm: nodes ➜ Clients periodically request the time ➜ Keep estimates for past communication Slide 18 Slide 20 ➜ Don’t set time backward Why not? ➜ Choose offset estimate for lowest transmission delay ➜ Take propagation and interrupt handling delay into account ➜ Also determine unreliable servers • ( T 1 − T 0) / 2 ➜ Accuracy 1 - 50 msec • Or take a series of measurements and average the delay ➜ Accuracy: 1-10 millisec (RTT in LAN) What is a drawback of this approach? N ETWORK T IME P ROTOCOL (NTP) 9 L AMPORT 10
The relation → is a partial order: L AMPORT ➜ If a → b , then a causally affects b ➜ Safety, Liveness ➜ We consider unordered events to be concurrent: ➜ Logical clocks and vector clocks a �→ b and b �→ a implies a � b Example: ➜ Snapshots E 11 E 12 E 13 E 14 ➜ Byzantine generals P 1 ➜ Paxos consensus ➜ TLA+, LaTeX Slide 21 Slide 23 ➜ Turing Award 2013 P 2 E 21 E 22 E 23 E 24 Comments about his pa- Real Time pers: Google: lamport my ➜ Causally related: E 11 → E 12 , E 13 , E 14 , E 23 , E 24 , . . . writings E 21 → E 22 , E 23 , E 24 , E 13 , E 14 , . . . ➜ Concurrent: E 11 � E 21 , E 12 � E 22 , E 13 � E 23 , E 11 � E 22 , E 13 � E 24 , E 14 � E 23 , . . . L OGICAL C LOCKS Lamport’s logical clocks: ➜ Software counter to locally compute the happened-before Event ordering is more important than physical time: relation → ➜ Events (e.g., state changes) in a single process are ordered ➜ Each process p i maintains a logical clock L i ➜ Processes need to agree on ordering of causally related events ➜ Lamport timestamp: (e.g., message send and receive) • L i ( e ) : timestamp of event e at p i Local ordering: • L ( e ) : timestamp of event e at process it occurred at ➜ System consists of N processes p i , i ∈ { 1 , . . . , N } Implementation: Slide 22 Slide 24 ➜ Local event ordering → i : ➀ Before timestamping a local event p i executes L i := L i + 1 If p i observes e before e ′ , we have e → i e ′ ➁ Whenever a message m is sent from p i to p j : Global ordering: • p i executes L i := L i + 1 and sends L i with m ➜ Leslie Lamport’s happened before relation → • p j receives L i with m and executes L j := max( L j , L i ) + 1 ( receive ( m ) is annotated with the new L j ) ➜ Smallest relation, such that 1. e → i e ′ implies e → e ′ Properties: 2. For every message m , send ( m ) → receive ( m ) ➜ a → b implies L ( a ) < L ( b ) 3. Transitivity: e → e ′ and e ′ → e ′′ implies e → e ′′ ➜ L ( a ) < L ( b ) does not necessarily imply a → b L OGICAL C LOCKS 11 L OGICAL C LOCKS 12
Recommend
More recommend