Last Time � We t alked about t he pot ent ial benef it s of 19: Dist ribut ed Coordinat ion dist ribut ed syst ems � We also t alked about some of t he reasons t hey can be so dif f icult t o build Last Modif ied: � Today we are going t o t ackle some of t hese 7/ 3/ 2004 1:50:34 PM problems! -1 -2 Recall Dist ribut ed Coordinat ion � To t ackle t his complexit y we are going � Dist ribut ed syst ems t o build dist ribut ed algorit hms f or: � Component s can f ail (not f ail- st op) � Event Or der ing � Net wor k par t it ions can occur in which each � Mut ual Exclusion por t ion of t he dist r ibut ed syst em t hinks t hey � At omicit y ar e t he only ones alive � Deadlock Handling � Don’t have a shar ed clock � Elect ion Algor it hms � Can’t r ely on har dwar e pr imit ives like t est-and- � Reaching Agr eement set f or mut ual exclusion � … -3 -4 Event Ordering Happens-bef ore � P roblem: dist ribut ed syst ems do not share � Def ine a Happens-bef ore relat ion (denot ed by → ). a clock � Many coor dinat ion pr oblems would be simplif ied � 1) I f A and B ar e event s in t he same pr ocess, and A was execut ed bef or e B , t hen A → B . if t hey did (“f ir st one wins”) � Dist ribut ed syst ems do have some sense of � 2) I f A is t he event of sending a message by one pr ocess and B is t he event of r eceiving t hat t ime message by anot her pr ocess, t hen A → B . � Event s in a single pr ocess happen in or der � 3) I f A → B and B → C t hen A → C . � Messages bet ween pr ocesses must be sent bef or e t hey can be r eceived � How helpf ul is t his? -5 -6 1
Tot al ordering? P art ial Ordering � Happens-bef ore gives a part ial ordering of event s � We st ill do not have a t ot al or der ing of event s Pi -> Pi+1; Qi -> Qi+1; Ri -> Ri+1 R0-> Q4; Q3-> R4; Q1-> P4; P1 -> Q2 -7 -8 Timest amps Tot al Ordering? � Assume each pr ocess has a local logical clock t hat t icks once per event and t hat t he pr ocesses ar e number ed � Clocks t ick once per event (including message send) � When send a message, send your clock value � When receive a message, set your clock t o MAX( your clock, t imest amp of message + 1) • Thus sending comes bef or e r eceiving P0, P 1, Q0, Q1, Q2, P 2, P 3, P 4, Q3, R0, Q4, R1, R2, R3, R4 • Only visibilit y int o act ions at ot her nodes happens dur ing communicat ion, communicat e synchr onizes t he clocks P0, Q0, Q1, P 1, Q2, P 2, P 3, P 4, Q3, R0, Q4, R1, R2, R3, R4 � I f t he t imest amps of t wo event s A and B are t he same, t hen use t he process ident it y numbers t o break t ies. P0, Q0, P 1, Q1, Q2, P 2, P 3, P 4, Q3, R0, Q4, R1, R2, R3, R4 � This gives a t ot al or der ing! -9 -10 Dist ribut ed Mut ual Exclusion Solut ion (DME) � Problem: We can no longer rely on j ust an � We present t hree algorit hms t o ensure t he at omic t est and set oper at ion on a single mut ual exclusion execut ion of processes in machine t o build mut ual exclusion t heir crit ical sect ions. primit ives � Cent r alized Dist r ibut ed Mut ual Exclusion (CDME) � Requirement � Fully Dist r ibut ed Mut ual Exclusion (DDME) � I f P i is execut ing in it s cr it ical sect ion, t hen no � Token passing ot her pr ocess P j is execut ing in it s cr it ical sect ion. -11 -12 2
CDME: Cent ralized Approach P roblems of CDME � One of t he pr ocesses in t he syst em is chosen t o � Elect ing t he mast er process? Har dcoded? coor dinat e t he ent r y t o t he cr it ical sect ion. � Single point of f ailur e? Elect ing a new � A process t hat want s t o ent er it s crit ical sect ion sends a request message t o t he coordinat or. mast er process? � The coordinat or decides which process can ent er t he crit ical � Dist ribut ed Elect ion algorit hms lat er… sect ion next , and it s sends t hat process a reply message. � When t he process receives a r eply message f rom t he coordinat or, it ent ers it s crit ical sect ion. � Af t er exit ing it s crit ical sect ion, t he process sends a release message t o t he coordinat or and proceeds wit h it s execut ion. � 3 messages per cr it ical sect ion ent r y -13 -14 DDME: Fully Dist ribut ed DDME: Fully Dist ribut ed Approach Approach (Cont .) � When pr ocess P i want s t o ent er it s cr it ical sect ion, � The decision whet her pr ocess P j r eplies it gener at es a new t imest amp, TS , and sends t he immediat ely t o a r equest ( P i , TS ) message or message r equest ( P i , TS ) t o all ot her pr ocesses in def er s it s r eply is based on t hr ee f act or s: t he syst em. � I f P j is in it s crit ical sect ion, t hen it def ers it s reply t o � When pr ocess P j r eceives a request message, it P i . may r eply immediat ely or it may def er sending a � I f P j does not want t o ent er it s crit ical sect ion, t hen it reply back. sends a reply immediat ely t o P i . � When pr ocess P i r eceives a reply message f rom all � I f P j want s t o ent er it s crit ical sect ion but has not yet ot her pr ocesses in t he syst em, it can ent er it s ent ered it , t hen it compares it s own request t imest amp cr it ical sect ion. wit h t he t imest amp TS . • I f it s own request t imest amp is great er t han TS , t hen it � Af t er exit ing it s cr it ical sect ion, t he pr ocess sends a r eply immediat ely t o P i ( P i asked f ir st ). sends reply messages t o all it s def er r ed r equest s. • Ot herwise, t he reply is def erred. -15 -16 P roblems of DDME Token P assing � Requir es complet e t r ust t hat ot her pr ocesses will � Circulat e a t oken among processes in t he play f air syst em � Easy t o cheat j ust by delaying t he reply! � P ossession of t he t oken ent it les t he holder � The pr ocesses needs t o know t he ident it y of all t o ent er t he crit ical sect ion ot her pr ocesses in t he syst em � Makes t he dynamic addit ion and removal of processes � Organize processes in syst em int o a logical more complex. r ing � I f one of t he pr ocesses f ails, t hen t he ent ir e scheme collapses. � Pass t oken ar ound t he r ing � Dealt wit h by cont inuously monit oring t he st at e of all t he � When you get it , ent er cr it ical sect ion if need processes in t he syst em. t o t hen pass it on when you ar e done (or j ust � Const ant ly bot her ing people who don’t car e pass it on if don’t need it ) � Can I ent er my crit ical sect ion? Can I ? -17 -18 3
Compare: Number of P roblems of Token P assing Messages? � I f machines wit h t oken f ails, how t o � CDME: 3 messages per cr it ical sect ion regenerat e a new t oken? ent ry � A lot like elect ing a new coordinat or � DDME: The number of messages per cr it ical-sect ion ent ry is 2 x ( n – 1) � I f process f ails, need t o repair t he break � Request / r eply f or ever yone but myself in t he logical ring � Token passing: Bet ween 0 and n messages � Might luck out and ask f or t oken while I have it or when t he per son r ight bef or e me has it � Might need t o wait f or t oken t o visit ever yone else f ir st -19 -20 Compare : St arvat ion Why DDME? � CDME : Fr eedom f r om st ar vat ion is ensur ed if coor dinat or uses FI FO � Harder � DDME: Fr eedom f r om st ar vat ion is ensur ed, since � More messages ent r y t o t he cr it ical sect ion is scheduled accor ding t o t he t imest amp or der ing. The t imest amp � Bot hers more people or der ing ensur es t hat pr ocesses ar e ser ved in a � Coordinat or j ust as bot hered f ir st-come, f ir st ser ved or der . � Token Passing: Fr eedom f r om st ar vat ion if r ing is unidir ect ional � Caveat s � net work reliable (I .e. machines not “st arved” by inabilit y t o communicat e) � I f machines f ail t hey are rest art ed or t aken out of considerat ion (I .e. machines not “st arved” by nonresponse of coordinat or or anot her part icipant ) � P rocesses play by t he rules -21 -22 At omicit y Replica Consist ency P roblem � Recall: At omicit y = eit her all t he � I magine we have mult iple bank ser ver s and a client desir ing t o updat e t heir back account operat ions associat ed wit h a program unit � How can we do t his? are execut ed t o complet ion, or none are � Allow a client t o updat e any ser ver t hen have perf ormed. ser ver pr opagat e updat e t o ot her ser ver s � I n a dist ribut ed syst em may have mult iple � Simple and wrong! copies of t he dat a , replicas are good f or � Simult aneous and conf lict ing updat es can occur at dif f erent servers? reliabilit y/ availabilit y � Have client send updat e t o all ser ver s � Same problem - race condit ion – which of t he conf lict ing � PROBLEM: How do we at omically updat e all updat e will reach each server f irst of t he copies? -23 -24 4
Recommend
More recommend