Fault Tolerance and Inconsistent Information in Distributed Systems Dr Vladimir Z. Tosic 1 Term 2 2020
Complete your myExperience and shape the future of education at UNSW. Click the link in Moodle or login to myExperience.unsw.edu.au (use z1234567@ ad .unsw.edu.au to login) The survey is confidential, your identity will never be released Survey results are not released to teaching staff until after your results are published
MAIN TOPICS IN THE LAST LECTURE… ( NOT IN THE BEN-ARI TEXTBOOK! ) • Ricart-Agrawala algorithm demo in DA in DAJ • Revision of message-passing using CSP channels • The actor or model for message-passing concurrency • Brief overview of some other dist istrib ribute uted d messag age- passing ing and dist istrib ribute uted d shared d memory ory paradigm igms • Notes on some other concepts for programming asynchron chronou ous s dist istrib ribute uted d syste stems ms 3
MAIN TOPICS IN THIS LECTURE… ( BEN-ARI TEXTBOOK CHAPTER 12 ) • Fault lt toler leran ance ce and inc inconsisten istent t inf inform rmation ation in distributed systems – the problem of consen ensus sus • By Byzant ntine General rals algorithm hm explanation • Byzantine Generals algorithm examples and demo in in DA DAJ • King ing algo lgorit ithm hm explanation and examples 4
CONSENSUS – INTRODUCTION From Chapter 12 in Ben- Ari’s Textbook book and additional materials 5
FAIL-SAFE AND FAULT-TOLERANT DISTRIBUTED SYSTEM – DEFINITIONS • Re Reli liabil ilit ity has several meanings • We focus on 2 aspects: 1. Fail-safe 1. afe – 1 or more failures do not damage system or users 2. 2. Fault lt-toler tolerant nt – system continues to fulfill its requirements even when 1 or more failures happen • E.g. Ricart-Agrawala algorithm for distributed mutual exclusion is NOT fault-tolerant because it will deadlock when 1 node fails 6
CRASH FAILURES VS. BYZANTINE FAILURES – DEFINITIONS • Cr Crash h fail ilure – failed node(s) stop sending messages • Assu sume e we know a node crashed (e.g. timeout occurred) • By Byzant ntine failure – failed/malfunctioning node(s) can send arbitrary messages, possibly even according to a malicious plan • We mu must accou count t for the worst t poss ssibl ible e scenar enario, i.e. the biggest negative impact of messages by this failed node • Name comes from Byzantine Empire (Eastern Roman Empire, 395 – 1453) that had many civil wars and treasons 7
EXAMPLE ARCHITECTURE OF A RELIABLE DS USING REPLICATION • Many complications are possible • E.g. different sensors give somewhat different data • E.g. all CPUs run software with the same bug 8 • No No absolut lute reli liabil ilit ity in DS, always some limits
REPLICATION, PARTITIONING, REDUNDANCY – DEFINITIONS • Re Repli lication ion – multiple nodes doing the same work • Apart from replication, there are other ways to achieve reliability • Notably: part rtiti itioning ing (each node does independent subset of processing) with redundancy ndancy (additional information that enables discovering and fixing some failures) • E.g. parity/CRC in RAID • Many uses of these methods, e.g. in cloud computing 9
BROADER CONTEXT – AUTONOMIC / SELF-MANAGING SYSTEMS • Automating work of network/system administrators • Self lf-ma mana nageme gement nt: self-healing, self- adaptation, … • Au Autonomic onomic comput uting – an IBM self-management initiative • Analogy with human autonomous nervous system • Use of various artifi ificial cial int intell llige igence (AI) I) techniques to make decisions using inc incomple lete te or inc inconsiste istent nt inf inform rmation ation • Not only technical, but also busine iness ss information (e.g. costs and benefits of various options) 10
CONSENSUS – PROBLEM DESCRIPTION • Each node choses init initial ial value lue • E.g. result of measurement or computation • It is required that all nodes in the system decide to use the same value lue – 1 of the initial choices of these nodes • If no faults: each node sends its choice to every other node and then a decision is made using some algorithm (e.g. majority voting) to obtain consensu nsus s value lue • All nodes have the same data and run the same decision algorithm, so they all decide upon the same value • If f there ere are faults lts : … [to be discussed in this lesson!] 11
(CONSENSUS EXAMPLE) COMMITMENT – PROBLEM DESCRIPTION (1/2) • n agents collaborate on a database base tran ansact saction ion • Each of the agents has done its share of the transaction • They want to come to an agreement on whether to commit it the transa nsaction ction results for later use by other transactions • Each agent formed an init initial ial vote but has not yet made the final decision • All that remains to be done is to ensure re that t no two agents ts make dif iffe ferent rent decisio isions 12
(CONSENSUS EXAMPLE) COMMITMENT – PROBLEM DESCRIPTION (2/2) • All agents that reach a decision reach the same e one • If there are no fail ilures s and all ll agents s vote ted d to commit mit, then the decision reached is to commit mit • If an agent decides to commit, this means that all agents voted to commit • Failure model: Only agents can fail, and if they do then they crash sh 13
(CONSENSUS EXAMPLE) COMMITMENT SOLUTION – 2-PHASE COMMIT • A dist isting inguishe ished d agent, e.g. #1, collects the other agents' votes • If f all ll vote tes ( incl. #1’s) are “commit” then #1 tells all other agents to commit mit • Otherwise ( if any agent voted “abort” or any agent did not send it its vote e e.g. . it it crash shed ed), #1 tells all other agents to abort rt • “All or nothing” • 2-Phase Commit solves the commitment problem but may fail to terminate if processes fail 14
CONSENSUS – THE NEED FOR SYNCHRONY • Theorem rem: It is impossible for a set of processes in an asynchronous distributed system to agree on a binary value, even if only a single process is subject to an unannounced crash • Proof of by contr trad adicti iction (sketch): Assume correct decisions made by algorithm; its result depends on some process – but if this process crashes then the other processes must choose arbitrarily and sometimes will make wrong decisions; contradiction with the assumption ∎ • Co Conclus lusion ion: Some synchrony is needed to reach consensus in presence of faults; it also helps tolerate some Byzantine failures 15
THE BYZANTINE GENERALS ALGORITHM From Chapter 12 in Ben- Ari’s Text xtbo book ok 16
(CONSENSUS EXAMPLE) BYZANTINE GENERALS - PROBLEM DESCRIPT. (1/2) • Several ral Byzantine ntine genera rals (each with own army) decide whether er to attack tack some enemy or to retr treat at (to avoid defeat) • To win, win, they y must st AL ALL attack tack togeth ether er; if they do not attack all together, they will be defeated • There are reli liable le messen senge gers rs delivering messages between the generals • Some of the generals might be trai aitor tors working towards defeat • Devise algorithm so all ll loy loyal l generals als come to consen ensus sus plan lan based on majority ty vote e of initial choices and if tied choose se retr trea eat 17
(CONSENSUS EXAMPLE) BYZANTINE GENERALS - PROBLEM DESCRIPT. (2/2) • Analogy with real-life distributed systems: • Genera ral – potentially failed/malfunctioning node • Trait itor or – failed/malfunctioning node • Messen enge ger – reliable communications channel • BG algorithm executing concurr urrentl ently with underlying computation • Messages of BG algorithm disjo isjoint int from computation messages • Messages of BG algorithm are synch chrono ronous us: request with reply • In send/receive statements message types are omitted 18
BYZANTINE GENERALS ALG. 1-ROUND VERSION - PSEUDOCODE • Note: planType = {A; R} for attack and retreat 19
BYZANTINE GENERALS ALG. 1-ROUND VERSION – ERROR SCENARIO • Zoe (attack) and Leo (retreat) are loyal, Basil (attack) is traitor • Basil sends A to Leo, but then crashes before sending to Zoe • Leo chooses A, Zoe chooses R – no consensus nsus by loyal ge general als 20
BYZANTINE GENERALS ALG. 1-ROUND VERSION – ERROR DISCUSSION • 1-Round Algorithm cannot ot toler lerat ate 1 c crash sh fail ilure among 3 ge genera rals • Because not using the fact that certain generals are loyal • This scenario can be extended ended to an arbitrary number of generals • Even if just few generals crash, they can cause no consensus if vote is very close in 1-Round Algorithm • Idea: Relay received messages in a further round 21
BYZANTINE GENERALS ALGORITHM – MAIN IDEAS AND DATA STRUCTURES • Fir irst t round: Each general sends own plan to all other generals and receives plans from them • After it, array y pla lan holds lds pla lans s of all ll generals rals • Subsequ quent ent round(s) d(s): Each general sends all other generals what it received from other generals about their plans and receives such reports from the other generals • Loyal generals relay always what they have received • Matrix ix cell ll reported rtedPlan Plan[G [G,G ,G ’] store ores s pla lan that t G reporte orted d receiv eiving ng from G’ 22
Recommend
More recommend