Group Communication Shan-Hung Wu and DataLab CS, NTHU Outline - PowerPoint PPT Presentation

Group Communication Shan-Hung Wu and DataLab CS, NTHU

Outline • Group Communication • Basic Abstraction – Perfect Point to Point Link – Perfect Failure Detection • Reliable Broadcast – Best Effort Broadcast – Reliable Broadcast – Uniform Reliable Broadcast • Consensus – Regular Consensus – Total Order Broadcast • Paxos – Basic Paxos – Zab – Other Variants: Multi-Paxos, FastPaxos, and Generalized Paxos 2

Group Communication • Group Communication is to provide multipoint to multipoint communication – Guarantees certain properties 4

Difficulties in Group Communication • Challenges – Message delay or loss – Out of order – Node Failure – Link Failure • Actually it is difficult to recognize whether the node or the link fails 5

Perfect Point to Point Link • How to cope with message loss? – Message retransmission and eliminating duplicates 7

Message to be sent Message to be sent p 1 p 1 p 2 p 2 Message loss 8

Perfect Point to Point Link • Properties – Reliable delivery : if neither the sender nor the receiver crashes, then the receiver eventually delivers a message sent by the sender • Keep retransmitting the message until an ACK is received – No duplication : a receiver may receive a message many times, but can only deliver it once • Sequence number – No creation : if a message is delivered, it must be sent by some process • Checksum 9

Perfect Point to Point Link • A simplified implementation without ACKs Retransmit all messages periodically 10

Perfect Failure Detection • How to detect a node failure? – Detect timeout for heartbeats – If not receiving a heartbeat from a process p for a long time, then deem p has crashed 11

Perfect Failure Detection • Uses: – PerfectPointToPointLink • Properties – Strong completeness : eventually every correct process knows which processes are still alive. • Achieved by broadcasting which nodes are failed, or everyone can detect by themselves – Strong accuracy : if a process p is detected by any process, then p has crashed • A process is detected as failure iff it has crashed 12

Perfect Failure Detection Send heartbeat messages to all processes 13

Broadcast • A broadcast abstraction enables a process to send a message to all processes in a system, including itself • A naïve approach • Try to broadcast the message to as many nodes as possible 15

Best Effort Broadcast p 1 p 2 p 3 p 4 16

Best Effort Broadcast • Uses: – PerfectPointToPointLink – PerfectFailureDetection • Properties – Best-effort validity • For any two processes p i and p j . If p i and p j are both correct, then every message broadcast by p i is eventually delivered by p j – No duplication – No creation 17

Best Effort Broadcast • How to achieve best effort broadcast ? – For the first property, the sender uses PerfectPointToPointLink to send the message to all receivers that hasn’t been detected as failure by PerfectFailureDetection – The other two properties are covered by PerfectPointToPointLink 18

Best Effort Broadcast 19

Is This Reliable? • Is best effort broadcast enough to have every correct processes receive the message ? – No. If the sender fails , rest correct processes may not deliver the message 20

Reliable Broadcast • Reliable broadcast ensures all correct processes deliver the same messages even if the sender fails • How? – If the sender is detected to have crashed, other processes will relay the message to all 21

Reliable Broadcast Detected p 1 Crash p 2 p 3 p 4 Relay 22

Reliable Broadcast • Uses: – BestEffortBroadcast – PerfectFailureDetection • Properties – Validity • If a correct process p i broadcasts a message m , then p i eventually delivers m. – No duplication – No creation – Agreement • If a message m is delivered by some correct processes p i , then m is eventually delivered by every correct process p j . 23

Reliable Broadcast Log the broadcast message Relay all broadcast messages coming from the failed process 24

Reliable Broadcast Meets Database • Can be used for GC-based eager replication? – To broadcast the effects of committed txs • Problems: – A process may deliver the messages too early – If this process crashes, other processes may not see the messages • Fails to ensure durability in DB world – Some committed txs are not propagated 25

Uniform Reliable Broadcast • Ensure the failed nodes do not deliver some other messages that others do not know • A process can only deliver the message when it knows all the other correct processes have received the message and returned an ack 26

Uniform Reliable Broadcast p 1 p 2 p 3 p 4 27

Uniform Reliable Broadcast • Uses: – BestEffortBroadcast – PerfectFailureDetection • Properties – Validity – No duplication – No creation – Uniform agreement • If a message m is delivered by some processes p i ( whether correct or faulty ), then m is also eventually delivered by every correct process p j 28

Uniform Reliable Broadcast Deliver the message only if it received ACKs from all correct processes 29

Consensus • Consensus: all participants want to decide a value • Specified in terms of two primitives: propose and decide – Each process has an initial value that it proposes for the agreement , through the primitive propose 31

Consensus • Uses: – BestEffortBroadcast – PerfectFailureDetection • Properties – Termination • Every correct process eventually decides some value. – Validity • If a process decides v , then v was proposed by some process. – Integrity • No process decides twice. – Agreement • No two correct process decide differently. 32

How? 33

Flooding Consensus • A consensus instance requires two rounds: – Round 1 • Every process proposes a value and broadcast to others • A consensus decision is reached when a process knows it has seen all proposed values that will be considered by correct processes for possible decision • The decision is made in a deterministic function • It’s ok to have many processes make the decision since the decisions should be all the same – Round 2 • The process that made the decision broadcasts the decision to all 34

Flooding Consensus Can decide upon arrival of all proposals of processes in Propose(2) current view p 1 Decide(2 = min(2, 3, 5, 7)) Propose(3) p 2 Propose(5) Decide(2) (3, 5, 7) p 3 Decide(2) Propose(7) (3, 5, 7) p 4 Cannot decide, starts another round Crash detected 35

Flooding Consensus Arrival of all proposals of processes in current view Relay the decision 36

Any Alternative? • Processes could fail during Round 1 and 2 • Why not using reliable broadcast? – All correct processes should receive all the proposals! – Every process decides (deterministically) the same – No need for round 2 any more! • However, if any process fails, the rest need to relay the proposals • Why not just relay decision? – This is exactly the purpose of the round 2! 37

Performance of Flooding Consensus • Regular: 2 steps • Each failure causes the start of a new round • Best case (no failures) – Single communication step in round 1 • Worst case (failure in every step) – N (the amount of processes) steps at most • Each step requires O(N 2 ) messages to be exchanged 38

Is This Enough for a Deterministic Database System? 39

Total Order Broadcast • Total order broadcast is a reliable broadcast communication abstraction which ensures that all processes deliver messages in the same order 40

Group Communication Shan-Hung Wu and DataLab CS, NTHU Outline - PowerPoint PPT Presentation

Group Communication Shan-Hung Wu and DataLab CS, NTHU Outline Group Communication Basic Abstraction Perfect Point to Point Link Perfect Failure Detection Reliable Broadcast Best Effort Broadcast Reliable

SK Telecom 1 U U U U U U U- U - - communication - - - - - communication

Lab 2 Group Communication Desired group communication Multicast communication Andreas

Session 12 Assessing and Developing Communication SECTION 4: 1 Communication Communication

UHF Communication System UHF Communication System UHF Communication System UHF Communication

Communication Saves Lives Communication in healthcare Communication is essential to healthcare

TACN - 2019 Tennessee Advanced Communication Network 1 Tennessee Advanced Communication Network

Leadership Using Effective Communication L eader eadership ship = = Using communication t

COMMUNICATION Task Force 1 The Communication Task Force will envision a global communication

Total Access Communication Total Access Communication Total Access Communication Total Access

Vermont Communication Support Project Communication Support In and Out of the Courtroom

Communication, Services, and Coordination Communication, Services, and Coordination Communication

Keep Communication Going Please download the Keep Communication Going Worksheet and Handouts 1

Communication Complexity Lecture 23 Computing with remote inputs 1 Communication Complexity

Group Communication Point-to-point vs. one-to-many Multicast communication Atomic

Cell Communication Topics 4.1 through 4.2 Topic 4.1 Cell Communication Importance of Cell

Disaster Communication in Mongolia Disaster Communication in Mongolia by Ms, Gombodorj Enkhzul by

DISTRIBUTED SYSTEMS: PAXOS Hakim Weatherspoon CS6410 Slides borrowed liberally from past

EECS 591 D ISTRIBUTED S YSTEMS Manos Kapritsos Fall 2020 D EALING WITH MULTIPLE PROPOSERS I

Fast Paxos Trevor Chan Outline Paxos Protocol 1. Fast Paxos Protocol 2. Consensus

Paxos Week: Return of the State Machine Doug Woos Logistics notes No in-class lecture Monday

CS425 / CSE424 / ECE428 Distributed Systems Fall

Thin-Film PV Technologies Organic PV Technology Week 5.5 Arno Smets Organic Solar Cells

Paxos Made Moderately Complex Jeremy Rubin Simple State

Finite State Machines Murray Cole Finite State Machines 1 Embedded Systems