Failure Modes Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” Node “Commit” unreceived: (1) Re-sent “Prepare” can be ignored. (2) Node still able to abort. 55
Failure Modes Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” CRASH! 56
Failure Modes Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” CRASH! Node 2 crashes after responding: Restart from log 56
Failure Cases Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” We are go for Commit “Commit” “ACK” 57
Failure Cases Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” We are go for Commit “Commit” “ACK” Coordinator “Commit” unreceived: Commit must happen, coordinator resends 57
Failure Cases Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” We are go for Commit “Commit” CRASH! “ACK” 58
Failure Cases Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” We are go for Commit “Commit” CRASH! “ACK” Node 2 crash: Restart. Already logged “Commit” message, so all is well. 58
Failure Cases Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” We are go for Commit “Commit” “ACK” “ACK” 59
Failure Cases Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” We are go for Commit “Commit” “ACK” “ACK” Node “Ack” unreceived: Ok. Resent “Commit” ignored by node 59
Failure Cases Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” We are go for Commit “Commit” “ACK” “ACK” CRASH! 60
Failure Cases Coordinator Node 1 Node 2 “Prepare” “Commit” “Commit” We are go for Commit “Commit” “ACK” “ACK” CRASH! Node crash after “Ack”: Ok. Log already recorded commit 60
Replication • Mode 1 : Periodic Backups • Copy the replicated data nightly/hourly. • Mode 2 : Log Shipping • Only send changes (replica serves as the log). 61
Replication • Mode 1 : Periodic Backups • Copy the replicated data nightly/hourly. • Mode 2 : Log Shipping • Only send changes (replica serves as the log). 61
Replication • Ensuring durability • Ensuring write-consistency under 2PC • Ensuring read-consistency without 2PC 62
Ensuring Durability When is a replica write durable? 63
Ensuring Durability Never. 64
Ensuring Durability Never. What you should be asking is how much durability do you need? 64
Ensuring Durability For N Failures N+1 Replicas (Assuming Failure = Crash) 65
Ensuring Write Consistency Coordinator Node 1 “Prepare” “Commit” Node 1 asserts that the commit is durable! What if Node 1 fails? 66
Ensuring Write Consistency Coordinator Node 1 Replica “Prepare” “Prepare” “Commit” “Commit” 67
Ensuring Write Consistency Coordinator Node 1 Replica “Prepare” “Prepare” “Commit” “Commit” Waiting for Node 1 to replicate is slooooow! Let the coordinator take over! 67
Ensuring Write Consistency Coordinator Node 1 Replica “Prepare” “Commit” “Commit” 68
Ensuring Write Consistency Coordinator Node 1 Replica “Prepare” “Commit” “Commit” Like 2PC… … but better. We may not need to wait for the replica 68
Ensuring Write-Consistency Coordinator Coordinator Alice Bob A: Prepare A: Prepare A: Prepare B: Prepare B: Prepare B: Prepare Replica 1 Replica 2 Replica 3 69
Ensuring Write-Consistency Coordinator Coordinator Alice Bob B: Prepare B: Prepare A: Prepare A: Prepare A: Prepare B: Prepare Replica 1 Replica 2 Replica 3 70
Ensuring Write-Consistency Coordinator Coordinator Alice Bob B: Prepare B: Prepare A: Prepare Commit! Commit! Replica 1 Replica 2 Replica 3 71
Ensuring Write-Consistency Majority Vote N Replicas ( N / 2 )+1 Votes Needed 72
Ensuring Read Consistency Forget transactions, let’s go back to reads & writes Can we do better than 2PC if we don’t need xacts? 73
(1) Alice writes ‘A’ Replica 1 W(A = 3) Replica 2 Replica 3 74
(1) Alice writes ‘A’ Replica 1 W(A = 3) Replica 2 Replica 3 (2) Alice tells Bob 75
(1) Alice writes ‘A’ (3) Bob reads ‘A’ Replica 1 W(A = 3) R(A) Replica 2 Replica 3 (2) Alice tells Bob 76
(1) Alice writes ‘A’ (3) Bob reads ‘A’ Replica 1 W(A = 42) R(A) Replica 2 What can we do to guarantee Replica 3 that Bob will (2) Alice tells Bob see the 42? 77
Recommend
More recommend