Distributed Systems and Databases of the Globe Unite! The Cloud, the Edge and Blockchains Amr El Abbadi University of California, Santa Barbara Divy Agrawal , Mohammad Amiri, Sujaya Maiyya, Faisal Nawab (UCSC), Victor Zakhary. OPODIS 2018 1
Protocols Supporting the Cloud • Scalability: • Shard or Partition the Data. Commit Protocols 2PC • Fault‐tolerance and fast access: • Replicate the Data State Machine Replication and Consensus Protocols Paxos OPODIS 2018 2
Google’s Spanner Application Access Tier Application Execution Tier Transactions 2PL+2PC … Datacenter A Datacenter B Datacenter Z Storage Tier Abstract Replication PAXOS OPODIS 2018 3
A Path for Unification OPODIS 2018 4
PAXOS OPODIS 2018 5
Paxos: No failure Case • Leader Election: Initially, a leader is elected by a majority quorum. • Replication: L eader replicates new updates to a majority quorum. • Decision: Propagate decision to all asynchronously Leader Fault‐Tolerant Decision Election Agreement Leader Proposer Asynchronou s Majority OPODIS 2018 6
Paxos: Failure Case • Leader Election: If the leader fails, a new leader is elected Also, Value Discovery in case agreement has been reached. Leader Fault‐Tolerant Decision Election Agreement A Majority OPODIS 2018 7
Atomic Commitment OPODIS 2018 8
Two Phase Commit: No Failure Case: • Leader : Initially, a Coordinator is chosen by transaction manager. • Value Discovery: Coordinator collects votes from ALL cohorts • If all yes, Decision=Commit, if any (no or failed) Decision=Abort • Fault‐Tolerance: Make Decision persistent on disk Make decision • Decision : Send Decision to all cohorts Fault‐tolerant by storing on Value Decision disk Discovery Coordinator All cohorts OPODIS 2018 9
Three Phase Commit • 2PC has possibility of Blocking • Solution: 3 Phase Commit. • Replicate decision to other cohorts (like Paxos) to avoid site failure blocking. Value Fault‐Tolerant Decision Discovery Agreement Coordinator Majority All Cohorts OPODIS 2018 10
Three Phase Commit: Termination • If leader fails or partitioned ‐ Elect new leader and execute termination protocol Leader Election & Fault‐Tolerant Decision Value Discovery Agreement Leader Majority Majority Cohorts OPODIS 2018 11
Common phases observed? • Paxos and 2PC/3PC are leader based protocols • Agreement on a single value is the main goal • Both protocols ensure fault tolerance on the decided value • Disseminate the decision , typically asynchronously OPODIS 2018 12
Consensus & Commitment (C&C) Framework Leader Value Fault‐tolerant Decision Election Discovery Agreement OPODIS 2018 13
Paxos Atomic Commitment (PAC) • Any processes can terminate a transaction: leader election • No separate termination case (like Paxos) Leader Election & Fault‐Tolerant Decision Value Discovery Agreement Leader Majority All Cohorts OPODIS 2018 14
2PC/State Machine Replication (SMR) • Alternative approach to achieve fault‐tolerance • Replicate state of each process for persistence • Spanner and Gray and Lamport 2006 • Layered architecture: 2PC on top of SMR • 2PC among coordinator and cohorts • SMR among shard leaders and replicas Fault‐Tolerant Replicas Persistence Coordinator Leader Cohorts Leader Fault‐Tolerant Replicas Persistence OPODIS 2018 15
2PC/State Machine Replication (SMR) • Alternative approach to achieve fault‐tolerance • Replicate state of each process for persistence • Spanner and Gray and Lamport 2006 • Layered architecture: 2PC on top of SMR • 2PC among leaders of coordinator and cohorts • SMR among shard leader and replicas Fault‐Tolerant Replicas Persistence Majority Coordinator Leader Value Discovery Decision Cohort Leader Majority Majority Fault‐Tolerant Replicas Persistence OPODIS 2018 16
Generalized‐PAC (G‐PAC) • Follows the abstractions of C&C • Flattened architecture: • No notion of cohort leader and replica Coordinator all identical replicas • Reduces one round‐trip communication • Related to other consolidating consensus and commitment like TAPIR [Zhang SOSP 2015] and Janus [Mu OSDI 2016] • Restrictive assumptions OPODIS 2018 17
G‐PAC (Generalized Paxos Atomic Commit) A majority of A majority of replicas from ALL replicas from a cohorts majority of cohorts Coordinator Majority All Cohort 1 Replicas Cohort 2 Replicas Leader Election + Fault‐Tolerant Decision Agreement Value Discovery OPODIS 2018 18
Consensus & Commitment (C&C) Framework Leader Value Fault‐tolerant Decision Election Discovery Agreement • Useful in modeling many existing data management protocols as well as propose new protocols OPODIS 2018 19
Consensus for Edge Data Management OPODIS 2018 20
The future of web/cloud applications • Emerging technologies • Business Analytics • Virtual/Augmented Reality • Data Science • Sensors/IoT OPODIS 2018 21/46
The Cloud • Big potential, but bigger challenges • Application Requirements: • Real‐time (low latency) • Continuous data flows (high throughput) • Challenge 1: The cloud is far away 100 of milliseconds to seconds OPODIS 2018 22/46
Is there a principled approach to decentralize the cloud for large scale replication? OPODIS 2018 23
Edge Data Management OPODIS 2018 24/46
We are making the world a better place through Paxos algorithms OPODIS 2018 25
Flexible Paxos [Howard et. al. OPODIS 2016] • Majority quorums for BOTH Leader Election AND Replication are too conservative Majority quorum Majority quorum 1 2 3 4 5 6 7 OPODIS 2018 26
Flexible Paxos • Generalized Quorum Condition: Only Leader Election Quorums and Replication Quorums must intersect. • Decouple Leader Election Quorums from Replication Quorums • Arbitrarily small replication quorums as long as Leader Election Quorums intersect with every Replication Quorum • No changes to Paxos algorithms Leader Election Quorum 1 2 3 4 5 6 7 Replication Quorum OPODIS 2018 27
Back to Edge Data Management • Edge persistence: edge datacenters store copies of data • Storage offloading: data placed in the edge near users OPODIS 2018 28
• A zone : • Mutually exclusive set of nodes • Datacenter + edge nodes • Or Edge nodes Beirut 2018
An edge‐aware Paxos • Direct application of Flexible Paxos to zones. • Elect a leader zone rather than a leader node Paxos Edge Paxos • • Replicate updates to majority Replicate updates to majority of all nodes of nodes in the leader zone • • Leader election: majority of all Leader election: majority from within all zones . nodes Leader Zone Zone 2 Zone 4 Zone 1 Zone 3 OPODIS 2018 30
An edge‐aware, mobile Paxos Zone 2 Zone 4 Zone 1 Zone 3 Leader Election Local replication Local replication Local replication OPODIS 2018 31
CAN WE DO BETTER??? OPODIS 2018 32
Expanding Quorums • Dynamic Expanding Leader Election Quorums: • A leader announces the Replication Quorum it will use • Future leader election quorums need intersect only announced quorums • Implementation • Intent Replication Quorums are piggybacked in the leader election phase • To detect Intents, leader election quorums must intersect • If an announcement is detected, the Leader Election Quorum expands to intersect the announced Intent Replication Quorums OPODIS 2018 33
Expanding Quorums example Zone 5 Zone 2 Zone 4 Zone 1 Zone 3 Leader Election {intent: zone 1} Local replication Leader Election {intent: zone 5} Leader Election expansion X Local replication Local replication OPODIS 2018 34
Leader Zone Expanding Quorums • Can we design smaller Leader Election quorums? • Leader Zone: Assign one zone as Leader Manager Zone • Leader Election quorums: Majority of nodes in the Leader Manager Zone • All Leader Election quorums intersect • `Use Intent Quorums to expand Leader Election Quorums. • Especially useful if the aspiring leaders are close to each other OPODIS 2018 35/46
Allowing for Mobility: Leader Handoff • Treat leadership as a logical role instead of physical • Relinquish leadership to another node when user moves • Note: node hosting the previous leader is functional Leader Leader relinquish() ‐ current state ‐ slots [] Zone A Zone B Zone C OPODIS 2018 36
Dynamic Paxos: a natural marriage with Edge Computing OPODIS 2018 37
Blockchains • Many interesting (controversial?) problems in new guises. • Distributed Systems: Consensus, replication, etc • Data Management: Transactions, replication, commitment, etc OPODIS 2018 38
Origins of Blockchain: Traditional Banking Systems OPODIS 2018 39
Bitcoin OPODIS 2018 40
Recommend
More recommend