distributed systems and databases of the globe unite the
play

Distributed Systems and Databases of the Globe Unite! The Cloud, the - PowerPoint PPT Presentation

Distributed Systems and Databases of the Globe Unite! The Cloud, the Edge and Blockchains Amr El Abbadi University of California, Santa Barbara Divy Agrawal , Mohammad Amiri, Sujaya Maiyya, Faisal Nawab (UCSC), Victor Zakhary. OPODIS 2018 1


  1. Distributed Systems and Databases of the Globe Unite! The Cloud, the Edge and Blockchains Amr El Abbadi University of California, Santa Barbara Divy Agrawal , Mohammad Amiri, Sujaya Maiyya, Faisal Nawab (UCSC), Victor Zakhary. OPODIS 2018 1

  2. Protocols Supporting the Cloud • Scalability: • Shard or Partition the Data.  Commit Protocols  2PC • Fault‐tolerance and fast access: • Replicate the Data  State Machine Replication and Consensus Protocols  Paxos OPODIS 2018 2

  3. Google’s Spanner Application Access Tier Application Execution Tier Transactions 2PL+2PC … Datacenter A Datacenter B Datacenter Z Storage Tier Abstract Replication PAXOS OPODIS 2018 3

  4. A Path for Unification OPODIS 2018 4

  5. PAXOS OPODIS 2018 5

  6. Paxos: No failure Case • Leader Election: Initially, a leader is elected by a majority quorum. • Replication: L eader replicates new updates to a majority quorum. • Decision: Propagate decision to all asynchronously Leader Fault‐Tolerant Decision Election Agreement Leader Proposer Asynchronou s Majority OPODIS 2018 6

  7. Paxos: Failure Case • Leader Election: If the leader fails, a new leader is elected Also, Value Discovery in case agreement has been reached. Leader Fault‐Tolerant Decision Election Agreement A Majority OPODIS 2018 7

  8. Atomic Commitment OPODIS 2018 8

  9. Two Phase Commit: No Failure Case: • Leader : Initially, a Coordinator is chosen by transaction manager. • Value Discovery: Coordinator collects votes from ALL cohorts • If all yes, Decision=Commit, if any (no or failed) Decision=Abort • Fault‐Tolerance: Make Decision persistent on disk Make decision • Decision : Send Decision to all cohorts Fault‐tolerant by storing on Value Decision disk Discovery Coordinator All cohorts OPODIS 2018 9

  10. Three Phase Commit • 2PC has possibility of Blocking • Solution: 3 Phase Commit. • Replicate decision to other cohorts (like Paxos) to avoid site failure blocking. Value Fault‐Tolerant Decision Discovery Agreement Coordinator Majority All Cohorts OPODIS 2018 10

  11. Three Phase Commit: Termination • If leader fails or partitioned ‐  Elect new leader and execute termination protocol Leader Election & Fault‐Tolerant Decision Value Discovery Agreement Leader Majority Majority Cohorts OPODIS 2018 11

  12. Common phases observed? • Paxos and 2PC/3PC are leader based protocols • Agreement on a single value is the main goal • Both protocols ensure fault tolerance on the decided value • Disseminate the decision , typically asynchronously OPODIS 2018 12

  13. Consensus & Commitment (C&C) Framework Leader Value Fault‐tolerant Decision Election Discovery Agreement OPODIS 2018 13

  14. Paxos Atomic Commitment (PAC) • Any processes can terminate a transaction: leader election • No separate termination case (like Paxos) Leader Election & Fault‐Tolerant Decision Value Discovery Agreement Leader Majority All Cohorts OPODIS 2018 14

  15. 2PC/State Machine Replication (SMR) • Alternative approach to achieve fault‐tolerance • Replicate state of each process for persistence • Spanner and Gray and Lamport 2006 • Layered architecture: 2PC on top of SMR • 2PC among coordinator and cohorts • SMR among shard leaders and replicas Fault‐Tolerant Replicas Persistence Coordinator Leader Cohorts Leader Fault‐Tolerant Replicas Persistence OPODIS 2018 15

  16. 2PC/State Machine Replication (SMR) • Alternative approach to achieve fault‐tolerance • Replicate state of each process for persistence • Spanner and Gray and Lamport 2006 • Layered architecture: 2PC on top of SMR • 2PC among leaders of coordinator and cohorts • SMR among shard leader and replicas Fault‐Tolerant Replicas Persistence Majority Coordinator Leader Value Discovery Decision Cohort Leader Majority Majority Fault‐Tolerant Replicas Persistence OPODIS 2018 16

  17. Generalized‐PAC (G‐PAC) • Follows the abstractions of C&C • Flattened architecture: • No notion of cohort leader and replica Coordinator  all identical replicas • Reduces one round‐trip communication • Related to other consolidating consensus and commitment like TAPIR [Zhang SOSP 2015] and Janus [Mu OSDI 2016] • Restrictive assumptions OPODIS 2018 17

  18. G‐PAC (Generalized Paxos Atomic Commit) A majority of A majority of replicas from ALL replicas from a cohorts majority of cohorts Coordinator Majority All Cohort 1 Replicas Cohort 2 Replicas Leader Election + Fault‐Tolerant Decision Agreement Value Discovery OPODIS 2018 18

  19. Consensus & Commitment (C&C) Framework Leader Value Fault‐tolerant Decision Election Discovery Agreement • Useful in modeling many existing data management protocols as well as propose new protocols OPODIS 2018 19

  20. Consensus for Edge Data Management OPODIS 2018 20

  21. The future of web/cloud applications • Emerging technologies • Business Analytics • Virtual/Augmented Reality • Data Science • Sensors/IoT OPODIS 2018 21/46

  22. The Cloud • Big potential, but bigger challenges • Application Requirements: • Real‐time (low latency) • Continuous data flows (high throughput) • Challenge 1: The cloud is far away 100 of milliseconds to seconds OPODIS 2018 22/46

  23. Is there a principled approach to decentralize the cloud for large scale replication? OPODIS 2018 23

  24. Edge Data Management OPODIS 2018 24/46

  25. We are making the world a better place through Paxos algorithms OPODIS 2018 25

  26. Flexible Paxos [Howard et. al. OPODIS 2016] • Majority quorums for BOTH Leader Election AND Replication are too conservative Majority quorum Majority quorum 1 2 3 4 5 6 7 OPODIS 2018 26

  27. Flexible Paxos • Generalized Quorum Condition: Only Leader Election Quorums and Replication Quorums must intersect. • Decouple Leader Election Quorums from Replication Quorums • Arbitrarily small replication quorums as long as Leader Election Quorums intersect with every Replication Quorum • No changes to Paxos algorithms Leader Election Quorum 1 2 3 4 5 6 7 Replication Quorum OPODIS 2018 27

  28. Back to Edge Data Management • Edge persistence: edge datacenters store copies of data • Storage offloading: data placed in the edge near users OPODIS 2018 28

  29. • A zone : • Mutually exclusive set of nodes • Datacenter + edge nodes • Or Edge nodes Beirut 2018

  30. An edge‐aware Paxos • Direct application of Flexible Paxos to zones. • Elect a leader zone rather than a leader node Paxos Edge Paxos • • Replicate updates to majority Replicate updates to majority of all nodes of nodes in the leader zone • • Leader election: majority of all Leader election: majority from within all zones . nodes Leader Zone Zone 2 Zone 4 Zone 1 Zone 3 OPODIS 2018 30

  31. An edge‐aware, mobile Paxos Zone 2 Zone 4 Zone 1 Zone 3 Leader Election Local replication Local replication Local replication OPODIS 2018 31

  32. CAN WE DO BETTER??? OPODIS 2018 32

  33. Expanding Quorums • Dynamic Expanding Leader Election Quorums: • A leader announces the Replication Quorum it will use • Future leader election quorums need intersect only announced quorums • Implementation • Intent Replication Quorums are piggybacked in the leader election phase • To detect Intents, leader election quorums must intersect • If an announcement is detected, the Leader Election Quorum expands to intersect the announced Intent Replication Quorums OPODIS 2018 33

  34. Expanding Quorums example Zone 5 Zone 2 Zone 4 Zone 1 Zone 3 Leader Election {intent: zone 1} Local replication Leader Election {intent: zone 5} Leader Election expansion X Local replication Local replication OPODIS 2018 34

  35. Leader Zone Expanding Quorums • Can we design smaller Leader Election quorums? • Leader Zone: Assign one zone as Leader Manager Zone • Leader Election quorums: Majority of nodes in the Leader Manager Zone • All Leader Election quorums intersect • `Use Intent Quorums to expand Leader Election Quorums. • Especially useful if the aspiring leaders are close to each other OPODIS 2018 35/46

  36. Allowing for Mobility: Leader Handoff • Treat leadership as a logical role instead of physical • Relinquish leadership to another node when user moves • Note: node hosting the previous leader is functional Leader Leader relinquish() ‐ current state ‐ slots [] Zone A Zone B Zone C OPODIS 2018 36

  37. Dynamic Paxos: a natural marriage with Edge Computing OPODIS 2018 37

  38. Blockchains • Many interesting (controversial?) problems in new guises. • Distributed Systems: Consensus, replication, etc • Data Management: Transactions, replication, commitment, etc OPODIS 2018 38

  39. Origins of Blockchain: Traditional Banking Systems OPODIS 2018 39

  40. Bitcoin OPODIS 2018 40

Recommend


More recommend