distributed oltp databases part ii
play

Distributed OLTP Databases (Part II) Lecture # 23 Database Systems - PowerPoint PPT Presentation

Distributed OLTP Databases (Part II) Lecture # 23 Database Systems Andy Pavlo AP AP Computer Science 15-445/15-645 Carnegie Mellon Univ. Fall 2018 2 LAST CLASS System Architectures Shared-Memory, Shared-Disk, Shared-Nothing


  1. Distributed OLTP Databases (Part II) Lecture # 23 Database Systems Andy Pavlo AP AP Computer Science 15-445/15-645 Carnegie Mellon Univ. Fall 2018

  2. 2 LAST CLASS System Architectures → Shared-Memory, Shared-Disk, Shared-Nothing Partitioning/Sharding → Hash, Range, Round Robin Transaction Coordination → Centralized vs. Decentralized CMU 15-445/645 (Fall 2018)

  3. 3 DECEN TRALIZED CO O RDIN ATO R Partitions Begin Request P1 P2 Application Server P3 P4 CMU 15-445/645 (Fall 2018)

  4. 3 DECEN TRALIZED CO O RDIN ATO R Partitions Query P1 P2 Query Application Server P3 P4 Query CMU 15-445/645 (Fall 2018)

  5. 3 DECEN TRALIZED CO O RDIN ATO R Partitions Commit Request P1 P2 Safe to commit? Application Server P3 P4 CMU 15-445/645 (Fall 2018)

  6. 4 O BSERVATIO N We have not discussed how to ensure that all nodes agree to commit a txn and then to make sure it does commit if we decide that it should. → What happens if a node fails? → What happens if our messages show up late? → What happens if we don't wait for every node to agree? CMU 15-445/645 (Fall 2018)

  7. 5 TO DAY'S AGEN DA Atomic Commit Protocols Replication Consistency Issues (CAP) Federated Databases CMU 15-445/645 (Fall 2018)

  8. 6 ATO M IC CO M M IT PROTO CO L When a multi-node txn finishes, the DBMS needs to ask all of the nodes involved whether it is safe to commit. Examples: → Two-Phase Commit → Three-Phase Commit (not used) → Paxos → Raft → ZAB (Apache Zookeeper) → Viewstamped Replication CMU 15-445/645 (Fall 2018)

  9. 7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant Application Server Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  10. 7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant Application Server Phase1: Prepare Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  11. 7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  12. 7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Phase2: Commit Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  13. 7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant OK OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Phase2: Commit OK Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  14. 7 TWO - PH ASE CO M M IT (SUCCESS) Success! Participant Application Server Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  15. 8 TWO - PH ASE CO M M IT (ABO RT) Commit Request Participant Application Server Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  16. 8 TWO - PH ASE CO M M IT (ABO RT) Commit Request Participant Application Server Phase1: Prepare Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  17. 8 TWO - PH ASE CO M M IT (ABO RT) Aborted Participant Application Server Node 2 ABORT! Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  18. 8 TWO - PH ASE CO M M IT (ABO RT) Aborted Participant Application Server Node 2 ABORT! Coordinator Participant Phase2: Abort Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  19. 8 TWO - PH ASE CO M M IT (ABO RT) Aborted Participant OK Application Server Node 2 ABORT! Coordinator Participant Phase2: Abort OK Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  20. 9 2PC O PTIM IZATIO N S Early Prepare Voting → If you send a query to a remote node that you know will be the last one you execute there, then that node will also return their vote for the prepare phase with the query result. Early Acknowledgement After Prepare → If all nodes vote to commit a txn, the coordinator can send the client an acknowledgement that their txn was successful before the commit phase finishes. CMU 15-445/645 (Fall 2018)

  21. 10 EARLY ACKN OWLEDGEM EN T Commit Request Participant Application Server Phase1: Prepare Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  22. 10 EARLY ACKN OWLEDGEM EN T Commit Request Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  23. 10 EARLY ACKN OWLEDGEM EN T Success! Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  24. 10 EARLY ACKN OWLEDGEM EN T Success! Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Phase2: Commit Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  25. 10 EARLY ACKN OWLEDGEM EN T Success! Participant OK OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Phase2: Commit OK Node 1 Node 3 CMU 15-445/645 (Fall 2018)

  26. 11 TWO - PH ASE CO M M IT Each node has to record the outcome of each phase in a stable storage log. What happens if coordinator crashes? → Participants have to decide what to do. What happens if participant crashes? → Coordinator assumes that it responded with an abort if it hasn't sent an acknowledgement yet. CMU 15-445/645 (Fall 2018)

  27. 12 PAXO S Consensus protocol where a coordinator proposes an outcome (e.g., commit or abort) and then the participants vote on whether that outcome should succeed. Does not block if a majority of participants are available and has provably minimal message delays in the best case. Lamport CMU 15-445/645 (Fall 2018)

  28. 12 PAXO S Consensus protocol where a coordinator proposes an outcome (e.g., commit or abort) and then the participants vote on whether that outcome should succeed. Does not block if a majority of participants are available and has provably minimal message delays in the best case. CMU 15-445/645 (Fall 2018)

  29. 13 PAXO S Acceptor Commit Request Node 2 Acceptor Application Server Node 3 Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

  30. 13 PAXO S Acceptor Commit Request Node 2 Acceptor Application Server Propose Node 3 Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

  31. 13 PAXO S Acceptor Commit Request Node 2 X Acceptor Application Server Propose Node 3 Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

  32. 13 PAXO S Acceptor Agree Commit Request Node 2 X Acceptor Application Server Propose Node 3 Agree Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

  33. 13 PAXO S Acceptor Agree Commit Request Node 2 X Acceptor Application Server Propose Node 3 Commit Agree Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

  34. 13 PAXO S Acceptor Agree Commit Request Accept Node 2 X Acceptor Application Server Propose Node 3 Commit Agree Proposer Acceptor Accept Node 1 Node 4 CMU 15-445/645 (Fall 2018)

  35. 13 PAXO S Acceptor Success! Node 2 X Acceptor Application Server Node 3 Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

  36. 14 PAXO S Proposer Acceptors Proposer TIM E CMU 15-445/645 (Fall 2018)

  37. 14 PAXO S Proposer Acceptors Proposer Propose(n) TIM E CMU 15-445/645 (Fall 2018)

  38. 14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) TIM E CMU 15-445/645 (Fall 2018)

  39. 14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) TIM E CMU 15-445/645 (Fall 2018)

  40. 14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E CMU 15-445/645 (Fall 2018)

  41. 14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E Reject(n,n+1) CMU 15-445/645 (Fall 2018)

  42. 14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E Reject(n,n+1) Agree(n+1) CMU 15-445/645 (Fall 2018)

  43. 14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E Reject(n,n+1) Agree(n+1) Commit(n+1) CMU 15-445/645 (Fall 2018)

  44. 14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E Reject(n,n+1) Agree(n+1) Commit(n+1) Accept(n+1) CMU 15-445/645 (Fall 2018)

  45. 15 M ULTI- PAXO S If the system elects a single leader that is in charge of proposing changes for some period of time, then it can skip the PREPARE phase. → Fall back to full Paxos whenever there is a failure. The system has to periodically renew who the leader is. CMU 15-445/645 (Fall 2018)

  46. 16 2PC VS. PAXO S Two-Phase Commit → Blocks if coordinator fails after the prepare message is sent, until coordinator recovers. Paxos → Non-blocking as long as a majority participants are alive, provided there is a sufficiently long period without further failures. CMU 15-445/645 (Fall 2018)

  47. 17 REPLICATIO N The DBMS can replicate data across redundant nodes to increase availability. Design Decisions: → Replica Configuration → Propagation Scheme → Propagation Timing CMU 15-445/645 (Fall 2018)

Recommend


More recommend