Distributed OLTP Databases (Part II) Lecture # 23 Database Systems - PowerPoint PPT Presentation

Distributed OLTP Databases (Part II) Lecture # 23 Database Systems Andy Pavlo AP AP Computer Science 15-445/15-645 Carnegie Mellon Univ. Fall 2018

2 LAST CLASS System Architectures → Shared-Memory, Shared-Disk, Shared-Nothing Partitioning/Sharding → Hash, Range, Round Robin Transaction Coordination → Centralized vs. Decentralized CMU 15-445/645 (Fall 2018)

3 DECEN TRALIZED CO O RDIN ATO R Partitions Begin Request P1 P2 Application Server P3 P4 CMU 15-445/645 (Fall 2018)

3 DECEN TRALIZED CO O RDIN ATO R Partitions Query P1 P2 Query Application Server P3 P4 Query CMU 15-445/645 (Fall 2018)

3 DECEN TRALIZED CO O RDIN ATO R Partitions Commit Request P1 P2 Safe to commit? Application Server P3 P4 CMU 15-445/645 (Fall 2018)

4 O BSERVATIO N We have not discussed how to ensure that all nodes agree to commit a txn and then to make sure it does commit if we decide that it should. → What happens if a node fails? → What happens if our messages show up late? → What happens if we don't wait for every node to agree? CMU 15-445/645 (Fall 2018)

5 TO DAY'S AGEN DA Atomic Commit Protocols Replication Consistency Issues (CAP) Federated Databases CMU 15-445/645 (Fall 2018)

6 ATO M IC CO M M IT PROTO CO L When a multi-node txn finishes, the DBMS needs to ask all of the nodes involved whether it is safe to commit. Examples: → Two-Phase Commit → Three-Phase Commit (not used) → Paxos → Raft → ZAB (Apache Zookeeper) → Viewstamped Replication CMU 15-445/645 (Fall 2018)

7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant Application Server Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant Application Server Phase1: Prepare Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Phase2: Commit Node 1 Node 3 CMU 15-445/645 (Fall 2018)

7 TWO - PH ASE CO M M IT (SUCCESS) Commit Request Participant OK OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Phase2: Commit OK Node 1 Node 3 CMU 15-445/645 (Fall 2018)

7 TWO - PH ASE CO M M IT (SUCCESS) Success! Participant Application Server Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

8 TWO - PH ASE CO M M IT (ABO RT) Commit Request Participant Application Server Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

8 TWO - PH ASE CO M M IT (ABO RT) Commit Request Participant Application Server Phase1: Prepare Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

8 TWO - PH ASE CO M M IT (ABO RT) Aborted Participant Application Server Node 2 ABORT! Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

8 TWO - PH ASE CO M M IT (ABO RT) Aborted Participant Application Server Node 2 ABORT! Coordinator Participant Phase2: Abort Node 1 Node 3 CMU 15-445/645 (Fall 2018)

8 TWO - PH ASE CO M M IT (ABO RT) Aborted Participant OK Application Server Node 2 ABORT! Coordinator Participant Phase2: Abort OK Node 1 Node 3 CMU 15-445/645 (Fall 2018)

9 2PC O PTIM IZATIO N S Early Prepare Voting → If you send a query to a remote node that you know will be the last one you execute there, then that node will also return their vote for the prepare phase with the query result. Early Acknowledgement After Prepare → If all nodes vote to commit a txn, the coordinator can send the client an acknowledgement that their txn was successful before the commit phase finishes. CMU 15-445/645 (Fall 2018)

10 EARLY ACKN OWLEDGEM EN T Commit Request Participant Application Server Phase1: Prepare Node 2 Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

10 EARLY ACKN OWLEDGEM EN T Commit Request Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

10 EARLY ACKN OWLEDGEM EN T Success! Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Node 1 Node 3 CMU 15-445/645 (Fall 2018)

10 EARLY ACKN OWLEDGEM EN T Success! Participant OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Phase2: Commit Node 1 Node 3 CMU 15-445/645 (Fall 2018)

10 EARLY ACKN OWLEDGEM EN T Success! Participant OK OK Application Server Phase1: Prepare Node 2 OK Coordinator Participant Phase2: Commit OK Node 1 Node 3 CMU 15-445/645 (Fall 2018)

11 TWO - PH ASE CO M M IT Each node has to record the outcome of each phase in a stable storage log. What happens if coordinator crashes? → Participants have to decide what to do. What happens if participant crashes? → Coordinator assumes that it responded with an abort if it hasn't sent an acknowledgement yet. CMU 15-445/645 (Fall 2018)

12 PAXO S Consensus protocol where a coordinator proposes an outcome (e.g., commit or abort) and then the participants vote on whether that outcome should succeed. Does not block if a majority of participants are available and has provably minimal message delays in the best case. Lamport CMU 15-445/645 (Fall 2018)

12 PAXO S Consensus protocol where a coordinator proposes an outcome (e.g., commit or abort) and then the participants vote on whether that outcome should succeed. Does not block if a majority of participants are available and has provably minimal message delays in the best case. CMU 15-445/645 (Fall 2018)

13 PAXO S Acceptor Commit Request Node 2 Acceptor Application Server Node 3 Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

13 PAXO S Acceptor Commit Request Node 2 Acceptor Application Server Propose Node 3 Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

13 PAXO S Acceptor Commit Request Node 2 X Acceptor Application Server Propose Node 3 Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

13 PAXO S Acceptor Agree Commit Request Node 2 X Acceptor Application Server Propose Node 3 Agree Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

13 PAXO S Acceptor Agree Commit Request Node 2 X Acceptor Application Server Propose Node 3 Commit Agree Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

13 PAXO S Acceptor Agree Commit Request Accept Node 2 X Acceptor Application Server Propose Node 3 Commit Agree Proposer Acceptor Accept Node 1 Node 4 CMU 15-445/645 (Fall 2018)

13 PAXO S Acceptor Success! Node 2 X Acceptor Application Server Node 3 Proposer Acceptor Node 1 Node 4 CMU 15-445/645 (Fall 2018)

14 PAXO S Proposer Acceptors Proposer TIM E CMU 15-445/645 (Fall 2018)

14 PAXO S Proposer Acceptors Proposer Propose(n) TIM E CMU 15-445/645 (Fall 2018)

14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) TIM E CMU 15-445/645 (Fall 2018)

14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) TIM E CMU 15-445/645 (Fall 2018)

14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E CMU 15-445/645 (Fall 2018)

14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E Reject(n,n+1) CMU 15-445/645 (Fall 2018)

14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E Reject(n,n+1) Agree(n+1) CMU 15-445/645 (Fall 2018)

14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E Reject(n,n+1) Agree(n+1) Commit(n+1) CMU 15-445/645 (Fall 2018)

14 PAXO S Proposer Acceptors Proposer Propose(n) Agree(n) Propose(n+1) Commit(n) TIM E Reject(n,n+1) Agree(n+1) Commit(n+1) Accept(n+1) CMU 15-445/645 (Fall 2018)

15 M ULTI- PAXO S If the system elects a single leader that is in charge of proposing changes for some period of time, then it can skip the PREPARE phase. → Fall back to full Paxos whenever there is a failure. The system has to periodically renew who the leader is. CMU 15-445/645 (Fall 2018)

16 2PC VS. PAXO S Two-Phase Commit → Blocks if coordinator fails after the prepare message is sent, until coordinator recovers. Paxos → Non-blocking as long as a majority participants are alive, provided there is a sufficiently long period without further failures. CMU 15-445/645 (Fall 2018)

17 REPLICATIO N The DBMS can replicate data across redundant nodes to increase availability. Design Decisions: → Replica Configuration → Propagation Scheme → Propagation Timing CMU 15-445/645 (Fall 2018)

Distributed OLTP Databases (Part II) Lecture # 23 Database Systems - PowerPoint PPT Presentation

Distributed OLTP Databases (Part II) Lecture # 23 Database Systems Andy Pavlo AP AP Computer Science 15-445/15-645 Carnegie Mellon Univ. Fall 2018 2 LAST CLASS System Architectures Shared-Memory, Shared-Disk, Shared-Nothing

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming

Modern OLTP Indexes (Part 2) 1 / 43 Modern OLTP Indexes (Part 2) Recap Recap 2 / 43 Modern OLTP

OLAP and Data Mining Chapter 17 OLTP Compared With OLAP On Line Transaction Processing

YMMV The The Las Last Si t Six Mon x Months ths Prison Life GOOD EVIL NVM OLTP DRAM

Distributed OLTP Databases (Part I) Lecture # 22 Andy Pavlo Database Systems AP AP Computer

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Distributed Databases Distributed database management system A distributed database (DDB) is

DISTRIBUTED DATABASES CHAPTER 25 LECTURE OVERVIEW What are distributed databases?

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Benchmarking Hybrid OLTP&OLAP Database Systems Florian Funke Alfons Kemper Thomas Neumann

OldSQL vs. NoSQL vs. NewSQL on New OLTP Michael Stonebraker,

YMMV 2013 2013 2013 2013 Prison Life GOOD EVIL NVM OLTP DRAM SSD/HDD Pr Projec oject

Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing

CS377: Database Systems Distributed Databases Distributed Databases

Question What is the pH of a liter of water to which you add 1 mL of White Vinegar? A. 5.89 B.

Edward T. Tilly, President and COO CBOE Holdings, Inc. UBS Global Financial Services Conference

sparse matrices and graphs L. Olson Department of Computer Science University of Illinois at

A Sparse Tensor Format and a Benchmark Suite Jiajia Li Pacific Northwest National Laboratory

HMIS Project Setup 201 May 5, 2020 Joan Domenech, CSH And Meradith Alspaugh, The Partnership

The Ins and Outs of Online Lending Kathryn Petralia Co-founder & COO, Kabbage Inc. 1 as

AUTOWALE.IN An Initiative By: UBIDA SOLUTIONS PVT. LTD. 24X7 Dial-An-Auto

2006 AGM Agenda 1. Company Developments: Peter M. Brown 2. Official Business of the Meeting:

Distributed OLTP Databases (Part II) Lecture # 23 Database Systems - PowerPoint PPT Presentation

Distributed OLTP Databases (Part II) Lecture # 23 Database Systems Andy Pavlo AP AP Computer Science 15-445/15-645 Carnegie Mellon Univ. Fall 2018 2 LAST CLASS System Architectures Shared-Memory, Shared-Disk, Shared-Nothing

2 Workloa d? 3 OLTP 4 OLAP OLTP 4 OLAP OLTP Streaming 4 Scan- OLAP OLTP Streaming

Modern OLTP Indexes (Part 2) 1 / 43 Modern OLTP Indexes (Part 2) Recap Recap 2 / 43 Modern OLTP

OLAP and Data Mining Chapter 17 OLTP Compared With OLAP On Line Transaction Processing

YMMV The The Las Last Si t Six Mon x Months ths Prison Life GOOD EVIL NVM OLTP DRAM

Distributed OLTP Databases (Part I) Lecture # 22 Andy Pavlo Database Systems AP AP Computer

Creating Databases and Tables Introduction to Databases in Python Creating Databases

Inductive Inductive Inductive Inductive Databases Databases Databases Databases and

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Distributed Databases Distributed database management system A distributed database (DDB) is

DISTRIBUTED DATABASES CHAPTER 25 LECTURE OVERVIEW What are distributed databases?

Module 3: Creating and Managing Databases Overview Creating Databases Creating

Benchmarking Hybrid OLTP&amp;OLAP Database Systems Florian Funke Alfons Kemper Thomas Neumann

OldSQL vs. NoSQL vs. NewSQL on New OLTP Michael Stonebraker,

YMMV 2013 2013 2013 2013 Prison Life GOOD EVIL NVM OLTP DRAM SSD/HDD Pr Projec oject

Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing

CS377: Database Systems Distributed Databases Distributed Databases

Question What is the pH of a liter of water to which you add 1 mL of White Vinegar? A. 5.89 B.

Edward T. Tilly, President and COO CBOE Holdings, Inc. UBS Global Financial Services Conference

sparse matrices and graphs L. Olson Department of Computer Science University of Illinois at

A Sparse Tensor Format and a Benchmark Suite Jiajia Li Pacific Northwest National Laboratory

HMIS Project Setup 201 May 5, 2020 Joan Domenech, CSH And Meradith Alspaugh, The Partnership

The Ins and Outs of Online Lending Kathryn Petralia Co-founder &amp; COO, Kabbage Inc. 1 as

AUTOWALE.IN An Initiative By: UBIDA SOLUTIONS PVT. LTD. 24X7 Dial-An-Auto

2006 AGM Agenda 1. Company Developments: Peter M. Brown 2. Official Business of the Meeting:

Benchmarking Hybrid OLTP&OLAP Database Systems Florian Funke Alfons Kemper Thomas Neumann

The Ins and Outs of Online Lending Kathryn Petralia Co-founder & COO, Kabbage Inc. 1 as