Faster MySQL replication using dependencies Abhinav Sharma - PowerPoint PPT Presentation

Faster MySQL replication using dependencies Abhinav Sharma Facebook Inc.

Motivation Replication lag is bad news • Stale reads on slaves • Potential data-loss 2

Motivation Can we eliminate replication lag? No! • There will always be a finite replication lag • Network delay (bounded by speed of light) • Transaction processing on the slave • Minimum lag = Network lag + Transaction time 3

Motivation Can we have bounds on replication lag? Yes! • Increasing lag is a function of difference in throughput between master and slave • Given master and the slave run on similar hardware and the slave is able to utilize it at least as much as the master we should never see unbounded lag 4

Problem Log shipping • Parallel transactions on the master are serialized in a log • Slave sees a linear stream of transactions that it can apply one-by- one to obtain a serializable schedule • Master is multithreaded, slave is single threaded • Throughput differential causes unbounded lag 5

Problem Log shipping Master Slave Tx3 Tx1 Tx2 Tx3 Tx2 Tx1 Tx1 Tx2 Tx3 6

Solution Log shipping with batching information • Record information about batching at the master in the log • Trivially, all transactions that commit together at the master can be executed in parallel on the slave (group commit) 7

Solution Log shipping with batching information Master Slave Tx1 Tx5 Tx3 Tx1 Tx2 Tx4 Tx2 Tx3 Tx1 Tx2 Tx3 8

Problem Log shipping with batching information • This still misses some independent transactions because we’re checking transactions that “committed” together not “executing” together • Batch size gets smaller with each hop in chain replication 9

Solution Tracking dependencies between transactions • By definition we just need to satisfy RW and WW conflicts • Logs are awesome for shipping but they’re one dimensional • Row based format of MySQL already has all the data required to find RW and WW conflicts! • Convert the log into a DAG of conflicts 10

Solution Tracking dependencies between transactions Master Slave Tx1 Tx2 Tx1 Tx5 Tx2 Tx3 Tx4 Tx3 Tx1 Tx3 Tx4 Tx5 Tx2 Tx4 Tx5 11

Problem Tracking dependencies between transactions • Are we done? Not yet… • We’re serializing transactions even if a single row is common between them • On the master two conflicting transactions can run partially in parallel, on the slave we’re serializing them completely • Extreme example: A workload where all transactions conflict will be serialized on a single thread on the slave 12

Solution Tracking fine grained dependencies • Track conflicts between individual row events • Dispatch events to workers immediately and pause execution on the workers for conflicting events • It can be proved that this is a necessary and sufficient condition to achieve bounded lag 13

Solution Tracking fine grained dependencies Transaction granularity Row granularity Tx1 Tx2 Tx1 vs. Tx2 14

Solution Tracking fine grained dependencies Master Slave Tx1 Tx3 Tx5 Tx1 Tx3 Tx5 Tx2 Tx4 Tx2 Tx4 Tx3 Tx1 Tx5 Tx4 Tx2 15

Solution Tracking fine grained dependencies • What about ordering of commits? • Commit ordering is sometimes important for reads on the slave • Two ways to solve this: • Never commit a transaction before it’s preceding transaction is committed. Trivial, easy to implement but can increase commit latency • Commit out of order and take periodic checkpoints at a prefix of the log. All clients read from the checkpoint. Caveat: This will still produce slave logs that are out of order • If we treat commit order as a kind of conflict, the 1 st solution is analogous to conflict serializable and the 2nd to view serializable 16

Solution Tracking fine grained dependencies: Commit ordering Checkpointing Force Order Next Checkpoint Prev Checkpoint Tx1 Tx2 Execution done Wait for commit Commit done Commit out of order Execute out of order but commit after checkpoint RO transactions always see prefix of the log 17

Solution sysbench write-only sysbench write-only 30000 25000 ~1.3x 20000 TPS 15000 Master Slave Old Slave ~4-5x 10000 5000 0 0 50 100 150 200 250 300 Threads 18

Solution Fully conflicting workload Fully conflicting workload 10000 9000 8000 7000 6000 TPS 5000 Master Slave 4000 Old Slave 3000 2000 MySQL 8.x WRITE_SET 1000 will also behave like this 0 0 50 100 150 200 250 300 Threads 19

Solution Tracking fine grained dependencies Can we do any better? Yes! • Eliminate 2PL • If replication workers are the only ones writing we can disable row locking altogether because we take care of conflicts explicitly • Replication threads will need to run in READ UNCOMMITTED isolation level • Rely on MVCC for consistent client reads • Transaction rewriting • Tx1: update tb1 set a = 2 where a = 1 • Tx2: update tb1 set a = 3 where a = 2 • Rewrite as Tx1 + Tx2: update tb1 set a = 3 where a = 1 20

Problems Tracking fine grained dependencies Some implementation problems • Coordinator/scheduler is the bottleneck. Anything more we do to make transaction processing faster is useless right now • Unpacking row events to find keys is expensive • Interaction between IO and SQL thread make things worse 21

MySQL 8.x WRITE_SET • MySQL 8.x supports WRITE_SET replication which uses transaction level dependencies extracted at the master • Workers are simple in WRITE_SET, both conflict detection and resolution happens in the coordinator • We use finer granularity than WRITE_SET and don’t sacrifice anything on the master • We also separate conflict detection (in the coordinator) and conflict resolution (in the workers) which allows better parallelization 22

Future Plans • Getting the best of both worlds • Use WRITE_SET to give hints about dependencies • All independent transactions go straight to the workers • Parse out finer row dependencies only for conflicting transactions 23

Resources Code: https://github.com/facebook/mysql-5.6 Relevant variables: • mts_dependency_replication = {NONE|TBL|STMT} • mts_dependency_order_commits = {0|1} • slave_tx_isolation = {keep it at READ-COMMITTED or lower} 24

Rate My Session 25

Faster MySQL replication using dependencies Abhinav Sharma - PowerPoint PPT Presentation

Faster MySQL replication using dependencies Abhinav Sharma Facebook Inc. Motivation Replication lag is bad news Stale reads on slaves Potential data-loss 2 Motivation Can we eliminate replication lag? No! There will always

MySQL Group Replication & MySQL InnoDB Cluster Production Ready? Kenny Gryp MySQL Practice

MySQL Replication Update MySQL 5.5 (GA) & MySQL 5.6.2 (Dev. Milestone) Lars Thalmann

Performance Guide for MySQL Cluster Mikael Ronstrm, Ph.D Senior MySQL Architect Sun

MySQL Proxy meets: binlogs Jan Kneschke MySQL Enterprise Tools mailto: jan@mysql.com What is

MySQL Proxy Making MySQL more flexible Jan Kneschke jan@mysql.com MySQL Proxy proxy-servers

MySQL Replication and HA at Facebook Part-II Jeff Jiang Production Engineer Facebook, Inc

New features in MySQL Replication Lars Thalmann, Development Manager, Replication & Backup

MySQL Replication Tutorial Mats Kindahl Senior Software Engineer Replication Technology Lars

MySQL Cluster und MySQL Proxy Alles Online Diese Slides gibt es auch unter:

Not Your Grandpas Replication The New Wave of MySQL Replication and How It Helps Your

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Asynchronous Replication

More on gdb for MySQL DBAs or Using gdb to study MySQL internals and as a last resort Valerii

Reducing Risk When Upgrading Your MySQL Environment Kenny Gryp MySQL Practice Manager My

PHP and MySQL Dr. E. Benoist Winter Term 2006-2007 PHP and MySQL 1 PHP and MySQL Introduction

An API for Reading the MySQL Binary Log Lars Thalmann Mats Kindahl Development Director, MySQL

Why MySQL Replication Fails, and How to Get it Back September, 26, 2017 Sveta Smirnova Sveta

CS 6320 - Advanced Database Systems Immanuel Trummer Course Organization Lectures

A Graphical representation for identifier structure in application logs Ari Rabkin, Wei Xu, Avani

INTRODUCTION TO TRANSACTION PROCESSING CHAPTER 21 (6/E) CHAPTER 17 (5/E) CHAPTER 21 OUTLINE

Introduction to Transaction Processing (1) Dr Janusz R. Getta School of Computing and

Outline Temporal and Real-Time Temporal database Databases: A survey Real-time database

CS 147: Computer Systems Performance Analysis Workload Selection 1 / 39 Overview CS147

CMPSCI 645 Database Design & Implementation Instructor: Gerome Miklau Overview of Databases

NetStore Leveraging Network Optimizations to Improve Distributed Transaction Processing

Sambuz

Useful Links

Newsletter

Mail Us

Faster MySQL replication using dependencies Abhinav Sharma - PowerPoint PPT Presentation

Faster MySQL replication using dependencies Abhinav Sharma Facebook Inc. Motivation Replication lag is bad news Stale reads on slaves Potential data-loss 2 Motivation Can we eliminate replication lag? No! There will always

MySQL Group Replication &amp; MySQL InnoDB Cluster Production Ready? Kenny Gryp MySQL Practice

MySQL Replication Update MySQL 5.5 (GA) &amp; MySQL 5.6.2 (Dev. Milestone) Lars Thalmann

Performance Guide for MySQL Cluster Mikael Ronstrm, Ph.D Senior MySQL Architect Sun

MySQL Proxy meets: binlogs Jan Kneschke MySQL Enterprise Tools mailto: jan@mysql.com What is

MySQL Proxy Making MySQL more flexible Jan Kneschke jan@mysql.com MySQL Proxy proxy-servers

MySQL Replication and HA at Facebook Part-II Jeff Jiang Production Engineer Facebook, Inc

New features in MySQL Replication Lars Thalmann, Development Manager, Replication &amp; Backup

MySQL Replication Tutorial Mats Kindahl Senior Software Engineer Replication Technology Lars

MySQL Cluster und MySQL Proxy Alles Online Diese Slides gibt es auch unter:

Not Your Grandpas Replication The New Wave of MySQL Replication and How It Helps Your

Asynchronous Replication and Bayou Asynchronous Replication and Bayou Asynchronous Replication

More on gdb for MySQL DBAs or Using gdb to study MySQL internals and as a last resort Valerii

Reducing Risk When Upgrading Your MySQL Environment Kenny Gryp MySQL Practice Manager My

PHP and MySQL Dr. E. Benoist Winter Term 2006-2007 PHP and MySQL 1 PHP and MySQL Introduction

An API for Reading the MySQL Binary Log Lars Thalmann Mats Kindahl Development Director, MySQL

Why MySQL Replication Fails, and How to Get it Back September, 26, 2017 Sveta Smirnova Sveta

CS 6320 - Advanced Database Systems Immanuel Trummer Course Organization Lectures

A Graphical representation for identifier structure in application logs Ari Rabkin, Wei Xu, Avani

INTRODUCTION TO TRANSACTION PROCESSING CHAPTER 21 (6/E) CHAPTER 17 (5/E) CHAPTER 21 OUTLINE

Introduction to Transaction Processing (1) Dr Janusz R. Getta School of Computing and

Outline Temporal and Real-Time Temporal database Databases: A survey Real-time database

CS 147: Computer Systems Performance Analysis Workload Selection 1 / 39 Overview CS147

CMPSCI 645 Database Design &amp; Implementation Instructor: Gerome Miklau Overview of Databases

NetStore Leveraging Network Optimizations to Improve Distributed Transaction Processing

Sambuz

Useful Links

Newsletter

Mail Us

MySQL Group Replication & MySQL InnoDB Cluster Production Ready? Kenny Gryp MySQL Practice

MySQL Replication Update MySQL 5.5 (GA) & MySQL 5.6.2 (Dev. Milestone) Lars Thalmann

New features in MySQL Replication Lars Thalmann, Development Manager, Replication & Backup

CMPSCI 645 Database Design & Implementation Instructor: Gerome Miklau Overview of Databases