MySQL Backup and Restore at Facebook Scale Ola Berjak Production Engineer at MySQL Infrastructure, Facebook London
MySQL Backup and Restore at Facebook Scale …and how it’s not rocket science Ola Berjak Production Engineer at MySQL Infrastructure, Facebook London
3
When do we need backups? How do we perform backups? How do we restore backups? 4
When do we need backups? How do we perform backups? How do we restore backups? 5
When do we need backups? How do we perform backups? How do we restore backups? 6
When do we need backups? 7
MALICIOUS ATTACKER HARDWARE FAILURE HUMAN ERROR 8
MALICIOUS ATTACKER HARDWARE FAILURE HUMAN ERROR 9
FULL DUMPS DIFFS 10
How do we perform backups? 11
Every database, every day 12
LOGICAL BACKUPS PHYSICAL BACKUPS Ea Easy sy Complex CUSTOMER LOGIC DEBUGGING Ea Easy sy Complex SINGLE TABLE Ea Easy sy Complex RESTORE PORTABILITY Con Consistent Inconsistent BACKUP AND Long Short RESTORE DURATION 13
Technical setup mysqldump number of rows for each table FULL DUMPS zstd compression trailing index 14
mysqldump --single-transaction --skip-lock- tables (...) 15
mysqldump --single-transaction --skip-lock- tables (...) | compress and add index 16
mysqldump --single-transaction --skip-lock- tables (...) | compress and add index | upload 17
Trailing index { "size": 7331, "offset": 1337, "table_name": "foo" }, { "size": 223, "offset": 8668, "table_name": "bar" } 18
Open-source mysqldump: https://github.com/facebook/mysql-5.6 zstd: https://github.com/facebook/zstd 19
Open source tooling will get the job done 20
Open source tooling will get the job done au autom omysql sqlbacku ackup scheduling email notifications custom backup rotation 21
Open source tooling will get the job done au autom omysql sqlbacku ackup mo monitoring tools scheduling email notifications alerting custom backup rotation 22
23
Technical setup full dump format 2 files for a single diff backup DIFFS rows removed rows inserted and updated 24
most recent dump backup, new dump “base dump” DiffDatabase diff2 diff1 CREATE TABLE foo CREATE TABLE foo INSERT INTO foo INSERT INTO foo -- rows for foo: 1337 -- rows for foo: 7331 CREATE TABLE bar CREATE TABLE bar INSERT INTO bar INSERT INTO bar 25
diff1 diff2 CREATE TABLE foo CREATE TABLE foo INSERT INTO foo INSERT INTO foo -- rows for foo: 1337 -- rows for foo: 1337 CREATE TABLE bar CREATE TABLE bar INSERT INTO bar INSERT INTO bar base dump MergeDatabase new dump 26
F D D D D F D D D D F 27
2-3x+ less space used 28
Explore the open source tooling 29
Explore the open source tooling au autom omysql sqlbacku ackup scheduling email notifications custom backup rotation di different ntial al bac acku kups 30
Due diligence checklist 31
Due diligence checklist • verify the size 32
Due diligence checklist • verify the size • set up alerting 33
Due diligence checklist • verify the size • set up alerting • store checksums and metadata 34
MALICIOUS ATTACKER HARDWARE FAILURE HUMAN ERROR 35
Technical setup all transactions for all databases from master BINARY LOGS compressed using zstd metadata stored 36
Due diligence checklist • verify the size • set up alerting • store checksums and metadata 37
Due diligence checklist • verify the size • set up alerting • store checksums and metadata • detect gaps in transactions backed up 38
Due diligence checklist • verify the size • set up alerting • store checksums and metadata • detect gaps in transactions backed up • monitor the ’’backup lag” 39
How do we restore backups? 40
Continuous restore pipeline 41
Continuous restore pipeline Scheduler 42
Continuous restore pipeline Warchief Loadbalancer Scheduler 43
Continuous restore pipeline Loadbalancer Scheduler DB 44
Continuous restore pipeline Loadbalancer Scheduler DB Worker Worker Worker Worker MySQL MySQL MySQL MySQL 45
SELECT DOWNLOAD LOAD CHECKSUM VERIFY REPLAY 46
SELECT DOWNLOAD LOAD CHECKSUM VERIFY REPLAY 47
SELECT DOWNLOAD LOAD CHECKSUM VERIFY REPLAY 48
SELECT DOWNLOAD LOAD CHECKSUM VERIFY REPLAY 49
SELECT DOWNLOAD LOAD CHECKSUM VERIFY REPLAY 50
SELECT DOWNLOAD LOAD CHECKSUM VERIFY REPLAY 51
Start small and build up 52
DEVELOPMENT TIME DATA RESILIENCE BUSINESS PRIORITIES BUSINESS CONTINUITY 53
Today’s agenda Why do we need backups? Backups and restores made easy How to make sure our backups don’t go to ’’/dev/null”? 54
Today’s agenda Why do we need backups? Backups and restores made easy How to make sure our backups don’t go to ’’/dev/null”? 55
Today’s agenda Why do we need backups? Backups and restores made easy How to make sure our backups don’t go to ’’/dev/null”? 56
57
“ The best outages are the ones that don’t happen. ” PRETTY MUCH EVERY PRODUCTION ENGINEER I KNOW 58
Thank you 59
Ola Berjak aberjak@fb.com @Lexxzor 60
Recommend
More recommend