continuous mysql restores
play

Continuous MySQL Restores Divij Rajkumar (divij@fb.com) Production - PowerPoint PPT Presentation

Continuous MySQL Restores Divij Rajkumar (divij@fb.com) Production Engineer, MySQL Infrastructure, Facebook Continuous Restores Why? Verify backup integrity Havent tested your backups? You dont have them Understand resource


  1. Continuous MySQL Restores Divij Rajkumar (divij@fb.com) Production Engineer, MySQL Infrastructure, Facebook

  2. Continuous Restores – Why? •Verify backup integrity • Haven’t tested your backups? You don’t have them •Understand resource requirements for restores • Time, server capacity, network capacity •Test restore orchestration • Don’t panic during a disaster event

  3. Backups

  4. Backups •Everything, every day

  5. server Shard mysq mysq 86 User 1 l l User 2 shar shar . d d . User shar shar 1000 d d shar shar d d shar shar d d shar shar d d shar shar d d

  6. Backups •Everything, every day •Tens of thousands of servers, many, many shards •Stored in HDFS •3 types: •Full dumps •Differential backups •Binary Logs

  7. Backups – Full dumps •mysqldump •--single-transaction •--set-gtid-purged=COMMENTED, stored in metadata •github.com/facebook/mysql-5.6 •Per shard, not per instance •Shards aren’t pinned to an instance •Easier to locate

  8. Backups – Full dumps •Logical, not physical •Easy single-table restore •No indexes backed up, reduced size •Easy to debug •Better compression •Every 5 days

  9. Backups – Differential backups •Diff between 2 full dumps •2 files generated - Rows deleted, Rows inserted •Based off of existing dump in HDFS •Take full dump -> compute diff -> upload diff

  10. Backups – Differential backups

  11. Backups – Differential backups •Diff between 2 full dumps •2 files generated - Rows deleted, Rows inserted •Based off of existing dump in HDFS •Take full dump -> compute diff -> upload diff •Possible because of logical full dumps

  12. Backups – Binary Logs •Raw binlogs from mysql •Record Previous-GTIDs in metadata •Using mysqlbinlog: • --start-position=4 • --stop-position=121 •Continuously uploaded •Binlog Server to interface

  13. Backups •Everything, every day •Tens of thousands of servers, many, many shards •Stored in HDFS •3 types: •Full dumps •Differential backups •Binary Logs

  14. Restore Infrastructure

  15. Restore Infrastructure OR C Reque st Warchief CRT Schedul e ORC DB Syn c Peon Peon Peon MySQ MySQ MySQ L L L

  16. ORC - Peons

  17. Restore Infrastructure - Peons •Python process server peo peo •Exposes Thrift n n interface mysq mysq •Manages MySQL l l restored restored db db Instance restored restored db db •Multiple peons per host restored restored db db •Work on restore jobs restored db •1 database == 1 job

  18. Restore Jobs – State Machine 1. SELECT – Select backup to restore 2. DOWNLOAD – Download backup to disk 3. LOAD – Load tables in parallel 4. VERIFY – Verify loaded full/diff dump 5. REPLAY – Replay binlogs + FAILED states

  19. Restore Jobs – mysqldump index •Custom index added to mysqldump output •Locate tables using per-table byte offsets •Preserved across diff backups

  20. Restore Jobs – SELECT •Pre-defined search range per job •1 day == restore most recent backup only •Use metadata to ignore “bad” backups •Only full/diff backups selected here

  21. Restore Jobs – DOWNLOAD •For full dumps: •Download mysqldump from HDFS to disk •For diff backups: •Stream diffs + full dump to perform 3-way merge •Store recreated mysqldump to disk •Single mysqldump output at the end

  22. Restore Jobs – LOAD •Split backup into per-table streams using index •Load tables in parallel

  23. Restore Jobs – VERIFY •Perform sanity checks on loaded data •Possible verification methods: •Compare checksums •Compare number of tables loaded •Best verification: replay binlogs

  24. Restore Jobs – REPLAY •Replay X seconds/minutes/hours worth of binlogs •Enables point in time restores •Verify loaded data can be written to

  25. Restore Jobs – State Machine LOAD TABLE LOAD TABLE SELECT DOWNLOAD LOAD TABLE VERIFY REPLAY LOAD TABLE LOAD TABLE

  26. Restore Jobs – Pipeline •Allow processing many jobs concurrently •Peon has 1 handler per job state •Limit per-state concurrency

  27. Restore Jobs – Pipeline DOWNLOA LOAD VERIF REPLA SELEC LOA D TABLE Y Y T D SLOT 1 SLOT 2 SLOT 3 SLOT 4 SLOT 5

  28. Restore Jobs – Pipeline DOWNLOA LOAD VERIF REPLA SELEC LOA D TABLE Y Y T D SLOT db1963 db142 db165 db982 db7 – t1 db2362 1 SLOT db1000 db3794 db1340 db7 db7 – t2 db651 2 SLOT db982 – 3 t3 SLOT db982 – 4 t9 SLOT db7 – t5 5

  29. Restore Jobs – Pipeline •Allow processing many jobs concurrently •Peon has 1 handler per job state •LOAD -> LOAD + LOAD_TABLE •Limit per-state concurrency •Slot capacity can be tuned

  30. Restore Jobs – Selecting Binlogs •Full/diff backups can be taken from slaves •Binlogs always taken from master •Need to uniquely identify transactions… •GTIDs!

  31. Restore Jobs – Selecting Binlogs •Use GTIDs purged from mysqldump: • --set-gtid-purged=COMMENTED •Use GTIDs purged from mysqlbinlog: • --start-position=4 • --stop-position=121 •These are GTIDs purged , not GTIDs contained

  32. Restore Jobs – Selecting Binlogs •GTIDs purged from LOAD state ( dump_gtids ): •GTIDs purged in binlogs ( binlog_gtids ): Contains 451-529 Contains 530-774 Contains 775-? First binlog to First replay superset

  33. Restore Jobs – Replaying Binlogs •Filter binlog events with mysqlbinlog: • --database • --skip-gtids • --skip-empty-trans • github.com/facebook/mysql-5.6 •Last transaction to replay: • --stop-datetime

  34. Restore Job – Lifecycle DOWNLOA LOAD VERIF REPLA SELEC LOA D TABLE Y Y T D SLOT db100 – db100 – db100 db100 1 t1 t2 –t3 SLOT 2 SLOT 3 SLOT 4 SLOT 5

  35. CRT

  36. Restore Infrastructure OR C Reque st Warchief CRT Schedul e ORC DB Syn c Peon Peon Peon MySQ MySQ MySQ L L L

  37. CRT – Continuous Restore Tier •Create restore jobs for shards •Periodically poll backup metadata to find backups •Thrift call to Warchief to issue restore job •Monitor restore progress •Peons expose job and health stats via Thrift •Detailed stats written to Scuba

  38. CRT – Continuous Restore Tier •Failure handling •Retry on transient errors •Mark “unstable” •Notify oncall •Orchestration component •Resolve dependency graph •Request restores in order of dependency

  39. On-demand Restores

  40. “oops, I accidentally ran UPDATE without WHERE”

  41. “oops, I thought I was on my dev database”

  42. “oops”

  43. Restore Infrastructure OR C Reque st Warchief CRT Reque Schedul st e UI/CLI ORC DB Syn c Peon Peon Peon MySQ MySQ MySQ L L L

  44. Discover Edge Cases

  45. War Stories

  46. War Stories – Collation Change • COLLATE=latin1_bin •Required for MyRocks •Schema change deployed •Most diff backup restores fail overnight •Thanks CRT Monitoring!

  47. 200x increase in DOWNLOAD_FAILED !

  48. War Stories – Collation Change •DifferentialBackup used default collation •Sorting order broken in diff files •Consistent sorting is required for 3-way merge •3-way merges failed for affected backups •Wouldn’t have been caught without continuous restores •DifferentialBackup fixed to understand table collation

  49. War Stories – RBR Binlogs •Row-based replication deployed •Lots of REPLAY failures overnight “ The database used for the current transaction has changed since BEGIN. This is not supported! ” •2 common patterns: •Shards belonged to same database tier •Failure only happened after master promotion

  50. War Stories – RBR Binlogs •SBR binlogs: •Each BEGIN contains session database •RBR binlogs: •On master, each BEGIN contains session database •On slave, BEGIN has no session database

  51. War Stories – RBR Binlogs •With mysqlbinlog --skip-gtids --skip-empty-trans •Expected: session databases don’t change across events •Binlogs not rotated after master promotion •For some part of binlog, instance is a slave •For some part of binlog, session database is empty •For some part of binlog, session databases don’t match “The database used for the current transaction has changed since BEGIN. This is not supported!”

  52. War Stories – RBR Binlogs •Possible solutions: •Rotate binlogs during promotion •Fix MySQL server behaviour •Patch mysqlbinlog •???

  53. Thank s! Divij Rajkumar (divij@fb.com)

Recommend


More recommend