the backup methods
play

The Backup Methods Available for MongoDB Adamo Tonete Agenda - PowerPoint PPT Presentation

The Backup Methods Available for MongoDB Adamo Tonete Agenda Backup importance for companies and backup plans. Available Methods: - Disk Snapshot - mongodump - rsync or copy - Point in time backup from Percona - MongoDB Cloud / Ops


  1. The Backup Methods Available for MongoDB Adamo Tonete

  2. Agenda Backup importance for companies and backup plans. Available Methods: - Disk Snapshot - mongodump - rsync or copy - Point in time backup from Percona - MongoDB Cloud / Ops Manager backup (on-prem) - Hot Backup Q&A 2

  3. Replica-set and Shard Concepts 101

  4. Replicasets and Shard concepts 4

  5. Replicaset and Shard concepts 5

  6. Why is Backup Important?

  7. Why is Backup Important? Data usually is the most valuable asset in a company. A company with severe data loss may not even come back to the business. Could you imagine a bank losing all its data or an e-commerce offline for 1 week? 7

  8. Why is Backup Important? Data loss can occur in 3 main different situations: 1) Human Error 2) DB failure/corruption 3) System failure/collapse 4) Security Breach 8

  9. Backup Plan

  10. Backup Plan/Disaster Recovery Plan Backup Plan Choose the best RPO, RTO for your company. - Recovery POINT Objective - Recovery Time Objective 10

  11. Why is Backup Important? ● RTO is how much time can the company would accept to be "offline". ● How long should take to have my application back online? 11

  12. Why is Backup Important? ● RPO is what POINT in time must the backups be when we have a data loss/incident. ● This is an extreme important metric to know how often a backup need to be made. 12

  13. Backup Plan/Disaster Recovery Plan 1TB replica-set 13

  14. Backup Plan/Disaster Recovery Plan RTO = 20 minutes RPO = 30 minutes 1TB replica- set 14

  15. Backup Plan/Disaster Recovery Plan RTO = 20 minutes RPO = 30 minutes 95% read 1TB replica- 5% writes set 15

  16. Backup Plan/Disaster Recovery Plan RTO = 20 minutes RPO = 30 minutes 2000 inserts/day 1TB replica- 3000 review day set 16

  17. Backup Plan/Disaster Recovery Plan We have 1TB data and... 5 GB is for user login 2 GB day of new writes ~ 900 GB of reviews and 40GB is the favorites (90% of the traffic) Favorites are updated every 20 minutes asynchronous. 17

  18. Backup Plan/Disaster Recovery Plan 90% traffic - 10% data 10% traffic - 90% data Login Historical data/non Favorites Comment/upvote fav 18

  19. Backup Plan/Disaster Recovery Plan ● Backup the user database every 30 minutes ● Backup the favorite topics every 20 minutes (right after the sync) ● Backup the new comments in an incremental way (using filter for created_at > last backup) ● Backup the history aged/non favorites collection once per day 19

  20. Backup Plan/Disaster Recovery Plan 5 GB user - 30 minutes Comments every hour - 500 MB 40 GB favorites - 20 minutes 900 GB - non favorite data 20

  21. Backup Plan/Disaster Recovery Plan What feature should have priority in a recovery situation? 21

  22. Backup Plan/Disaster Recovery Plan Login Favorites Comment/upvote 90% traffic - 10% data 22

  23. Replica-sets and Shard concepts ● With 10% of the data the environment is handling 90% of the requests and slowly recovering the old data. ● Not all the companies consider this as a full RTO but other do. It depends on the expectations. 23

  24. Disk Snapshot

  25. Disk Snapshot Disk snapshot is a full copy of the data currently in a disk. The snapshot process may take a while but the advantage is when a restore is needed the files are already ready for the database. No need to create indexes or run a file restore, the recover time is fast. 25

  26. Disk Snapshot Advantages: Straight forward approach, take a copy of what is in the disk and that’s all. 26

  27. Disk Snapshot Disadvantages May slow down the database while the snapshot is being created. Can take several hours depending on the disk speed No "partial" restore all or nothing 27

  28. Disk Snapshot Backup type: Binary copy Time to backup: High Complexity: Low Time to recover: Low 28

  29. Rsync or scp to a different host

  30. Rsync or SCP ● Consists in copying the entire/data folder to a different machine/disk while a mongod process is stopped or all the writes are stopped. ● It was very common in MMAP and still possible with wiredTiger. 30

  31. Rsync or SCP Advantages Data is ready to be used in the target folder. Just start the mongod process using the backup folder. 31

  32. Rsync or SCP Disadvantages Needs to stop a secondary or lock writes. May affect performance. Restore is all or nothing. 32

  33. Rsync or SCP Backup type: Binary Time to backup: High Complexity: Medium Time to recover: Low 33

  34. Mongodump

  35. mongodump mongodump in bounded with mongodb and it is the preferable tool to backup a mongodb database. It is important to mention there are 2 steps to perform a disaster recover when using mongodump 1) create the dump file 2) restore the dump file with mongorestore 35

  36. mongodump Use mongodump to create backups per: ● Database ● Collection ● Specific value (query) ● Point in time backup (when using replica-sets) 36

  37. mongodump Although the mongodump tool is very versatile only having backup file doesn't mean you are safe. dump files need to processed by mongorestore to rebuild the database. An error in the dump file may break the entire restore process. 37

  38. mongodump Backup files Backup files dump process 38

  39. mongodump Collection Start Time End Time users T T+10 Backup logins T T+20 files favorites T+10 T+30 Backup files other T+20 T+40 39

  40. mongorestore Backup files Backup files 40

  41. Mongodump Backup files Backup o files p dump process l o g oplog 41

  42. Mongodump Collection Start Time End Time oplog Backup users T T+10 T+50 files logins T T+20 T+40 Backup files favorites T+10 T+30 T+20 messages T+20 T+40 T+0 Oplog 42

  43. Mongodump It is easy to achieve a point in time backup in a replica-set with mongodump. However the same is not true for sharding. How to guarantee all the backups will end at the same time? https://github.com/Percona-Lab/mongodb_consistent_backup 43

  44. Mongodump + Percona Scripts Percona POINT in time backup is a Beta tool from percona to backup a cluster wide project in a point in time way. It does rely on mongodump and ensures all the dumps ends at the same time generating an point in time backup from a cluster. Full backup, not partial 44

  45. Mongodump + Percona Scripts 45

  46. Mongodump + Percona Scripts Advantages Highly flexible tool to generate backups. Default logical backup method offered by mongodb 46

  47. Mongodump + Percona Scripts Disadvantages Default behavior is not point in time. Restore time can take longer as indexes needs to be rebuilt. Backup files needs to be tested 47

  48. Mongodump + Percona Scripts Backup type: logical Time to backup: depends Complexity: low to high Time to recover: depends usually high 48

  49. MongoDB Atlas

  50. MongoDB Atlas Fully managed backup service offered by MongoDB It is possible to backup using cloud provider snapshot or continuous backup. Only need an agent installed and all done. The configuration is done by a website. No tech skills need. 50

  51. MongoDB Atlas Backup type: logical/snapshots Time to backup: low Complexity: (unknown) Time to recover: (unknown) would say fast as the data is in the same DC 51

  52. Percona Hot Backup

  53. Percona Hot Backup Binary lightweight backup method that copies the database to a different folder/disk without affecting the instance performance. Available in WiredTiger only. Acts very similar to a disk snapshot but in the database level. Generates a point of time copy of the database. 53

  54. Percona Hot Backup Backup type: logical Time to backup: medium Complexity: low Time to recover: low 54

  55. Questions

  56. Rate My Session 56

  57. Thank You Sponsors!! 57

Recommend


More recommend