mysql at scale at square
play

MySQL at Scale at Square Daniel Nichter Percona Live 2018 1 April - PowerPoint PPT Presentation

MySQL at Scale at Square Daniel Nichter Percona Live 2018 1 April 23, 2018 An honest financial network for everyone Square Global: USA, Canada, UK, Japan, Australia Payment transaction data stored in MySQL We are hiring


  1. MySQL at Scale at Square Daniel Nichter Percona Live 2018 1 April 23, 2018

  2. ● An honest financial network for everyone Square ● Global: USA, Canada, UK, Japan, Australia ● Payment transaction data stored in MySQL ● We are hiring (square.com/careers) 2 About Square

  3. The Numbers ● > 4,000 unique MySQL instances ○ Percona Server 5.6, RBR, GTID ● 99.95% SLA ○ 43s day, 5m week, 21m month, 4h 23m year ● ~ 1 PB of data ● ~ 800 physical servers ● ~ 300 microservices (apps) 3 MySQL at Square by the Numbers

  4. Theme Pebbles, Not Boulders 4 Theme: Pebbles, Not Boulders

  5. Theme Pebbles, Not Boulders 5 Theme: Pebbles, Not Boulders

  6. Topics 1. Setup and Replication 2. High Availability 3. Backups 4. Access and Security 5. Monitoring and Alerting 6. Advice 6 Topics of Discussion

  7. 1. Setup and Replication 2. High Availability 3. Backups 4. Access and Security 5. Monitoring and Alerting 6. Advice

  8. A-side B-side 01 02 US West US East 03 04 8 Production Database Cluster

  9. A-side B-side ● It works ● Simple, fewest moving parts ● Balance cost, redundancy, complexity 01 02 ● Can add more read replicas US West ● Inactive side for: ○ Backups ○ Maintenance ○ Upgrade/patch US East 03 04 ○ Ad hoc queries by humans ○ No SLA ● Shard and scale out (app) 9 Benefits of this Topology

  10. Theme Pebbles, Not Boulders A-side B-side 01 02 US West 03 04 US East 10 Theme: Pebbles, Not Boulders

  11. 1. Setup and Replication 2. High Availability 3. Backups 4. Access and Security 5. Monitoring and Alerting 6. Advice

  12. A-side B-side RW CNAME RW SIP 01 02 03 04 12 RW CNAME and SIP

  13. A-side B-side RW SIP RW CNAME 01 02 03 04 13 RW CNAME and SIP flipped

  14. A-side B-side RW CNAME RW SIP 01 02 Clone from 03 03 04 Clone from 02 14 Node 02 Failure

  15. A-side B-side RW CNAME RW SIP 01 02 03 04 15 B-side Rebuilt

  16. A-side B-side 01 02 Clone from 02 Clone from 04 RW CNAME RW SIP 03 04 GTID 16 Primary DC Failure

  17. A-side B-side 01 02 RW CNAME RW SIP 03 04 17 Cluster Running In DR DC

  18. A-side B-side RW CNAME RW SIP 01 02 03 04 18 DR Successover Back To Primary DC

  19. Scaling Out 01 02 01 02 01 02 03 04 03 04 03 04 01 02 01 02 01 02 03 04 03 04 03 04 01 02 01 02 01 02 03 04 03 04 03 04 19 9 Production MySQL Clusters

  20. Theme Pebbles, Not Boulders 01 02 01 02 03 04 03 04 01 02 01 02 03 04 03 04 20 Theme: Pebbles, Not Boulders

  21. 1. Setup and Replication 2. High Availability 3. Backups 4. Access and Security 5. Monitoring and Alerting 6. Advice

  22. Backups Percona XtraBackup 22 Percona XtraBackup for Backups

  23. Backups ● Percona XtraBackup ● Full daily on all inactive nodes ● Encrypted ● Monitored (we know if backups do not run) ● Most stored locally on separate RAID array; some uploaded to cloud ● Ruby script to wrap XtraBackup, monitoring, etc. ● Restore verified via rebuilding failed nodes 23 Backups at Square

  24. ulimit -n 262144 && umask 0227 && \ innobackupex \ --defaults-extra-file=#{backup_my.cnf} \ --slave-info \ --safe-slave-backup \ --databases=<db list file> \ --stream=xbstream \ --parallel=4 \ #{tmpdir} \ | pigz --fast -p 1 -c \ | gpg2 --batch --no-options -o #{dst} \ | tee >(md5sum > #{file}.md5) \ | sudo -u mysql split -a 2 --bytes=4096m - #{file}- 24 innobackupex command

  25. ulimit -n 262144 && umask 0227 && \ innobackupex \ --defaults-extra-file=#{backup_my.cnf} \ --slave-info \ --safe-slave-backup \ --databases=<db list file> \ --stream=xbstream \ --parallel=4 \ #{tmpdir} \ | pigz --fast -p 1 -c \ | gpg2 --batch --no-options -o #{dst} \ | tee >(md5sum > #{file}.md5) \ | sudo -u mysql split -a 2 --bytes=4096m - #{file}- 25 innobackupex command

  26. ulimit -n 262144 && umask 0227 && \ innobackupex \ --defaults-extra-file=#{backup_my.cnf} \ --slave-info \ --safe-slave-backup \ --databases=<db list file> \ --stream=xbstream \ --parallel=4 \ #{tmpdir} \ | pigz --fast -p 1 -c \ | gpg2 --batch --no-options -o #{dst} \ | tee >(md5sum > #{file}.md5) \ | sudo -u mysql split -a 2 --bytes=4096m - #{file}- 26 innobackupex command

  27. ulimit -n 262144 && umask 0227 && \ innobackupex \ --defaults-extra-file=#{backup_my.cnf} \ --slave-info \ --safe-slave-backup \ --databases=<db list file> \ --stream=xbstream \ --parallel=4 \ #{tmpdir} \ | pigz --fast -p 1 -c \ | gpg2 --batch --no-options -o #{dst} \ | tee >(md5sum > #{file}.md5) \ | sudo -u mysql split -a 2 --bytes=4096m - #{file}- 27 innobackupex command

  28. ulimit -n 262144 && umask 0227 && \ innobackupex \ --defaults-extra-file=#{backup_my.cnf} \ --slave-info \ --safe-slave-backup \ --databases=<db list file> \ --stream=xbstream \ --parallel=4 \ #{tmpdir} \ | pigz --fast -p 1 -c \ | gpg2 --batch --no-options -o #{dst} \ | tee >(md5sum > #{file}.md5) \ | sudo -u mysql split -a 2 --bytes=4096m - #{file}- 28 innobackupex command

  29. ulimit -n 262144 && umask 0227 && \ innobackupex \ --defaults-extra-file=#{backup_my.cnf} \ --slave-info \ --safe-slave-backup \ --databases=<db list file> \ --stream=xbstream \ --parallel=4 \ #{tmpdir} \ | pigz --fast -p 1 -c \ | gpg2 --batch --no-options -o #{dst} \ | tee >(md5sum > #{file}.md5) \ | sudo -u mysql split -a 2 --bytes=4096m - #{file}- 29 innobackupex command

  30. ulimit -n 262144 && umask 0227 && \ innobackupex \ --defaults-extra-file=#{backup_my.cnf} \ --slave-info \ --safe-slave-backup \ --databases=<db list file> \ --stream=xbstream \ --parallel=4 \ #{tmpdir} \ | pigz --fast -p 1 -c \ | gpg2 --batch --no-options -o #{dst} \ | tee >(md5sum > #{file}.md5) \ | sudo -u mysql split -a 2 --bytes=4096m - #{file}- 30 innobackupex command

  31. 1. Setup and Replication 2. High Availability 3. Backups 4. Access and Security 5. Monitoring and Alerting 6. Advice

  32. Access and Security ● App-specific MySQL clusters 32 Access and Security

  33. Access and Security ● App-specific MySQL clusters App: foo 01 02 01 02 01 02 03 04 03 04 03 04 foo-001 foo-002 bar-001 33 Access and Security

  34. Access and Security ● App-specific MySQL clusters ● TLS for all connections and auth (no passwords) 34 Access and Security

  35. Access and Security ● App-specific MySQL clusters ● TLS for all connections and auth (no passwords) GRANT ALL PRIVILEGES ON *.* TO 'daniel'@'%' REQUIRE ISSUER '/CN=Square/O=Square Inc./ST=California/L=San Francisco' SUBJECT '/CN=daniel/serialNumber=1' 35 Access and Security

  36. Access and Security ● App-specific MySQL clusters ● TLS for all connections and auth (no passwords) ● Rotating accounts: human, app, system 36 Access and Security

  37. Access and Security ● App-specific MySQL clusters ● TLS for all connections and auth (no passwords) ● Rotating accounts: human, app, system ● Isolated security zones: directional, no cross-talk, no hardware reuse 37 Access and Security

  38. Access and Security ● App-specific MySQL clusters ● TLS for all connections and auth (no passwords) ● Rotating accounts: human, app, system ● Isolated security zones: directional, no cross-talk, no hardware reuse Blue Zone Red Zone INBOUND TCP 3306 OUTBOUND 01 02 01 02 03 04 03 04 38 Access and Security

Recommend


More recommend