herd of containers sa d dif
play

Herd of Containers Sad DIF Database Engineer Herd of Containers: - PowerPoint PPT Presentation

Herd of Containers Sad DIF Database Engineer Herd of Containers: PostgreSQL in containers at BlaBlaCar pgDay Paris, Mar 15, 2018 BlaBlaCar Overview Todays agenda PostgreSQL usage at BlaBlaCar Switching to a new implementation


  1. Herd of Containers

  2. Saâd DIF Database Engineer

  3. Herd of Containers: PostgreSQL in containers at BlaBlaCar pgDay Paris, Mar 15, 2018

  4. BlaBlaCar Overview Today’s agenda PostgreSQL usage at BlaBlaCar Switching to a new implementation

  5. BlaBlaCar Overview

  6. Facts and Figures 30 million mobile 60 million app downloads members Iphone and Android Founded 15 million in 2006 travellers 1 million tonnes Currently in less CO 2 22 countries France, Spain, UK, Italy, Poland, Hungary, Croatia, Serbia, Romania, In the past year Germany, Belgium, India, Mexico, The Netherlands, Luxembourg, Portugal, Ukraine, Czech Republic, Slovakia, Russia, Brazil and Turkey.

  7. Core Data Ecosystem 1 2 3 MySQL Cassandra Redis Main Database Column Oriented In Memory MariaDB 10.0+ Distributed Key-Value Galera Cluster Optional durability

  8. Core Data Ecosystem 4 5 ElasticSearch PostgreSQL JSON documents ORDBMS FullText search Extensibility Distributed Stability

  9. Containers Why Containers ? Resource allocation Deployment Speed On premise Skills already there Cost

  10. Rkt Why Rkt over Docker ? Containers CoreOS Container Linux Linux Distrib Simple & Secure Only run containers Fleet Orchestration By default with CoreOS

  11. GGN Generate systemd Containers units Dgr Build and configure App Container Images Pods Aggregate images in one shared environment

  12. Containers front1 php nerve pgsql-main1 nginx Service Discovery zookeeper pgsql create monitoring Service Codebase synapse monitoring nerve dgr build nerve synapse store Container Registry rkt PODs ggn run fleet cluster “Distributed init system” fleet etcd 1 type of hardware bare-metal servers Hardware host CoreOS 3 disk profiles

  13. Service Discovery Why ? 1 Get rid of DNS internally Adapt to change

  14. Service Discovery Why ? Zookeeper 1 2 Key-Value store Reliable, Fast, Scalable

  15. Service Discovery Why ? Zookeeper Report 1 2 3 Go-Nerve Health Checks Ephemeral keys Present on each pod

  16. Service Discovery Why ? Zookeeper Report Discover 1 2 3 4 Go-Synapse Watch Zookeeper Update HAProxy configuration

  17. Service Discovery Zookeeper backend pod client pod go-synapse go-nerve /database/node1 /database go-synapse watches zookeeper service keys and reloads haproxy if changes are go-nerve does health checks detected and reports to zookeeper in service keys HAProxy node1 Applications hit their local haproxy to access backends

  18. PostgreSQL usage at BlaBlaCar

  19. Usage Third-party applications Home Made tools Prerequisite Confidence Spatial PostGIS

  20. PostGIS Paris Travel company Rambouillet Le Creusot Corridoring Lyon Point to Point

  21. 3 685 1M 50k Rides passed by Number of Rows reads per Amiens last month meeting points minutes

  22. Operate Manual Not friendly Interventions Streaming Painful failover Replication recovery Change!

  23. Target Scale writes Slaves Ease deployments Failovers Maximum availability Expandable resources

  24. Possibilities Postgres-XC (x2) Bucardo Postgres-XL Slony PgLogical Londiste

  25. Switching to a new implementation

  26. BDR Bi-Directional Replication OpenSource project by 2ndQuadrant Multi Master Asynchronous Replication 2 to 48 nodes Optimal for Geo Distributed databases

  27. BDR : The Confirmation All nodes support reads and writes No failovers No other process / nodes needed Partition tolerant

  28. BDR : Caveats Modified version of PostgreSQL 9.4 BDR 2.0 with PostgreSQL 9.6 for 2ndQuadrant support customers Replication lag Conflicts DDL lock Statement not replicated Some statement not supported yet

  29. Implementation [~/build-tools/aci/aci-postgresql-bdr] $ tree . ├── Jenkinsfile ├── aci-manifest.yml ├── attributes Run │ ├── base.yml │ └── postgresql.yml ├── files │ └── tmp │ └── postgresql Check │ ├── environment │ ├── pg_ctl.conf │ ├── pg_ident.conf │ └── start.conf ├── runlevels Check if node have │ ├── build Init │ │ └── 00.install.sh entries in the │ └── build-late │ └── 00.clean.sh └── templates bdr_nodes table, if └── dgr └── runlevels yes : skip init └── prestart-late ├── 00.init-instance.sh.tmpl └── 01.init-database.sh.tmpl

  30. Implementation (init) If no “donor” attributes : Init as new group 1 When the node have “donor” attributes : Retrieve user definition on Part local node on donor 1 1 donor ( pg_dumpall -g ) 2 2 Join BDR group Delete entries on donor (bdr_nodes and bdr_connections) Create minimum objects if not 3 present New fresh node Node already referenced but changed host or have lost his data

  31. Monitoring and Alerting Exporter Expose metrics Pager Duty Prometheus Incidents Manager Smart Monitoring Grafana Beautiful Visualizations

  32. Monitoring Key principles: Usage Saturation

  33. BDR exporter specifics $ cat aci-prometheus-postgresql-exporter/templates/queries.tmpl.yaml {{ if .use_bdr }} pg_replication_bdr_count: query: "select (select count(*) from bdr.bdr_nodes) as bdr_nodes, (select count(*) from Template values for bdr.bdr_connections) as bdr_connections;" metrics: BDR specifics - bdr_nodes: usage: "GAUGE" description: "Number of rows in the bdr_nodes table" - bdr_connections: usage: "GAUGE" description: "Number of rows in the bdr_connections table" {{ end }} pg_replication_count: query: "select (select count(*) from pg_stat_replication) as stat_repli, (select count(*) from pg_replication_slots where active=true) as rep_slots;" Extend metrics to all metrics: PostgreSQL needs - stat_repli: usage: "GAUGE" description: "Number of rows in the pg_stat_replication table" - rep_slots: usage: "GAUGE" description: "Number of rows in the pg_replication_slots table with the active status" [...]

  34. Backup and Recovery Retrieve dumps pg_dump 1 Alter structure 2 dump Load structure and 3 data dump

  35. Backup and Recovery $ cat pod-mysql-backup/aci-backup/templates/opt/backup-main.tmpl.sh function startbackup { begin_unixtime=$(date +%s) cat <<EOF | curl --data-binary @- http://prometheus-gw:9091/metrics/job/backup_{{.env}}/target/$node/service/$service/type/{{.backup.type}} # HELP backup_begin_unixtime # TYPE backup_begin_unixtime counter backup_begin_unixtime $begin_unixtime EOF }

  36. Alerting $ cat prometheus-rules/alert.postgresql.rules # Alert: There is less replication active than bdr nodes ALERT BackupsTooOld PromQL to find out IF time() - backup_end_unixtime{exported_service=~".*postgresql.*"} ) > ( 3600 * 24 ) unhealthy services LABELS { Labeling for routing to severity="warning", stack="backups", Slack & Pager Duty team="data_infrastructure" } Annotations with ANNOTATIONS { summary="Backup {{ $labels.type }} on {{ $labels.exported_service }}.{{ $labels.target }} is too templating to have clear old.", descriptions, URL to dashboard=" https://grafana.blabla.car/dashboard/db/db-backups ", } dashboards and ops runbooks

  37. Feedback Clearly satisfied with availability Sanity checks Reactive community BDR 3.0 coming soon! Know what your needs are

  38. What’s next?

Recommend


More recommend