Automatic MySQL Schema Management with Skeema Evan Elias Percona Live, April 2017
What is Schema Management? • Organize table schemas in a repo • Execution of all DDL, on the correct MySQL instances, with the correct OSC flags • Not to be confused with specific OSC tools! Schema management is a separate concept / higher layer 2
Prior Art 3
Introducing Skeema • CLI tool for schema management • Designed specifically for MySQL / InnoDB • Free and open source (Apache 2.0 license) http://github.com/skeema/skeema • Written in Go 4
Location in the MySQL toolchain Skeema’s functionality combines/replaces several types of tools: + + Schema Schema DDL / OSC dumper diff-er executor 5
DESIGN PRINCIPLES
Manage schemas like code • Top-level directory for each pool/cluster • Subdirectory for each database schema • One file for each table, containing CREATE TABLE statement • Config files may be placed in any dir / subdir, and they “stack” 7
Declarative schema management • Filesystem defines what is the ideal state of schemas • Tool automatically figures out how to reach this state on a DB No need to ever create migrations or DDL by hand! 8
None of this… 9
or this… 10
DEFINITELY not this… 11 XML?!? NOPE NOPE NOPE
Repo is CREATE TABLE all the way down git diff skeema diff 12
Very very agnostic • OSC tool • Service discovery • Sharding scheme • SCM tool • Application language 13
Environments • Configure multiple environments (dev, test, stage, prod; any arbitrary name) • Also support patterns such as “every engineer has their own dev DB on localhost” • Ability to migrate one environment at a time • Visibility into the state of an environment via diff 14
Flexible configuration Config powerful enough for diverse use-cases: • Ability to override settings at multiple levels: environment, cluster, or schema • Options for conditionally controlling behavior based on table size 15
Configuration examples “In production, use pt-osc when altering tables over 1GB, or use MySQL 5.6 built-in online DDL for anything smaller” [production] alter-wrapper="/usr/local/bin/pt-online-schema-change --execute --alter {CLAUSES} D={SCHEMA},t={TABLE},h={HOST},P={PORT},u={USER},p={PASSWORDX}" alter-wrapper-min-size=1g alter-algorithm=inplace alter-lock=none 16
Configuration examples “In dev, always just use standard ALTERs” [development] skip-alter-wrapper skip-alter-algorithm skip-alter-lock 17
Configuration examples “In everything but dev, apply extra safety checks to destructive DDL, but only for tables with at least one row. In dev, always allow destructive DDL without any extra safety checks. safe-below-size=1 [dev] allow-unsafe 18
Configuration examples “For a specific schema, apply different flags to pt-osc” In /var/schemas/.skeema: alter-wrapper="/usr/local/bin/pt-online-schema-change …” In /var/schemas/some-cluster/some-schema/.skeema: alter-wrapper="/usr/local/bin/pt-online-schema-change —extra-flag …” 19
Configuration examples “When interacting with any schema on a particular cluster, increase innodb_lock_wait_timeout and use different sql_mode” In /var/schemas/special-cluster/.skeema: connect-options=“innodb_lock_wait_timeout=60,sql_mode= ‘STRICT_ALL_TABLES,ALLOW_INVALID_DATES'” 20
USAGE
Dumping schemas to filesystem Command: skeema init [environment] + + Schema Schema DDL / OSC dumper diff-er executor 22
skeema init • Point it at a MySQL instance • For each database schema on the instance, creates a subdir • For each table in each schema, creates one .sql file, containing a CREATE TABLE statement • In each dir, a .skeema option file is created, storing the host/ port (at top level) or schema name (in each database subdir) 23
skeema init: options Same option-handling as the standard MySQL client! • -h host • -u user • -P port • -ppassword • -S socket # if host=localhost • -p # prompt for password If ~/.my.cnf exists, automatically parsed for user and password. Default output dirname based on host/port; override with --dir 24
skeema init Example above assumes user and password obtained from ~/.my.cnf Port is non-standard due to use of Docker 25
skeema init 26
skeema init 27
Updating files from out-of-band changes Command: skeema pull [environment] + + Schema Schema DDL / OSC dumper diff-er executor 28
skeema pull 29
Linting schemas Command: skeema lint + + Schema Schema DDL / OSC dumper diff-er executor 30
skeema lint • Identifies invalid SQL • Normalizes format of table files to match MySQL’s SHOW CREATE TABLE • Future versions may also warn on redundant indexes, lack of PK, etc • Exit code 0=no changes, 1=reformatted something, 2+=error 31
skeema lint Original CREATE TABLE, with any arbitrary formatting 32
skeema lint 33
skeema lint After reformatting 34
skeema lint Another example: catching typos / invalid SQL 35
skeema lint 36
Diff-ing schemas Command: skeema diff [environment] + + Schema Schema DDL / OSC dumper diff-er executor 37
skeema diff • Compares state of database(s) to that of filesystem, generating DDL as output • Generated DDL, if executed, would cause database(s) to match the filesystem • Usable as a “dry run” before executing schema changes 38
skeema diff Starting with the above table as an example 39
skeema diff Let’s say we want to add a new col and index. Just edit the file. 40
skeema diff skeema diff shows, but does not run, the corresponding DDL 41
skeema diff: use in pipelines • STDOUT is SQL, STDERR is human log messages • Exit code 0=no diffs, 1=some diffs, 2+=error • Use --quick option to only output instances with at least one diff. (Good for finding shards where a schema change didn’t complete!) • If an external OSC tool is configured, the output will include the full command-line 42
Executing DDL Command: skeema push [environment] + + Schema Schema DDL / OSC dumper diff-er executor 43
skeema push • Exact same behavior as skeema diff , but actually executes the DDL! • Recommended workflow: run skeema diff first to preview changes, then skeema push to execute them • Can double-check by running skeema diff again after; should show no remaining differences 44
skeema push 45
Destructive actions and safety checks 46
ADVANCED CONFIG
Shell-outs and variables • Several options configure external command-lines • These support use of variable placeholders • Conn variables: {HOST}, {PORT}, {USER}, etc • DDL-specific vars: {TYPE}, {CLAUSES}, {SIZE}, etc • Automatic escaping of quotes in variable values 48
Online schema change • alter-wrapper: external command for ALTERs • ddl-wrapper: external command for ALTER, CREATE, DROP • alter-algorithm and alter-lock: force use of MySQL 5.6 online DDL (if not using an external tool) • alter-wrapper-min-size: Only apply alter-wrapper to tables that are at least this size, in bytes 49
Online schema change example “In production, use pt-osc when altering tables over 1GB, or use MySQL 5.6 built-in online DDL for anything smaller” [production] alter-wrapper="/usr/local/bin/pt-online-schema-change --execute --alter {CLAUSES} D={SCHEMA},t={TABLE},h={HOST},P={PORT},u={USER},p={PASSWORDX}" alter-wrapper-min-size=1g alter-algorithm=inplace alter-lock=none 50
Online schema change example 51
Service discovery and sharding • Ordinarily, host option specifies a hostname or IP address (or comma-separated list of them) • For service discovery, host-wrapper option allows host to represent a cluster name instead • host-wrapper defines a command-line to shell out to. Command’s STDOUT will be parsed and interpreted as one or more hosts (or host:port) to use 52
Service discovery example The executed script should be capable of doing lookups (e.g. in ZooKeeper, etcd, or Consul) such as "return the master of pool foo" or "return all shard masters for sharded pool xyz" host-wrapper=/some/script.sh /mysql/{ENVIRONMENT}/{HOST} 53
Sharding behavior • For a sharded topology, ensure that host-wrapper returns all shard masters • diff and push default to applying to all shards, one at a time • first-only option to just run against first returned shard • concurrent-instances option to run against multiple at once • pull always runs against first shard only 54
WORKFLOWS
Local dev DB per engineer • Engineers run skeema on dev box when interacting with dev DB • Configure dev environment in .skeema file to use localhost [development] host=localhost socket=/path/to/mysql.sock allow-unsafe skip-alter-wrapper skip-alter-algorithm skip-alter-lock 56
Recommend
More recommend