Open Source SQL databases enters millions queries per second era Alexander Korotkov, Sveta Smirnova Postgres Professional, Percona 2016 Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 1 / 33
Russian developers of PostgreSQL: PostgreSQL CORE Open Source SQL databases enters millions queries per second era Alexander Korotkov, Sveta Smirnova Extensions Alexander Korotkov, Teodor Sigaev, Oleg Bartunov GiST(KNN), GIN, SP-GiST 2 / 33 ▶ Locale support ▶ PostgreSQL extendability: ▶ Full Text Search (FTS) ▶ NoSQL (hstore, jsonb) ▶ Indexed regexp search ▶ Access method extendability ▶ Speakers at PGCon, PGConf: 20+ talks ▶ intarray ▶ GSoC mentors ▶ pg_trgm ▶ 3 PostgreSQL major contributors + 1 committer ▶ ltree ▶ Conference organizers ▶ hstore ▶ 50+ years of PostgreSQL expertship: dev., audit, consult. ▶ plantuner ▶ Postgres Professional company co-founders ▶ jsquery
Sveta Smirnova support Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 3 / 33 ▶ MySQL Support engineer for more than 10 years ▶ Author of book MySQL Troubleshooting ▶ JSON UDF functions: design prototype for built-in JSON ▶ Pluggable FILTER clause for MySQL ▶ Speaker at Percona Live, OOW, Fosdem, DevConf, ... ▶ http://www.slideshare.net/SvetaSmirnova ▶ https://twitter.com/svetsmirnova ▶ https://github.com/svetasmirnova
Scalability benchmark What do we have? Open Source SQL databases enters millions queries per second era Alexander Korotkov, Sveta Smirnova for both MySQL 5.7 and PostgreSQL 9.6. We want to run 4 / 33 ▶ Outstanding scalability improvements in PostgreSQL 9.6 and MySQL 5.7. ▶ Nice benchmarks for MySQL 5.7 made by Dimitri Kravtchuk. ▶ https://goo.gl/aw0sM6 ▶ https://goo.gl/xc8cp8 ▶ https://goo.gl/7dwkoY ▶ Some benchmarks for PostgreSQL 9.6 ▶ https://goo.gl/RNWYxb ▶ https://goo.gl/3WrOAH ▶ Access to 72-cores server for testing. ▶ same tests ▶ on the same machine ▶ using the same tool
Try #1: synchronizing tests statements. Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 5 / 33 ▶ For PostgreSQL standard is pgbench ▶ For MySQL SysBench is widely used ▶ It is scriptable ▶ Easy to communicate with MySQL developers via bugs database, email and so on ▶ SysBench has built-in PostgreSQL support. ▶ I converted pgbench tests into Lua ▶ (open-database-bench) ▶ In PostgreSQL world, it’s standard to run small SQL-queries as prepared
sysbench with prepared statements: try 1 - Open Source SQL databases enters millions queries per second era Alexander Korotkov, Sveta Smirnova switch (stmt->bound_param[i].type) { continue; if (stmt->bound_param[i].is_null && *(stmt->bound_param[i].is_null)) + if (stmt->bound_param[i].is_null) { for (i = 0; i < (unsigned)pgstmt->nparams; i++) /* Convert SysBench bind structures to PgSQL data */ invalid input syntax for integer: "" FATAL: PQexecPrepared() failed: 7 ERROR: (last message repeated 7 times) FATAL: failed to execute function `event': 3 6 / 33 ▶ Problem: NULL handling is broken in sysbench for PostgreSQL. ▶ Fix. Pull request was merged by Alexey Kopytov.
sysbench with prepared statements: try 2 81604 3,6 0,0 0:00.47 postgres 93128 korotkov 20 0 32,904g 77936 76536 S 3,6 0,0 0:00.46 postgres 93130 korotkov 20 0 32,904g 80204 S 80720 3,6 0,0 0:00.47 postgres 93146 korotkov 20 0 32,904g 81112 79704 S 3,6 0,0 0:00.46 postgres .............................................................................. Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 79312 S 0 32,904g 20 0:00.47 postgres prepared statements. 93087 korotkov 20 0 9289440 3,718g 2964 S 242,6 0,1 0:32.82 sysbench 93161 korotkov 20 0 32,904g 81612 80208 S 4,0 0,0 93116 korotkov 20 0 32,904g 93121 korotkov 0:00.47 postgres 0,0 3,6 79020 S 80424 20 93118 korotkov 0:00.46 postgres 0,0 3,6 79424 S 80828 0 32,904g 7 / 33 ▶ Problem 2: sysbench can’t load PostgreSQL when using ▶ ...give up with sysbench, let’s use pgbench!
pgbench OLTP read-only script SELECT c FROM sbtest WHERE id = :id1; Open Source SQL databases enters millions queries per second era Alexander Korotkov, Sveta Smirnova SELECT DISTINCT c FROM sbtest WHERE id BETWEEN :r4l AND :r4u; SELECT c FROM sbtest WHERE id BETWEEN :r3l AND :r3u ORDER BY c ; SELECT SUM (K) FROM sbtest WHERE id BETWEEN :r2l AND :r2u; SELECT c FROM sbtest WHERE id BETWEEN :r1l AND :r1u; SELECT c FROM sbtest WHERE id = :id10; ............................................................... \ set r4u :r4l + :range_size \ set table_size 10000000 \ set r4l random(1, :table_size) ............................................................... \ set r1u :r1l + :range_size \ set r1l random(1, :table_size) \ set id10 random(1, :table_size) ............................................................... \ set id1 random(1, :table_size) \ set range_size 100 8 / 33
pgbench OLTP read-write script UPDATE sbtest SET k = k + 1 WHERE id = :u1; Open Source SQL databases enters millions queries per second era Alexander Korotkov, Sveta Smirnova I’ve to implement sb_rand_str() in server side C. COMMIT ; INSERT INTO sbtest (id, k, c , pad ) VALUES (:u3, :u4, sb_rand_str('###########-###########-###########-###########-###########-###########-###########-###########-###########-###########'), sb_rand_str('###########-###########-###########-###########-###########')) ON CONFLICT DO NOTHING ; DELETE FROM sbtest WHERE id = :u3; UPDATE sbtest SET c = sb_rand_str('###########-###########-###########-###########-###########-###########-###########-###########-###########-###########') WHERE id = :u2; SELECT DISTINCT c FROM sbtest WHERE id BETWEEN :r4l AND :r4u; \ set table_size 10000000 ............................................................... SELECT c FROM sbtest WHERE id = :id1; BEGIN ; \ set u4 random(1, :table_size) \ set u3 random(1, :table_size) \ set u2 random(1, :table_size) \ set u1 random(1, :table_size) ............................................................... 9 / 33
How to run this? It’s on the github and reproducible! $ git clone https://github.com/postgrespro/pg_oltp_bench.git $ psql DB -f oltp_init.sql $ pgbench -c 100 -j 100 -M prepared -f oltp_ro.sql -T 300 -P 1 DB Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 10 / 33 $ cd pg_oltp_bench $ make USE_PGXS=1 $ sudo make USE_PGXS=1 install $ psql DB -c "CREATE EXTENSION pg_oltp_bench;" $ pgbench -c 100 -j 100 -M prepared -f oltp_rw.sql -T 300 -P 1 DB
Inequal comparison! Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 11 / 33
Benchmark: Point selects Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 12 / 33
Benchmark: OLTP RO results Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 13 / 33
Benchmark: OLTP RW results Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 14 / 33
That was close... Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 15 / 33
Pin/UnpinBufger in lockless manner Before ”touching”any block of data, backend have to ”pin”correcponding bufger. Pin/UnpinBufger – very frequent operation. Before: S_LOCK(bufHdr); bufHdr->pinCount++; S_UNLOCK(bufHdr); Large: atomic_increment(buf_hdr->pinCount); See commit details: https://goo.gl/LLCvR8 . Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 16 / 33
Reduce ProcArrayLock contention snapshot requires shared ProcArrayLock. Committing transaction requires exclusive ProcArrayLock. See commit details: https://goo.gl/ZxiilI . Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 17 / 33 ▶ Snapshot contains list of running transaction ids. Getting ▶ Transaction commit clears its id from shared memory. ▶ High TPS leads to high ProcArrayLock contention. ▶ Solution: clear transaction id in group.
Reduce CLogControlLock contention Setting transaction status requires exclusive CLogControlLock. Reading new CLOG page requires exclusive CLogControlLock. transaction status. Number of demanded transactions is also high. have to read CLOG pages rarely. See commit details: https://goo.gl/aaPYsJ . Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 18 / 33 ▶ Getting transaction status requires shared CLogControlLock. ▶ On modern multicore systems, backends frequently get ▶ Solution: increase CLOG bufgers from 32 to 128. We would
What is PostgreSQL bottleneck? Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 19 / 33
What ARE PostgreSQL bottlenecks? Alexander Korotkov, Sveta Smirnova Open Source SQL databases enters millions queries per second era 20 / 33
Recommend
More recommend