Measuring and Reducing Postgres Transaction Latency (updated version) Fabien Coelho MINES ParisTech, PSL Research University pgDay Paris – March 23, 2017 1 / 41
Postgres Latency Talk Outline 1 Introduction Subject Typical Web Application Transaction Performance Definitions pgbench 2 Performance Comparisons General Approach Two Connection Costs Latency Pitfalls Throughput and Latency Control Three Storage Options Two Protocol Impacts Four Query Combination Tricks Reducting Server Distance 3 Conclusion Performance Scalability Latency and Throughput Wrap-Up Miscellaneous Settings Lessons Learned Contributions to Postgres 2 / 41
Subject Postgres Latency Small OLTP OnLine Transaction Processing F. Coelho CRUD queries ... WHERE pk=? Introduction Subject data fit in shared buffers small, few GB Application Definitions RW, RO pgbench builtins pgbench Approach Performance Connection Focus and Motivation Latency Rate & Limit Storage performance with emphasis on latency interactive web app Protocol Combinations experiment & measure do not assume! Distance Scalability Miscellaneous Conclusion latency performance : RW × 63 , RO × 219 Wrap-Up Lessons Contributions 3 / 41
Typical Web Application Postgres 3-Tier Architecture Latency F. Coelho Client user acts on user-agent, sends to Introduction Server process request, database operations to Subject Application Database stores and retrieves data Definitions pgbench Approach Performance Connection Latency Rate & Limit User Client Server Database Storage Protocol Combinations Distance Database Operations Scalability Miscellaneous Connection TCP/IP , SSL & AAA Conclusion Wrap-Up Lessons Request-Response cycles transfer, parse, plan, execute, transfer back Contributions 4 / 41
Transaction Performance Postgres Definitions time & operations Latency F. Coelho Throughput operations per time unit tx/s Introduction usual approach, load measured in tps Subject Application Latency time for one operation ms/tx Definitions pgbench must fit application requirements Approach Performance Connection Latency Comments Rate & Limit Storage Protocol correlated and contradictory Combinations Distance Scalability max vs enough and vice-versa Miscellaneous Conclusion sensitive to many settings net, soft & hard Wrap-Up Lessons throughput bottleneck & latency additivity deep voodoo! Contributions 5 / 41
Postgres Performance Swiss Army Knife pgbench Postgres Available Features Latency F. Coelho input SQL-like scripts with minimal client-side language Introduction options time to run, prepared, reconnections, . . . Subject Application Definitions parallelism threads, clients, asynchronous calls pgbench Approach output statistical performance data Performance Connection Latency Rate & Limit Caveats Storage Protocol Combinations Distance long enough warm-up, checkpoint and vacuum Scalability Miscellaneous several times reproducibility Conclusion Wrap-Up pedal-to-the-metal max speed test not representative Lessons Contributions 6 / 41
Default TPC-B-like Transaction pgbench -b tcpb-like TPC-B-like banking transaction Postgres Pattern Latency -- random ids and amount F. Coelho \ set aid random(1, 100000 * :scale) 3 updates Introduction \ set bid random(1, 1 * :scale) 1 insert Subject \ set tid random(1, 10 * :scale) Application \ set delta random(-5000, 5000) Definitions 1 select pgbench -- actual transaction Approach BEGIN; Performance UPDATE pgbench accounts Connection SET abalance = abalance + :delta WHERE aid = :aid; Latency SELECT abalance Rate & Limit Storage FROM pgbench accounts WHERE aid = :aid; Protocol UPDATE pgbench tellers Combinations SET tbalance = tbalance + :delta WHERE tid = :tid; Distance Scalability UPDATE pgbench branches Miscellaneous SET bbalance = bbalance + :delta WHERE bid = :bid; Conclusion INSERT INTO pgbench history (tid, bid, aid, delta, mtime) Wrap-Up VALUES (:tid, :bid, :aid, :delta, CURRENT TIMESTAMP); Lessons Contributions END; 7 / 41
General Approach Postgres Experiment & Measure RW or RO Latency F. Coelho one-client runs unless otherwise stated Introduction Subject independent tests one at a time change Application Definitions final wrap up cumulative changes pgbench Approach Performance Connection Latency Exploration RW or RO Rate & Limit Storage Protocol two connection costs two protocol impacts Combinations Distance Scalability latency pitfalls four query combinations Miscellaneous throughput & latency control reducing server distance Conclusion Wrap-Up Lessons three storage options scalability and misc. stuff Contributions 8 / 41
Postgres Latency F. Coelho Introduction Subject Application Performance Comparisons Definitions pgbench Approach Performance Two Connection Costs Connection Latency Rate & Limit Storage Protocol Combinations Distance Scalability Miscellaneous Conclusion Wrap-Up Lessons Contributions 9 / 41
Connection Costs pgbench -C pgbench postgres Postgres Client 8 cores, 16 GB Latency LAN 1 Gbps F. Coelho LAN Server 16 cores, 32 GB, HDD Introduction Client Server Subject Application Definitions Initialization and Benchmarks Postgres 9.6.1 pgbench Approach 1.5 GB pgbench -i -s 100 Performance Connection Latency pgbench -T 2000 -C "host=server sslmode=require" 36.1 tps Rate & Limit Storage 56.4 tps pgbench -T 2000 -C "host=server sslmode=disable" Protocol Combinations pgbench -T 2000 "host=server sslmode=disable" 105.4 tps Distance Scalability Miscellaneous connection AAA 8.2 ms Conclusion Wrap-Up SSL negociation 10.0 ms Lessons Contributions transfers and transactions 9.5 ms 10 / 41
Postgres Latency F. Coelho Introduction Subject Application Performance Comparisons Definitions pgbench Approach Performance Latency Pitfalls Connection Latency Rate & Limit Storage Protocol Combinations Distance Scalability Miscellaneous Conclusion Wrap-Up Lessons Contributions 11 / 41
Latency Comparison – 9.5 vs 9.6 pgbench -j 4 -c 8 Version 9.5.5 Version 9.6.1 Postgres Latency F. Coelho throughput 329.4 tps throughput 326.4 tps Introduction average latency 24.3 ms average latency 24.4 ms Subject Application 600 600 Definitions pgbench Approach 500 500 thousand transactions thousand transactions Performance 400 400 Connection Latency 300 300 Rate & Limit Storage 200 200 Protocol Combinations 100 100 Distance Scalability Miscellaneous 0 0 0 1 2 3 4 5 0 1 2 3 4 5 Conclusion transaction latency in seconds transaction latency in seconds Wrap-Up Lessons latency std. dev. 79.5 ms latency std. dev. 20.3 ms Contributions 12 / 41
Latency Comparison – 9.5 vs 9.6 Instant TPS Version 9.5.5 Version 9.6.1 Postgres Latency 500 500 F. Coelho 400 400 Introduction Subject 300 300 Application tps tps Definitions 200 200 pgbench Approach 100 100 Performance Connection Latency 0 0 Rate & Limit 0 500 1000 1500 2000 0 500 1000 1500 2000 Storage run seconds sorted by tps run seconds sorted by tps Protocol Combinations What is happening? Buy Now, Pay Later! Distance Scalability Miscellaneous transaction surges are absorbed in-memory + WAL Conclusion Wrap-Up then data are written disk checkpoint Lessons Contributions 13 / 41
Latency Comparison – 9.5 vs 9.6 Checkpointing Postgres Latency Postgres 9.5 Checkpoint F. Coelho data writes spread over some time random I/O Introduction Subject OS choose when to actually write 30s delay on Linux Application Definitions pgbench until fsync is called. . . I/O storm – on low-end HDD Approach Performance Connection Latency Postgres 9.6 Checkpoint Rate & Limit Storage Protocol Combinations sorted data writes spread over some time sequential I/O Distance Scalability flush instructions sent regularly (256 kB) Miscellaneous checkpoint flush after Conclusion when fsync is called ok! Wrap-Up Lessons Contributions 14 / 41
Postgres Latency F. Coelho Introduction Subject Application Performance Comparisons Definitions pgbench Approach Performance Throughput and Latency Control Connection Latency Rate & Limit Storage Protocol Combinations Distance Scalability Miscellaneous Conclusion Wrap-Up Lessons Contributions 15 / 41
Rate (tps) and Limit (ms) pgbench -R 100 -L 100 -N 150 Postgres Pg 9.5 basic checkpoint Latency 100 tps F. Coelho 50 slow & skipped 24.0% Introduction 0 latency 15.6 ± 158.3 ms Subject 0 500 1000 1500 2000 2500 Application run seconds sorted by tps Definitions 150 pgbench Pg 9.6 sorted checkpoint Approach 100 Performance tps Connection slow & skipped 2.7% 50 Latency Rate & Limit 0 3.6 ± 24.6 ms latency Storage 0 500 1000 1500 2000 2500 Protocol run seconds sorted by tps Combinations Distance 150 Scalability Pg 9.6 sorted & flushed checkpoint Miscellaneous 100 tps Conclusion slow & skipped 0.5% 50 Wrap-Up Lessons 0 Contributions latency 2.6 ± 13.8 ms 0 500 1000 1500 2000 2500 run seconds sorted by tps 16 / 41
Recommend
More recommend