pgcon 2020 online 2020 05 26 database systems 2002 2005
play

PGCon-2020 Online 2020-05-26 Database systems: - PowerPoint PPT Presentation

PGCon-2020 Online 2020-05-26 Database systems: 2002-2005: since 2005: Worked on XML data type and functions (2005-2007) Long-term community activist #RuPostgres, 2000+ members Conferences Program


  1. PGCon-2020 Online 2020-05-26

  2. Database systems: ○ ○ 2002-2005: ○ since 2005: ○ Worked on XML data type and functions (2005-2007) ○ Long-term community activist – #RuPostgres, 2000+ members Conferences Program Committee ○ etc. ○ Current business:

  3. - — boost development of fast-growing PostgreSQL-based projects using thin cloning and high level of automation Clients and partners: CHEWY.COM CHEWY.COM

  4. - Get an overview of numerous tools and methods in Postgres ecosystem - Focus on remembering what can be done rather than how - Learn about the new methodology: “Seamless SQL Optimization” to build powerful and scalable optimization process in your organization - Bonus: PostgreSQL and tools developers might find some new (or old) ideas

  5. ● SQL Performance analysis. Methodologies Macro-analysis: tools, metrics, visualization ● ● Switching from Macro- to Micro-analysis (and back) Micro-analysis: tools, metrics, visualization ● ● How do we optimize? Where? 3 key principles of Seamless SQL Optimization methodology ●

  6. bit.ly/pgcon2020 Just launch it, do not follow the steps (yet). Watch what I’m doing

  7. (inherited from Brendan Gregg’s talks on Performance Analysis) - “Zero methodology”: lack of tools and analysis / optimization activities.

  8. (inherited from Brendan Gregg’s talks on Performance Analysis) - “Zero methodology”: lack of tools and analysis / optimization activities. - “Street Light” anti-method: - Pick query that is known to be slow (e.g., random slow query from logs) - Use tools and metrics that are familiar, without understanding the whole picture - Avoid query analysis, focus on config tuning, scaling-in, or partitioning / sharding

  9. (inherited from Brendan Gregg’s talks on Performance Analysis) - “Zero methodology”: lack of tools and analysis / optimization activities. - “Street Light” anti-method. - “Drunk man” anti-method. Tune random things. For example: - Increase shared_buffers . Or work_mem . Or random / seq_page_cost . Or effective_cache_size - Build indexes for all columns - Put VACUUM FULL to cron. - Pick random inefficient query and optimize it, then iterate - Let’s use Google, StackOverflow, etc. … until the problems go away

  10. (inherited from Brendan Gregg’s talks on Performance Analysis) - “Zero methodology”: lack of tools and analysis / optimization activities. - “Street Light” anti-method. - “Drunk man” anti-method. - “Blame someone else” anti-method: - “Database is slow again!” - “It’s not database, it’s network!” - “Our configuration is fine, it’s AWS/GCP/… problem!”

  11. - USE (Utilization, Saturation, Errors) by Brendan Gregg - RED (Requests Rate, Errors, Duration) by Tom Wilkie - The Four Golden Signals (RED + Saturation) by Rob Ewaschuk - Problem statement method - Resource analysis - CPU and Off-CPU FlameGraph analysis (perf, eBPF) - Wait Events analysis - Transaction and lock analysis - Tuple and operation stats analysis - Workload analysis (pg_stat_statements, log-based) - Single query analysis (EXPLAIN, auto_explain). Variations: - Planner numbers only - Time-centric - Focused on cardinality/selectivity (row counts) - Buffers-centric Database Experiments -

  12. Macro-analysis: - analyze workload as a whole - split to segments (e.g., group queries removing parameters) - find “heavy” segments (“heavy” may have various meanings here) - apply macro- or micro-optimization to speed up whole workload or significant part of it - pro step: ensure that no parts of workload have slowed down Micro-analysis: - analyze a single query, forget about all others – run EXPLAIN to obtain the plan - find bottlenecks, inefficient nodes in the plan - apply micro- or macro-optimization to improve performance - pro step: ensure that all queries have not slowed down

  13. postgres-checkup MACRO-ANALYSIS pgwatch2 MICRO-ANALYSIS PEV2 EXPLAIN (BUFFERS, ANALYZE)

  14. postgres-checkup MACRO-ANALYSIS pgwatch2 MICRO-ANALYSIS PEV2 EXPLAIN (BUFFERS, ANALYZE)

  15. postgres-checkup MACRO-ANALYSIS pgwatch2 MICRO-ANALYSIS PEV2 EXPLAIN (BUFFERS, ANALYZE)

  16. postgres-checkup MACRO-ANALYSIS pgwatch2 MICRO-ANALYSIS PEV2 EXPLAIN (BUFFERS, ANALYZE)

  17. postgres-checkup MACRO-ANALYSIS pgwatch2 MICRO-ANALYSIS PEV2 EXPLAIN (BUFFERS, ANALYZE)

  18. - pg_stat_activity - pid, query (usually limited to default 1024 bytes), ... , xact_start, query_start, state_change, …., state, …, wait_event, wait_event_type, … - pg_locks - pg_stat_statements - queryid, query (entire text w/o params), calls, total_time, mean_time, …, shared_blk_***, … - Non-standard: pg_stat_kcache, pg_qualstats, pg_wait_sampling - PostgreSQL logs - Slow queries: - pid, query (full), duration (if exceeds log_min_duration_statement), ... - Blocked queries, with blockers (when blocking state exceeds deadlock_timeout) - Canceled queries (deadlocks, statement_timeout, idle_in_transaction_session_timeout)

  19. pg_stat_statements (extension, in core) pg_stat_activity ┌─────────────────────┬──────────────────┐ ┌──────────────────┬──────────────────────────┐ │ Column │ Type │ │ Column │ Type │ ├─────────────────────┼──────────────────┤ ├──────────────────┼──────────────────────────┤ │ userid │ oid │ │ datid │ oid │ │ dbid │ oid │ │ datname │ name │ │ queryid │ bigint │ │ pid │ integer │ │ query │ text │ │ usename │ name │ │ calls │ bigint │ │ application_name │ text │ │ total_time │ double precision │ │ client_addr │ inet │ │ min_time │ double precision │ │ client_hostname │ text │ │ max_time │ double precision │ │ client_port │ integer │ │ mean_time │ double precision │ │ backend_start │ timestamp with time zone │ │ stddev_time │ double precision │ │ xact_start │ timestamp with time zone │ │ rows │ bigint │ │ query_start │ timestamp with time zone │ │ shared_blks_hit │ bigint │ │ state_change │ timestamp with time zone │ │ shared_blks_read │ bigint │ │ wait_event_type │ text │ │ shared_blks_dirtied │ bigint │ │ wait_event │ text │ │ shared_blks_written │ bigint │ │ state │ text │ │ local_blks_hit │ bigint │ │ backend_xid │ xid │ │ local_blks_read │ bigint │ │ backend_xmin │ xid │ │ local_blks_dirtied │ bigint │ │ query │ text │ │ local_blks_written │ bigint │ │ backend_type │ text │ │ temp_blks_read │ bigint │ └──────────────────┴──────────────────────────┘ │ temp_blks_written │ bigint │ │ blk_read_time │ double precision │ │ blk_write_time │ double precision │ └─────────────────────┴──────────────────┘

Recommend


More recommend