Query Optimizer – MySQL vs. PostgreSQL Percona Live, Santa Clara (USA), 24 April 2018 Christian Antognini @ChrisAntognini antognini.ch/blog BASEL BERN BRUGG DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. GENEVA HAMBURG COPENHAGEN LAUSANNE MUNICH STUTTGART VIENNA ZURICH
@ChrisAntognini Senior principal consultant, trainer and partner at Trivadis christian.antognini@trivadis.com http://antognini.ch Focus: get the most out of database engines Logical and physical database design Query optimizer Application performance management Author of Troubleshooting Oracle Performance (Apress, 2008/14) OakTable Network, Oracle ACE Director Query Optimizer – MySQL vs. PostgreSQL 2 2018-04-24
Agenda 1. Introduction 2. Configuration 3. Statistics about Data 4. Data Dictionary Metadata 5. Single-Table Access Paths 6. Joins and Sub-queries 7. Conclusion 8. References Query Optimizer – MySQL vs. PostgreSQL 3 2018-04-24
Introduction Query Optimizer – MySQL vs. PostgreSQL 4 2018-04-24
Compared Products MySQL Community Server 8.0.11 PostgreSQL 10.3 Release date: 19 April 2018 Release date: 1 March 2018 Only the InnoDB engine is covered Query Optimizer – MySQL vs. PostgreSQL 5 2018-04-24
Terminology In PostgreSQL the query optimizer is called planner Query Optimizer – MySQL vs. PostgreSQL 6 2018-04-24
Disclaimer No performance tests were performed No comparison between the execution plans generated by the two query optimizers were performed To compare and to evaluate the two query optimizers only the availability of key features and the ability of the query optimizer to correctly recognize common data patterns was considered Query Optimizer – MySQL vs. PostgreSQL 7 2018-04-24
Aim of the Query Optimizer SQL is a declarative language B SQL doesn’t specify how to access data, only which data to process The strategy used to access data is delegated to the query optimizer The aim of the query optimizer is to find out the optimal way to execute a A given SQL statement Query Optimizer – MySQL vs. PostgreSQL 8 2018-04-24
Inputs Considered to Produce an Execution Plan SQL statement Configuration Query Metadata Others? Optimizer Object statistics Execution plan Query Optimizer – MySQL vs. PostgreSQL 9 2018-04-24
How Is Data in Tables Stored? InnoDB uses a B-tree index Heap table Query Optimizer – MySQL vs. PostgreSQL 10 2018-04-24
Controlling the Query Optimizer Query Optimizer – MySQL vs. PostgreSQL 11 2018-04-24
Configuration Three system variables control the Two dozen of parameters control the behavior of the query optimizer behavior of the query optimizer Limit the number of evaluated Limit the number of evaluated plans (2) plans (2) Control specific features Control specific features (15) (1 parameter for 20 features) Control the genetic optimizer (7) The default (system) values can be The default (system) values can be overwritten at session level and at overwritten at session level the statement level Query Optimizer – MySQL vs. PostgreSQL 12 2018-04-24
Configuration – Cost Model A cost model database contains cost A number of parameters provide cost estimate information for a number of estimate information for a number of operations (8) operations (11) The default values can be changed The default (system) values can be at the system level only overwritten at session level Query Optimizer – MySQL vs. PostgreSQL 13 2018-04-24
hints_modifiers.sql Hints hints_index.sql hints_optimizer.sql SELECT statement modifiers (4) (Available in EDB Advanced Server ) Impact statement syntax Index hints (3) Impact statement syntax Cause error when index missing Optimizer hints (21) Similar to Oracle Database hints Global, query block and object-level Cause warning when syntax is wrong Query Optimizer – MySQL vs. PostgreSQL 14 2018-04-24
Statistics about Data Query Optimizer – MySQL vs. PostgreSQL 15 2018-04-24
Gathering Statistics The ANALYZE statement gathers The ANALYZE statement gathers and, by default, stores statistics in and stores statistics in the data the data dictionary dictionary By default, an asynchronous By default, the autovacuum daemon automatic statistics recalculation recalculate statistics of modified takes place tables Persistent (default) as well as non- persistent statistics exist Query Optimizer – MySQL vs. PostgreSQL 16 2018-04-24
statistics.sql Table Statistics Clustered index size (pages) Table size (pages) Number of rows Number of rows Number of pages marked all-visible Query Optimizer – MySQL vs. PostgreSQL 17 2018-04-24
statistics.sql Column Statistics Data distribution (optional) Fraction of values that are null Including fraction of entries that Average column width (bytes) are null Number of distinct values Statistical correlation between physical and logical row ordering Data distribution (optional) Most common values and their frequency (optional) Query Optimizer – MySQL vs. PostgreSQL 18 2018-04-24
statistics.sql Cross-Column Statistics Functional dependencies (optional) Number of distinct values (optional) Query Optimizer – MySQL vs. PostgreSQL 19 2018-04-24
statistics.sql Index Statistics Index size (pages) Index size (pages) Number of leaf pages Number of indexed rows Number of distinct keys Several values are stored E.g. for index “ a,b,c ” “a”, “ a,b ”, “ a,b,c ”, “ a,b,c,PK ” Query Optimizer – MySQL vs. PostgreSQL 20 2018-04-24
Data Dictionary Metadata Query Optimizer – MySQL vs. PostgreSQL 21 2018-04-24
constraints_pk_uk.sql Constraints – Primary Key and Unique Key Because of the clustered index, PK No particular precedence is given to has precedence over other indexes predicates based on PK/UK Predicates based on UK take Statistical correlation between precedence over other indexes physical and logical row ordering determines which index is used Equality predicates based on PK/UK are probed Query Optimizer – MySQL vs. PostgreSQL 22 2018-04-24
constraints_fk.sql Constraints – Foreign Key No evidence about the usage of FK No evidence about the usage of FK to avoid unnecessary loss-less joins to avoid unnecessary loss-less joins has been observed has been observed Query Optimizer – MySQL vs. PostgreSQL 23 2018-04-24
constraints_not_null.sql Constraints – NOT NULL NOT NULL constraints are used to By default the usage of NOT NULL verify the validity of predicates constraints to verify the validity of predicates is enabled for specific cases only constraint_exclusion = partition Statistics are used instead Query Optimizer – MySQL vs. PostgreSQL 24 2018-04-24
constraints_check.sql Constraints – CHECK No evidence about the usage of By default the usage of CHECK CHECK constraints to verify the constraints to verify the validity of validity of predicates has been predicates is enabled for specific observed cases only Statistics are used instead constraint_exclusion = partition Statistics are used instead Query Optimizer – MySQL vs. PostgreSQL 25 2018-04-24
Single-Table Access Paths Query Optimizer – MySQL vs. PostgreSQL 26 2018-04-24
indexes_expression.sql Available Index Types indexes_partial.sql indexes_invisible.sql indexes_prefix.sql indexes_nulls.sql Supported index types Supported index types B-tree (default) B-tree (default) R-tree (for spatial indexes) Hash, GiST, SP-GiST, GIN, BRIN For string columns, indexes can be Indexes can be created on created on the leading part of expressions as well as on a subset column values of the rows B-tree indexes store NULL values B-tree indexes store NULL values Support for invisible indexes Query Optimizer – MySQL vs. PostgreSQL 27 2018-04-24
indexes_order_by.sql Optimization of ORDER BY, MIN and MAX indexes_min_max.sql B-tree indexes can be used to B-tree indexes can be used to optimize ORDER BY, MIN and MAX optimize ORDER BY, MIN and MAX Index scans can be performed in Index scans can be performed in both directions both directions Keys are stored according to the Keys are stored according to the specified order specified order No NULLS FIRST/LAST support NULLS FIRST/LAST supported Query Optimizer – MySQL vs. PostgreSQL 28 2018-04-24
indexes_merge.sql Merging Indexes Two or more B-tree indexes can be When appropriate, B-tree indexes merged at runtime to evaluate are dynamically converted to multiple predicates combined with bitmaps in memory AND or OR One utilization of this feature is to merge indexes to evaluate multiple predicates combined with AND or OR Query Optimizer – MySQL vs. PostgreSQL 29 2018-04-24
partitioning_range.sql (Declarative) Partitioning partitioning_list.sql partitioning_hash.sql Available methods: Available methods: Multi-column range and list Multi-column range Single-column hash Single-column list Sub-partitioning by hash Sub-partitioning Only local indexes (PK/UK must Only local indexes (no PK/UK) contain partition key) FK not supported FK not supported Partition exclusion (pruning) Partition pruning Query Optimizer – MySQL vs. PostgreSQL 30 2018-04-24
Joins and Sub-queries Query Optimizer – MySQL vs. PostgreSQL 31 2018-04-24
Recommend
More recommend