improve query performance with the query log analyzer
play

Improve Query Performance with the Query Log Analyzer Kees Vegter - PowerPoint PPT Presentation

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log Analyzer kees@neo4j.com Query Log dbms.logs.query.enabled=true # If the execution of query takes more time than this threshold, # the query is


  1. Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log Analyzer kees@neo4j.com

  2. Query Log dbms.logs.query.enabled=true # If the execution of query takes more time than this threshold, # the query is logged. If set to zero then all queries dbms.logs.query.threshold=100ms dbms.logs.query.parameter_logging_enabled=true dbms.logs.query.time_logging_enabled=true dbms.logs.query.allocation_logging_enabled=true dbms.logs.query.page_logging_enabled=true dbms.track_query_cpu_time=true dbms.track_query_allocation=true 2

  3. Query Log Analyzer Query Analysis 3

  4. Query Log Analyzer Query Log: Filter 4

  5. Query Log Analyzer Query Log: Highlight 5

  6. Query Log Analyzer Query Timeline 6

  7. Cypher Query Processing Cypher Cypher Physical Execution Plan Planning Execution Use query parameters! uses db-statistics Physical Logical Execute Physical Plan in Query String Parse Execution Plan Cypher Runtime Plan Query Execute Physical Plan in Plan Query String Cypher Runtime Cache Use repeatable statements! 7

  8. Cypher Cypher Query Planning Execution Load MATCH p=(ah:AccountHolder {fullName :$accountName }) -[:HAS_BANKACCOUNT]->(ba)-[:SEND*2..16]->() WITH p, [x in nodes(p) WHERE x:BankAccount] AS mts UNWIND mts AS mt MATCH p2=(mt)-[:FROM]->()-[:IN_COUNTRY]->() Query Log Analyzer RETURN p, p2 SKIP 0 LIMIT 1000 8

  9. Cypher Planning Cypher Planning 1775 queries analysed, 302 distinct queries found. Parameter Usage ● 1775 queries analysed, 1775 distinct queries found. Check the tool header ○ MATCH (ah:AccountHolder) Check for parameter usage in WHERE ah.fullName = "John Smith" ○ ... your queries RETURN ah MATCH (ah:AccountHolder) WHERE ah.fullName = $fullName ... RETURN ah Planning time ● 9

  10. Cypher Execution Cypher Execution Page Cache (data cache) 24 % : read from Cache ● 76 % : read from Disk Waiting for Locks ● Memory Footprint ● 10

  11. Query Load Query Load Locking ● Concurrent Load ● Big Result Sets ● 11

  12. Query Tuning Query Tuning Tips 12

  13. Query Tuning Use Explain and Profile Things to check: Index usage ● Eager ● NodeByLabelScan ● AllNodesScan ● 13

  14. Query Tuning Avoid Cartesian Products MATCH (a), (b), (c) … … RETURN a, b, c OPTIONAL MATCH UNWIND arrA as a OPTIONAL MATCH UNWIND arrB as b OPTIONAL MATCH UNWIND arrC as c ... ... Use WITH and COLLECT and DISTINCT to reduce the intermediate results Use Pattern Comprehension when applicable: MATCH (a) RETURN { a:a, blist : [ (a)-->(b) | {b:b, clist : [(b)-->(c) | c ]], dlist : [ (a)-->(d) | {d:d, elist : [(d)-->(e) | e ]], flist : [ (a)-->(f) | f] } 14

  15. Query Tuning Reduce the query working set as soon as possible Can I move a DISTINCT to an earlier ● point in the query? Can I move a LIMIT to an earlier point ● in the query? Can I use COLLECT on places in the ● query to reduce the amount of rows to be processed? 15

  16. Query Tuning Query Tuning Query Execution Try to send ‘repeatable’ statements ● MERGE (author1:Author {id: 1}) MERGE (author:Author {id: $authorId }) MERGE (author2:Author {id: 2}) MERGE (book:Book {title: $bookTitle }) ... MERGE (author)-[:WROTE]->(book) MERGE (book1:Book {title: "title 1"}) MERGE (book2:Book {title: "title-2"}) ... MERGE (author1)-[:WROTE]->(book1) MERGE (author2)-[:WROTE]->(book2) ... 16

  17. Query Tuning Query Tuning Query Execution Reduce the amount of statements you send to Neo4j by using 'batch' statements ● MERGE (author:Author {id: $authorId }) FOR EVERY ENTRY IN LIST MERGE (book:Book {title: $bookTitle }) WITH AUTHORS AND BOOKS MERGE (author)-[:WROTE]->(book) FIRE A STATEMENT TO NEO4J { authorId : 1, bookTitle : "title1" } UNWIND $inputList as row FOR EVERY 100 ENTRIES IN MERGE (author:Author {id: row.authorId LIST WITH AUTHORS AND BOOKS }) FIRE A STATEMENT TO NEO4J MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book) { inputList : [ { authorId : 1, bookTitle : "title1" } , { authorId : 2, bookTitle : "title2" } ,...] } 17

  18. Query Tuning Query Tuning Query Execution Use apoc.periodic.iterate with the config parameter iterateList : true ! ● CALL apoc.periodic.iterate( 'CALL apoc.load.jdbc("mydb","SELECT authorId, bookTitle FROM AuthorBooks") YIELD row RETURN row' ,'MERGE (author:Author {id: row.authorId }) MERGE (book:Book {title: row.bookTitle }) MERGE (author)-[:WROTE]->(book)' ,{batchSize : 100, iterateList: true } ) kettle also uses this 'batch' approach ● 18

  19. Tool Usage The Query Log Analyzer is meant to be used during development and testing! ● When you have only a command prompt available on a neo4j server you can also ● use the following tool to do a quick analysis of the query.log file: https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/ This tool wil list the top 10 most expensive queries based upon planning, cpu and waiting time. 19

  20. Next Version Supports Neo4j version 4 (multi db) ● List Current queries ● List Query Stats (version 3.5.4 and higher) ● Explain Plan ● 20 Still under development

  21. Multi db support 21 preview, still under development

  22. Current Queries 22 preview, still under development

  23. Queries Stats 23 preview, still under development

  24. Explain Plan 24 preview, still under development

  25. Useful links Introducing the Query Log Analyzer https://medium.com/neo4j/meet-the-query-log-analyzer-30b3eb4b1d6 Cypher Query Optimisations https://medium.com/neo4j/cypher-query-optimisations-fe0539ce2e5c Script to get the top 10 most expensive queries from the command line https://neo4j.com/developer/kb/an-approach-to-parsing-the-query-log/ 25

  26. Q & A Hunger Games Questions for " Improve Query Performance with Query Log Analyzer " 1. Easy : What does Avg Waiting stand for? a. Waiting to execute query b. Waiting to execute query + waiting for locks c. Waiting for locks 2. Medium : What is the correct order of steps in The Cypher Query Processing a. Query Text > Logical Plan > Parse > Physical Execution Plan > Execute Physical Plan in Cypher Runtime b. Query Text > Parse > Logical Plan > Physical Execution Plan > Execute Physical Plan in Cypher Runtime c. Cache > Physical Execution Plan > Execute Physical Plan in Cypher Runtime 3. Hard : What is the name of config parameter in apoc.periodic.iterate to make batch updates possible? Answer here: r.neo4j.com/hunger-games

  27. Query Log Analyzer install https://install.graphapp.io/ 27

Recommend


More recommend