apollo
play

APOLLO AUTOMATIC DETECTION AND DIAGNOSIS OF PERFORMANCE REGRESSIONS - PowerPoint PPT Presentation

APOLLO AUTOMATIC DETECTION AND DIAGNOSIS OF PERFORMANCE REGRESSIONS IN DATABASE SYSTEMS Jinho Jung , Hong Hu, Joy Arulraj, Taesoo Kim, Woonhak Kang* * APOLLO Holistic toolchain for debugging DBMS 1 AUTOMATICALLY FIND SQL QUERIES


  1. APOLLO AUTOMATIC DETECTION AND DIAGNOSIS OF PERFORMANCE REGRESSIONS IN DATABASE SYSTEMS Jinho Jung , Hong Hu, Joy Arulraj, Taesoo Kim, Woonhak Kang* *

  2. APOLLO • Holistic toolchain for debugging DBMS 1 AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS 2 AUTOMATICALLY DIAGNOSE THE ROOT CAUSE OF PERFORMANCE REGRESSIONS 2 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  3. MOTIVATION: DBMS COMPLEXITY PostgreSQL SQLite 60 7x 47.7 50 increase 40 Code 26.4 Size 30 (MB) 20 8.7 6.1 10 4.4 1.4 0 2000 2010 Present Release Year 3 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  4. MOTIVATION: PERFORMANCE REGRESSIONS CHALLENGING TO BUILD SYSTEM WITH PREDICTABLE PERFORMANCE 4 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  5. MOTIVATION: PERFORMANCE REGRESSIONS CHALLENGING TO BUILD SYSTEM WITH PREDICTABLE PERFORMANCE • Scenario: User upgrades a DBMS installation ▫ Query suddenly takes 10 times longer to execute ▫ Due to unexpected interactions between different components ▫ Refer to this behavior as a performance regression • Performance regression can hurt user productivity ▫ Can easily convert an interactive query to an overnight one 5 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  6. MOTIVATION: PERFORMANCE REGRESSIONS > 10,000x SELECT R0.S_DIST_06 slowdown FROM PUBLIC.STOCK AS R0 WHERE (R0.S_W_ID < CAST(LEAST(0, 1) AS INT8 )) LATEST VERSION OF POSTGRESQL • Due to a recent optimizer update ▫ New policy for choosing the scan algorithm ▫ Resulted in over-estimating the number of rows in the table ▫ Earlier version: Fast bitmap scan ▫ Latest version: Slow sequential scan 6 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  7. MOTIVATION: DETECTING REGRESSIONS 1 HOW TO DISCOVER QUERIES EXHIBITING REGRESSIONS? SELECT NO FROM ORDER AS R0 SELECT NO FROM ORDER AS R0 WHERE EXISTS ( WHERE EXISTS ( SELECT CNT FROM SALES AS R1 SELECT CNT FROM SALES AS R1 WHERE EXISTS ( WHERE EXISTS ( Query runs SELECT ID FROM HISTORY AS R2 SELECT ID FROM HISTORY AS R2 slower on WHERE (R0.INFO IS NOT NULL )); WHERE (R0.INFO IS NOT NULL )); latest version 7 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  8. MOTIVATION: REPORTING REGRESSIONS 2 HOW TO SIMPLIFY QUERIES FOR REPORTING REGRESSION? SELECT NO FROM ORDER AS R0 SELECT NO FROM ORDER AS R0 WHERE EXISTS ( WHERE EXISTS ( SELECT CNT FROM SALES AS R1 SELECT CNT FROM SALES AS R1 WHERE EXISTS ( WHERE EXISTS ( Query runs SELECT ID FROM HISTORY AS R2 SELECT ID FROM HISTORY AS R2 slower on WHERE (R0.INFO IS NOT NULL )); WHERE (R0.INFO IS NOT NULL )); latest version 8 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  9. MOTIVATION: DIAGNOSING REGRESSIONS 3 HOW TO DIAGNOSE THE ROOT CAUSE OF THE REGRESSION? SELECT NO FROM ORDER AS R0 SELECT NO FROM ORDER AS R0 WHERE EXISTS ( WHERE EXISTS ( SELECT CNT FROM SALES AS R1 SELECT CNT FROM SALES AS R1 WHERE EXISTS ( WHERE EXISTS ( Query runs SELECT ID FROM HISTORY AS R2 SELECT ID FROM HISTORY AS R2 slower on WHERE (R0.INFO IS NOT NULL )); WHERE (R0.INFO IS NOT NULL )); latest version 9 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  10. APOLLO TOOLCHAIN 1 HOW TO DISCOVER QUERIES EXHIBITING REGRESSIONS? SQLFUZZ: FEEDBACK-DRIVEN FUZZING BUG APOLLO TOOLCHAIN REPORTS OLD - Query VERSION SQLFUZZ SQLMIN SQLDEBUG - Commit - File list NEW - Function VERSION 10 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  11. APOLLO TOOLCHAIN 2 HOW TO SIMPLIFY QUERIES FOR REPORTING REGRESSION? SQLMIN: BI-DIRECTIONAL QUERY REDUCTION ALGORITHMS BUG APOLLO TOOLCHAIN REPORTS OLD - Query VERSION SQLMIN SQLDEBUG - Commit SQLFUZZ - File list NEW - Function VERSION 11 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  12. APOLLO TOOLCHAIN 3 HOW TO DIAGNOSE THE ROOT CAUSE OF THE REGRESSION? SQLDEBUG: STATISTICAL DEBUGGING + COMMIT BISECTION BUG APOLLO TOOLCHAIN REPORTS OLD - Query VERSION SQLMIN SQLDEBUG - Commit SQLFUZZ - File list NEW - Function VERSION 12 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  13. TALK OVERVIEW APOLLO TOOLCHAIN BUG REPORTS OLD - Query VERSION SQLFUZZ SQLMIN SQLDEBUG - Commit - File list NEW - Function VERSION 13 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  14. #1: SQLFUZZ — DETECTING REGRESSIONS OLD NEW VERSION VERSION SQLFuzz Candidate Random Queries queries queries Query Query Bug 2 3 1 exhibiting Executor Generator Validator performance regression Update SQL grammar probability table 14 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  15. #1: SQLFUZZ — DETECTING REGRESSIONS 1 QUERY GENERATOR: RANDOM QUERY GENERATION Retrieve Check schema complexity Query Generator Queries Valid for fuzzing queries SELECT 0.3 LEFT JOIN 0.3 SQL grammar probability table LIMIT 0.2 CAST 0.2 15 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  16. #1: SQLFUZZ — DETECTING REGRESSIONS 2 QUERY EXECUTOR: FEEDBACK-DRIVEN FUZZING SELECT R0.S_DIST_06 FROM PUBLIC.STOCK AS R0 WHERE (R0.S_W_ID < OLD NEW VERSION VERSION CAST ( LEAST (0, 1) AS INT8 )) Found Update table Regression? CASE LEFT JOIN Query LIMIT CAST +0.1 Executor SQL grammar probability table 16 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  17. #1: SQLFUZZ — DETECTING REGRESSIONS 3 REGRESSION VALIDATOR: REDUCING FALSE POSITIVES Filtering rules 1 Non-deterministic behavior? 2 Non-executed plan? Usage of catalog statistics? 3 Report 4 Enough memory? Regression 5 Limit statement? Query Developers 6 Query is too complex? 7 … Updated filtering rules 17 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  18. TALK OVERVIEW APOLLO TOOLCHAIN BUG REPORTS OLD - Query VERSION SQLFUZZ SQLMIN SQLDEBUG - Commit - File list NEW - Function VERSION 18 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  19. #2: SQLMIN — REPORTING REGRESSIONS • Bottom-up Query Reduction ▫ Extract valid sub-query • Top-down Query Reduction ▫ Iteratively removes unnecessary expressions 19 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  20. #2: SQLMIN — REPORTING REGRESSIONS SELECT S1.C2 FROM ( SELECT CASE WHEN EXISTS ( SELECT S0.C0 FROM ORDER AS R1 WHERE (( S0.C0 = 10 ) AND ( S0.C1 IS NULL)) ) THEN S0.C0 END AS C2 , FROM ( SELECT R0.I_PRICE AS C0 , R0.I_DATA AS C1 , (SELECT ID FROM ITEM ) AS C2 FROM ITEM AS R0 WHERE R0.PRICE IS NOT NULL OR ( R0.PRICE IS NOT S1.C2 ) LIMIT 1000 ) AS S0 ) AS S1 ; 20 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  21. #2: SQLMIN — REPORTING REGRESSIONS SELECT S1.C2 BOTTOM-UP FROM ( SELECT REDUCTION CASE WHEN EXISTS ( EXTRACT SUB-QUERY SELECT S0.C0 FROM ORDER AS R1 WHERE (( S0.C0 = 10 ) AND ( S0.C1 IS NULL)) ) THEN S0.C0 END AS C2 , FROM ( SELECT R0.I_PRICE AS C0 , R0.I_DATA AS C1 , (SELECT ID FROM ITEM ) AS C2 Remove FROM ITEM AS R0 dependencies WHERE R0.PRICE IS NOT NULL OR ( R0.PRICE IS NOT S1.C2 ) LIMIT 1000 ) AS S0 ) AS S1 ; 21 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  22. #2: SQLMIN — REPORTING REGRESSIONS SELECT S1.C2 TOP-DOWN FROM ( REDUCTION SELECT REMOVE ELEMENTS CASE WHEN EXISTS ( SELECT S0.C0 FROM ORDER AS R1 WHERE (( S0.C0 = 10 ) AND ( S0.C1 IS NULL)) Remove condition ) THEN S0.C0 END AS C2 , FROM ( Remove columns SELECT R0.I_PRICE AS C0 , R0.I_DATA AS C1 , (SELECT ID FROM ITEM ) AS C2 Remove sub-queries FROM ITEM AS R0 WHERE R0.PRICE IS NOT NULL OR ( R0.PRICE IS NOT S1.C2 ) Remove clause LIMIT 1000 ) AS S0 ) AS S1 ; 22 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  23. #2: SQLMIN — REPORTING REGRESSIONS SELECT CASE WHEN EXISTS ( SELECT S0.C0 FROM ORDER AS R1 WHERE (( S0.C0 = 10 )) ) THEN S0.C0 END AS C2 , FROM ( SELECT R0.I_PRICE AS C0, FROM ITEM AS R0 WHERE R0.PRICE IS NOT NULL) AS S0 ) AS S1 ; 23 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  24. TALK OVERVIEW APOLLO TOOLCHAIN BUG REPORTS OLD - Query VERSION SQLFUZZ SQLMIN SQLDEBUG - Commit - File list NEW - Function VERSION 24 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  25. #3: SQLDEBUG — DIAGNOSING REGRESSIONS Control-flow Partially BUG SQLDEBUG Graphs Reduced DBMS REPORTS (Traces) queries Slow - Query Statistical SQLMIN - Commit Debugger Fast - File list Regression - Function query Commit bisection First commit exhibiting regression? 25 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  26. #3: SQLDEBUG — DIAGNOSING REGRESSIONS 1 COMMIT BISECTION: FIND EARLIEST PROBLEMATIC COMMIT COMMIT 1 OLD VERSION (FAST QUERY EXECUTION) COMMIT 2 PROBLEM BEGINS HERE! COMMIT 3 NEW VERSION (SLOW QUERY EXECUTION) COMMIT 5 26 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  27. #3: SQLDEBUG — DIAGNOSING REGRESSIONS 2 QUERY REDUCTION: PARTIALLY REDUCED QUERIES Partially reduced Minimized Original query queries query SELECT NO FROM ORDER AS R0 WHERE SELECT CNT EXISTS ( SELECT FROM SALES CNT WHERE CNT > ID FROM SALES AS R1 WHERE EXISTS ( SELECT ID FROM Ready to use statistical debugging? Collect set of queries 27 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  28. #3: SQLDEBUG — DIAGNOSING REGRESSIONS 3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES Functions int func(){ if (cond1) work; OLD VERSION } int func(){ if (cond1) work; NEW VERSION } 28 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  29. #3: SQLDEBUG — DIAGNOSING REGRESSIONS 3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES Functions Traces int func(){ 0x400 0x420  TRUE if (cond1) work; OLD VERSION } int func(){ 0x500 0x520  FALSE if (cond1) work; NEW VERSION } 29 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

  30. #3: SQLDEBUG — DIAGNOSING REGRESSIONS 3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES Trace Alignment Functions Traces int func(){ 0x400 func + 0x0 0x420  TRUE func + 0x20  TRUE if (cond1) work; OLD VERSION } int func(){ 0x500 func + 0x0 0x520  FALSE func + 0x20  FALSE if (cond1) work; NEW VERSION } 30 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

Recommend


More recommend