an index advisor using deep reinforcement learning
play

An Index Advisor Using Deep Reinforcement Learning Hai Lan 1 , - PowerPoint PPT Presentation

An Index Advisor Using Deep Reinforcement Learning Hai Lan 1 , Zhifeng Bao 1 , Yuwei Peng 2 1 RMIT University 2 Wuhan University 1 Index Selection Problem (ISP) 2 Index Selection Problem (ISP) Choosing the right indexes to build is one of


  1. An Index Advisor Using Deep Reinforcement Learning Hai Lan 1 , Zhifeng Bao 1 , Yuwei Peng 2 1 RMIT University 2 Wuhan University 1

  2. Index Selection Problem (ISP) 2

  3. Index Selection Problem (ISP) • Choosing the right indexes to build is one of the central issues in database tuning. • Problem Definition: • Select a set of indexes (index configuration) to be built to maximize the performance of the given workload with some constraints. • Constraints: storage usage, index number, and so on. 2

  4. Index Selection Problem (ISP) • Choosing the right indexes to build is one of the central issues in database tuning. • Problem Definition: • Select a set of indexes (index configuration) to be built to maximize the performance of the given workload with some constraints. • Constraints: storage usage, index number, and so on. • Index interaction : an interaction exists between an index a and an index b if the benefit of a is affected by the existence of b and vice - versa . SELECT * FROM t WHERE a < 10 OR b < 10; (1) An index on a ✗ (2) An index on b ✗ (3) An index on a and an index on b ✓ 2

  5. Prior Work 3

  6. Prior Work Category Work Cost Index type Alog Cons IIA ✗ Non - Learning AutoAdmin [ VLD ’97] Estimated cost S / M Greedy index number method ✗ ILP [ ICDE ’07] Estimated cost S / M ILP storage ✓ ISRM [ ICDE 19] Estimated cost S / M Greedy storage Learning - based AI Meet AI [ SIGMOD ’19] Learning - model S / M Not sure Greedy index number method Estimated cost ✓ Welborn et al [ arxiv ’19] Not mention S / M DQN no ✓ DRL - Index [ ICDEW ’20] Estimated cost S DQN Not mention IIA means index interaction . Cons means constraints . Alog means search algorithm . S means single column index . M means multi - column index . Welborn ’ s work only focuses on single table . 3 DRL - index is not implemented yet .

  7. Prior Work Category Work Cost Index type Alog Cons IIA ✗ Non - Learning AutoAdmin [ VLD ’97] Estimated cost S / M Greedy index number method ✗ ILP [ ICDE ’07] Estimated cost S / M ILP storage ✓ ISRM [ ICDE 19] Estimated cost S / M Greedy storage Learning - based AI Meet AI [ SIGMOD ’19] Learning - model S / M Not sure Greedy index number method Estimated cost ✓ Welborn et al [ arxiv ’19] Not mention S / M DQN no ✓ DRL - Index [ ICDEW ’20] Estimated cost S DQN Not mention IIA means index interaction . Cons means constraints . Alog means search algorithm . S means single column index . M means multi - column index . Welborn ’ s work only focuses on single table . 3 DRL - index is not implemented yet .

  8. Prior Work Category Work Cost Index type Alog Cons IIA ✗ Non - Learning AutoAdmin [ VLD ’97] Estimated cost S / M Greedy index number method ✗ ILP [ ICDE ’07] Estimated cost S / M ILP storage ✓ ISRM [ ICDE 19] Estimated cost S / M Greedy storage Learning - based AI Meet AI [ SIGMOD ’19] Learning - model S / M Not sure Greedy index number method Estimated cost ✓ Welborn et al [ arxiv ’19] Not mention S / M DQN no ✓ DRL - Index [ ICDEW ’20] Estimated cost S DQN Not mention Our Goal : (1) Handle complex queries on multiple tables (2) Recommend multi - column indexes (3) Capture the index interaction IIA means index interaction . Cons means constraints . Alog means search algorithm . S means single column index . M means multi - column index . Welborn ’ s work only focuses on single table . 3 DRL - index is not implemented yet .

  9. Our Method - Overview 4

  10. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance 4

  11. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance 4

  12. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance workload 4

  13. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload 4

  14. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload The algorithm to select an index from candidates according to current workload and index configuration 4

  15. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration 4

  16. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . 4

  17. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . • Framework 4

  18. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . • Framework Index Candidates Rules Workload Sample 4

  19. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . • Framework Index Candidates Rules Workload Sample action DQN 4 Agent

  20. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps The algorithm to select an index from candidates according to current workload and index configuration • T is determined by the constraints . Create • Framework transform Index Candidates Rules Workload Sample action DQN 4 Agent

  21. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps Environment DB The algorithm to select an index from candidates according to What - If Caller current workload and index configuration • T is determined by the constraints . Create • Framework transform Index Candidates Rules Workload Sample action DQN 4 Agent

  22. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps Environment DB The algorithm to select an index from candidates according to What - If Caller current workload and index configuration • T is determined by the constraints . Create • Framework Reward Next state transform Index Candidates Rules Workload Sample action DQN 4 Agent

  23. Our Method - Overview • Formulate Index Selection as a reinforcement learning problem • Maximize the Performance Index configuration before starting the step t workload steps Environment DB The algorithm to select an index from candidates according to What - If Caller current workload and index configuration • T is determined by the constraints . Create • Framework Reward Next state transform Index Candidates Rules Workload Sample action DQN 4 Agent

  24. Our Method - Rules 5

  25. Our Method - Rules 5

  26. Our Method - Rules 5

  27. Our Method - Rules SELECT t 1. a 7 FROM t 1, t 2 WHERE t 1. a 1 = t 2. b 1 AND t 1. a 2 = t 2. b 2 AND t 1. a 3 = 4 AND t 2. b 3 < 10 ORDER BY t 1. a 5, t 1. a 6 5

  28. Our Method - Rules SELECT t 1. a 7 FROM t 1, t 2 WHERE t 1. a 1 = t 2. b 1 AND t 1. a 2 = t 2. b 2 AND t 1. a 3 = 4 AND t 2. b 3 < 10 ORDER BY t 1. a 5, t 1. a 6 J : t 1. a 2, t 1. a 3, t 2. b 2, t 2. b 3 EQ : t 1. a 3 RANGE : t 2. b 3 O : t 1. a 5, t 1. a 6 USED : t 1.( a 1- a 7), t 2.( b 1- b 3) 5

  29. Our Method - Rules SELECT t 1. a 7 FROM t 1, t 2 WHERE t 1. a 1 = t 2. b 1 AND t 1. a 2 = t 2. b 2 AND t 1. a 3 = 4 AND t 2. b 3 < 10 ORDER BY t 1. a 5, t 1. a 6 t 1. a 1, t 1. a 2, t 1. a 3, t 2. b 1, t 2. b 2, t 2. b 3 Rule 1 J : t 1. a 2, t 1. a 3, t 2. b 2, t 2. b 3 EQ : t 1. a 3 RANGE : t 2. b 3 O : t 1. a 5, t 1. a 6 USED : t 1.( a 1- a 7), t 2.( b 1- b 3) 5

  30. Our Method - Rules SELECT t 1. a 7 FROM t 1, t 2 WHERE t 1. a 1 = t 2. b 1 AND t 1. a 2 = t 2. b 2 AND t 1. a 3 = 4 AND t 2. b 3 < 10 ORDER BY t 1. a 5, t 1. a 6 t 1. a 1, t 1. a 2, t 1. a 3, t 2. b 1, t 2. b 2, t 2. b 3 Rule 1 J : t 1. a 2, t 1. a 3, t 2. b 2, t 2. b 3 Rule 2 EQ : t 1. a 3 ( t 1. a 5, t 1. a 6) RANGE : t 2. b 3 O : t 1. a 5, t 1. a 6 USED : t 1.( a 1- a 7), t 2.( b 1- b 3) 5

Recommend


More recommend