advanced database systems
play

ADVANCED DATABASE SYSTEMS Self-Driving Database Management - PowerPoint PPT Presentation

Lect ure # 25 ADVANCED DATABASE SYSTEMS Self-Driving Database Management Systems @ Andy_Pavlo // 15- 721 // Spring 2019 CMU 15-721 (Spring 2019) 2 Autonomous DBMS History Self-Driving DBMSs Learned Components CMU 15-721 (Spring 2019) 3


  1. Lect ure # 25 ADVANCED DATABASE SYSTEMS Self-Driving Database Management Systems @ Andy_Pavlo // 15- 721 // Spring 2019

  2. CMU 15-721 (Spring 2019) 2 Autonomous DBMS History Self-Driving DBMSs Learned Components

  3. CMU 15-721 (Spring 2019) 3 M OTIVATIO N Personnel is ~50% of the TOC of a DBMS. Average DBA Salary (2017): $89,050 The scale and complexity of DBMS installations have surpassed humans. Source: https://www.bls.gov/oes/current/oes151141.htm Source: https://www.highbeam.com/doc/1P3- 1149052351.html

  4. CMU 15-721 (Spring 2019) 4 SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s) Index Selection Partitioning / Sharding Keys Data Placement

  5. CMU 15-721 (Spring 2019) 4 SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s) SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' Admin A.ID A.VAL B.ID Tuning B.NAME Algorithm +100 +200 +50

  6. CMU 15-721 (Spring 2019) 4 SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s) SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' Admin A.ID A.VAL B.ID Tuning B.NAME Algorithm +100 +200 +50

  7. CMU 15-721 (Spring 2019) 4 SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s) SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' Admin A.ID A.VAL B.ID Tuning B.NAME Algorithm +100 +200 +50

  8. CMU 15-721 (Spring 2019) 5 SELF- TUN ING DATABASES (19 9 0 s- 20 0 0 s) SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' Admin A.ID A.VAL B.ID Tuning B.NAME Algorithm Optimizer Cost Model AutoAdmin

  9. CMU 15-721 (Spring 2019) 5 SELF- TUN ING DATABASES (19 9 0 s- 20 0 0 s) SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' Admin A.ID A.VAL B.ID Tuning B.NAME Algorithm Optimizer Cost Model AutoAdmin

  10. CMU 15-721 (Spring 2019) 6 SELF- TUN ING DATABASES (19 9 0 s- 20 0 0 s) Number of Configuration Knobs Per Release 600 541 Number of Knobs 400 291 200 75 53 0 2000 2004 2008 2012 2016 Source: Dana Van Aken

  11. CMU 15-721 (Spring 2019) 7 CLO UD- M AN AGED DATABASES (20 10 S) Initial Placement Tenant Migration

  12. CMU 15-721 (Spring 2019) 7 CLO UD- M AN AGED DATABASES (20 10 S)

  13. CMU 15-721 (Spring 2019) 7 CLO UD- M AN AGED DATABASES (20 10 S)

  14. CMU 15-721 (Spring 2019) 8 O BSERVATIO N People have been working on autonomous database systems for 45 years. Why is this previous work insufficient?

  15. CMU 15-721 (Spring 2019) 9 PREVIO US WO RK Problem #1: Human Judgements → User has to make final decision on whether to apply recommendations. Problem #2: Reactionary Measures → Can only solve previous problems. Cannot anticipate upcoming usage trends / issues. Problem #3: No Transfer Learning → Tunes each DBMS instance in isolation. Cannot apply knowledge learned about one DBMS to another.

  16. CMU 15-721 (Spring 2019) 10 O BSERVATIO N Just like there are different levels of autonomy in cars, there are different levels for databases. → SAE (J3016) Automation Levels We need to reason about the autonomous systems to understand their capabilities and limitations. → This will help us reason about how much a human needs to be involved in its administration.

  17. CMU 15-721 (Spring 2019) 11 AUTO N O M O US DBM S TAXO N O MY System only does what Level #0: Manual humans tell it to do.

  18. CMU 15-721 (Spring 2019) 11 AUTO N O M O US DBM S TAXO N O MY Level #0: Manual Recommendation tools that Level #1: Assistant suggest improvements. Human makes final decisions.

  19. CMU 15-721 (Spring 2019) 11 AUTO N O M O US DBM S TAXO N O MY Level #0: Manual Level #1: Assistant DBMS and humans work together to mange the system. Level #2: Mixed Management Human guides the process.

  20. CMU 15-721 (Spring 2019) 11 AUTO N O M O US DBM S TAXO N O MY Level #0: Manual Level #1: Assistant Level #2: Mixed Management Subsystems can adapt without human guidance. Level #3: Local Optimizations No higher-level coordination.

  21. CMU 15-721 (Spring 2019) 11 AUTO N O M O US DBM S TAXO N O MY Level #0: Manual Level #1: Assistant Level #2: Mixed Management Level #3: Local Optimizations Human only provides high- level direction + hints. Level #4: Direct Optimizations System can identify when it needs to ask humans for help.

  22. CMU 15-721 (Spring 2019) 11 AUTO N O M O US DBM S TAXO N O MY Level #0: Manual Level #1: Assistant Level #2: Mixed Management Level #3: Local Optimizations Level #4: Direct Optimizations Level #5: Self-Driving

  23. CMU 15-721 (Spring 2019) 12 SELF- DRIVIN G DATABASE A DBMS that can deploy, configure, and tune itself automatically without any human intervention. → Select actions to improve some objective function (e.g., throughput, latency, cost). → Choose when to apply an action. → Learn from these actions and refine future decision making processes. SELF- DRIVING DATABASE MANAGEMENT SYSTEMS CIDR 2 2017

  24. CMU 15-721 (Spring 2019) 13 ARCH ITECTURE OVERVIEW SQL Statements Workload Forecasts Modeling Component Models Internal Metrics Deploy & Search & Observe Planning Actions Where to Deploy? When to Deploy? How to Deploy? Why?

  25. CMU 15-721 (Spring 2019) 14 SELF- DRIVIN G EN GIN EERIN G Environment Observations → How the DBMS collects training data. Action Meta-Data → How the DBMS implements and exposes methods for controlling and modifying the system's configuration. Action Engineering → How the DBMS deploys actions either for training or optimization. MAKE YOUR DATABASE DREAM OF ELECTRIC SHEEP: ENGINEERING FOR SELF- DRIVING OPERATION UNPUBLISHED M MANUSCRIPT 2019

  26. CMU 15-721 (Spring 2019) 15 EN VIRO N M EN T O BSERVATIO N S Logical Workload History → SQL queries with their execution context. → Need to compress to reduce storage size. Runtime Metrics → Internal measurements about the DBMS's runtime behavior. Database Contents → Succinct representation/encoding of the database tables.

  27. CMU 15-721 (Spring 2019) 16 SUB- CO M PO N EN T M ETRICS If the DBMS has sub-components that are tunable, then it must expose separate metrics for those components. Bad Example:

  28. CMU 15-721 (Spring 2019) 17 SUB- CO M PO N EN T M ETRICS RocksDB Column Family Knobs

  29. CMU 15-721 (Spring 2019) 17 SUB- CO M PO N EN T M ETRICS RocksDB Column Family Knobs Column Family Metrics Missing: Reads Writes

  30. CMU 15-721 (Spring 2019) 17 SUB- CO M PO N EN T M ETRICS RocksDB Column Family Knobs Global Metrics Aggregated Metrics

  31. CMU 15-721 (Spring 2019) 18 ACTIO N M ETA- DATA Configuration Knobs → Untunable flags → Value ranges Dependencies → No hidden dependencies → Dynamic actions (i.e., an action creates new actions).

  32. CMU 15-721 (Spring 2019) 19 UN TUN ABLE KN O BS Anything that requires a human value judgement should be marked as off-limits to autonomous components. → File Paths → Network Addresses → Durability / Isolation Levels

  33. CMU 15-721 (Spring 2019) 20 KN O B H IN TS The autonomous components need hints about how to change a knob. → Min/max ranges. → Separate knobs to enable/disable a feature. → Non-uniform deltas.

  34. CMU 15-721 (Spring 2019) 20 KN O B H IN TS The autonomous components need hints about how to change a knob. → Min/max ranges. → Separate knobs to enable/disable a feature. → Non-uniform deltas. 1 KB 1 MB 1 GB 1 TB +10 KB +10 MB +10 GB

  35. CMU 15-721 (Spring 2019) 21 ACTIO N EN GIN EERIN G No Downtime Notifications Replicated Training

  36. CMU 15-721 (Spring 2019) 22 N O DOWN TIM E The DBMS must be able to deploy any action without incurring downtime. → Restart vs. Unavailability Without this, the system has to include the downtime in its cost model estimations. → Bad Example: MySQL Log File Size

  37. CMU 15-721 (Spring 2019) 23 N OTIFICATIO N S Provide a notification to indicate when an action starts and when it completes. → Need to know whether degradation is due to deployment or bad decision. Harder for changes that can be used before the action completes.

  38. CMU 15-721 (Spring 2019) 24 REPLICATED TRAIN IN G ML models need lots of training data. But getting this data is expensive in a DBMS. → We don't want to slow down a production DBMS. → Building a simulator for the DBMS is too hard. Ongoing Research: How to use the DBMS's replicas to explore configurations and train its models.

  39. CMU 15-721 (Spring 2019) 25 REPLICATED TRAIN IN G Actions Master Replica Self-Driving Manager Actions Replica

  40. CMU 15-721 (Spring 2019) 25 REPLICATED TRAIN IN G Actions Actions Master Replica Self-Driving Manager Actions Actions Replica

  41. CMU 15-721 (Spring 2019) 25 REPLICATED TRAIN IN G ??? Wait Time Revert Actions Actions Master Replica Self-Driving Manager Actions Replica

  42. CMU 15-721 (Spring 2019) 25 REPLICATED TRAIN IN G Writes Master Replica Reads Writes Self-Driving Manager App Server Writes Replica

  43. CMU 15-721 (Spring 2019) 25 REPLICATED TRAIN IN G ??? SQL Statements Physical Log Master Replica Reads Writes Self-Driving Manager App Server ??? SQL Statements Replica Physical Log

Recommend


More recommend