15 721
play

15-721 DATABASE SYSTEMS Lecture #10 Query Execution & - PowerPoint PPT Presentation

15-721 DATABASE SYSTEMS Lecture #10 Query Execution & Scheduling Andy Pavlo / / Carnegie Mellon University / / Spring 2016 2 TODAYS AGENDA Process Models Query Parallelization Data Placement Scheduling 10 Crack Commandments


  1. 15-721 DATABASE SYSTEMS Lecture #10 – Query Execution & Scheduling Andy Pavlo / / Carnegie Mellon University / / Spring 2016

  2. 2 TODAY’S AGENDA Process Models Query Parallelization Data Placement Scheduling 10 Crack Commandments CMU 15-721 (Spring 2016)

  3. 3 MULTI-USER DATABASE APP STACK End Users Client Server CMU 15-721 (Spring 2016)

  4. 3 MULTI-USER DATABASE APP STACK SQL PL/SQL End Users Client Server CMU 15-721 (Spring 2016)

  5. 3 MULTI-USER DATABASE APP STACK REST SQL SOAP PL/SQL End Users Client Server CMU 15-721 (Spring 2016)

  6. 3 MULTI-USER DATABASE APP STACK FRONT-END BACK-END APPLICATION APPLICATION DBMS REST SQL SOAP PL/SQL End Users Client Server CMU 15-721 (Spring 2016)

  7. 4 QUERY EXECUTION A query plan is comprised of operators . An operator instance is an invocation of an operator on some segment of data. A task is the execution of a sequence of one or more operator instances. CMU 15-721 (Spring 2016)

  8. 5 PROCESS MODEL A DBMS’s process model defines how the system is architected to support concurrent requests from a multi-user application. A worker is the DBMS component that is responsible for executing tasks on behalf of the client and returning the results. ARCHITECTURE OF A DATABASE SYSTEM Foundations and Trends in Databases 2007 CMU 15-721 (Spring 2016)

  9. 6 PROCESS MODELS Approach #1: Process per DBMS Worker Approach #2: Process Pool Approach #3: Thread per DBMS Worker CMU 15-721 (Spring 2016)

  10. 7 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker CMU 15-721 (Spring 2016)

  11. 7 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker CMU 15-721 (Spring 2016)

  12. 7 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker CMU 15-721 (Spring 2016)

  13. 7 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker CMU 15-721 (Spring 2016)

  14. 7 PROCESS PER WORKER Each worker is a separate OS process. → Relies on OS scheduler. → Use shared-memory for global data structures. → Examples: IBM DB2, Postgres, Oracle Dispatcher Worker CMU 15-721 (Spring 2016)

  15. 8 PROCESS POOL A worker uses any process that is free in a pool → Still relies on OS scheduler and shared memory. → Bad for CPU cache locality. → Examples: IBM DB2 Dispatcher Worker Pool CMU 15-721 (Spring 2016)

  16. 8 PROCESS POOL A worker uses any process that is free in a pool → Still relies on OS scheduler and shared memory. → Bad for CPU cache locality. → Examples: IBM DB2 Dispatcher Worker Pool CMU 15-721 (Spring 2016)

  17. 9 THREAD PER WORKER Single process with multiple worker threads. → DBMS has to manage its own scheduling. → May or may not use a dispatcher thread. → Examples: IBM DB2, MSSQL, MySQL, Oracle (Newer) Worker Threads CMU 15-721 (Spring 2016)

  18. 10 PROCESS MODELS Using a multi-threaded architecture has several advantages: → Less overhead per context switch. → Don’t have to manage shared memory. The thread per worker model does not mean that you have intra-query parallelism. I am not aware of any new DBMS built in the last 7-8 years that doesn’t use threads. CMU 15-721 (Spring 2016)

  19. 11 SCHEDULING For each query plan, the DBMS has to decide where, when, and how to execute it. → How many tasks should it use? → How many CPU cores should it use? → What CPU core should the tasks execute on? → Where should a task store its output? The DBMS always knows more than the OS. CMU 15-721 (Spring 2016)

  20. 12 INTER-QUERY PARALLELISM Improve overall performance by allowing multiple queries to execute simultaneously. → Provide the illusion of isolation through concurrency control scheme. The difficulty of implementing a concurrency control scheme is not significantly affected by the DBMS’s process model. CMU 15-721 (Spring 2016)

  21. 13 INTRA-QUERY PARALLELISM Improve the performance of a single query by executing its operators in parallel. Approach #1: Intra-Operator (Horizontal) → Operators are decomposed into independent instances that perform the same function on different subsets of data. Approach #2: Inter-Operator (Vertical) → Operations are overlapped in order to pipeline data from one stage to the next without materialization. CMU 15-721 (Spring 2016)

  22. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ σ σ A B CMU 15-721 (Spring 2016)

  23. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ σ σ A B CMU 15-721 (Spring 2016)

  24. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ σ σ A 1 A 2 A 3 A B CMU 15-721 (Spring 2016)

  25. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ σ σ A 1 A 2 A 3 A B 1 2 3 CMU 15-721 (Spring 2016)

  26. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ σ σ A 1 A 2 A 3 A B 1 2 3 CMU 15-721 (Spring 2016)

  27. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ σ σ σ σ σ A 1 A 2 A 3 A B 1 2 3 CMU 15-721 (Spring 2016)

  28. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ Build HT Build HT Build HT σ σ σ σ σ A 1 A 2 A 3 A B 1 2 3 CMU 15-721 (Spring 2016)

  29. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π Exchange ⨝ Build HT Build HT Build HT σ σ σ σ σ A 1 A 2 A 3 A B 1 2 3 CMU 15-721 (Spring 2016)

  30. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π Exchange ⨝ Build HT Build HT Build HT σ σ σ σ σ A 1 A 2 A 3 A B 1 2 3 CMU 15-721 (Spring 2016)

  31. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π Exchange ⨝ Build HT Build HT Build HT σ σ σ σ σ A 1 A 2 A 3 B 1 B 2 A B 1 2 3 4 5 CMU 15-721 (Spring 2016)

  32. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π Exchange Exchange ⨝ Build HT Build HT Build HT Build HT Build HT σ σ σ σ σ σ σ A 1 A 2 A 3 B 1 B 2 A B 1 2 3 4 5 CMU 15-721 (Spring 2016)

  33. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π Exchange Exchange ⨝ Build HT Build HT Build HT Build HT Build HT σ σ σ σ σ σ σ A 1 A 2 A 3 B 1 B 2 A B 1 2 3 4 5 CMU 15-721 (Spring 2016)

  34. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 ⨝ π Exchange Exchange ⨝ Build HT Build HT Build HT Build HT Build HT σ σ σ σ σ σ σ A 1 A 2 A 3 B 1 B 2 A B 1 2 3 4 5 CMU 15-721 (Spring 2016)

  35. 14 INTRA-OPERATOR PARALLELISM SELECT A.id, B.value Exchange FROM A, B WHERE A.id = B.id 1 2 3 4 AND A.value < 99 AND B.value > 100 Probe HT Probe HT Probe HT Probe HT ⨝ π Exchange Exchange ⨝ Build HT Build HT Build HT Build HT Build HT σ σ σ σ σ σ σ A 1 A 2 A 3 B 1 B 2 A B 1 2 3 4 5 CMU 15-721 (Spring 2016)

  36. 15 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ σ σ A B CMU 15-721 (Spring 2016)

  37. 15 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ σ σ A B CMU 15-721 (Spring 2016)

  38. 15 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id AND A.value < 99 AND B.value > 100 π ⨝ σ σ for r 1 ∊ outer: 1 ⨝ for r 2 ∊ inner: A B emit (r 1 ⨝ r 2 ) CMU 15-721 (Spring 2016)

  39. 15 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id 2 π AND A.value < 99 AND B.value > 100 for r ∊ incoming: π emit ( π r) ⨝ σ σ for r 1 ∊ outer: 1 ⨝ for r 2 ∊ inner: A B emit (r 1 ⨝ r 2 ) CMU 15-721 (Spring 2016)

  40. 15 INTER-OPERATOR PARALLELISM SELECT A.id, B.value FROM A, B WHERE A.id = B.id 2 π AND A.value < 99 AND B.value > 100 for r ∊ incoming: π emit ( π r) ⨝ σ σ for r 1 ∊ outer: 1 ⨝ for r 2 ∊ inner: A B emit (r 1 ⨝ r 2 ) CMU 15-721 (Spring 2016)

Recommend


More recommend