the mixed workload ch benchmark hybrid oltp olap database
play

The mixed workload CH-BenCHmark Hybrid OLTP&OLAP Database - PowerPoint PPT Presentation

Chair of Informatics III: Database Systems The mixed workload CH-BenCHmark Hybrid OLTP&OLAP Database Systems y y Real-Time Business Intelligence Analytical information at your fingertips Analytical information at your


  1. Chair of Informatics III: Database Systems The mixed workload CH-BenCHmark � Hybrid OLTP&OLAP Database Systems y y � Real-Time Business Intelligence � Analytical information at your fingertips � Analytical information at your fingertips Richard Cole (ParAccel), Florian Funke (TU München), ( ), ( ), Leo Giakoumakis (Microsoft), Wey Guy (Microsoft), Alfons Kemper (TU München), Stefan Krompass (TU p ( ), p ( München), Harumi Kuno (HP Labs), Raghunath Nambiar (Cisco), Thomas Neumann (TU München), Meikel Poess ( ), ( ), (Oracle), Kai-Uwe Sattler (TU Ilmenau), Eric Simon (SAP), Florian Waas (Greenplum)

  2. Chair of Informatics III: Database Systems O t Outcome of the Dagstuhl Seminar f th D t hl S i Fall 2010 � Robust Query Processing � Organized by Goetz Graefe et al. g y � Breakout Working Group � Workload Management � Workload Management � Headed by: Harumi Kuno

  3. Chair of Informatics III: Database Systems State of the Art Separate Transaction State of the Art: Separate Transaction (OLTP) and Query (OLAP) Systems

  4. Chair of Informatics III: Database Systems G Goal: Real Time Business Intelligence l R l Ti B i I t lli � Querying the Transactional Data

  5. Chair of Informatics III: Database Systems H Hasso Plattner (SAP): Keynote at SIGMOD 09 Pl tt (SAP) K t t SIGMOD 09

  6. Chair of Informatics III: Database Systems U Use cases for low latency analytics [Curt f l l t l ti [C t Monash‘s Blog (April 11, 2011), Teradata] � Operational reporting � BI dashboards � Claims processed p � 7 X 24 real time � 7 X 24 real time operations � Inventory instant status � Financial peak � Financial peak � Machine generated � Machine generated periods data � Month end, quarter end � Month end quarter end � Rapid response � Rapid response � Fast analytics � Cyber Security � Short and long term Sh t d l t threats Frankly, I think low-latency monitoring is going to be one of the hot areas over y y g g g the next few years. “Real-time” is cool, and big monitors with constantly changing graphics are cooler yet. [C.M.]

  7. Chair of Informatics III: Database Systems The Best of Both Worlds … Th B t f B th W ld …. one size fits all – again?? BestOfBothWorlds BestOfBothWorlds ++ OLAP ++ OLTP MonetDB / M tDB / V ltDB / VoltDB / Vectorwise/ TimesTen / TREX/ Vertica TREX/ Vertica P*Time P Time -- OLTP -- OLAP

  8. TPC-C and TPC-H d TPC H Missing in TPC-C Schemas TPC C Chair of Informatics III: Database Systems

  9. Chair of Informatics III: Database Systems C&H BenCHmark schema

  10. Mixed OLTP&OLAP Workload Mixed OLTP&OLAP Workload Chair of Informatics III: Database Systems

  11. Chair of Informatics III: Database Systems Re-use existing TPC-C-Benchmark Kit Re-use existing TPC-C-Benchmark Kit

  12. Chair of Informatics III: Database Systems Re-use existing TPC-C-Benchmark Kit Re-use existing TPC-C-Benchmark Kit •Multiple OLTP clients •No wait-time in between requests N it ti i b t t •Deviating from original TPC-C •High throughput for smaller DB •High throughput for smaller DB

  13. Chair of Informatics III: Database Systems No Keying/Think-Time � Clients generate one � Cli N K i /Thi k Ti t t request after another as fast as possible 0 sec 0 sec. 0 sec 0 sec. •10 clients=terminals per Warehouse •10 clients=terminals per Warehouse

  14. Chair of Informatics III: Database Systems Re-use existing TPC-C-Benchmark Kit Re-use existing TPC-C-Benchmark Kit N ti Nation (25) S Supplier li 100k Region (5)

  15. Chair of Informatics III: Database Systems Re-use existing TPC-C-Benchmark Kit Re-use existing TPC-C-Benchmark Kit •No updates because new data is generated by OLTP data is generated by OLTP •Modified TPC-H queries •Different schema N ti Nation (25) S Supplier li 100k Region (5)

  16. Chair of Informatics III: Database Systems All 5 TPC C T All 5 TPC-C Transactions (no waiting time) ti ( iti ti ) • New-Order • Order-Status • Payment P t • Stock-Level St k L l • Delivery All 22 TPC-H Queries e.g., Query 5 : Intra Country – Revenue by local Suppliers within a Region per Nation within a Region, per Nation

  17. Chair of Informatics III: Database Systems Complete Query Suite Complete Query Suite Q1: Generate orderline overview � Q2: Most important supplier/item-combinations (those � that have the lowest stock level for certain parts in a certain region) certain region) Q3: Unshipped orders with highest value for customers � within a certain state within a certain state Q4: Orders that were partially shipped late � Q5: Revenue volume achieved through local suppliers Q5: Revenue volume achieved through local suppliers � � Q6: Revenue generated by orderlines of a certain � quantity quantity Q7: Bi-directional trade volume between two nations � Q8: Market share of a given nation for customers of a Q8: Market share of a given nation for customers of a � � given region for a given part type

  18. Chair of Informatics III: Database Systems Complete Query Suite Complete Query Suite Q9: Profit made on a given line of parts,broken out by � supplier nation and year li ti d Q10: Customers who received their ordered products late � Q11: Most important (high order count compared to the Q11 M t i t t (hi h d t d t th � sum of all ordercounts) parts supplied by suppiers of a particular nation particular nation Q12: Determine whether selecting less expensive modes � of shipping is negatively affecting the critical-priority of shipping is negatively affecting the critical priority orders by causing more parts to be received late by customers Q13: Relationships between customers and the size of � their orders Q14: Market response to a promotion campaign �

  19. Chair of Informatics III: Database Systems Complete Query Suite Complete Query Suite Q15: Determines the top supplier � Q16: Number of suppliers that can supply particular parts � Q17: Average yearly revenue that would be lost if orders � were no longer filled for small quantities of certain parts l fill d f ll titi f t i t Q18: Rank customers based on their placement of a large � q antit order quantity order Q19: Machine generated data mining (revenue report for � disjunctive predicate) disjunctive predicate) Q20: Suppliers in a particular nation having selected parts � that may be candidates for a promotional offer that may be candidates for a promotional offer Q21: Suppliers who were not able to ship required parts in � a timely manner a timely manner Q22: Geographies with customers who may be likely to � make a purchase

  20. Chair of Informatics III: Database Systems Performance and Quality Metrics Performance Quality � OLTP Throughput g p � Isolation Level � NewOrder Tx per minute � Serializable for OLTP � Except Stock-Level � Query Response Times � Query isolation level � Geometric Mean One query stream � Read uncommitted (dirty) � Multiple query streams Multiple query streams � � � Read committed R d itt d � Query Throughput � Serializable � Multiple parallel streams � Multiple parallel streams � Snapshot � Snapshot Freshness of the Snapshot � #Queries per hour � In #missed transactions � � Response time guarantees R ti t � derived from TPC-C

  21. Chair of Informatics III: Database Systems Fi First Results from PostgreSQL to Demonstrate t R lt f P t SQL t D t t the Reporting (out of the box � no fine-tuning)

  22. Chair of Informatics III: Database Systems Fi First Results from PostgreSQL: „Powertest“ t R lt f P t SQL P t t“

  23. Chair of Informatics III: Database Systems Fi First Results from PostgreSQL: „OLTP centric“ t R lt f P t SQL OLTP t i “

  24. Chair of Informatics III: Database Systems Fi First Results: „balanced OLTP & OLAP“ t R lt b l d OLTP & OLAP“

  25. l “ First Results: „Queries only“ Chair of Informatics III: Database Systems i Q lt t R Fi

  26. Chair of Informatics III: Database Systems T Tuning Dimensions i Di i • challenge for workload management •Multi-objective control •Admission control •Resource allocation •Memory M •cores

  27. Chair of Informatics III: Database Systems H How to Gain Performance for Mixed t G i P f f Mi d Workload Processing: Snapshotting and Main Memory DBMS Snapshot Snapshot OLAP Workload OLTP Workload Most current Most current database state •Versioning: run OLAP on time versions of the data V i i OLAP ti i f th d t •Twin block: run OLAP on Tx-consistent snapshot •Shadowing •Tuple level •Tuple level •Page level � exploit hardware support for for Virtual Memory Snapshot (HyPer)

  28. Chair of Informatics III: Database Systems F t Future Work W k � Fine-tune (tighten) the benCHmark specification � Query Parameters � Performance metrics � Performance metrics � Account for dynamically growing database cardinality � Isolation levels � Freshness guarantees � Get TPC.org interested to follow up � Get TPC.org interested to follow up � Industry representatives

  29. in HyPer How it works … How it works Chair of Informatics III: Database Systems

  30. Chair of Informatics III: Database Systems H Hardware Supported Data Access and d S t d D t A d Copy on Update

Recommend


More recommend