Fjord Bridging the between Gap OLAP SQL and Jens-Peter Dittrich 1, * Donald Kossmann 1,2 Alexander Kreutz 2 1 ETH Zurich, Switzerland *Affiliation 2003-2004: SAP BW OLAP Technology 2 i-TV-T AG, Germany
OLAP: The CEO’s View 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 2
OLAP: The Ph.D.’s View Not true! 1. OLAP is just another application on top of existing DBMS. Not true! 2. DBMS performance for OLAP is great. Not true! 3. The relational model is well suited for OLAP. Not true! 4. SQL is a great language for doing OLAP. OLAP Client OLAP Client ? OLAP Engine RDBMS/SQL RDBMS/SQL 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 3
What do OLAP Engines do?... 1. Result Formatting Good! OLAP Client 2. Query Processing 1. Joins OLAP 2. Aggregations Bad! Engine 3. Pivot and Cube Computation 4. Caching 3. Misc RDBMS/SQL 1. Currency Conversions 2. Summarizability Checks ...OLAP engines 3. Authorization bridge the gap between OLAP 4. ... and SQL! 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 4
The Gap Relational Model Pivot Tables Bridging the Gap OLAP SQL Closing the Gap 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 5
Relational State Customer Product Profit Input S1 C1 P1 1.0 S1 C1 P2 1.0 Model S1 C2 P1 1.0 S1 C2 P2 1.0 S2 C1 P1 1.0 S2 C1 P2 1.0 S2 C2 P1 1.0 S2 C2 P2 1.0 Result of SQL Rollup 1. April 2005 Dr. Jens-Peter Dittrich/Institut für Informationssysteme
Relational Model: NULL-values Take care NULL has two different meanings in SQL: 1. Aggregate (from Rollup operation) 2. Value does not exist (e.g. from outer joins) The semantics of a NULL-value can be obtained calling GROUPING(). Result of SQL Rollup 1. differentiate between different semantics of NULL 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 7
Relational Model: order on rows The result table is ordered lexicographically. Result of SQL Rollup 1. differentiate between different semantics of NULL 2. assume order on result table 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 8
Relational Model: multi columns Result of Interpreting NULL- SQL Rollup values as multi columns 1. differentiate between different semantics of NULL 2. assume order on result table 3. merge columns containg NULLs to form multi column cells 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 9
Relational Model: multi rows Result of Interpreting NULL- Interpreting adjacent similar SQL Rollup values as multi columns values as multi rows 1. differentiate between different semantics of NULL 2. assume order on result table 3. merge columns containg NULLs to form multi column cells 4. merge adjacent rows containing similar values to form multi row cells 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 10
Relational Model: order on columns These cells look different if we chose a different order on the dimensions, e.g. Customer, State, Product 1. differentiate between different semantics of NULL 2. assume order on result table 3. merge columns containg NULLs to form multi column cells 4. merge adjacent rows containing similar values to form multi row cells 5. interpret order on columns as hierarchy 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 11
Relational Model: Pivot Tables Pivot attribute values become attribute names Pivot operation moves at least one of the attributes to the columns Some of the sums of the pivot are not part of the rollup e.g. ( ∑∑ ,P1), ( ∑∑ ,P2). We have to use CUBE() here. Should we still call this a “table”? 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 12
Relational Model: Summary State Customer Product Profit S1 C1 P1 1.0 S1 C1 P2 1.0 S1 C2 P1 1.0 S1 C2 P2 1.0 S2 C1 P1 1.0 S2 C1 P2 1.0 S2 C2 P1 1.0 S2 C2 P2 1.0 SQL to Pivot Recipe 1. differentiate between different semantics of NULL 2. assume order on result table 3. merge columns containg NULLs to form multi column cells 4. merge adjacent rows containing similar values to form multi row cells 5. interpret order on columns as hierarchy 6. let attribute values become attribute names 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 13
The Gap Bridging the Gap Closing the Gap 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 14
Example: BTell’s Operator Model Client Client Client BTell has two different merged data Pivot types of operators: x xy y HTML XML Excel Split Convert Convert Convert 1S: returns one output stream x xy y Cache [Graefe: Volcano] Align x xy y Drill-down Filter 3S: returns three output x xy y Group Link streams: x xy y Sort x xy y Format 1. one for the x-axis Cache x xy y 2. one for the y-axis Filter Cell Merge x xy y Observable Result Cache 3. one for the xy-axis x xy y Btell OLAP Fetch Engine x, xy, y 3S operators RDBMS 1S operators 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 15
More Examples 1. Caching (Special Caching Operator) 2. Pivot Computation (based on 3S operator model) 3. Check for Computability of Aggregates see paper 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 16
The Gap Bridging the Gap Closing the Gap 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 17
How to reach OLAP Heaven? 3 possible paths to follow: 1. Add even more OLAP stuff to SQL Open questions: How to handle non-relational data? (nested relations?) 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 18
How to reach OLAP Heaven? 3 possible paths to follow: 1. Add even more OLAP stuff to SQL e l b a b o Open questions: r p m I How to handle non-relational data? (nested relations?) 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 19
How to reach OLAP Heaven? 3 possible paths to follow: 1. Add even more OLAP stuff to SQL e l b a b o Open questions: r p m I How to handle non-relational data? (nested relations?) 2. Make a new query language Hard to agree upon Works only for part of the market (see e.g. MDX) 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 20
How to reach OLAP Heaven? 3 possible paths to follow: 1. Add even more OLAP stuff to SQL e l b a b o Open questions: r p m I How to handle non-relational data? (nested relations?) 2. Make a new query language e l b a b o Hard to agree upon r p m I Works only for part of the market (see e.g. MDX) 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 21
How to reach OLAP Heaven? 3 possible paths to follow: 1. Add even more OLAP stuff to SQL e l b a b o Open questions: r p m I How to handle non-relational data? (nested relations?) 2. Make a new query language e l b a b o Hard to agree upon r p m I Works only for part of the market (see e.g. MDX) 3. Ride the XML wave Recent proposal for ‘group by’ in XQuery from IBM [SIGMOD 2005] 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 22
How to reach OLAP Heaven? 3 possible paths to follow: 1. Add even more OLAP stuff to SQL e l b a b o Open questions: r p m I How to handle non-relational data? (nested relations?) 2. Make a new query language e l b a b o Hard to agree upon r p m I Works only for part of the market (see e.g. MDX) 3. Ride the XML wave ! e l b a b Recent proposal for ‘group by’ in XQuery from IBM [SIGMOD 2005] o r P 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 23
Why XML/XQuery? 1. Great data model 2. Powerful query language 3. Highly extensible 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 24
Example: Pivot Result 08/30/05 Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi 25
Recommend
More recommend