Data Warehouse and OLAP II Data Warehouse and OLAP II Week 6 1
Team Homework Assignment #8 Team Homework Assignment #8 • Using a data warehousing tool and a data set, play four OLAP Using a data warehousing tool and a data set, play four OLAP operations (Roll ‐ up (drill ‐ up), Drill ‐ down (roll down), Slice and dice, Pivot (rotate)) and show the results. • Exercise 3.11, 3,12 and 3.13. i 3 3 2 d 3 3 • Due date – beginning of the lecture on Friday March11 th . beginning of the lecture on Friday March11 th
T Typical OLAP Operations i l OLAP O ti • Roll ‐ up (drill ‐ up) Roll up (drill up) • Drill ‐ down (roll down) • Slice and dice • Pivot (rotate) • Drill ‐ across • Drill ‐ through Drill through
R ll Roll-up • Perform aggregation on a data cube by – Climbing up a concept hierarchy for a dimension Climbing up a concept hierarchy for a dimension – Dimension reduction
5 Ro ll-up Ro ll up
Drill-down Drill down • Drill ‐ down is the reverse of roll ‐ up • Navigates from less detailed data to more detailed data by – Stepping down a concept hierarchy for a dimension – Introducing additional dimensions
7 Drill-do wn Drill do wn
Slice and Dice Slice and Dice • The slice operation performs a selection on one dimension of the given cube, resulting in a sub ‐ cube • The dice operation defines a sub ‐ cube by performing a selection on two or more dimensions 8
9 Slic e Slic e
10 Dic e Dic e
Pivot (Rotate) Pivot (Rotate) • Visualization operation that rotate the data axes in view in order to provide an alternative presentation of the data 11
12 Pivo t Pivo t
Drill-across Drill across • An additional drilling operation • Executes queries involving (i e across) more than one fact • Executes queries involving (i.e., across) more than one fact table 13
Drill-through Drill through • An additional drilling operation • Uses relational SQL facilities to drill through the bottom level • Uses relational SQL facilities to drill through the bottom level of a data cube down to its back ‐ end relational tables 14
15 ure 3.10. E E xample s o o f T ypic al O OL AP o pe r ratio ns o n F ig mu ultidime nsio o nal data c ube , c o m mmo nly us e d fo r da ta ware ho o using
Motivation for Building Data Warehouse Motivation for Building Data Warehouse • Building and using a data warehouse is a complex, difficult, and long ‐ term task • The construction of a large and complex information system • The construction of a large and complex information system can be viewed as the construction of large and complex building
Data Warehouse Project Process (1) D t W h P j t P (1) • Top ‐ down, bottom ‐ up approaches or a combination of both – Top ‐ down: Starts with overall design and planning (mature) – Bottom ‐ up: Starts with experiments and prototypes (rapid)
Data Warehouse Project Process (2) Data Warehouse Project Process (2) • Typical data warehouse design process – Choose a business process to model, e.g., orders, invoices, etc etc. – Choose the grain ( atomic level of data ) of the business process – Choose the dimensions that will apply to each fact table record – Choose the measure that will populate each fact table Choose the measure that will populate each fact table record
Th Three Data Warehouse Models D t W h M d l • Enterprise warehouse • Enterprise warehouse – Collects all of the information about subjects spanning the entire organization • Data mart – A subset of corporate ‐ wide data that is of value to a specific groups of users Its scope is confined to specific selected groups such as users. Its scope is confined to specific, selected groups, such as marketing data mart • Independent vs. dependent (directly from warehouse) data mart • Virtual warehouse – A set of views over operational databases – Only some of the possible summary views may be materialized – Only some of the possible summary views may be materialized
Data Ware ho use De ve lo pme nt: A R A Re c o mme nde d Appro ac h d d A h ig ure 3.13 A re c o mme nde d appro ac h fo r data ware ho use F de ve lo pme nt.
ig ure 3.12 A thre e -tie r data ware ho using arc hite c ture . F
OLAP S OLAP Server Architectures A hit t • Relational OLAP (ROLAP) • Multidimensional OLAP (MOLAP) Multidimensional OLAP (MOLAP) • Hybrid OLAP (HOLAP)
ROLAP ROLAP • Advantages – Can handle large amounts of data C h dl l f d – Can leverage functionalities inherent in the relational database database • Disadvantages – Performance can be slow – Limited by SQL functionalities 23
MOLAP MOLAP • Advantages – Excellent performance – Can perform complex calculations • Disadvantages – Limited in the amount of data it can handle – Requires additional investment
HOLAP HOLAP • HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. 25
Data Warehouse Vendors Data Warehouse Vendors • IBM – http://www ‐ 306.ibm.com/software/data/informix/redbrick/ • Microsoft – http://www.microsoft.com/sql/solutions/bi/default.mspx p // / q / / / p • Oracle – http://www.oracle.com/siebel/index.html • Business Objects • Business Objects – http://www.businessobjects.com/
Data Warehouse Vendors (cont’d) Data Warehouse Vendors (cont d) • Microstrategy – http://www microstrategy com/ – http://www.microstrategy.com/ • Cognos – http://www.cognos.com/ • Informatica f – http://www.informatica.com/ • Actuate – http://www.actuate.com/home/index.asp
Open Source Data Warehousing Tools Open Source Data Warehousing Tools • MySQL ‐ based data warehouse • Open data warehouse Open data warehouse
D t W Data Warehouse Usage (1) h U (1) • Information processing – supports querying, basic statistical analysis, reporting using supports querying basic statistical analysis reporting using cross ‐ tabs, tables, charts and graphs • Analytical processing • Analytical processing – multidimensional analysis of data warehouse data – supports basic OLAP operations, slice ‐ dice, drilling, supports basic OLAP operations slice dice drilling pivoting
D t W Data Warehouse Usage (2) h U (2) • Data mining • Data mining – knowledge discovery from hidden patterns – Supports associations, constructing analytical models, Supports associations constructing analytical models performing classification and prediction, and presenting the mining results using visualization tools presenting the mining results using visualization tools
From OLAP t OLAM to OLAM • On ‐ Line Analytical Mining y g – High quality of data in data warehouses – Available information processing infrastructure surrounding data warehouses – OLAP ‐ based exploratory data analysis – On ‐ line selection of data mining functions O li l ti f d t i i f ti
An inte gr rate d OL AM and OL AP F ig g ure 3.18 arc c hite c ture e .
Recommend
More recommend