FedDW Global Schema Architect UML-based Design Tool for the Integration of Data Mart Schemas Dr. Stefan Berger Department of Business Informatics – Data & Knowledge Engineering Johannes Kepler University Linz ACM 15 th DOLAP ’12 — November 2, 2012
FedDW Approach Tool Support: FedDW Tool Suite Outline FedDW Approach 1 Tool Support: FedDW Tool Suite 2 Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 2 / 31
FedDW Approach Tool Support: FedDW Tool Suite FedDW Approach 1 General overview of FedDW Integrating heterogeneous multidimensional schemata Tool Support: FedDW Tool Suite 2 Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 3 / 31
FedDW Approach Tool Support: FedDW Tool Suite General overview of FedDW Problem definition; our contribution Problem : similar autonomous data marts/DWs, but heterogeneous schemata and/or data Business collaboration Mergers and acquisitions ⇒ Preexisting DW data across autonomous organizations Contribution : comprehensive tool suite for integration of autonomous data marts/DWs Visual integration of multidimensional schemas OLAP front-end prototype, based on SQL-MDi [Berger and Schrefl, 2006] Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 4 / 31
FedDW Approach Tool Support: FedDW Tool Suite General overview of FedDW Problem definition; our contribution Problem : similar autonomous data marts/DWs, but heterogeneous schemata and/or data Business collaboration Mergers and acquisitions ⇒ Preexisting DW data across autonomous organizations Contribution : comprehensive tool suite for integration of autonomous data marts/DWs Visual integration of multidimensional schemas OLAP front-end prototype, based on SQL-MDi [Berger and Schrefl, 2006] Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 4 / 31
FedDW Approach Tool Support: FedDW Tool Suite General overview of FedDW Motivating example Telecommunications sector—sample, heterogeneous conceptual data mart schemas: year month date date/hr date month quarter year red blue date category dates connections customers connections products customer duration products dur_min customer tn_tel customer turnover product cust_name tn_misc product p_name age_grp promotion prod_name prod_name promo contract_type regular_fee contract_type base_fee base_fee promo_type Dimensionality (extra dimension blue.promotion ) Hierarchy of date dimensions Decorations of product dimensions Measures of connections facts Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 5 / 31
FedDW Approach Tool Support: FedDW Tool Suite General overview of FedDW Motivating example Telecommunications sector—sample, heterogeneous conceptual data mart schemas: year month date date/hr date month quarter year red blue date category dates connections customers connections products customer duration products dur_min customer tn_tel customer turnover product cust_name tn_misc product p_name age_grp promotion prod_name prod_name promo contract_type regular_fee contract_type base_fee base_fee promo_type Dimensionality (extra dimension blue.promotion ) Hierarchy of date dimensions Decorations of product dimensions Measures of connections facts Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 5 / 31
FedDW Approach Tool Support: FedDW Tool Suite Integrating heterogeneous multidimensional schemata Conflict classification I Modeling Scope re re Dimension Cube stance Instance Instance („Members“) („ ) („ („Cells“) ) Ins Conflicts Conflicts Schema- Instance Conflicts Schema Cube Dimension Schema Schema Conflicts Conflicts Model Entity Model Entity Dimension Cube ts Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 6 / 31
FedDW Approach Tool Support: FedDW Tool Suite Integrating heterogeneous multidimensional schemata Conflict classification II Facts: conflicts Relevant operator of FedDW Merge measures: PIVOT MEASURES (Fact) Schema-instance Split measures: PIVOT SPLIT MEASURES (Fact) Dimensionality Choose attributes: add DIM reference (Cube) Different measures Choose measures: add MEASURE reference (Cube) Domain (measures) Convert domain: CONVERT MEASURES APPLY ... (Measure) Naming of attributes Rename attributes: operator “– > ...” (Measure, Dimension) Base levels Roll-up dimension attributes: ROLLUP TO LEVEL ... (Dimension) Cube cells (fact extensions) Join cubes: MERGE CUBES ( n-ary ) Derive measure values: AGGREGATE MEASURE ( n-ary ) Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 7 / 31
FedDW Approach Tool Support: FedDW Tool Suite Integrating heterogeneous multidimensional schemata Conflict classification III Dimensions: conflicts Relevant operator of FedDW Hierarchies Map corresponding levels: add level reference [...] (Dimension) Domain (levels / decorations) Convert domain: CONVERT ATTRIBUTES APPLY ... (Dimension) Naming (levels) Rename attributes: operator “– > ...” (Level) Naming (decorations) Map decorations: MATCH ATTRIBUTES (under Merge Dimensions— n-ary ) Members (dim. extensions) Merge sets of members: MERGE DIMENSIONS ( n-ary ) Roll-up functions Overwrite hierarchies: RELATE Expression (under Merge Dimensions clause— n-ary ) Decoration values Correct values: add RENAME function (under Merge Dimensions clause— n-ary ) Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 8 / 31
FedDW Approach Tool Support: FedDW Tool Suite Integrating heterogeneous multidimensional schemata Integration workflow Establish a federation of autonomous data marts: Import data mart schemas (CWM supported) 1 (Optional: enrich roll-up hierarchies ⇒ minimum match integration strategy) Design global multidimensional schema (canonical model) 2 Define semantic mappings – both-as-view paradigm [see 3 McBrien and Poulovassilis, 2003] (a) Resolve schema–instance conflicts (b) Intensional integration – map conceptual schemata Fact tables Dimension tables + hierarchies (c) Extensional integration – consolidate data Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 9 / 31
FedDW Approach Tool Support: FedDW Tool Suite FedDW Approach 1 Tool Support: FedDW Tool Suite 2 FedDW Global Schema Architect FedDW Query Tool Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 10 / 31
FedDW Approach Tool Support: FedDW Tool Suite Overview of FedDW tool support I Java- and Eclipse-based interactive tool suite (EMF, GMF, UML2) Visual data mart integration: FedDW Global Schema Architect (GSA) OLAP front-end prototype: FedDW Query Tool [Berger and Schrefl, 2009] Auxiliary components: Metadata Dictionary, Dimension Repository Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 11 / 31
FedDW Approach Tool Support: FedDW Tool Suite Overview of FedDW tool support II ? Global Schema OLAP Application Architect User Query (SQL) Federated DW System Query Tool Global SQL-MDi Meta-data schema Parser dictionary Mappings (SQL-MDi) SQL-MDi Dimension Processor repository Import Schemas • DM 1 DM 2 DM n Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 12 / 31
FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect Overview of FedDW GSA Visual design environment for multidimensional schemas Schema Editor — nested UML diagrams Import schemas Global schema Mapping Editor — graphical, high-level code editor ( Master–Detail layout) Import mappings : unary operators (Fact, Dimension entities) — intensional Global mappings : n-ary operators — extensional Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 13 / 31
FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect Sample GSA Workflow Import local, autonomous connections schemas 1 Design global connections schema 2 Create import mappings 3 Create one global mapping file 4 Export the mappings to metadata repository 5 Export fact and dimension metadata 6 Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 14 / 31
FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect GSA: Step 1, Import Wizard I Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 15 / 31
FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect GSA: Step 1, Import Wizard II Wizard suggests appropriate UML stereotypes (based on PK/FK constraints): Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 16 / 31
FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect GSA: Step 1, Import Wizard III Initialized class diagram of red.connections : Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 17 / 31
FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect GSA: Step 2, Global Schema Editor Global Schema wizard: Comfortably create global schema as copy of one import schema Edit the schema later on Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 18 / 31
Recommend
More recommend