OUTLINE CHAPTER 10 � Recursive Hierarchies Table of contents Recursive Hierarchies and Bridges � Reporting Challenge � Flattening a Recursive Hierarchy � The Hierarchy Bridge � Double Counting � Resolving Many to Many Dinesh Maharjan G20169202 � Impact of Changes to Hierarchical bridge Intelligent Information System Lab � Ripple Effect University of Seoul 2017.05.08 � Variation of Hierarchical Bridge 2 Recursive Hierarchies Rows Referring to other Rows Table of contents Table of contents � Attribute Hierarchies describes relationships between columns in � Instance hierarchy is a set of relationship among individual a dimension table. rows of a table. � For eg: in product dimension, each product have brand, brand � This is recursive, because it is self-referring relationship have category. � For eg one company may own another company, and another � Such hierarchy can be defined within the table or separate table. company may own other companies. � But Instance Hierarchy cannot be defined within single table. � Instance Hierarchy describes relationships between rows of a dimension 3 4
Recursive Hierarchy Balanced Vs Unbalanced Hierarchy Table of contents Table of contents � Balanced contains fixed number of levels � All attribute hierarchies are balanced � For eg: product dimension has fixed level of hierarchies. Product, brand, category � Instance hierarchies are unbalanced hierarchy as it rarely contains fixed number of levels � For eg: Company dimension has no fixed level of hierarchy. 5 6 The Reporting Challenge The Reporting Challenge Table of contents Table of contents � Looking Down: to summarize the facts � For eg: if we want to generate total sales grouping particular member in recursive report at company 2 i.e. Total sales made by dimension 2,3,4 � Looking Up: summarizes the facts up the � Cannot use SQL to answer this question. recursive dimension using certain member � GROUP BY function is difficult to be used as � For this traversing the recursive dimension is traversal of series of recursive relationships is required which is difficult required to summarize the fact 7 8
The Reporting Challenge Flattening a Recursive Hierarchy Table of contents Table of contents � Flattened hierarchy looks and behaves like attribute hierarchy � Requires creation of new attributes that represents fixed number of levels � This is done during ETL process � None need to traverse the recursive relationship � Facts can be summarized at various levels by grouping at desired level 9 10 Flattening a Recursive Hierarchy Drawbacks of Flattening Table of contents Table of contents � Flattening creates artificial attribute hierarchy � Backfilling is Necessary by its own identity � It allows query to group data without losing any transaction � Looking up and Looking down is still difficult � Fixed number of levels only � If company 8 acquires another company, cannot accommodate 11 12
Hierarchy Bridge Structure of Hierarchy Bridge Table of contents Table of contents � Allows to aggregate the facts at any point of hierarchy � two foreign keys, one superior_company_key represents upper level company � No need of subqueries � Subordinate_company_key represents lower level company � Looking up and Looking down within the recursive hierarchy � Optional column levels removed describes number of levels � Captures recursive instance relationship instead of dimension separated table 13 14 Structure of Hierarchy Bridge Content of Hierarchical Bridge Table of contents Table of contents � Captures relationship between rows in dimension table � Done during ETL process � One row for each of its direct subordinates � One row for each of its indirect subordinates � One row for itself 15 16
Content of Hierarchical Bridge Looking Down Table of contents Table of contents � Aggregates at its level and levels below it � Subordinate key will join fact table � Superior key join the dimension table � For eg if we want to aggregate order down the company 5 � Select sum(order_facts.order_dollars) � Where bridge.superior=5 and ………. 17 18 Looking Down Looking Up Table of contents Table of contents � Aggregates at its level and levels up it � Subordinate key will join dimension table � Superior key join the fact table � For eg if we want to aggregate order up the company 6 � Select sum(order_facts.order_dollars) � Where bridge.subordinate=6 and …… 19 20
Looking Up Double Counting Table of contents Table of contents � Many to Many relationship between fact and bridge table � So possibility of double counting � For eg, we want to aggregate orders at or below 5, then companies 7 and 8 are subordinates for both 5 and 6. so they will be counted twice. Miscalculation occurs � Looking down- avoided by selecting a single upper level member or by grouping results by upper level members. ( for 6 look down group by 5 and 1) 21 22 Double Counting Double Counting Table of contents Table of contents � Looking Up- avoided by selecting a single lower level member or by grouping results by lower level members � For eg: if we want to look up at 6 then put constraint where subordinate company key=6 or � Put Group by 7, 8 23 24
Hiding the Bridge from Novice Hierarchy Bridge to RDBMS Table of contents Table of contents � High probability of bad queries by novice user � Many to many relationship between facts and bridge table � To avoid this, bridge table can be withhold from such users. � Most of RDBMS cant accommodate such relationship. � One to many relationship is created between dimension and fact table � So, to deal with such condition, dimension table is kept � Can generate reports without hierarchy between facts table and hierarchical bridge table � One to many relationship is created between dimension and facts table using primary key at dimension and foreign at fact � Similarly one to many between dimension and bridge 25 26 Hierarchy Bridge to RDBMS Hierarchy Bridge to Business Intelligence Table of contents Table of contents � This has still problem in hierarchical report generation � Business intelligence requires, meaningful query results from SQL � For this we alias company table twice � One for superior company and one for subordinate company � Subordinate kept between fact and bridge while superior kept beside bridge � Here superior alias is subject of query i.e. point to start query � Superior company is linked with superior key of bridge � Subordinate company is linked with subordinate key of bridge 27 28
Hierarchy Bridge to Business Intelligence Hierarchy Bridge to Business Intelligence Table of contents Table of contents 29 30 Looking Up without many to many Looking Up without many to many Table of contents Table of contents � Superior company is kept between bridge and fact table � Subordinate is kept beside bridge � Superior company is linked with superior key of bridge and fact � Subordinate company is linked with subordinate key of bridge � Here subordinate company represents subject of query or point from which query starts 31 32
Type 1 Change in Dimension Type 1 Change in Hierarchy Table of contents Table of contents � Bridge organization must respond to changes in data sources � If change in hierarchy occurs, delete all the rows in the bridge table relating to the change � Otherwise, it wont generate correct and updated reports � And replace them with a new set of rows � Type 1 change does not require to preserve historic context � When type 1 occurs, simply update the dimension row � There is not impact of type 1 change on bridge and query techniques 33 34 Type 2 Change in Dimension Mechanism of Type 2 change Response Table of contents Table of contents � Create new row in dimension table for changed member � Response to the type 2 change creates a version of � Create new row for all other members in hierarchy history � Create new rows in bridge table for all these new dimension rows � It means there will be row for old data and new row for � For eg: company E has moved its headquarter location new data. � A query can ignore the history version by using natural key � Ripple effect occurs. It means change in one member of hierarchy affects all members of hierarchy 35 36
Mechanism of Type 2 Response Reason for Ripple Effect Table of contents Table of contents � If we only change E to E-1 and want to look down from E, then � Even though E only changed, two version of companies exists. � we will count orders of the subordinates F, G, H for both E and E-1 twice and occurs double count 37 38 Type 2 Changes to Hierarchy Reason for Ripple Effect Table of contents Table of contents � When hierarchy changes, create new rows in dimension table for all � Suppose company H is sold and if we still associate company the members H-1 with same hierarchy � Create new rows in bridge table reflecting new status of hierarchy � Then looking up from H, can provide the transaction mad by A,E,F that occurred after H is sold � When we only want orders made by H-1, we cannot avoid the orders of H as they both use same natural key 39 40
Recommend
More recommend