✁ ✁ ✄ ✂ � ✁ � Towards a Formal Model for View Maintenance in Data Warehouses D. Agrawal , A. El Abbadi , A. Most´ efaoui , M. Raynal and M. Roy Univ. Santa Barbara, California IRISA, Rennes, France Towards a Formal Modelfor View Maintenance in Data Warehouses – p.1/22
Summary The Data Warehouse Problem Definitions Existing protocols A Formal Definition of the Problem Formal Definition of Data Objects Abstract Definition of View Management The Protocol A Virtual Topology A Pipelining Technique Towards a Formal Modelfor View Maintenance in Data Warehouses – p.2/22
✄ ✄ ✆ ✆ ✆ ✄ The Data Warehouse Problem A set of databases �✂✁ �✂☎ �✞✝ How to efficiently query a database aggregate? x1 x2 x3 x4 x5 Query Towards a Formal Modelfor View Maintenance in Data Warehouses – p.3/22
✄ ✄ ✆ ✆ ✆ ✄ The Data Warehouse Problem A set of databases �✂✁ �✂☎ �✞✝ How to efficiently query a database aggregate? By adding a Data Warehouse x1 x2 x3 x4 x5 Data Query Warehouse Towards a Formal Modelfor View Maintenance in Data Warehouses – p.3/22
� ✂ ✆ ✆ ✆ ✞✟ ✁ ✁ ✝ ✁ ✄ ✁ ✝ ✝ ✄ ✆ ✆ ✆ ✄ ✁ � ✠ ✞✟ Data Warehouse: Definition The Data Warehouse maintains a DB summary a Select-Project-Join (SPJ) expression: �✆☎ Data Warehouse (DWH) problem calculus of a “Simple” distributed function with changing Data Sources. Towards a Formal Modelfor View Maintenance in Data Warehouses – p.4/22
Extremal Solutions The DWH maintains the total aggregation of all Data Sources. costly in space unnecessary network usage Towards a Formal Modelfor View Maintenance in Data Warehouses – p.5/22
� ✠ Extremal Solutions The DWH maintains the total aggregation of all Data Sources. costly in space unnecessary network usage The DWH stores no datum, and forwards queries to Data Sources high latency unnecessary network usage DWH proxy Towards a Formal Modelfor View Maintenance in Data Warehouses – p.5/22
Proposed Solutions The DWH maintains the SPJ expression Periodically, it calculates the Major Problem: asynchrony of updates on Data Sources Error Terms Towards a Formal Modelfor View Maintenance in Data Warehouses – p.6/22
Major Difficulties Asynchrony and distribution of the model: Towards a Formal Modelfor View Maintenance in Data Warehouses – p.7/22
Major Difficulties Asynchrony and distribution of the model: Consistency issues Performance issues network usage memory/disk usage on dwh. Towards a Formal Modelfor View Maintenance in Data Warehouses – p.7/22
Major Difficulties Asynchrony and distribution of the model: Consistency issues Performance issues network usage memory/disk usage on dwh. Complexity of proposed protocols: Towards a Formal Modelfor View Maintenance in Data Warehouses – p.7/22
Major Difficulties Asynchrony and distribution of the model: Consistency issues Performance issues network usage memory/disk usage on dwh. Complexity of proposed protocols: unproved algorithms need for a formal definition of the problem. Towards a Formal Modelfor View Maintenance in Data Warehouses – p.7/22
✆ ✝ ✄ ✁ � ☎ ✄ ✂ � � � � � � Formal Definitions (data) Data Objects denoted � ✁� a data manager is associated with each can be updated and read using the query/update primitives Timeline: the successive values of are denoted . Towards a Formal Modelfor View Maintenance in Data Warehouses – p.8/22
Formal Definitions (operations) Data Operations add/remove tuples, denoted associative commutative. a join operation, denoted associative, commutative, distributive over . Towards a Formal Modelfor View Maintenance in Data Warehouses – p.9/22
✆ ✆ ✝ ✂ � ✁ � ☎ � ✆ Formal Definitions (dwh) the Data Warehouse calculates such that consistency is mandatory at any time. up-to-dateness is eventual for performance reasons Towards a Formal Modelfor View Maintenance in Data Warehouses – p.10/22
✆ ✂ ✂ � ✂ ✝ ☎ ✁ ✆ ☎ ✆ � ✂ ✄ Abstract Def. of View Management Validity any query on the dwh returns an ✄ ✁� . Towards a Formal Modelfor View Maintenance in Data Warehouses – p.11/22
☎ ☎ ✆ ✆ ✆ ✁ ☎ � ✄ ✂ ✂ � ✂ ☎ � ✝ � ✄ � � ✄ ✄ ✂ ✁ ✄ ☎ ✆ ☎ ✁ � � ✝ ☎ ✂ ☎ ✂ ✁ ✁ ✁ ✆ ✆ ✆ � ✂ ✄ ✂ ☎ ✝ � ✄ � ✂ ✂ � ✂ ✄ ✁ � ☎ ✁ ✆ ✆ ✆ � ✂ ✂ ☎ Abstract Def. of View Management Validity any query on the dwh returns an ✄ ✁� . Order Consistency , if was issued before , then . Towards a Formal Modelfor View Maintenance in Data Warehouses – p.11/22
✂ ✝ ✁ ✄ ☎ � ✆ ✁ � ✆ ☎ ✄ ✂ ☎ ✄ ✆ � ✆ ✆ ✆ ✁ ✂ ✄ � � ✄ ✆ ✆ ✆ � ✂ ✄ � ☎ ☎ ✄ ✂ � ☎ � � ✆ ✄ � ☎ ☎ ✂ ✂ ✁ � ✄ ✁ ✝ ☎ ✂ ✄ � � ✆ ✆ ✆ ✁ ☎ ✄ ✂ � ✂ ✂ ✂ ✄ ✁ ✂ � ✂ ☎ � ✁ ✝ ☎ ✂ ✄ ✄ ✂ � ✆ ✆ ✆ ✁ ☎ � ✁ � Abstract Def. of View Management Validity any query on the dwh returns an ✄ ✁� . Order Consistency , if was issued before , then . Up-to-Dateness for any , an infinite sequence of queries will return at least an with . Towards a Formal Modelfor View Maintenance in Data Warehouses – p.11/22
✁ � ✁ � ✁ ✂ ☎ � � ✁ � ✂ � ✁ � � � ☎ � ✁ ✂ ✂ The Protocol: a single update Suppose that . �✂✁ if is updated to , then the corresponding is: Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22
� � ✂ ✁ � ☎ � The Protocol: a single update Suppose that . �✂✁ x1 x4 x2 DWH x3 Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22
✁ � ✂ � � ☎ � The Protocol: a single update Suppose that . �✂✁ x1 ð1 x4 x2 DWH x3 Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22
✁ � ✂ � � ☎ � The Protocol: a single update Suppose that . �✂✁ x1 x4 x2 DWH ð1*x2 x3 Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22
✁ � ✂ � � ☎ � The Protocol: a single update Suppose that . �✂✁ x1 x4 x2 DWH ð1*x2*x3 x3 Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22
✁ ✂ � ☎ � � � The Protocol: a single update Suppose that . �✂✁ x1 x4 x2 DWH ÐF = ð1*x2*x3*x4 x3 Towards a Formal Modelfor View Maintenance in Data Warehouses – p.12/22
✞ ✟ � ✁ ✂ ✄ ✡ ✌ ☛ ☞ ☞ ✄ ✆ ✁ ✞ ☎ ✡ ☛ ☛ ✡ ☞ ✌ ✍ ✄ ☎ ✂ ✁ � � ✁ � ☎ ✁ ✂ � � ✁ ✂ ✁ ✁ � � ☎ ✂ ☎ ✁ � � The Protocol: Concurrent Updates Now, suppose that both and are updated. ✌✎✍ ☎✝✆ ☎✝✆ ✞✠✟ ✞✠✟ ✞✠✟ ✞✠✟ ✞✠✟ ✞✠✟ ✞✠✟ complexity increases with concurrency two solutions: 1. compute error terms 2. order the updates Towards a Formal Modelfor View Maintenance in Data Warehouses – p.13/22
The Protocol: a Virtual Topology the star topology (center: dwh, edges: nodes) is seen as a ring a token perpetually moves on the ring it generates a natural order on updates x1 x4 x2 DWH x3 Towards a Formal Modelfor View Maintenance in Data Warehouses – p.14/22
✁ ✂ ✂ ✁ � � ✂ � The Protocol: Pipelining Updates The token generates a global time ( of steps) on : . current lastcommited � ✁� � ✁� � ✁� when an update made a total rotation, it can be integrated to the data warehouse. the token can contain up to updates in commitment phase. Towards a Formal Modelfor View Maintenance in Data Warehouses – p.15/22
✙ ✛ ✝ ✝ ✆ ☛ ✌ ☎ ✎ ✚ ☞ ✖ ✑ ✒ ✍ ✌ ☞ ☛ ☛ ☛ ✛ ✆ ✏ ✒ ✘ ✘ ✙ ✁ ✥ ✙ ✕ ✍ ✛ ☞ ☛ ✆ ✝ ✆ ✕ ✒ ✝ ✘ ✁ ✑ � � � ✂ ☎ ☛ ☞ ✌ ✍ ✁ ✂ ☎ ✏ ✎ ☛ ✒ ✘ ☛ ✑ ✁ ✕ ✔ ✓ ✑ ☛ The Protocol: Code (1) when the token arrives to with sequence number : 1. let ; ✁✄✂ ✟✡✠ ✆✞✝ 2. if ( ) then ; to DWH endif ; send incr ✂✗✖ ✁✄✙ 3. ; ✟✡✠ ✍✢✜ ✕✤✣ ✟✡✠ 4. enddo ; ✟✡✠ do 5. ; 6. send token sn to next data ✟✡✠ Towards a Formal Modelfor View Maintenance in Data Warehouses – p.16/22
☛ ☛ ✄ � ✂ ✁ ✠ ☛ ✒ ✠ ✣ ✆ ☛ ✑ ☎ � ✑ ☛ ✘ ✑ ✆ ✄ ✒ ✄ � ✁ ✂ ✣ ✂ ✣ ☛ ✣ � ✂ � ✁ ✔ � � ✓ ☛ ✣ ✙ ✒ ✙ ✙ � ✁ ✙ ✑ ✆ ✁ ✙ ✒ ✁ ✙ � ✁ ✠ The Protocol: Code (2) : when update is received by 1. ; 2. is received by DWH : when incr 1. wait ( _ ; 2. ; 3. _ _ Towards a Formal Modelfor View Maintenance in Data Warehouses – p.17/22
Recommend
More recommend