Multigranular Attributes for Relational Database Systems Stephen J. Hegner Umeå University, Sweden (retired) Hegner Consulting, LLC, USA M. Andrea Rodríguez University of Concepción, Chile 0/24
The Relational Model of Data 1968-01-19 638 Voss, Houston, TX M 40000 888665555 5 Alicia J Zeyala 999887777 3321 Castle, Spring, TX Wong F 25000 987654321 4 Employee Attributes: The columns are defjned by attributes , shown in green . Domain: The domain of each attribute is the set of possible values. Operations; In general, the only intra-domain operations supported are simple comparison (including equality). 333445555 1955-12-08 T DNo FName MInit LName SSN BDate Address Sex Salary Franklin Super_SSN John M 5 B 30000 333445555 731 Fondren, Houston, TX 1965-01-09 123456789 Smith 1/24 • In the relational model, the data are stored in tables. • Dom ( Sex ) = { M , F } . • Dom ( SSN ) = strings of exactly 9 digits. • Dom ( BDate ) = dates in YYYY-MM-DD format. Examples: 333445555 < 888665555; 1955-12-08 < 1965-01-09.
The Idea of Multigranular Attributes Y2016 Spatial containment: inherent order structure. Granular order: The granules of spatial and temporal attributes have Granules: The domain values are called granules . prv = provincia/province cmn = comuna/county cdd = ciudad/city attributes Thematic attributes Place Spatio-temporal Concepción_cmn Concepción_cmn Time Births Concepción_cdd Y2016Q1 Y2016Q1 Concepción_prv Y2016Q1 2/24 b 1 b 2 b 3 b 4 Concepción_cdd ⊑ Concepción_cmn ⊑ Concepción_prv Temporal interval containment: Y2016Q1 ⊑ Y2016 Typical constraints: Functional dependency (FD) { Place , Time } → Births, births monotonic w.r.t. space/time, so b 1 ≤ b 2 ≤ b 3 , b 2 ≤ b 4 .
Lattice-Like Operations on Granules Ñuble_prv Observation: These lattice-like operations are partial . Disjoint Join: The four provinces join disjointly to the region. Meet: Distinct provinces are disjoint (six possibilities in all). . Join: The four provinces join to the region. Y2016Q1 Place Y2016Q1 BíoBío_rgn Y2016Q1 BíoBío_prv Time Births Arauco_prv Y2016Q1 Y2016Q1 Concepción_prv 3/24 b 1 b 2 b 3 b 4 b 5 BíoBío_rgn = � { Arauco_prv , BíoBío_prv , Concepción_prv , Ñuble_prv } � { Arauco_prv , BíoBío_prv } = ⊥ ⊥ { Arauco_prv , BíoBío_prv , Concepción_prv , Ñuble_prv } � BíoBío_rgn = Consequence: � 4 i =1 b i = b 5 .
Granularities — Organizing Granules ElecTable Disjointness: Distinct granules of the same granularity are disjoint. granularities . Day Week Month Quarter Year Administrative Electoral ElecConst County City NatlPark Province District Region SenConst Chile 4/24 ⊤ ⊤ • The granules of each attribute are partitioned into a hierarchy of Order: G 1 ≤ G 2 ⇔ (( ∀ g 1 ∈ Granules � G 1 � )( ∃ g 2 ∈ Granules � G 2 � )( g 1 ⊑ g 2 )) .
Additional properties: Formalizing Granularity Schemata The top granularity consists only of the top granules: Distinct granules of the same granularity are never equivalent: Distinct granules of the same granularity have nothing in common: 5/24 Granularity schema: S = ( Glty � S � , Gnle � S � , Π Gnle � S � ) Granularity preorder: Glty � S � = ( Glty � S � , ≤ Glty � S � , ⊤ Glty � S � ) Granule preorder: Gnle � S � = ( Granules � S � , ⊑ S , ⊤ S , ⊥ S ) Granule partition: Π Gnle � S � = { Granules � S | G � | G ∈ Glty � S �} of Granules �⊥ � S � Granules � S |⊤ Glty � S � � = [ ⊤ S ] S ( [ - ] S = equivalence class under ⊑ S ) ( g 1 � = g 2 ∈ Granules � S | G � ) ⇒ ([ g 1 ] S � = [ g 1 ] S )) ( g 1 � = g 2 ∈ Granules � S | G � ) ⇒ ( GLB Gnle � S � �{ g 1 , g 2 }� = ⊥ S ) Granularity order and granule order: ( G 1 ≤ Glty � S � G 2 ) ⇔ (( ∀ g 1 ∈ Granules � S | G 1 � )( ∃ g 2 ∈ Granules � S | G 2 � )( g 1 ⊑ S g 2 ))
Equivalence of Granularities ElecTable Near partial order: Require the order instead to be near partial : granules) at other points in time. Answer: Some distinct granularities might become identical (with respect to Question: Why not make the granularity order partial? Administrative Electoral ElecConst County City NatlPark Province District Region SenConst Chile 6/24 ⊤ ( G 1 ≤ Glty � S � G 2 ≤ Glty � S � G 1 ) ⇒ ( G 1 ∼ = G 2 ) .
Formalization of Granule Structure granularity schema. Granule subsumption maps to set inclusion: Distinct granules of the same granularity are disjoint: 7/24 • A granule structure is a model for the constraints imposed by the • σ = ( Dom � σ � , GnletoDom σ ) Domain: Dom � σ � is a (not necessarily fjnite) set. Granule semantics function: GnletoDom σ : Granules � S � → 2 Dom � σ � . ⊥ S maps to ∅ : GnletoDom S ( ⊥ S ) = ∅ . ( g 1 ⊑ S g 2 ) ⇒ ( GnletoDom σ ( g 1 ) ⊆ GnletoDom σ ( g 2 )) . ( ∀ G ∈ Glty � S � \ {⊤ Glty � S � } )( ∀ g 1 , g 2 ∈ Granules � S | G � ) ( g 1 � = g 2 ) ⇒ ( GnletoDom σ ( g 1 ) ∩ GnletoDom σ ( g 2 ) = ∅ ) . Two granules have the same semantics ifg they are equivalent under ⊑ S : ( GnletoDom σ ( g 1 ) = GnletoDom σ ( g 2 )) ⇔ [ g 1 ] S = [ g 2 ] S .
Examples of Granule Structure All other granules consist of a set of days: Disjointness: Recaptures the notion for granules of the same granularity only . Subsumption: Recaptures the usual notion of spatial/temporal subsumption. Common properties: Number days consecutively with 1970-01-01 day zero: 8/24 = the geographic region defjning that entity. Example: σ Place for the granularity schema of space. • Dom � σ � = R 2 . • GnletoDom Place ( Some_entity ) Example: σ Time for the granularity schema of time. • Model all days starting with 1970-01-01. • Dom � σ � = N . GnletoDom Time ( yyyy-mm-dd ) = { number of days yyyy-mm-dd is after 1970-01-01 } . GnletoDom Time ( X ) = � { GnletoDom Time ( d ) | d ∈ X } .
Canonical Primitive Rules and Their Semantics Canonical primitive rules: All rules are defjned in terms of those which are of Semantics: The semantics of these rules are defjned with respect to a Question: How are constraints which are not part of the basic granularity the following two forms. 9/24 Rules: All additional constraints are expressed in terms of rules. Examples: schema modelled? • Disjointness of granules of difgerent granularities. • Join constraints: g ⊑ S � g = � g = � ⊥ S S ; S S ; S S ; Basic subsumption rule: g ⊑ S � S S . ( S fjnite and nonempty) Convention: Regard g ⊑ S g ′ as g ⊑ S S { g ′ } . � Basic disjointness rule: � S { g 1 , g 2 } = ⊥ S granule structure σ using: � �→ � � �→ � ⊑�→⊆ = �→ = . • σ ∈ ModelsOf � g ⊑ S � S S � ifg GnletoDom S ( g ) ⊆ � s ∈ S GnletoDom S ( s ) . • σ ∈ ModelsOf � � S { g 1 , g 2 } = ⊥ S � ifg GnletoDom S ( g 1 ) ∩ GnletoDom S ( g 2 ) = ∅ .
Basic Rules and Their Semantics are the only ones used in this work. 10/24 Basic join rule: g = � S S is defjned as the conjunction ( g ⊑ S � S S ) ∧ ( � s ∈ S ( s ⊑ S g )) . Basic disjoint join rule: g = � ⊥ S S is defjned as the conjunction s 1 � = s 2 ∈ S ( � S { s 1 , s 2 } = ⊥ S )) . ( g = � S S ) ∧ ( � Basic disjoint subsumption rule: g ⊑ S � ⊥ s 1 � = s 2 ∈ S ( � S S is defjned as the conjunction ( g ⊑ S � S S ) ∧ ( � S { s 1 , s 2 } = ⊥ S )) . • These rules, together with the canonical primitive rules: • g ⊑ S � S S • g ⊑ S g ′ • � S { g 1 , g 2 } = ⊥ S BaRules � S � : This combined collection is denoted BaRules � S � .
Expression of Constraints Question: How are constraints expressed in a multigranular attribute? Two solutions: set of all constraints which hold are precisely those which hold in 11/24 Defjnition by structure: Choose a single granule structure σ , and then take exactly those constraints which hold in σ to be the true ones. Defjnition by constraint satisfaction: Given a set Φ of constraints, the every structure in which Φ is satisfjed. • The choice depends upon the multigranular attribute. • Defjnition by structure works best for Time. • Defjnition by constraint satisfaction works best for Place.
Defjnition by Structure Complete information: It is an exact model, not a Day Week Month Quarter Year partial one. Man made: With a formal, mathematical structure. to defjnition by structure. Example: The granular attribute Time is well suited are true and which are false. Complete information: There is complete information about which rules False rules: All other rules are taken to be false. True rules: The rules which are true are precisely those of 12/24 Idea of defjnition by structure: The constrained granularity schema S is modelled as a single structure σ S . ModelsOf � σ S � . ⊤ • Recall model from Slide 8.
Recall Structure of Granular Attribute Time Year Quarter Month Week Day Number days consecutively with 1970-01-01 day zero: All other granules consist of a set of days: 13/24 ⊤ Example: σ Time for the granularity schema of time. • Model all days starting with 1970-01-01. • Dom � σ � = N . GnletoDom Time ( yyyy-mm-dd ) = { number of days yyyy-mm-dd is after 1970-01-01 } . GnletoDom Time ( X ) = � { GnletoDom Time ( d ) | d ∈ X } .
Recommend
More recommend