Peer Data Management Systems Concepts and Approaches Armin Roth HPI, Potsdam, Germany Nov. 10, 2010 Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 1 / 28
Agenda Large-scale Information Sharing 1 PDMS Architecture 2 System Characteristics 3 Comparison of Approaches 4 Conclusion + Future Research 5 Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 2 / 28
Large-scale Information Sharing Large-scale Information Sharing Regional Red cross hospital capital Medication north headquarters logistics company Government National control medication center inventory Medicines sans Capital fronties medication Medicine national base inventory company Regional Capital hospital hospital east National pharmacies association Peer Regional Peer schema hospital Peer data south Peer mapping Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 3 / 28
PDMS Architecture PDMS Regional Red cross hospital capital Medication north headquarters Heterogeneity logistics company Peer Autonomy Government National control medication center inventory Mediator: Queries Medicines sans Capital fronties medication passed to neighbors Medicine national base inventory company Capital Regional Flexibility hospital hospital east National pharmacies High Redundancy association Peer Regional Peer schema Information Loss hospital Peer data south Peer mapping Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 4 / 28
PDMS Architecture Distributed Information Systems P2P System Distribution PDMS Distributed DBMS DBMS Autonomy Data Warehouse Mediator-based Heterogeneity Information System Federated DBMS [OV99] Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 5 / 28
PDMS Architecture General System Model P i PDMS set P of peers P i with P i = { G i , S i , L i , M i } : L i G i S i – Peer schema G i – Local schema S i – Local mappings L i M i / M j – Peer mappings M i P j Peer mappings m ∈ M i ∪ M j are assertions L j φ G i ❀ φ G j resp. φ G j ❀ φ G i G j S j with queries φ G i and φ G j of different arity Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 6 / 28
PDMS Architecture Peer Mappings Different peers P i , P j heterogeneous in – Data model – Schema – Query language – Data schema interplay [BCHL05] – Intens./extens. completeness Language of mapping assertions φ G i ❀ φ G j must bridge all these types of heterogeneity [MBDH02] Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 7 / 28
PDMS Architecture Example P 1 Source R 1 #tuples a b c d e 20 Global-as-view R 5 ( a,b,c,d,e ) ⊆ Mapping R 1 ( a,b,c,d,e ) , b = 'US' R 2 ( a,b,c ) , R 3 ( c,d,e ) ⊆ P 5 R 5 a b c d e 90 R 1 ( a,b,c,d,e ) P 2 R 2 R 3 a b c c d e 60 Local-as- 60 R 5 ( a,b,c,d,e ) ⊆ view R 6 ( c,d,e ) Mapping R 6 ( c,d,e ) ⊆ R 4 ( a , b , c , d ) ⊆ R 3 ( c,d,e ), d > 10 R 2 ( a,b,c ), R 3 ( c,d,e ), d > 1 R 6 P 6 c d e R 4 P 4 c d e 10 Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 8 / 28
PDMS Architecture Semantics of PDMS Query Answering [CGLR04] Special case: all queries in mapping assertions ∈ CQ Semantics of an individual peer: FOL theory T P i (Global) source database D Set of all models of PDMS P wrt. D : sem D ( P ) = { I | I is a model of all T P i based on D ∧ I satisfies all M i } Meaning of I satisfying M i varies in different approaches for peer mappings Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 9 / 28
PDMS Architecture Applications for PDMS Fusion of organisations Semantic Web [HIMT03, HHNR05] Disaster Management [HIST03] Groupware [ANR07] In general: Large, loosely coupled integrated information systems Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 10 / 28
System Characteristics System Model [HRZ + 08] Category Possible Alternatives Data model Relational XML (incl. web services) RDF Topology Arbitrary Arbitrary without cycles Mapping language GLaV Subset of FOL Mapping tables Data schema interplay (e.g., HePToX) Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 11 / 28
System Characteristics Semantics Expressiveness and interpretation of mapping language determines semantics of – query answering – data exchange 2 principal approaches Global reasoning : Mappings are interpreted as 1 material logical implication Local reasoning : Only exchange of certain answers 2 Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 12 / 28
System Characteristics Autonomy/Modularity Important category in distributed systems with many stakeholders Types: – Design autonomy (modeling, naming) – Communication autonomy (decide about cooperations) – Execution autonomy (scheduling of requests) Influenced by – Semantics – Functional requirements (e.g., update propagation, global catalog) Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 13 / 28
Comparison of Approaches Piazza [HIST03] Data model Relational, XML Mapping language GLaV, definitional mappings Query language CQ Global catalog Peer autonomy Open-world wrt. certain peer Semantics of query answering Query optimization Containment-based pruning at query planning time Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 14 / 28
Comparison of Approaches Hyper [CGL + 04, CGLR04] Data model Relational Mapping language GLaV CQ Query language Preserved Peer autonomy Semantics of Based on epistemic logic, query answering exchange of certain answers none Query optimization Inconsistency tolerance Other Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 15 / 28
Comparison of Approaches Hyperion [AKK + 03, KAM03] Data model Relational (others also possible) Mapping language Generalization of GLaV CQ, value search Query language Preserved Peer autonomy Semantics of Open-world and closed-world possible query answering unknown Query optimization Update propagation Other Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 16 / 28
Comparison of Approaches Hyperion Highly dynamic and scalable Schema mapping expressions Mapping tables: – Correspondences between data values – Many-to-many mappings – Automatically inferring new entries – Respect autonomy of the peers – Supports value search (point queries) Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 17 / 28
Comparison of Approaches Hyperion: Semantics of Mapping Tables Mapping table: X → Y with sets of attribute values resp. Open- Closed- world world variables X , Y (many-to-many) present Any indicated Semantics of practical interest: X -value Y -value Y -values closed-open-world , missing Any no closed-closed-world X -value Y -value Y -value Influences combination of mapping tables Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 18 / 28
Comparison of Approaches Hyperion: Example GDB id SwissProt id MIN id GDB:120231 P21359 162200 GDB:120231 O00662 193520 GDB:120232 P35240 101000 GDB id SwissProt id GDB:120231 O00662 GDB id MIM id GDB:120233 162030 Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 19 / 28
Comparison of Approaches Logical Relational Model [SGMB03] Domain relation: any subset of dom i × dom j Relational space: set of local databases and a domain relation Coordination formula: CF ::= i : φ | CF → CF | CF ∧ CF | CF ∨ CF | ∃ i : x . CF | ∀ i : x . CF ( i ∈ set of peers) Example: ∀ (Doc : fn , ln , pn , gender , pr ) . (Doc : Patient ( 1234 , fn , ln , pn , gender , pr ) → Hospital : ∃ ( hid , n , a ) . Patient ( hid , 1234 , n , gender , a , Davis , pr ) ∧ n = concat ( fn , ln ))) Query answering: coordination formulas as deductive rules Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 20 / 28
Comparison of Approaches Logical Relational Model Data model Relational Coordination formulas: Subset of FOL Mapping language (implication, conjunction, disjunction, universal and existential quantification wrt. different domains) Query language Equal to mapping language Preserved (recursive local reasoning) Peer autonomy Local reasoning Semantics of query answering (satisfyability of coordination formulas) Query optimization unknown Update propagation Other (using coordination formulas) Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 21 / 28
Comparison of Approaches Humboldt Peers [Rot07] Relational Data model extensionally sound GaV: Mapping language ∀ ¯ x ∀ ¯ y ( φ S (¯ x , ¯ y ) → ∃ ¯ z g (¯ x , ¯ z )) extensionally sound LaV: ∀ ¯ x ∀ ¯ y ( s (¯ x , ¯ y ) → ∃ ¯ z φ G (¯ x , ¯ z )) Query language CQ with semi-interval selections Peer autonomy Highly preserved Semantics of Exchange of certain answers query answering Query optimization Completeness-driven pruning, limitation of resource consumption Cardinality estimation based on query Other feedback Armin Roth (HPI, Potsdam, Germany) Peer Data Management Systems Nov. 10, 2010 22 / 28
Recommend
More recommend