Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic - PowerPoint PPT Presentation

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues Anastasios Kementsietsidis, Marcelo Arenas, Renée J. Miller ACM SIGMOD International Conference on Management of Data 2003 Rolando Blanco CS856 – Winter 2005

Overview • Data Sharing in P2P systems • Mapping table approach • Conclusions/ Discussion 2

Data Sharing in P2P • Between autonomous structured data sources • Data sources may use different schemas • Sources may not be willing to share schema • Data and schemas overlap or are related Different schemas � semantic issues! 3

Example [Berstein02] Peer1: Toronto General Hospital (TGHDB) Peer2: Dr Davis Family Dr (DavisDB) Patients (TGH#, OHIP#, Name, FamilyDr, Sex, Age, …) Patients (OHIP#, FName, LName, Phone#, Sex, …) Treatments (TreatID, TGH#, Date, TreatDesc, PhysID) Events (OHIP#, Date, Description) • Patient visits hospital � load data from DavisDB • Patient receives treatment � update Events at DavisDB • A pharmacist db may update Events relation at DavisDB as well How to implement data sharing? Note global key OHIP# and similarities between attribute names 4 [Berstein02] Bernstein et al, “Data management for peer-to-peer computing: A vision”. Workshop on the Web and Databases, WebDB 2002

Data Sharing • Traditional Approach: Mediated schemas - “semantic tree” Mediated Schema - global-as-view - local-as-view TGHDB DavisDB • P2P: Schema mappings Victoria Walking Clinic DavisDB TGHDB ClinicDB map(TGHDB) map(DavisDB) map(DavisDB) map(ClinicDB) Graph of interconnected schemas form semantic network/topology Variations [Tatarinov03]: Mediating Peer Mediating Peer TGHDB DavisDB ClinicDB TGHDB schema DavisDB schema DavisDB schema ClinicDB schema 5 [Tatarinov03] Igor Tatarinov et al, “The Piazza Peer Data Management System”. ACM SIGMOD Record Volume 32 , Issue 3 (September 2003)

Data Sharing More Variations [Löser03]: Super-peers store schema mappings between super-peers, and between super-peers and regular neighbour peers. 6 [Löser] Alexander Löser et al. “Information Integration in Schema-Based Peer-To-Peer Networks” 15th Conference on Advanced Information Systems Engineering (CAiSE'03)

“… The true novelty lies in the PDMS ability to exploit transitive relationships among peers’ schemas …” [Halevy04] From: To: 7 [Halevy04] Alon Halevy et al. "Schema Mediation for Large-Scale Semantic Data Sharing", VLDB Journal, 2004.

How to create schema mappings • Machine learning techniques: GLUE [ Doan03] – Correspondences between taxonomies – “Similarity” between concepts based on probability distributions • Gossiping [ Aberer03] : – Propagation of queries toward nodes for which no direct mapping exists ( “semantic gossiping”) – Analyse results and create/ adjust mappings – Goal: increm ental developm ent of global agreem ent (sem antics = = form of agreem ent) • On the fly ( PeerDB [ Ng03] ): – No shared/ distributed schema – Attributes have associated words (e.g. desc � description, characteristics, features, functions) - – Selection of candidate relations using I R techniques (flooding + TTL) – User confirms selections, system remembers. • Don’t query, subscribe! [Aberer03] Karl Aberer et al. The Chatty Web: Emergent Semantics Through Gossiping. Proceedings International WWW Conference 2003. [Doan03] AnHai Doan, et al. Learning to Match Ontologies on the Semantic Web. VLDB journal, vol. 12, No. 4. 2003 [Ng03] Wee Siong Ng, et al. PeerDB: A P2P-based System for Distributed Data Sharing. 8 19th International Conference on Data Engineering 2003

Schema Mappings - Interesting Problems • Schema composition • Minimal composition • Semantical redundancy • Semantical partition 9

Are schema mappings enough? Peer1: ABC Rentals (ABC) Peer2: The Rental Store (TRS) ProdClasses (ProdClassID, ProdClassDesc, …) ProdGroups( ProdGroupID, ProdGroupDesc, …) Customer of ABC Rentals wants to rent a product, ABC Rentals subrents from TRS if none available Schema mapping: ABC.ProdClassID ≅ TRS.ProdGroupID ABC.ProdClassDesc ≅ TRS.ProdGroupDesc ABC’s ProdClasses TRS’s ProdGroups: C001 “Air Compressors 2-4 CFM” A001-31 “Air Comp. 2-6 CFM” C002 “Air Compressors 5-7 CFM” A001-32 “Air Comp. 7-10 CFM” C003 “Air Compressors 8-10 CFM” • Unless global ID, � different ID’s imply different “meaning” • Query: Customer wants air compressor of at least 5 CFM • Assume no “capacity” column. This is a real-world example. 10

Data Mappings ABC’s ProdClasses TRS’s ProdGroups: C001 “Air Compressors 2-4 CFM” A001-31 “Air Comp. 2-6 CFM” C002 “Air Compressors 5-7 CFM” A001-32 “Air Comp. 7-10 CFM” C003 “Air Compressors 8-10 CFM” ProdClassI D ProdGroupI D C001 A001-31 C002 A001-32 C003 A001-32 • Represent knowledge, created/maintained by experts • Semantically “richer”/more specific than schema mappings (but complementary) • Note mapping is unidirectional (schema mapping is typically bi-directional) • But still transitivity! • Peer network logically defined by mappings among peers • The way data sharing is done today in many applications • Goals (paper’s): (1) Specification of different semantics for data mappings (2) Inference/Validation of new data mappings 11

Definitions Mapping Table MP A → B : Given tables A(a 1 , a 2 , …, a n ), B(b 1 , b 2 , …, b m ), MP A → B (c 1 ,…, c i , c i+1 ,…, c j ) with {c 1 ,…, c i } ⊆ {a 1 , …, a n } and {c i+1 ,…, c j } ⊆ {b 1 , …, b m }, then MP A → B is a mapping table from A to B if: ∀ t ∈ MP A → B : t[c k ] = value in dom(a l ), or v (variable), or v – subset(dom(a l )) ( assuming c k corresponds to a l ) Restriction!: v can appear one or more times in one and only one tuple of MP A → B Is this definition sound?: assuming v can have values in dom(a l ) MP A → B ⊆ p {c1,…, cj} with v – subset(dom( a l )): with v: U (*) subset(dom(a l )) = {val 1 , val 2 …val z } σ c k <> v σ a l <>val 1 ∧ p {c1,…, ck-1, ck,ck+1,…, cj} (*) a l <>val 2 ∧ ... X X a l <>val z MP A → B σ c k =v p ck B A MP A → B A 12

More definitions What about values of p {c1,…, ci} (A) not in p {c1,…, ci} ( MP A → B ) ? • Closed world semantics: - data cannot be associated to values in B • Open world semantics: - data can be associated to any value in B ≅ v – { p {cw} ( MP A → B ) } with cw attribute of B - represents partial knowledge • Tuple satisfies mapping table: Given a mapping MP A → B (c 1 ,…, c i , c i+1 ,…, c j ), a tuple t with attributes {r 1 , …, r w } ⊇ {c 1 , …, c j } satisfies MP A → B if t[c 1 ,…, c i , c i+1 ,…, c j ] ∈ MP A → B • Mapping constraint: Assume attribute sets A’ = {c 1 , …,c i }, B’ = {c i+1 , …, c j } and mapping MP A → B (c 1 ,…, c i , c i+1 ,…, c j ), MP µ is a mapping constraint over A’ U B’ (represented µ : ), from A’ to B’, if for every tuple t A’ B’ with attributes ⊇ {c 1 ,…, c i , c i+1 ,…, c j }, t satisfies µ , (t | = µ ) if t[(c 1 ,…, c i , c i+1 ,…, c j ] ∈ MP A → B . • Relation satisfies mapping constraint: R |= µ (R satisfies µ ) A relation R with attributes {r 1 , …, r w } ⊆ {c 1 , …, c j } satisfies µ (R |= µ ) if for every tuple t in t, t |= µ 13

Inference/ Consistency Problem Inference problem : Given a set of formulas ∑ , can f be • deduced from ∑ ( ∑ | = f)? – Deductive calculus: prove ¬ ∃ t : t | = ∑ U { ¬ f} ( consistency problem : can anything be deduced from ∑ ?) – Note if you have an algorithm to resolve consistency problem, then you can use it to resolve inference problem as well. 15

One more definition • Cover of a set of constraints: – Consider semantic path P 1 , …P n with set of attributes A i i . Assume ∑ is the set of mapping constraints for peer P in P 1 , … P n . µ is the cover of a set of constraints ∑ iff: ∀ µ ’ : ∑ |= µ ’ iff ext( µ ) ⊆ ext( µ ’) MP’ A 1 A n – Argument: - If an algorithm can compute cover µ then inference consistency problem is solved (since µ < > ∅ ) - To show that a mapping constraint µ ’ can be inferred from ∑ we just need to show ext( µ ) ⊆ ext( µ ’) – Are the arguments valid, what type of things can be shown to be deduced from ∑ ? 16

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic - PowerPoint PPT Presentation

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues Anastasios Kementsietsidis, Marcelo Arenas, Rene J. Miller ACM SIGMOD International Conference on Management of Data 2003 Rolando Blanco CS856 Winter 2005 Overview

Texture and other Mappings Texture Mapping Texture Mapping Bump Mapping Bump Mapping

Comparing Hybrid Peer-to-Peer Hybrid peer-to-peer systems Systems Beverly Yang and Hector

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

Image Warping Image Mapping Image Mapping - Examples Forward Mapping Forward Mapping -

TEXTURE MAPPING 1 OUTLINE Introduce Mapping Methods Texture Mapping Environment

Serverless networking (peer-to-peer computing) Peer-to-peer models Client-server computing

Peer-to-Peer Networks 09 Random Graphs for Peer-to-Peer-Networks Christian Ortolf Technical

THE PEER-TO-PEER NETWORK JOHN NEWBERY @jfnewbery github.com/jnewbery THE PEER-TO-PEER NETWORK

Dependability within Dependability within Peer- -to to- -Peer Systems Peer Systems Peer

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

SpamResist: Making Peer-to-Peer Tagging SpamResist: Making Peer-to-Peer Tagging Systems Robust to

Peer to Peer Learning & Support Aims and Objectives of this Workshop Workshop 3: Peer to

Peer-to-Peer Networking and Discovery Technologies Week 6 Whats Peer-to-Peer? A different

Mapping data Representing data with maps Geographic analysis tasks Mapping where things are

Texture Mapping Texture Mapping 1 Texture Mapping Texture Mapping Motivation Motivation:

Manhattan Community Board 4 Thursday May 15, 2019 New York City Transit Background M14A/D

PROCESS (CMP) UPDATE Regional Freight Advisory Committee Meeting May 7, 2019 Mike Galizio

Final Recommendation on Parking Pilot Program Transportation Commission Meeting April 23, 2018

Environmental Site Tracking And Research Tool (E-START) MWCC Environmental Conference July 18,

Pilot Evaluation and Recommendations San Francisco Municipal Transportation Agency Board of

Full year results for the year ended 31 October 2018 Disclaimer These slides (the Slides)

SEX OFFENDER RESIDENCY RESTRICTIONS IN NEW JERSEY Kristen Zgoba, Ph.D. New Jersey Department of

1 Todays presentation will begin with an orientation and description of the NH 10 South

Sambuz

Useful Links

Newsletter

Mail Us

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic - PowerPoint PPT Presentation

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues Anastasios Kementsietsidis, Marcelo Arenas, Rene J. Miller ACM SIGMOD International Conference on Management of Data 2003 Rolando Blanco CS856 Winter 2005 Overview

Texture and other Mappings Texture Mapping Texture Mapping Bump Mapping Bump Mapping

Comparing Hybrid Peer-to-Peer Hybrid peer-to-peer systems Systems Beverly Yang and Hector

Semantics 1 / 21 Outline What is semantics? Denotational semantics Semantics of naming What

Image Warping Image Mapping Image Mapping - Examples Forward Mapping Forward Mapping -

TEXTURE MAPPING 1 OUTLINE Introduce Mapping Methods Texture Mapping Environment

Serverless networking (peer-to-peer computing) Peer-to-peer models Client-server computing

Peer-to-Peer Networks 09 Random Graphs for Peer-to-Peer-Networks Christian Ortolf Technical

THE PEER-TO-PEER NETWORK JOHN NEWBERY @jfnewbery github.com/jnewbery THE PEER-TO-PEER NETWORK

Dependability within Dependability within Peer- -to to- -Peer Systems Peer Systems Peer

Operational Semantics 1 / 14 Outline What is semantics? Operational Semantics What is

15-411: Dynamic Semantics Jan Ho ff mann Dynamic Semantics Static semantics: definition of

SpamResist: Making Peer-to-Peer Tagging SpamResist: Making Peer-to-Peer Tagging Systems Robust to

Peer to Peer Learning &amp; Support Aims and Objectives of this Workshop Workshop 3: Peer to

Peer-to-Peer Networking and Discovery Technologies Week 6 Whats Peer-to-Peer? A different

Mapping data Representing data with maps Geographic analysis tasks Mapping where things are

Texture Mapping Texture Mapping 1 Texture Mapping Texture Mapping Motivation Motivation:

Manhattan Community Board 4 Thursday May 15, 2019 New York City Transit Background M14A/D

PROCESS (CMP) UPDATE Regional Freight Advisory Committee Meeting May 7, 2019 Mike Galizio

Final Recommendation on Parking Pilot Program Transportation Commission Meeting April 23, 2018

Environmental Site Tracking And Research Tool (E-START) MWCC Environmental Conference July 18,

Pilot Evaluation and Recommendations San Francisco Municipal Transportation Agency Board of

Full year results for the year ended 31 October 2018 Disclaimer These slides (the Slides)

SEX OFFENDER RESIDENCY RESTRICTIONS IN NEW JERSEY Kristen Zgoba, Ph.D. New Jersey Department of

1 Todays presentation will begin with an orientation and description of the NH 10 South

Sambuz

Useful Links

Newsletter

Mail Us

Peer to Peer Learning & Support Aims and Objectives of this Workshop Workshop 3: Peer to