itrails pay as you go information integration in
play

iTrails: Pay-as-you-go Information Integration in Introduction Data - PowerPoint PPT Presentation

iTrails: Pay-as-you-go Information Integration in Dataspaces iTrails: Pay-as-you-go Information Integration in Introduction Data & Query Dataspaces models Data model Query model Marcos Antonio Vaz Salles Jens-Peter Dittrich Shant


  1. iTrails: Pay-as-you-go Information Integration in Dataspaces iTrails: Pay-as-you-go Information Integration in Introduction Data & Query Dataspaces models Data model Query model Marcos Antonio Vaz Salles Jens-Peter Dittrich Shant Kirakos Karakashian iTrails Oliver Rene Girard Luras Blunschi iTrails query processing Matching ETH Zurich Transformation Merging 8092 Zurich, Switzerland Mutilple trails Experiments CSE718. Advanced Topics in Database Systems

  2. Querying heteregenous data sources iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Data & Query models Data model 1 Schema first approach(SFA) Query model Semantically integrated view over a set of data sources iTrails Mappings between source schemas and mediated schema iTrails query Queries have clearly defined semantics processing Expensive to construct and maintain Matching Transformation Merging Mutilple trails Experiments

  3. Querying heteregenous data sources iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Data & Query models Data model 1 Schema first approach(SFA) Query model Semantically integrated view over a set of data sources iTrails Mappings between source schemas and mediated schema iTrails query Queries have clearly defined semantics processing Expensive to construct and maintain Matching 2 No schema approach(NSA) Transformation Merging Keyword search Mutilple trails Requires good result ranking methods Performs no integration Experiments

  4. Querying heteregenous data sources iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Data & Query models Data model 1 Schema first approach(SFA) Query model Semantically integrated view over a set of data sources iTrails Mappings between source schemas and mediated schema iTrails query Queries have clearly defined semantics processing Expensive to construct and maintain Matching 2 No schema approach(NSA) Transformation Merging Keyword search Mutilple trails Requires good result ranking methods Performs no integration Experiments 3 Dataspaces Starts with NSA Gradually approaches SFA by means of hints (trails)

  5. Dataspaces. Motivation iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Data & Query models Data model Query model iTrails iTrails query processing Matching Transformation Merging Mutilple trails Experiments

  6. Dataspaces. Motivation iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Data & Query models Data model Query model iTrails iTrails query processing Matching Transformation Merging Mutilple trails Experiments

  7. Dataspaces. Motivation iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Data & Query models Data model Query model iTrails iTrails query processing Matching Transformation Merging Mutilple trails Experiments

  8. Dataspaces. Motivation iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Data & Query models Data model Query model iTrails iTrails query processing Matching Transformation Merging Mutilple trails Experiments

  9. Motivation. Possible queries iTrails: Pay-as-you-go Information Integration in Dataspaces Query 1 Retrieve all pdf documents that were added or modified yesterday Introduction Data & Query models Data model Query model iTrails iTrails query processing Matching Transformation Merging Mutilple trails Experiments

  10. Motivation. Possible queries iTrails: Pay-as-you-go Information Integration in Dataspaces Query 1 Retrieve all pdf documents that were added or modified yesterday Introduction Data & Query models State-of-the-art Data model Query model Select all pdf documents that iTrails Email server are attachements to emails with the attribute received set to iTrails query yesterday; processing Matching DBMS are pointed by rows whose value of the lastmodified column Transformation is set to yesterday Merging Mutilple trails Net file-server, laptop have an attribute lastmodified set to yesterday. Experiments

  11. Motivation. Possible queries iTrails: Pay-as-you-go Information Integration in Dataspaces Query 1 Retrieve all pdf documents that were added or modified yesterday Introduction Data & Query models State-of-the-art Data model Query model Select all pdf documents that iTrails Email server are attachements to emails with the attribute received set to iTrails query yesterday; processing Matching DBMS are pointed by rows whose value of the lastmodified column Transformation is set to yesterday Merging Mutilple trails Net file-server, laptop have an attribute lastmodified set to yesterday. Experiments Goal Provide a method that allows to specify the same query by typing the keywords pdf yesterday . Exploit hints (trails) to provide partial schema knowledge 1 The yesterday keyword is mapped to a query for values of the date attribute equal to the date of yesterday 2 The date attribute is mapped to the lastmodified attribute 3 The date attribute is mapped to the received attribute 4 The pdf keyword is mapped to a query for elements whose names end in pdf.

  12. Motivation. Possible queries iTrails: Pay-as-you-go Information Integration in Dataspaces Query 2 Introduction Data & Query Retrieve all information about the current work on project PIM models Data model Query model iTrails iTrails query processing Matching Transformation Merging Mutilple trails Experiments

  13. Motivation. Possible queries iTrails: Pay-as-you-go Information Integration in Dataspaces Query 2 Introduction Data & Query Retrieve all information about the current work on project PIM models Data model Query model State-of-the-art iTrails iTrails query Issue the following queries to the search engine processing Email server //mike/personalIM Matching Transformation Laptop //projects/PIM (but not //papers/PIM ) Merging Net file-server //mike/research/PIM Mutilple trails Experiments

  14. Motivation. Possible queries iTrails: Pay-as-you-go Information Integration in Dataspaces Query 2 Introduction Data & Query Retrieve all information about the current work on project PIM models Data model Query model State-of-the-art iTrails iTrails query Issue the following queries to the search engine processing Email server //mike/personalIM Matching Transformation Laptop //projects/PIM (but not //papers/PIM ) Merging Net file-server //mike/research/PIM Mutilple trails Experiments Goal Provide a method of specifying the query by typing //projects/PIM 1 Queries for the path //projects/PIM should also consider the path //mike/research/PIM 2 Queries for the path //projects/PIM should also consider the path //mike/personalIM

  15. Data model iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Definition Data & Query models All data is represented by a logical graph G = ( RV , E ) Data model Query model RV is the set of nodes { V 1 , . . . V n } each of which termed resource view iTrails E is a sequence of ordered pairs ( V i , V j ) of resource views representing iTrails query directed edges from V i to V j processing Matching V i � V j denotes the fact that V j is reachable from V i by traversing the edges Transformation E Merging A resource view V i has three components: name, tuple, and content Mutilple trails Experiments Component of V i Definition V i . name Name (string) of the resource view V i . tuple Set of attribute value pairs ( � att 0 , value 0 � , � att 1 , value 1 � , . . . ) V i . content Finite by sequence of content (e.g. text)

  16. Example iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Data & Query X 1 = { . name = ‘ home ‘ , models Data model . tuple = { . owner = ‘ root ‘ , Query model . lastmodified = ‘ 05 . 01 . 2000 ‘ } , iTrails . content = “ } iTrails query X 2 = { . name = ‘ mike ‘ , processing . tuple = { . owner = ‘ root ‘ , Matching Transformation . lastmodified = ‘ 04 . 17 . 2008 ‘ } , Merging . content = “ } Mutilple trails . . . Experiments X 5 = { . name = ‘ SIGMOD 42 . pdf ‘ , . tuple = { size = 10 k , . owner = ‘ mike ‘ , . lastmodified = ‘ 04 . 01 . 2007 ‘ } , . content = ‘@ PDF . . . ‘ } . . .

  17. Query model iTrails: Pay-as-you-go Information Integration in Dataspaces Introduction Data & Query models Data model Query expression Query model A query expresion Q selects a subset of nodes R := Q ( G ) ⊆ G . RV iTrails iTrails query Example: //mike/papers processing Matching Transformation Merging Component projection Mutilple trails A component projection C ∈ { . name , . tuple . � att i � , . content } obtains a projection of Experiments the set of resource views selected by a query expression Q , i.e. a set of components R ′ := { V i . C | V i ∈ Q ( G ) } Example: //mike//PIM/*.tuple.lastmodified

Recommend


More recommend