Prov ovenan nance ce I Infor format ation ion in the in the W Web of D b of Data ata Olaf f Hart artig Humbol oldt-Universität zu zu Berlin http://olafhartig.de/foaf.rdf#olaf
● Provenance of a data item: information about the history Olaf Hartig - Provenance Information in the Web of Data 2
● Provenance of a data item: information about the history Olaf Hartig - Provenance Information in the Web of Data 3
● Provenance of a data item: information about the history Olaf Hartig - Provenance Information in the Web of Data 4
Outline Towards a model of Web data provenance Provenance information in the Web of data today Upcoming tasks Olaf Hartig - Provenance Information in the Web of Data 5
Existi ting g Provenanc nce Research ● Main research areas: (scientific) workflows, DBMSs ● General focus: data creation Olaf Hartig - Provenance Information in the Web of Data 6
Olaf Hartig - Provenance Information in the Web of Data 7
Olaf Hartig - Provenance Information in the Web of Data 8
Olaf Hartig - Provenance Information in the Web of Data 9
Olaf Hartig - Provenance Information in the Web of Data 10
Web data provenance comprises two dimensions: Data Creation • Data Access Olaf Hartig - Provenance Information in the Web of Data 11
Basics of of the Prov ovenance Mode del ● Provenance graph describes provenance of a data item ● Nodes: provenance elements – pieces of provenance info ● Edges: relate provenance elements to each other ● Subgraphs for related data items possible Olaf Hartig - Provenance Information in the Web of Data 12
Basics of of the Prov ovenance Mode del ● Provenance model defines: ● Types of provenance elements ● Relationships Olaf Hartig - Provenance Information in the Web of Data 13
Basics of of the Prov ovenance Mode del ● Provenance model defines: ● Types of provenance elements ● Relationships ● High level of abstraction (only main element types) Olaf Hartig - Provenance Information in the Web of Data 14
Basics of of the Prov ovenance Mode del ● General differentiation: Actors Executions Artifacts Olaf Hartig - Provenance Information in the Web of Data 15
Data ta Access Dimens nsion Data Item Data Accessor (Non-Human) contains Information Resource Access Time Data Access Data Providing Service (Non-Human) controls uses Service Provider Data Publisher (Human) Relation to the provided Information Resource Olaf Hartig - Provenance Information in the Web of Data 16
Data ta Access Dimens nsion n cont. ont. owns Public Key Signer Relation to the signed Data Integrity Assurance Digital Signature signs Verification Result (Signed) Artifact Olaf Hartig - Provenance Information in the Web of Data 17
Data ta Creati tion on Dimens nsion on Provenance Information Source Data Provenance Creation Time Information Creation Guidelines Data Creator Data Creation (Human or Non-human) {complete,disjoint} Data Creating Device (e.g. Sensor) Data Item Data Creating Service (e.g. Software Agent) part of Provenance responsible for responsible for Data Creating Entity Information (e.g. Person, Group, Orga.) (Encompassing) Data Item Relation to the created Data Olaf Hartig - Provenance Information in the Web of Data 18
Provenance information in the Web of data today Olaf Hartig - Provenance Information in the Web of Data 19
Prov ovenanc nce-r -relate ted d Vocabul bularies DC – Dublin Core Metadata Terms FOAF – Friend of a Friend SIOC – Semantically-Interlinked Online Communities ● SWP – Semantic Web Publishing vocabulary ● WOT – Web of Trust schema ● OMV – Ontology Metadata Vocabulary ● PML – Proof Markup Language ● Changeset vocabulary ● Ouzo Provenance Ontology Olaf Hartig - Provenance Information in the Web of Data 20
Main n Issue ues Toda day ● Vocabularies: ● Partly unsuitable ● Lack of certain features ● Coverage of provenance model impossible Olaf Hartig - Provenance Information in the Web of Data 21
Prov ovenanc nce-r -relate ted d Vocabul bularies DC – Dublin Core Metadata Terms Property Occurrences* dc:creator about 24,284 dc:contributor 476 dc:source about 3,631 dc:created about 82,720 dc:modified about 12,020 dc:provenance 7 *Measured by querying Sindice; Feb. 7, 2009 (by that time Sindice indexed about 48,99 million documents) Olaf Hartig - Provenance Information in the Web of Data 22
Main n Issue ues Toda day ● Vocabularies: ● Partly unsuitable ● Lack of certain features ● Coverage of provenance model impossible ● General lack of provenance-related metadata on the Web of data Olaf Hartig - Provenance Information in the Web of Data 23
Pos ossibl ble Reason ons ● Lack of suitable vocabularies ● Lack of usable tools ● Ignorance / lack of sensitization Olaf Hartig - Provenance Information in the Web of Data 24
Upcoming tasks Olaf Hartig - Provenance Information in the Web of Data 25
Addr dress the Issue ues ● Let's develop a vocabulary for Web data provenance ● Proposal: refine the presented provenance model ● Integrate existing vocabularies for specific types of provenance elements Olaf Hartig - Provenance Information in the Web of Data 26
Addr dress the Issue ues ● Let's develop a vocabulary for Web data provenance ● Proposal: refine the presented provenance model ● Integrate existing vocabularies for specific types of provenance elements ● Let's develop usable tools for data providers ● Edit and publish provenance-related metadata ● Automatic generation if possible Olaf Hartig - Provenance Information in the Web of Data 27
Addr dress the Issue ues ● Let's develop a vocabulary for Web data provenance ● Proposal: refine the presented provenance model ● Integrate existing vocabularies for specific types of provenance elements ● Let's develop usable tools for data providers ● Edit and publish provenance-related metadata ● Automatic generation if possible ● Let's raise awareness of data providers ● Probably the hardest task ● Maybe voiD can help Olaf Hartig - Provenance Information in the Web of Data 28
Thank ank you ou! Olaf Ha Harti rtig Hum umbo boldt-Universität zu zu Berlin http://olafhartig.de/foaf.rdf#olaf
These slides have been created by Olaf Hartig http://olafhartig.de This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License ( http://creativecommons.org/licenses/by-sa/3.0/ ) Attribution: http://www.flickr.com/photos/adrenalin/3032734/ ● http://www.hasslefreeclipart.com ● http://www.flickr.com/photos/dullhunk/428079229/ ● http://www.flickr.com/photos/darwinbell/1337963794/ ● http://www.flickr.com/photos/alandd/2780700767/ ● http://www.flickr.com/photos/simeon_barkas/2872099696/ ● http://www.flickr.com/photos/robinh00d/122544491/ ● http://www.flickr.com/photos/adrenalin/3032747/ ● Olaf Hartig - Provenance Information in the Web of Data 30
Recommend
More recommend