approaches towards unified models for integrating web
play

Approaches Towards Unified Models for Integrating Web Knowledge - PowerPoint PPT Presentation

Approaches Towards Unified Models for Integrating Web Knowledge Bases Maria Koutraki Joint work with: Nicoleta Preda, Dan Vodislav Paris, 26/10/2016 Koutraki Maria 2 Motivation Unstructured Data Koutraki Maria 3 Motivation


  1. Approaches Towards Unified Models for Integrating Web Knowledge Bases Maria Koutraki Joint work with: Nicoleta Preda, Dan Vodislav Paris, 26/10/2016

  2. Koutraki Maria 2

  3. Motivation – Unstructured Data Koutraki Maria 3

  4. Motivation – Unstructured Data • Text representation • Lack of structure • No entity resolution • No entity disambiguation Koutraki Maria 4

  5. Motivation – Structured Data What is structured data? • RDF – Resource Description Framework • W3C standard for describing web resources • Triple = statement of the form (subject, property, object) type Subject Property Object Rodin type Artist Artist interestedIn Sculpture Rodin notableWork The Thinker interestedIn The Thinker type Sculpture Rodin influences Artist1 Koutraki Maria 5

  6. Motivation – Structured Data Linked Open Data Cloud Domains Topic % Government 18.05% Publications 9.47% Life Sciences 8.19% User-generated content 4.73% Cross-domain 4.04% Media 2.17% Geographic 2.07% Social Web 51.28% 1800 Exponential increase of • 1500 datasets and triples 1200 900 > 30 billion triples • 600 Automatically constructed KBs 300 • 0 Koutraki Maria 6

  7. Motivation – Structured Data bronze sculpturer style Artist1 type influences date 1902 Artist2 influences createdBy born 1840 DBpedia Museum_Rodin Koutraki Maria 7

  8. Motivation – Structured Data Complementary bronze sculpturer style Artist1 type influences date 1902 Artist2 influences createdBy born 1840 DBpedia Museum_Rodin Koutraki Maria 8

  9. Motivation – Structured Data bronze sculpturer style Artist_3 type mentor date 1902 Artist_4 mentor createdBy Artist_5 mentor mentor Artist_6 Freebase Museum_Rodin Koutraki Maria 9

  10. Motivation – Structured Data bronze sculpturer style Artist_3 type mentor date 1902 Artist_4 mentor createdBy Artist_5 mentor mentor Artist_6 Freebase Museum_Rodin Koutraki Maria 10

  11. Motivation – Structured Data Diverse schemas for representation in LOD ~576 schemas/vocabularies • used for representation Diverse quality of schemas [1] • Duplicate representation of • similar concepts/classes and relations Lack of explicit alignment • between classes/relations (with only up to 2%) [2] [1] Aimilia Magkanaraki, Sofia Alexaki, Vassilis Christophides, Dimitris Plexousakis: Benchmarking RDF Schemas for the Semantic Web. International Semantic Web Conference 2002: 132-146 [2] Max Schmachtenberg, Christian Bizer, Heiko Paulheim: Adoption of the Linked Data Best Practices in Different Topical Domains. International Semantic Web Conference (1) 2014: 245-260 Koutraki Maria 11

  12. Motivation – Web services Koutraki Maria 12

  13. Motivation – Web services bronze style date 1902 createdBy Museum_Rodin Koutraki Maria 13

  14. Motivation – Web services bronze owl:sameAs bronze style sculpture style contains date 1902 DBpedia createdBy Museum_Rodin Koutraki Maria 14

  15. Motivation – Web services bronze owl:sameAs bronze style sculpture style contains date 1902 DBpedia createdBy MuseumExhibitions(Paris) <exhibitions> <museum> Louvre </museum> <museum>Rodin</museum> Museum_Rodin </exhibitions> Koutraki Maria 15

  16. Motivation – Web services bronze bronze style sculpture style contains date 1902 DBpedia createdBy MuseumExhibitions(Paris) <exhibitions> <museum> Louvre </museum> <museum>Rodin</museum> Museum_Rodin </exhibitions> Koutraki Maria 16

  17. Motivation – Web services More than 12000 APIs* from various domains: Search (3200 APIs) • Social (3000 APIs) • Traveling (1200 APIs) • Music (1000 APIs) • Financial (1200 APIs), Science (600 APIs), Weather (300 APIs) • *Source: ProgrammableWeb.com Koutraki Maria 17

  18. Context & Objectives ¤ PART I – DORIS: Deriving Intensional Description for Web Services Knowledge Base DORIS Web Service ¤ PART II – SOFYA: Online Relation Alignment on Linked Datasets Knowledge Knowledge Base Base SOFYA SPARQL endpoint SPARQL endpoint Koutraki Maria 18

  19. Part I: Deriving Intensional Descriptions for Web Services [CIKM’15, ISWC’15, BDA’15] Koutraki Maria 19

  20. Web Services What is a Web service? What is a Web Service? ¤ Way of publishing/exporting data ¤ A Web service (WS) is a function ¤ Consider WSs implementing REST: Interfaces to data sources ¤ Call a WS: ¤ URL address of WS ¤ Input value Example: “get artworks by artist name” – exported by DORIS_museums ¤ call for input “Rodin”: http://doris_museums.com?artist= Rodin ¤ Output: XML document Koutraki Maria 20

  21. Objective Uniform access to Web services! Local as view approach: • We consider as target source a given Knowledge Base (RDF) • Infer a mapping function (transform XML call results à RDF) • Infer a description (parameterized query over the target KB) Knowledge Base Web Service Web Services Koutraki Maria 21

  22. Mapping function ( σ ) Web service: “get artworks by artist” WS call result (XML) KB fragment (RDF) R: getArtWorksByArtist(Rodin) σ (R) name date URI3 1902 The Thinker root works σ URI2 item item shownAt d t d t birthdate name URI1 1840 Rodin The Thinker The Kiss 1902 1889 shownAt a a URI4 works b n b n date name URI5 1889 The Kiss 1840 Rodin 1840 Rodin Koutraki Maria 22

  23. Parameterized Query Schema of the parameterized query: the KB schema σ (getArtworksByArtist(Rodin)) name date URI3 1902 The Thinker works URI2 shownAt birthdate name URI1 1840 Rodin shownAt URI4 works date name URI5 1889 The Kiss Koutraki Maria 23

  24. Parameterized Query Schema of the parameterized query: the KB schema σ (getArtworksByArtist(Rodin)) σ (getArtworksByArtist(?IO)) name date birthdate URI3 1902 The Thinker name ?x ?l1 ?IO works shownAt URI2 shownAt ?y birthdate name works URI1 1840 Rodin date name shownAt ?l3 ?z ?l4 URI4 works date name URI5 1889 The Kiss Koutraki Maria 24

  25. Parameterized Query Schema of the parameterized query: the KB schema σ (getArtworksByArtist(Rodin)) name date URI3 1902 The Thinker works URI2 shownAt birthdate name URI1 1840 Rodin shownAt URI4 works date name URI5 1889 The Kiss Koutraki Maria 25

  26. Overview – DORIS system 1. Web service Input: 2. Knowledge Base Instance – based solution 1. Probing • Call WS with top entities from KB • Obtain call results (samples) 2. Compute alignments between WS and KB • Path Alignments • Class/Relation Alignments 1. Mapping Function Output: 2. Parameterized Query Koutraki Maria 26

  27. Path Alignments ¤ Relevant WS call result to an input entity (Rodin) ¤ Leaf nodes in call result encode attributes for input entity ¤ Linear XML paths in WS call result correspond to input entity – literal paths getArtWorksByArtist(Rodin) yago fragment (Rodin) name birthdate root yago:Rodin 1840 Rodin item item shownAt shownAt d t d t yago:Rodin_Museum yago:Pantheon The Thinker The Kiss 1902 1889 a a works works name date b n b n yago:The_Thinker 1902 The Thinker 1840 Rodin 1840 Rodin Koutraki Maria 27

  28. Path Alignments Path Pairs: shownAt works name t item root KB Input getArtWorksByArtist(Rodin) yago fragment (Rodin) name birthdate root yago:Rodin 1840 Rodin item item shownAt shownAt d t d t yago:Rodin_Museum yago:Pantheon The Thinker The Kiss 1902 1889 a a works works name date b n b n yago:The_Thinker 1902 The Thinker 1840 Rodin 1840 Rodin Koutraki Maria 28

  29. Metrics for Path Alignments 1. Overlapping: align two paths if the results of the one overlap the results of the other over a threshold α . #x: number of samples 2. Inclusions: align two paths if the results of the one are included in the results of the other over a threshold α . ¤ Compute both ways inclusions: KB path ⇆ WS path ¤ Partial completeness assumption: “a source knows either all or none of the p-attributes of some x” Koutraki Maria 29

  30. Class & Relation Alignments Problem: Identify XML nodes representing entities ¤ Idea: starting from the right-most side, align functional sub-paths (paths selecting one value) ¤ Assumption: the XML call result encode at least a function property per class of entities 1 n 1 1 1 1 XML: root t item 1 n 1 n 1 shownAt name 1 works KB: KB Input à “item” nodes correspond to artworks Koutraki Maria 30

  31. Class & Relation Alignments Problem: Identify XML nodes representing entities ¤ Idea: starting from the right-most side, align functional sub-paths (paths selecting one value) ¤ Assumption: the XML call result encode at least a function property per class of entities 1 n 1 1 1 1 XML: root t item 1 n 1 n 1 shownAt name 1 works KB: KB Input à “item” nodes correspond to artworks Koutraki Maria 31

  32. Class & Relation Alignments Compute Functionality ¤ KB: “A relation r(x,y) is called functional if for x there are not more than one y.” ¤ XML: “A path is functional if there are no two sibling nodes sharing the same label”. Koutraki Maria 32

  33. Overview 1. Web service 2. Knowledge Base Discovering DORIS I/O Dependencies 1. Mapping Function 2. Parameterized Query Koutraki Maria 33

Recommend


More recommend