Information Integration on the WEB with RDF, OWL and SPARQL Introduction and Overview Grant Weddell September 10, 2013
Data on the WEB Consider the HTML associated with the URI http://en.wikipedia.org/wiki/Resource Description Framework To humans (with browser):
Data on the WEB Consider the HTML associated with the URI http://en.wikipedia.org/wiki/Resource Description Framework To machines:
Data on the WEB Consider the HTML associated with the URI http://en.wikipedia.org/wiki/Resource Description Framework To usefully integrate this information, machines must 1. understand natural languages, and 2. have domain specific understandings of the world. RDF and OWL are a solution: 1. Add HTML that encodes data and metadata in the form of subject / predicate / object triples. 2. Add two new WEB functions Fetch : URI → HTML SelectRDF : HTML → HTML FilterRDF : HTML → HTML
� � � � � � Information Integration SQL SPARQL Conceptual Schema Ontology (OWL) ⇒ Source 1 Source n URI 1 URI n · · · · · · (RDB) (RDB) (RDF) (RDF) Relational setting: ◮ Source mappings, option 1: local as view (LAV) ◮ Source mappings, option 2: global as view (GAV) ◮ Query evaluation: optimization / compilation then plan execution ◮ Metadata operates as constraints ◮ Closed world
� � � � � � Information Integration SQL SPARQL Conceptual Schema Ontology (OWL) ⇒ Source 1 Source n URI 1 URI n · · · · · · (RDB) (RDB) (RDF) (RDF) WEB setting: ◮ Object identity: literal values + uniform resource identifiers (URIs) ◮ Data and metadata: resource description framework (RDF) ◮ RDF: collection of subject / predicate / object triples ◮ Integration of information: � i SelectRDF(Fetch( URI i ))
� � � � � � Information Integration SQL SPARQL Conceptual Schema Ontology (OWL) ⇒ Source 1 Source n URI 1 URI n · · · · · · (RDB) (RDB) (RDF) (RDF) WEB setting (cont’d): ◮ Query evaluation: optimization / compilation then plan execution ◮ Metadata can infer new data ◮ Open world
Information Integration Example: ( data ) ( metadata ) John/age/32 student/subclass/human John/type/student human/exists-property/age Mary/type/student age/range/integer age/type/functional-property Query: known humans and their ages select ?x, ?y where { ?x type human. ?x age ?y } Result: ◮ With basic RDF entailment: { } ◮ With OWL 2 direct semantics entailment: {{ ?x = John, ?y = 32 }}
Information Integration Example: ( data ) ( metadata ) John/age/32 student/subclass/human John/type/student human/exists-property/age Mary/type/student age/range/integer age/type/functional-property Query: known humans that have an age select ?x where { ?x type concept-intersection(human, exists-property(age)) } Relies on OWL 2’s ability to express complex concepts: the intersection of the set of humans and the set of things having an age property Result: ◮ With basic RDF entailment: { } ◮ With OWL 2 direct semantics entailment: {{ ?x = John } , { ?x = Mary }}
Overview There are three main topics: 1. RDF storage engines 2. SPARQL query evaluation 3. Ontology-based data access (OBDA) Will also be useful to review ◮ complexity theory, to understand the difficulty of query evaluation, ◮ first order logic, which underlies both the relational and WEB setting and ◮ description logics (DLs), in particular the dialect SHROIQ ( D ). Topics not covered: 1. schema integration 2. fact extraction 3. inductive reasoning
Recommend
More recommend