RDF Storage and Retrieval Systems Jan Pettersen Nytun, UiA 1
S O P Following Slides - Ref.: Chapter 4 Semantic Web application architecture - Semantic Web for the Working Ontologist, Second Edition: Effective Modeling in RDFS and OWL, May 20, 2011, by Dean Allemang, James Hendler Jan Pettersen Nytun, UiA, Ontologies, page 2
S RDF STORE O P • An RDF store typically includes a query engine (SPARQL). • RDF allows easy merging of data sets (in contrast to relational data stores). • RDF stores comes in many flavors: Custom programmed database solutions, fully supported off-the-shelf products, … Jan Pettersen Nytun, UiA, page 3
S O P Conceptually the simplest relational implementation of a triple store • A triple store is designed to store and retrieve triple collections of strings (i.e., Subject-Predicate-Object statements). Each RDF statement is stored as a single row in a three column table. • Since this fits in a relational database representation, it can be accessed using conventional relational database tools such as SQL. Jan Pettersen Nytun, UiA, Ontologies, page 4
S O P Jan Pettersen Nytun, UiA, Ontologies, page 5
S RDF data standards and O P interoperability of RDF stores • Relational data stores: Difficult process to transfer a whole database from one system to another. • RDF Stores: All have the underlying RDF data model and support RDF/XML and/or Turtle. Easy to transfer an RDF data set. Jan Pettersen Nytun, UiA, Ontologies, page 6
S O P The common standards also simplifies the issue of federating data that are housed in multiple RDF stores, possibly coming from different vendor sources. Jan Pettersen Nytun, UiA, Ontologies, page 7
S RDF query engines O P • The SPARQL query language includes a protocol for communicating queries and results so that a query engine can act as a web service. • SPARQL endpoints provide access to large amounts of structured RDF data. • It is even possible to provide SPARQL access to databases that are not triple stores, e.g., SPARQL translated to SQL. Jan Pettersen Nytun, UiA, Ontologies, page 8
S DATA FEDERATION O P • The RDF data model was designed from the beginning with data federation in mind. Information from any source is converted into a set of triples so that data federation of any kind — spreadsheets and XML, database tables and web pages — is accomplished with a single mechanism. Jan Pettersen Nytun, UiA, Ontologies, page 9
S SPARQL 1.1 Federated Query Extension O P - SERVICE • This extension allows a query author to direct a portion of a query to a particular SPARQL endpoint. • Results are returned to the federated query processor and are combined with results from the rest of the query. Jan Pettersen Nytun, UiA, Ontologies, page 10
S O P PREFIX : <http://example/> PREFIX dc: <http://purl.org/dc/elements/1.1/> SELECT ?a FROM <mybooks.rdf> { ?b dc:title ?title . SERVICE <http://sparql.org/books> { ?s dc:title ?title . ?s dc:creator ?a } } Jan Pettersen Nytun, UiA, Ontologies, page 11
S RDF Store (Merge) O P Jan Pettersen Nytun, UiA, Ontologies, page 12
S More Details O P Following Slides – Ref. article: RDF Storage and Retrieval Systems Alice Hertel, Jeen Broekstra, and Heiner Stuckenschmidt Found in: Steffen Staab - Rudi Studer (Eds.) Handbook on Ontologies Second Edition Jan Pettersen Nytun, UiA, Ontologies, page 13
Possible Generic Architecture of an S O P RDF Store Jan Pettersen Nytun, UiA, page 14
S O P Admin Module • Gives functionality for adding and deleting data from the RDF store. • Loading data from files requires parsing and validating RDF, consequently an RDF parser and an RDF validator are usually part of the admin module. Jan Pettersen Nytun, UiA, Ontologies, page 15
S O P Query Module and Export Module • Query module handles queries to the RDF store. Implements a parser and handler for one query language. • Export module allows a dump of the RDF store into files for data exchange with other systems. Jan Pettersen Nytun, UiA, Ontologies, page 16
The modules can be accessed locally or remotely, e.g. using SOAP or RMI. This is why the highest layer in the middleware contains protocol handlers that can manage different access modes. Jan Pettersen Nytun, UiA, Ontologies, page 17
S O P Storing RDF data in a relational database requires an appropriate ta table ble design sign . There are diffferent approaches that can be classified in generic neric schemas hemas , i.e. schemas that do not depend on the ontology, and ontology specifi fic schemas . Jan Pettersen Nytun, UiA, Ontologies, page 18
S Simples Generic Schema: One O P table with three columns named Subject, Predicate and Object Advantage: No restructuring is required if the ontology changes (e.g., new classes, etc., realized by a simple INSERT command in the table). Disadvantage: Performing a query means searching the whole database and queries involving joins become very expensive. Jan Pettersen Nytun, UiA, Ontologies, page 19
S Normalized triple store – a more O P advanced generic schema • Requires significantly less storage space. • One may also split the Triples table into several tables based on the RDFS properties, e.g., a separate table for RDFS rdfs:subClassOf. Jan Pettersen Nytun, UiA, Ontologies, page 20
S Ontology Specific Schemas O P • Ontology specific schemas are changing when the ontology changes, i.e. when classes or properties are added or removed. • The basic schema consists of one table with one column for the instance ID, one for the class name and one for each property in the ontology. Thus, one row in the table corresponds to one instance. • Some sort of one-table-per-property and/or one- table-per class schema. Jan Pettersen Nytun, UiA, Ontologies, page 21
Recommend
More recommend