National and Kapodistrian HELLENIC REPUBLIC University of Athens 1 / 14 Scalable End-user Access to Big Data
Engineer Application predefjned queries answers 2 / 14 The Problem of Data Access
I need to fjnd all rock samples where my Company had at least a 30% share of the licence at the time the sample was taken. I’m sure the information is there but there are so many concepts involved that I can’t fjnd it in the application. I need all wellbores with a pore pressure of over 14ppg, but lower than 12ppg further down the hole. I can’t say this to the application. I need to fjnd all rock samples for this oil fjeld, including the ones in this Excel sheet from Dinoco. The application doesn’t know about this data. 3 / 14 When does this Go Wrong?
Where is this information stored, and what is it called? Can you hand-craft a query for my information need? Can you include data from this spreadsheet in the db? May take weeks to respond Takes several years to master data stores and user needs 30–70% of domain expert time spent looking for and assessing the quality of the data found 4 / 14 What then happens?
Where is this information stored, and what is it called? Can you hand-craft a query for my information need? Can you include data from this spreadsheet in the db? May take weeks to respond Takes several years to master data stores and user needs 30–70% of domain expert time spent looking for and assessing the quality of the data found 4 / 14 What then happens?
Engineer Application predefjned queries answers 5 / 14 The Problem of Data Access
Engineer Application IT-expert information need specialised query answers 5 / 14 The Problem of Data Access
Engineer Application Data Warehouse queries ETL answers ETL 6 / 14 Data Access, with a Data Warehouse
Engineer Application ontology-based query translated query answers 7 / 14 Data Access: The Optique Solution
Engineer Application ontology-based query translated query answers Onto- logy Map- pings 7 / 14 Data Access: The Optique Solution
8 / 14 End-user repository central static data temporal data streaming data results Query Execution Query Execution Query Execution Query Planning Query Transformation Queries Mappings Ontology Management Ontology & Mapping Formulation Query & Analysis Visualisation … Std. ontologies Data models IT-expert Optique Architecture · · · · · ·
engineer Generators with a turbine fault? Based on slides by Ian Horrocks 9 / 14 OBDA: Example Generator ( g1 ) hasFault ( g1 , f1 ) CondenserFault ( f1 )
engineer Generators with a turbine fault? Based on slides by Ian Horrocks g1 is a Generator g1 has fault f1 9 / 14 OBDA: Example f1 is a CondenserFault
engineer Generators with a turbine fault? Based on slides by Ian Horrocks g1 is a Generator g1 has fault f1 9 / 14 OBDA: Example f1 is a CondenserFault
engineer Generators with a turbine fault? Based on slides by Ian Horrocks g1 is a Generator g1 has fault f1 9 / 14 OBDA: Example ∅ f1 is a CondenserFault
engineer Generators with a turbine fault? Based on slides by Ian Horrocks g1 is a Generator g1 has fault f1 9 / 14 OBDA: Example f1 is a CondenserFault Condenser ⊑ CoolingDevice ⊓ ∃ isPartOf . Turbine CondenserFault ≡ Fault ⊓ ∃ afgects . Condenser TurbineFault ≡ Fault ⊓ ∃ afgects . ( ∃ isPartOf . Turbine )
engineer Generators with a turbine fault? Based on slides by Ian Horrocks g1 is a Generator g1 has fault f1 is part of a Turbine afgects a Condenser 9 / 14 OBDA: Example f1 is a CondenserFault Condenser is a CoolingDevice that Condenser Fault is a Fault that Turbine Fault is a Fault that afgects part of a Turbine
engineer Generators with a turbine fault? Based on slides by Ian Horrocks g1 is a Generator g1 has fault f1 is part of a Turbine afgects a Condenser 9 / 14 OBDA: Example f1 is a CondenserFault Condenser is a CoolingDevice that Condenser Fault is a Fault that Turbine Fault is a Fault that afgects part of a Turbine
9 / 14 afgects a Condenser Generators with a turbine fault? Based on slides by Ian Horrocks g1 is a Generator g1 has fault f1 engineer g1 is part of a Turbine OBDA: Example f1 is a CondenserFault Condenser is a CoolingDevice that Condenser Fault is a Fault that Turbine Fault is a Fault that afgects part of a Turbine
Given: q a query we want In general expensive to compute In certain cases possible by rewriting : such that Query answering with empty ontology is cheap (same as SQL) 10 / 14 Query Rewriting T (Terminology) – the ontology, domain model A (Assertions) – the database ans ( q , ( T , A )) the answers of the query given knowledge in T and A q ′ := rewrite ( q , T ) ans ( q ′ , ( ∅ , A )) = ans ( q , ( T , A ))
11 / 14 Rewriting Example q = Generator ( g ) ∧ hasFault ( g , f ) ∧ TurbineFault ( f ) A : Generator ( g1 ) Rewrite with T : hasFault ( g1 , f1 ) rewrite ( q , T ) = CondenserFault ( f1 ) q ′ = Generator ( g ) ∧ hasFault ( g , f ) ∧ CondenserFault ( f ) T : ∨ · · · Condenser ⊑ CoolingDevice ⊓ ∃ isPartOf . Turbine CondenserFault ≡ Fault ⊓ Answers from q ′ : ∃ afgects . Condenser TurbineFault ≡ Fault ⊓ ∃ afgects . ( ans ( q ′ , ( ∅ , A )) = {⟨ g1 , f1 ⟩} ∃ isPartOf . Turbine )
12 / 14 Scope Optimize rewritten queries and storage layer Big Data is maybe not best stored in an SQL database SQL databases not good at queries from OBDA Effjciency Need to extend bare-bones query rewriting What about queries with time? Or geology? Or chemistry? Mapping management, analysis and evolution OBDA is well researched, many publications in last 10 years. Ontology management Need a user interface for ‘query formulation’ How do end-users formulate queries? In fjrst-order logic? Usability Some important bits are missing: So why a 4 year EU research project? That Sounds Simple?
www.optique-project.eu
Recommend
More recommend