knowledge representation ontologies and semantic web
play

Knowledge Representation, Ontologies, and Semantic Web Georg - PowerPoint PPT Presentation

Knowledge Representation, Ontologies, and Semantic Web Georg Gottlob, Carsten Lutz KR + DB Knowledge Representation: Build ontology / knowledge base capturing general knowedge of application domain + Databases: Build systems for managing /


  1. Knowledge Representation, Ontologies, and Semantic Web Georg Gottlob, Carsten Lutz

  2. KR + DB Knowledge Representation: Build ontology / knowledge base capturing general knowedge of application domain + Databases: Build systems for managing / querying data from application domain tool for incomplete and heterogeneous data and for data integration ontology = schema? ontology = constraints? Yes and no! 2

  3. Culture Shock Shocking languages with strange names and syntax: description logics Shocking data models: relations of arity at most two Good news: common query languages (CQs, UCQs, RPQs, etc) no objections against higher arity, sometimes have it This part of tutorial: A bit of DL history, area overview, help understanding 3

  4. Description Logic Jungle KL-ONE 1981 formalization ALC family 1991 deprecated stuff scalability/data DL-Lite family EL family 2005 semantic web/ no recursion standardization no disjunction OWL2 QL OWL2 EL OWL2 DL 2009 4

  5. ALC Family Operators available in ALC : (attribute concept language with complement) A ( x ) A ¬ C ( x ) , C ( x ) ∧ D ( x ) , C ( x ) ∨ D ( x ) ¬ C , C u D , C t D ∃ r.C ∃ y r ( x, y ) ∧ C ( y ) ∀ y r ( x, y ) → C ( y ) ∀ r.C Ontology: ∀ x C ( x ) → D ( x ) C v D ∀ x C ( x ) ↔ D ( x ) C ≡ D For example: Movie v Comedy t Drama t HorrorMovie Director ⌘ Person u 9 directed . ( Movie t TVseries ) ForeignMovie ⌘ 8 producedIn . ¬ US u 9 language . ¬ English Modeling strongly concentrates on classes (= unary relations) 5

  6. ALC Family Despite focus on classes, DLs are viewed as FO fragments, related e.g. to the guarded fragment Precise characterization: Theorem. An FO-sentence is equiv. to an ALC -ontology iff it is invariant under global bisimulation and disjoint union. Sloppily: the fragment of FO that “speaks about trees” DL kind of syntax actually not that unusual: same type used in temporal logic, mu-calculus, PDL 6

  7. ALC Family How does this give rise to an ALC family of logics? speaking also about the inverse of relations: + I counting the number of successors of a node: + Q constants: + O ~five most frequently used modifiers, plus several minor ones Of course, these additions might change the model theory (e.g. no longer purely trees) Most DL people agree: DL names can be ugly! (heard of ALCHQIO R + ?) 7

  8. EL and DL-Lite Families EL family (existential language) , e.g.: Positive-existential-conjunctive fragment of ALC Scientist u 9 participatesIn . DagstuhlSeminar v 9 customerOf . TaxiCompany x x EL ontology ≈ monadic datalog program + ∃ in rule heads, tree shaped rule bodies, no EDB/IDB separation DL-Lite family essentially: inclusion dependencies + projection + fundeps Movie v 9 hasDirector 9 hasISSN v SerialPublication 8

  9. Ontologies vs Schemas Ontologies are not quite like a schema in several ways: often supposed to be general purpose and universally useful result of expensive modeling effort, often rather large Two examples: schema.org ontology for the web by Bing, Google, Yahoo!, Yandex ~700 classes, ~1000 binary relations, ~30 contributors SNOMED CT international standard for electronic health records ~400.000 classes, ~36 binary relations, ~40 engineers There are various ontology repositories containing hundreds of ontologies 9

  10. Basic DL Research No data, just an ontology. e m i T P e m m o i Reasoning helps to construct / maintain / verify ontology: r T f p x E a i v e v o b a satisfiability: check consistency of concepts o t implication/subsumption: make ontology consequences explicit Studied all the way from theory to systems There is now serious tool support: editors (such as protege), reasoners (Konclude, ELK, many many more) We are quite good in solving ExpTime-complete problems in practice, e.g. satisfiability in (extensions of) ALC (choice of logics helps!) 10

  11. Ontology Reasoning Some other “data-free” lines of research: systems & optimization summary / uniform interpolation ontology revision non-monotonic ontologies conservative extensions, modularity concept matching and unification probabilistic ontologies learning/mining ontologies temporal ontologies concrete domains (= data values) explanation ontology “diff” and debugging ontology decomposition 11

  12. Sample: Conservative Extensions + Modularity subdomain of interest Let Σ be signature. O 2 ⊇ O 1 is Σ -conservative extension of O 1 if for every model I 1 of O 1 , there is model I 2 of O 2 with I 1 | Σ = I 2 | Σ . Good for managing ontologies, e.g. modularity: M ⊆ O is self-contained Σ -module if O is Σ -c.e. of M . M ⊆ O is depleting Σ -module if O \ M is Σ -c.e. of ∅ . Bad news: Theorem. Conservative extensions are undecidable in ALC (and below). Good news: there are very good replacements! 12

  13. Sample: Conservative Extensions + Modularity Overapproximation: deductive conservative extensions O 2 ◆ O 1 is Σ -conservative extension of O 1 if O 2 | = C v D implies O 1 | = C v D whenever C, D use only Σ -relations. Equivalent: O 2 ⊇ O 1 is Σ -conservative extension of O 1 if for every model I 1 of O 1 , there is model I 2 of O 2 with I 1 | Σ = I 2 | Σ up to bisimulation. This recovers decidability e.g. via automata, 2ExpTime-complete for ALC . [GhilardiL__Wolter] Underapproximation: ? -conservative extensions O 2 ⊇ O 1 is Σ -conservative extension of O 1 if from every model I 1 of O 1 , we get model of O 2 by making non- Σ -symbols empty. Can be reduced to satisfiability, thus ExpTime-complete for ALC . Can be syntactically (under)approximated, giving rise to polytime module extraction algorithms that work very well in practice [CuencaGrauHorrocksKazakovSattler] 13

  14. Adding Data first seriously considered in [RoussetLevy96,CalvaneseEtAl98] now very mainstream Ontology used at querying time for inferencing, unlike constraints open world / certain answer semantics 14

  15. Adding Data Essentially two kinds of scenarios: Web data / Semantic Web (AI’ish) very large scale, very incomplete ontologies tend to be general purpose and pragmatic sometimes ontology even given as part of data Data Integration / Ontology-Based Data Access (DB’ish) ontology provides class-centric global view / conceptual model mappings (typically GAV) connect ontology with data sources ontologies typically application dependent and custom tailored 15

  16. Implementation Approaches Query rewriting to get rid of ontology: target query languages include SQL = UCQs = non-recursive Datalog, Datalog, linear Datalog, monadic Datalog Combined approach: materialization of ontology consequences in data: often becomes infinite because of existential quantifiers finite representation used instead that is unsound soundness regained by limited query rewriting Incremental maintenance related to FOIES / DynFO Implementations: Oracle Semantic Technologies, RDFox, Combo 16

  17. DL-Lite In DL-Lite family: FO-rewritings… always exist in (DL-Lite,UCQ) [CalvaneseEtAl] can be superpolynomial unless NP ⊆ P/poly [ZakharyaschevEtAl] [GottlobSchwentick] is polynomial under mild assumptions are small in practice when data comes from classical DB [Calvanese,Rodriguez-Muro] Implementations include OnTop, Clipper, Rapid, Requiem, Presto/Mastro 17

  18. EL In EL family: FO-rewritings are not guaranteed to exist because of recursion 9 r.A v A + query A ( x ) = reachability of A -point on r -path existence of FO-rewritings related to monadic datalog boundedness but often simpler (PSpace, ExpTime, 2ExpTime dep. on setup) [BienvenuHansenL__Wolter] FO-rewritings exist in almost all cases and can be computed efficiently (as non-recursive Datalog programs): Grind system [HansenL__Wolter] combined approach always applicable, polynomial query rewriting [L__TomanWolter] 18

  19. ALC Connection to CSP, also related to natural questions such as: can we classify the complexity of all OMQs in, e.g., ( ALC , UCQ ) ? how does expressive power of OMQs relate to traditional QLs? Emerging picture is very interesting: ( ALC , AQ ) = coCSP ( ALC , UCQ ) = coMMSNP = monadic disjunctive Datalog ( GF , UCQ ) = coGMSNP = frontier-guarded disjunctive Datalog [BienvenuTenCateL__Wolter] Can also be used to clarify complexity of FO- and Datalog-rewritability NExpTime for ( ALC , AQ ) , 2NExpTime for ( ALC , UCQ ) . [BienvenuTenCateL__Wolter,FeierKuusistoL__] 19

  20. Other Things People are Interested in systems & optimization dynamic & temporal aspects OMQ emptiness and containment uncertain / probabilistic databases consistent query answering OMQ expressive power supporting data analytics partial closed world assumption explanation updates privacy / confidentiality DL is rather active subarea of KR Annual workshop with ~80 participants, next year 30th edition 20

  21. Higher Arity Relations Why again binary? Mix of syntax and desired universality of relations. Sometimes we don’t need more: [CalvaneseOrtizSimkus] DLs as language for graph DBs RDF!? Possible but not so elegant workaround: mappings / reification Native solutions: n -ary language DLR already in [LenzeriniEtAl98] more concise proposal DL FU 1 [Kuusisto16]: C ::= A | ¬ C | ( C 1 u C 2 ) | 9 R. ( C 1 , . . . , C n ) R ::= S | ε | ¬ R | ( R 1 u R 2 ) | σ R more from surjection [ n ] → [ m ] , n arity of R 21

Recommend


More recommend