y ielding o ntologies for t ransition based o rganization
play

Y ielding O ntologies for T ransition-Based O rganization - PowerPoint PPT Presentation

Y ielding O ntologies for T ransition-Based O rganization ICT-211423 February, 2008 Intelligent Content and Semantics KYOTO (ICT-211423) Overview Title : Y ielding O ntologies for T ransition-Based O rganization Funded: 7 th


  1. Y ielding O ntologies for T ransition-Based O rganization ICT-211423 February, 2008 Intelligent Content and Semantics

  2. KYOTO (ICT-211423) Overview • Title : Y ielding O ntologies for T ransition-Based O rganization • Funded: – 7 th Framework Program-ICT of the European Union: Intelligent Content and Semantics – Taiwan and Japan funded by national grants • Goal : – Platform for knowledge sharing across languages and cultures – Knowledge transition and information across different target groups, transgressing linguistic, cultural and geographic boundaries. – Open text mining and deep semantic search – Wiki environment that allows people in the field to maintain their knowledge and agree on meaning without knowledge engineering skills • Duration : – March 2008 – March 2011 Effort : • – 364 person months of work. General presentation, February 2008 ICT-211423

  3. KYOTO (ICT-211423) Overview • Languages : – English, Dutch, Italian, Spanish, Basque, Chinese, Japanese • Domain : – Environmental domain, BUT usable in any domain • Global: – Both European and non-European languages • Available : – Free: as open source system and data (GPL) • Future perspective : – Content standardization that supports world wide communication – Global Wordnet Grid General presentation, February 2008 ICT-211423

  4. Consortium 1. Vrije Universiteit Amsterdam (Amsterdam, The Netherlands), 2. Consiglio Nazionale delle Ricerche (Pisa, Italy), 3. Berlin-Brandenburg Academy of Sciences and Humantities (Berlin, Germany), 4. Euskal Herriko Unibertsitatea (San Sebastian, Spain), 5. Academia Sinica (Tapei, Taiwan), 6. National Institute of Information and Communications Technology (Kyoto, Japan), 7. Irion Technologies (Delft, The Netherlands), 8. Synthema (Rome, Italy), 9. European Centre for Nature Conservation (Tilburg, The Netherlands), • Subcontractors: – World Wide Fund for Nature (Zeist, The Netherlands), – Masaryk University (Brno, Czech) General presentation, February 2008 ICT-211423

  5. Citizens Governors Companies Environmental Environmental organizations organizations Domain Wiki Global Wordnet Grid Capture Universal Ontology Wordnets Θ Concept Docs Mining Dialogue Top Abstract Physical Fact Search Mining URLs Process Substance Middle Experts water CO2 Index Kybots water CO2 Domain Images pollution emission General presentation, February 2008 ICT-211423

  6. Generic Knowledge & Language Layer Language Independent Central Wordnets Language dependent Ontology Sources Ontology Ontology Sources Dolce Wikipedia Top o m Ontology n a Sumo m t p Wikipedia e o Middle r Milo l & Ontology g o Wikipedia e g p Meaning Domain i a Ontology z r Wikipedia Others e s e Gemet -type hierarchy -axioms GEO DB General presentation, February 2008 ICT-211423

  7. Ontologize synsets • (Semi-)rigid type hierarchy in the ontology: – Canine => PoodleDog; NewfoundlandDog; DalmatianDog, etc. • Wordnet consists of names for (semi-)rigid dog-types and other words for dogs with roles: – NAMES for TYPES: { poodle } EN , { poedel } NL , { pudoru } JP ⇔ ((instance x PoodleDog) ‏ – LABELS for ROLES: { watchdog } EN , { waakhond } NL , { banken } JP ⇒ ((instance x Canine) and (role x GuardingProcess)) ‏ • Type hierarchy remains compact and pure General presentation, February 2008 ICT-211423

  8. Ontologize – "theewater" ( water for making tea ), Dutch • (exists (?A ?W) – (and » (instance ?W Water) » (hasPurposeForAgent ?W » (exists (?T) » (and » (instance ?T Tea) » (part ?W ?T)))))) General presentation, February 2008 ICT-211423

  9. Ontologize • Ontologize concepts from a specific wordnet: – Only disjunct types need to be added (Fellbaum and Vossen 2007). – For example, CO2 is type of substance, but green- house gas does not represent a different type of gas or substance but refers to substances that play a specific role in specific circumstances. • All languages can contribute • Knowledge is shared among all participating languages through the mapping of the different wordnets to the ontology. General presentation, February 2008 ICT-211423

  10. Knowledge mining • Concept mining: – Extract terms and relations in a language – Map the terms to an existing wordnet – Ontologize terms to concepts and axioms • Fact mining – Define logical patterns – Define expression rules in a language General presentation, February 2008 ICT-211423

  11. Concept mining Morpho-syntactic analysis [[the emission] NP Source Linguistic [of greenhouse gases] PP Documents Processors [in agricultural areas] PP ] NP Concept Miners English Wordnet substance:1 location:3 natural process:1 Term hierarchy of regio:3 emission:2 emission:3 gas:1 area gas emission geographical area:1 area:1 agricultural area greenhouse gas greenhouse gas:1 in rural area:1 farmland:2 General presentation, February 2008 ICT-211423

  12. Concept integration English Wordnet Ontology Extended for domain Θ natural process:1 substance:1 Abstract Physical emission:3 gas:1 emission:2 Substance Process greenhouse gas:1 Chemical H20 CO2 Ontologize Reaction GreenhouseGas CO2 (instance s1 Substance) (instance e1 Warming) GlobalWarming Axiomatize (katalyist s1 e1) CO2Emission WaterPollution General presentation, February 2008 ICT-211423

  13. Fact mining • KYBOT = Knowledge Yielding Robot • Logical expression – (instance, e1, Burn) (instance, e2, Warming) (cause, e1, e2) – (instance, s1, CO2) (instance, e1, GlobalWarming) (katalyist, s1,e1) • Expression rules per language : – [N[s1]V[e1]] S – [N[e1]N[s1] N – [[N[e1]][prep][N[s2]] NP • Ontology * Wordnets – Capabilities – Conditions: WNT -> adjectives, WNT -> nouns – Causes: WNT -> verbs, WNT -> nouns – Process: DamageProcess, ProduceProcess • Kybot compiler – kybots = logical pattern+ ontology + WN[Lx] + ER[Lx] General presentation, February 2008 ICT-211423

  14. Fact mining Morpho-syntactic analysis [[the emission] NP Source Linguistic [of greenhouse gases] PP Documents Processors [in agricultural areas] PP ] NP Ontology Logical Wordnets & Expressions Linguistic Expressions Θ Generic Abstract Physical Fact analysis [[the emission] NP ] Process: e1 Substance Process [of greenhouse gases] PP Patient: s2 [in agricultural areas] PP ] Location: a3 Chemical H2O CO2 Reaction Domain CO2 water emission pollution General presentation, February 2008 ICT-211423

  15. Wiki for knowledge sharing • Uses XFLOW workflow engine as underlying mechanism • Easy interface tailored to domain experts who don't know the underlying complex data model (ontology plus multi grid wordnet); • Simplified wiki syntax that is much easier to use for non technical users than e.g. HTML; • Web based interface; • Rollback mechanism: each change to the content is versioned; • Search functions: synset; • Automatic downloading of information from web resources e.g. Wikipedia; • Support for collaborative editing and consensus achievement such as discussion forums, and list of last updates. • Role based user management; General presentation, February 2008 ICT-211423

  16. Wiki for knowledge sharing • Manage the underlying complex data model in order to keep it consistent: – "water pollution" is inserted into a language specific wordnet by a domain expert – a new entry will be automatically inserted in the ontology extension and in every wordnet. – list all dummy entries to be filled in. – English used as the common ground language to support the extension and propagation of changes between the different wordnets and the ontology. General presentation, February 2008 ICT-211423

  17. Evaluation • Wordnets and ontologies are evaluated across linguistic partners; • Language and ontology experts will use the Wiki system to build the basic ontology and wordnet layers needed for the extension to the domain; • Domain experts will use the top layer and middle layer of wordnets and ontologies plus the Wiki system to encode the knowledge in their domains and reach consensus; • The system is tested by integration in a retrieval system; General presentation, February 2008 ICT-211423

  18. Evaluation • Cross-lingual portal: – show the effects of deep semantic processing for user-scenarios – match queries across languages and cultures.. • User queries processed by Kybots and matched with deep semantic patterns: – polluting substance and polluted substance General presentation, February 2008 ICT-211423

  19. Knowledge sharing • Domains share the generic: – Generic knowledge from the wordnets and the ontology is re-used and shared in various domains – Generic Kybots (knowledge yielding miners) are re- used and shared in various domains • Languages share the knowledge: – Ontologies (both generic and domain-specific) are shared across languages – Kybots (both generic and domain-specific) are re- used and shared across languages General presentation, February 2008 ICT-211423

  20. Kybot sharing Ontology Logical Wordnets Kybots Expressions Linguistic Θ Expresssions Abstract Physical Generic Substance Process words words Chemical Reaction H20 CO2 Domain CO2 Water words words Emission pollution General presentation, February 2008 ICT-211423

Recommend


More recommend