natural language and the semantic web a crucial symbiosis
play

Natural Language and the Semantic Web: a crucial symbiosis Philipp - PowerPoint PPT Presentation

Natural Language and the Semantic Web: a crucial symbiosis Philipp Cimiano Web Information Systems, TU Delft, The Netherlands 7/16/09 Technology Delft University of Challenge the future Aims and Not-Aims Aims Overview Raise


  1. Natural Language and the Semantic Web: a crucial symbiosis Philipp Cimiano Web Information Systems, TU Delft, The Netherlands 7/16/09 Technology Delft University of Challenge the future

  2. Aims and Not-Aims • Aims • Overview • Raise Questions • Entertain and Encourage Not-Aims • Present my own work (only a bit ;-) • Present solutions or answers Semantic Web Summer School (SWSS09), Cercedilla 2

  3. Structure • The relation between ontologies and natural language • Applications at the ontology-language interface • Principled approaches to the language-ontology interface • The LexInfo model • Conclusion Semantic Web Summer School (SWSS09), Cercedilla 3

  4. Symbiosis • The term symbiosis commonly describes close and often long-term interactions between different biological species. Semantic Web Summer School (SWSS09), Cercedilla 4

  5. Different type of symbiotic relations • Mutualism is a biological interaction between two organisms, where each individual derives a fitness benefit, for example increased survivorship. • Commensalism is a class of relationship between two organisms where one organism benefits but the other is unaffected. • Parasitism is a type of symbiotic relationship between two different organisms where one organism, the parasite, takes favor from the host, sometimes for a prolonged time. What type of relation exists between ontologies (as building blocks of the Semantic Web) and natural language? Semantic Web Summer School (SWSS09), Cercedilla 5

  6. Fitness Benefit • What fitness benefit do ontologies derive from natural language? Semantic Web Summer School (SWSS09), Cercedilla 6

  7. Symbol Grounding • Symbol Grounding Problem (Harnard 1990): it is very difficult (if not impossible) to express the meaning of a symbol in the system itself. We need anchoring to some external system. • In the case of ontologies, this external system is language. • We define local names as part of URIs http://www.example.org#car • We specify labels of these URIs rdf:label(http://www.example.org#car,’car’) • We add natural language definitions of the classes and properties we define (e.g. using rdfs:comment) “A car is a wheeled motor vehicle used for transporting passengers, which also carries its own engine or motor.” Semantic Web Summer School (SWSS09), Cercedilla 7

  8. Further benefits • Ontologies benefit from language: • Grounding of meaning for humans • Population of ontologies from textual data (massively available) • Language-based interaction with knowledge (e.g. querying by way of natural language) • Reading documents describing how humans perceive the world to support ontology engineering (consulting domain-specific literature is an important step in most knowledge engineering methodologies) Semantic Web Summer School (SWSS09), Cercedilla 8

  9. A commensual or parasitic relation ? • So is the relation commensal or even parasitic in the sense that ontologies need language (to ground the meaning of symbols) but language does not profit from ontologies? Semantic Web Summer School (SWSS09), Cercedilla 9

  10. NLP benefits from ontologies • In formal semantics it is assumed that meaning can be captured by a logical formalism (typically FOL) which supports reasoning and drawing of inferences (humans clearly do so). • The meaning of the sentence: “Vincent is married to Mia” is: marriedTo(vincent,mia) • But what do these symbols mean in the logical system in terms of what conclusions we can draw? (is MarriedTo symmetric? timeless?) • What are the legal symbols that we can use? (a question of ontology) Semantic Web Summer School (SWSS09), Cercedilla 10

  11. There are a number of ways in which meaning can be represented e.g. “Vincent is married to Mia.” marriedTo ( vincent , mia ) ∃ x marriage ( x ) ∧ partner ( x , vincent ) ∧ partner ( x , mia ) ∃ x marriage ( x ) ∧ partner ( x , vincent ) ∧ partner ( x , mia ) ∧ holdsDuring ( x ,interval) ∧ overlap(interval,now) Semantic Web Summer School (SWSS09), Cercedilla 11

  12. Word Sense Disambiguation (WSD) • Well-known that words have different senses (at least 10 according to WordNet!) • There is no limit to the senses that we can consider (very fine-grained) Semantic Web Summer School (SWSS09), Cercedilla 12

  13. Named Entity Recognition • Named entity recognition recognizes entities of a certain type in textual data: <painter>Rembrandt Harmenszoon van Rijn </painter> was born on <date> July 15, 1606 </date> in <city> Leiden </city>, <country> the Netherlands </country>. He was the ninth child born to <person> Harmen Gerritszoon van Rijn </person> and <person> Neeltgen Willemsdochter van Zuytbrouck </person>. • Arbitrary number of possible types and granularity (tag people as person or according to their profession etc.) Semantic Web Summer School (SWSS09), Cercedilla 13

  14. Semantic Normalization • The Liffy flows through Dublin. => flowsThrough(Liffy,Dublin) • Dublin lies at the Liffy. => lies_at(Dublin,Liffy) • Dublin is located at the Liffy. => located_at(Dublin,Liffy) • The Liffy passes Dublin. => passes(Liffy,Dublin) Semantic Web Summer School (SWSS09), Cercedilla 14

  15. Ontologies are crucial for the analysis of natural language • Ontologies define and axiomatize a vocabulary. • Define the meaning of symbols to allow to reason with them (e.g. marriedTo is symmetric, bound to a certain time interval) • Define the granularity for WSD and NER and other tasks. • Normalization • Help to constrain the task of interpreting language for a specific purpose, domain, application etc. Semantic Web Summer School (SWSS09), Cercedilla 15

  16. Applications at the interface between language and ontologies • Information Extraction / Ontology Population • Ontology-based Question Answering • Ontology Engineering • Ontology Verbalization Semantic Web Summer School (SWSS09), Cercedilla 16

  17. Scenario • Assume we have an ontology about artists modeling: • Name • Birth and death dates • Birth and death places • Marriages, children • Paintings with their creation date • Influences by other artists • Etc. There are many artists so it is hard add all relevant instances manually. Textual data is massively available, so what about extracting information from textual data to populate the ontology automatically? This process has been typically referred to as ontology population (as opposed to ontology learning which tries to learn the actual schema) Semantic Web Summer School (SWSS09), Cercedilla 17

  18. 1. Ontology Population/Information Extraction • “Claude Monet was born on 14 November 1840 on the fifth floor of 45 rue Laffitte, in the ninth arrondissement of Paris.” -> birthplace(Claude Monnet, Paris) -> birthdate (Claude Monnet,14.11.1840) • “Monet lived from December 1871 to 1878 at Argenteuil, a village on the Seine near Paris”. -> type(stay_Monnet_Paris,Stay) -> artist(stay_Monnet_Paris,Claude_Monnet) -> place(stay_Monnet_Paris,Argenteuil) -> during(stay_Monnet_Paris,interval_1871_1878) -> Semantic Web Summer School (SWSS09), Cercedilla 18

  19. Challenges for Ontology Population • Normalization (different variants map to the same ontological representation) • Capture different variants (learn from examples using machine learning techniques, sparseness) • Ontology-sensitive processing • Granularity of word senses that need to be distinguished • Use ontology for disambiguation • Ignore constituents which are not relevant to the ontology Semantic Web Summer School (SWSS09), Cercedilla 19

  20. 2. Question Answering • Ontologies model relevant world knowledge in a certain domain. • As humans, we are interested in accessing this knowledge, preferably in an intuitive way (e.g. by means of natural language) • Many systems have been designed in the past to meet this need: • Aqualog [Lopez and Motta 2004] • ORAKEL [Cimiano et al. 2008] • GINO [Bernstein and Kaufmann 2006] • And many more… • E.g. Who is a professor at the knowledge media institute? • Prof. Enrico Motta • Prof. John Domingue • Prof. Stefan Rueger Semantic Web Summer School (SWSS09), Cercedilla 20

  21. Ontology-based Question Answering (e.g. Aqualog [Lopez and Motta 2004]) Who is a Professor at the Knowledge Media Institute? (Who,is,professor) (professor,at,Knowledge Media Institute) RSS <typeOf ?x Professor-in-Academia> & <works-in-unit ?x KMi> Semantic Web Summer School (SWSS09), Cercedilla 21

  22. Ontology-based Question Answering Who is PC member of the ISWC conference? (Who,is,PC Member) (PC Member,of,ISWC Conference) (?x,PCMemberOf,ISWC) Semantic Web Summer School (SWSS09), Cercedilla 22

  23. 3. Language in Ontology Engineering Ontology Languages (OWL and RDFS) are hard to grasp, both semantically and syntactically. • RDF-XML and OWL-XML syntaxes hard to read by humans • OWL Abstract syntax only for experts (logicians) • Manchester syntax more intuitive, but not for “casual users” Pizza AND NOT (hasTopping SOME FishTopping) AND NOT (hasTopping SOME MeatTopping) The idea has been to allow people to model ontological knowledge using natural language. Most approaches along these lines rely on “controlled natural language”. Semantic Web Summer School (SWSS09), Cercedilla 23

Recommend


More recommend