ontology maintenance ontology maintenance support support
play

Ontology Maintenance Ontology Maintenance Support Support Text, - PowerPoint PPT Presentation

Ontology Maintenance Ontology Maintenance Support Support Text, Tools, and Theories Text, Tools, and Theories Chris Welty Chris Welty IBM Research IBM Research Outline Outline Opening joke Opening joke Motivation


  1. Ontology Maintenance Ontology Maintenance Support Support Text, Tools, and Theories Text, Tools, and Theories Chris Welty Chris Welty IBM Research IBM Research

  2. Outline Outline • Opening joke • Opening joke • Motivation • Motivation • Maintenance • Maintenance • Support • Support – Tools Tools – – Theories Theories – – Text Analysis Text Analysis –

  3. Motivation Motivation • Given: • Given: Ontologies Ontologies matter matter – Does quality matter? Does quality matter? –

  4. Does quality matter? Does quality matter? • Good quality • Good quality ontologies ontologies cost more cost more – Coverage, correctness, richness, commitment Coverage, correctness, richness, commitment – [ Kashyap Kashyap, 2003] , 2003] [ – Organization, meta Organization, meta- -level consistency [ level consistency [ Guarino Guarino & & – Welty, 2000] [Rector, 2002] Welty, 2000] [Rector, 2002] – Required Required for some applications for some applications – • Improvements in quality can improve • Improvements in quality can improve performance [Welty, et al, 2004] performance [Welty, et al, 2004] – 18% – 18% f f - -improvement in search improvement in search – Cleanup cost ~ 1mw/3000 classes Cleanup cost ~ 1mw/3000 classes – – BUT BUT … … low quality ontology still improved base low quality ontology still improved base –

  5. Motivation Motivation • Given: • Given: Ontologies Ontologies matter matter Sometimes – Does quality matter? Does quality matter? Sometimes – • Problem: How to create them • Problem: How to create them • Bigger problem: how to • Bigger problem: how to maintain maintain them them – From SE: 80% of the cost is maintenance From SE: 80% of the cost is maintenance – [ Schrobe Schrobe, 1996] , 1996] [

  6. Software Maintenance Software Maintenance • Fixing Bugs • Fixing Bugs • Testing • Testing • Enhancing • Enhancing

  7. Ontology Maintenance Ontology Maintenance • Fixing Bugs • • • Enhancing Fixing Bugs Enhancing – Tweaking – Tweaking – Inconsistent Inconsistent – • Richness • Richness – Inaccurate Inaccurate – • Correctness • Correctness – Inefficient Inefficient – • Organization • Organization • Meta • Meta- -level consistency level consistency • Testing • Testing • Efficiency • Efficiency – Regression tests Regression tests – – Extending Extending – • Improving coverage – Test Suites Test Suites • – Improving coverage • Extending commitment • Extending commitment – Meta tag sets for test Meta tag sets for test – • Integration • Integration content content – Refactoring Refactoring – – Ablation tests Ablation tests –

  8. A looming problem A looming problem • Prediction • Prediction the – Ontology maintenance will become Ontology maintenance will become the – significant problem as ontologies ontologies become become significant problem as more mainstream more mainstream – Will follow the SE model (80% of cost) Will follow the SE model (80% of cost) – • Observation/Conjecture • Observation/Conjecture – High quality High quality ontologies ontologies are easier to maintain are easier to maintain –

  9. Tool Support Tool Support • • • • Hierarchical view of View relations between Hierarchical view of View relations between classes classes classes classes • • • • Hierarchical view of Global axioms Hierarchical view of Global axioms properties properties • • View meta- -level level View meta • • Consistency Reasoning Consistency Reasoning • • Basic Upper- -level level Basic Upper – But But… ….no .no “ “segmentation segmentation – Theories Theories faults” ” faults – Space, Time, Parts, Space, Time, Parts, … … – • • Inferential Reasoning Inferential Reasoning • • Assistance for integration Assistance for integration • • View non- -tree tree View non taxonomies taxonomies

  10. Theory Support Theory Support • Meta • Meta- -level analysis level analysis – OntoClean OntoClean [ [ Guarino Guarino & Welty, 2000] & Welty, 2000] – • Good organizing principles • Good organizing principles – R R- -Normalization [Rector, 2002] Normalization [Rector, 2002] – • Well • Well- -founded upper levels founded upper levels – Dolce [ Dolce [ Gangemi Gangemi, et al., 2003] , et al., 2003] – – DAML DAML- -Time [Hobbs, 2003] Time [Hobbs, 2003] – – RCC [ RCC [ Randell Randell, Cui & Cohn, 1992] , Cui & Cohn, 1992] –

  11. OntoClean OntoClean • Draw • fundamental notions from Formal Draw fundamental notions from Formal Ontology Ontology • Establish a set of useful • meta- -properties properties , Establish a set of useful meta , based on behavior wrt wrt above notions above notions based on behavior • Explore the way these meta • Explore the way these meta- -properties combine properties combine property kinds to form relevant property kinds to form relevant • Explore the • taxonomic constraints imposed Explore the taxonomic constraints imposed by these property kinds by these property kinds – Expose common modeling pitfalls Expose common modeling pitfalls –

  12. Overloading Subsumption Overloading Subsumption Common modeling pitfalls Common modeling pitfalls • Instantiation • Instantiation • Constitution • Constitution • Composition • Composition • Disjunction • Disjunction • Polysemy • Polysemy

  13. Instantiation Instantiation My ThinkPad is a is a ThinkPad Model ThinkPad Model ? Does this ontology mean that My ThinkPad ? Does this ontology mean that ThinkPad Model ThinkPad Model Ooops… … Ooops T21 T21 My ThinkPad (s# xx123) My ThinkPad (s# xx123) Question: What ThinkPad models do you sell? Question: What ThinkPad models do you sell? Answer should NOT include My ThinkPad -- -- nor yours. nor yours. Answer should NOT include My ThinkPad

  14. Composition Composition Computer Computer Disk Drive Disk Drive Memory Memory Micro Drive Micro Drive Question: What Computers do you sell? Question: What Computers do you sell? Answer should NOT include Disk Drives or Memory. Answer should NOT include Disk Drives or Memory.

  15. Disjunction Disjunction has- -part part has Computer Computer Computer Part Computer Part Disk Drive Memory Disk Drive Memory Micro Drive Micro Drive has- -part part has Flashcard- -110 110 Flashcard Camera- -15 15 Camera Unintended model: flashcard- -110 is a computer 110 is a computer- -part part Unintended model: flashcard

  16. Polysemy Polysemy Physical Object Abstract Entity Physical Object Abstract Entity Book Book ….. .. … Question: How many books do you have on Hemingway? Question: How many books do you have on Hemingway? Answer: 5,000 Answer: 5,000

  17. Constitution Constitution Entity Entity Amount of Matter Amount of Matter Physical Object Physical Object Clay Metal Clay Metal Computer Computer Question: What types of matter will conduct electricity? Question: What types of matter will conduct electricity? Answer should NOT include computers. Answer should NOT include computers.

  18. Text Analysis Support Text Analysis Support • Document Classification • Document Classification – Subject hierarchies Subject hierarchies – – Identify relevant concepts Identify relevant concepts – • Information Extraction • Information Extraction – Find individuals Find individuals – – Glossary extraction [Park, 2004] Glossary extraction [Park, 2004] –

  19. Concept- -specific Ontology specific Ontology Concept Building through Search Building through Search • Human expert knows what she is interested in: anchor • Human expert knows what she is interested in: anchor concept concept • Find relations and other related concepts for the anchor • Find relations and other related concepts for the anchor concept concept • Active acquisition of knowledge sources through search • Active acquisition of knowledge sources through search – Concept Concept- -defining knowledge source: glossaries or defining knowledge source: glossaries or – dictionaries dictionaries – Up Up- -to to- -date knowledge source: web documents date knowledge source: web documents – • Very useful for recognizing missing terms • Very useful for recognizing missing terms

  20. Domain Term Recognition Domain Term Recognition • Nominal Expressions • Nominal Expressions – acute radiation syndrome acute radiation syndrome – – intercontinental and submarine intercontinental and submarine- -launched launched – ballistic missile ballistic missile – highly enriched uranium highly enriched uranium – • New Domain Word Identification • New Domain Word Identification – agroterrorism agroterrorism, astrobiology, , astrobiology, biocomputation biocomputation – • Generic • Generic Premodifier Premodifier Filtering Filtering – average average radial first harmonic radial first harmonic runout runout – – absolute absolute amazement/zero amazement/zero –

Recommend


More recommend