generic ontology learners on application domains
play

Generic Ontology Learners on Application Domains Francesca Fallucchi - PowerPoint PPT Presentation

Generic Ontology Learners on Application Domains Francesca Fallucchi 1 Maria Teresa Pazienza 1 Fabio Massimo Zanzotto 1 1 DISP University of Rome Tor Vergata Rome, Italy {fallucchi,pazienza,zanzotto}@info.uniroma2.it LREC 2010, Malta, May 2010


  1. Generic Ontology Learners on Application Domains Francesca Fallucchi 1 Maria Teresa Pazienza 1 Fabio Massimo Zanzotto 1 1 DISP University of Rome Tor Vergata Rome, Italy {fallucchi,pazienza,zanzotto}@info.uniroma2.it LREC 2010, Malta, May 2010

  2. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Motivation Learning methods require large general corpora and knowl- edge repositories In specific domains ontologies are extremely poor Manually building ontologies is a very time consuming and expensive task Automatically creating or extending ontologies needs large corpora and existing structured knowledge to achieve rea- sonable performance

  3. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Motivation Problems Scarcity of domains covered by existing ontologies Not relevant existing ontologies to expand for target domain

  4. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Motivation Problems Scarcity of domains covered by existing ontologies Not relevant existing ontologies to expand for target domain ⇓ Solution We propose a model that can be used in different specific knowledge domains with a small effort for its adaptation Our model is learned from a generic domain that can be exploited to extract new informations in a specific domain

  5. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Motivations 1 Probabilistic Ontology Learning 2 Corpus Analysis A Probabilistic Model Logistic Regression Experimental Evaluation 3 Experimental Set-Up Agreement Results Conclusions and Future Works 4

  6. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Our Learner Model Model exploits the information learned in a background domain for extracting information in an adaptation domain Model is based on the probabilistic formulation Model takes into consideration corpus-extracted evidences over a list of training pairs Model is used to estimate the probabilities of the new instances computing a new feature space

  7. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis

  8. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis corpus instance ✛ ( dog , animal )

  9. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis context ... “dog” , as “animal” ... corpus instance ✛ ( dog , animal )

  10. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis context ... “dog” , as “animal” ... corpus instance ✛ ( dog , animal ) , features as , as

  11. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis context ... “dog” , as “animal” ... corpus instance ✛ ( dog , animal ) , 1 features as 1 ◗ ❦ ◗ , as 1 ◗ ◗ feature space

  12. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis corpus

  13. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis context X 1 f 1 f 2 Y 1 corpus

  14. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis context X 1 f 1 f 2 Y 1 corpus ( X 1 , Y 1 ) • f 1 • f 2

  15. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis context X 1 f 1 f 2 Y 1 corpus ( X 1 , Y 1 ) ( X 2 , Y 2 ) • f 1 • • f 2 • f 3

  16. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Corpus Analysis context X 1 f 1 f 2 Y 1 corpus ( X 1 , Y 1 ) ( X 2 , Y 2 ) ... ... ( X n , Y n ) • • • • • f 1 • • • • f 2 • • • • f 3 • • • • • • • • • • • • . . . • • • . . • • • • . • • • • • • • • • • • • • f m

  17. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Instances Matrix context X 1 f 1 f 2 Y 1 ( X 1 , Y 1 ) ( X 2 , Y 2 ) ... ... ( X n , Y n ) Corpus • • • • • f 1 • • • • f 2 f 3 • • • • • • • • • • • • • • • . . • • • . . . . • • • • • • • • • • • • f m • • • • •

  18. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Instances Matrix Evidences Matrix context E = ( − → e 1 ... − → X 1 f 1 f 2 Y 1 e n ) ( X 1 , Y 1 ) ( X 2 , Y 2 ) ... ... ( X n , Y n ) Corpus • • • • • f 1 • • • • f 2 f 3 • • • • • • • • • • • • • • • . . • • • . . . . • • • • • • • • • • • • f m • • • • •

  19. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions A Probabilistic Model Probabilistic model for learning ontologies form corpora Ontology is seen as a set O of relations R over pairs R i , j If R i , j is in O , i is a concept and j is one of its generalization Goal: Estimate Posterior Probability P ( R i , j ∈ O | E ) where E is a set of evidences extracted from corpus

  20. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Logistic Regression Logit Given two variables Y and X , the probability p of Y to be 1 given that X = x is: p = P ( Y = 1 | X = x ) and Y ∼ Bernoulli ( p ) � � p logit ( p ) = ln 1 − p

  21. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Logistic Regression Logit Given two variables Y and X , the probability p of Y to be 1 given that X = x is: p = P ( Y = 1 | X = x ) and Y ∼ Bernoulli ( p ) � � p logit ( p ) = ln 1 − p logit ( p ) = β 0 + β 1 x 1 + ... + β k x k

  22. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Logistic Regression Logit Given two variables Y and X , the probability p of Y to be 1 given that X = x is: p = P ( Y = 1 | X = x ) and Y ∼ Bernoulli ( p ) � � p logit ( p ) = ln 1 − p logit ( p ) = β 0 + β 1 x 1 + ... + β k x k Given regression coefficients the probability is exp ( β 0 + β 1 x 1 + ... + β k x k ) p ( x ) = 1 + exp ( β 0 + β 1 x 1 + ... + β k x k )

  23. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Estimating Regression Coefficients We estimate the regressors β 0 , β 1 ,..., β k of x 1 ,..., x k with maximal likelihood estimation logit ( p ) = β 0 + β 1 x 1 + ... + β k x k solving a linear problem

  24. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Estimating Regression Coefficients We estimate the regressors β 0 , β 1 ,..., β k of x 1 ,..., x k with maximal likelihood estimation logit ( p ) = β 0 + β 1 x 1 + ... + β k x k solving a linear problem − − − − → logit ( p ) = E β where   1 e 11 e 12 ··· e 1 n  ···  1 e 21 e 22 e 2 n   E =  . . . .  ... . . . .   . . . . ··· 1 e m 1 e m 2 e mn

  25. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Background Ontology Learner Using a logistic regressor based on the Moore-Penrose pseudo-inverse matrix (Fallucchi and Zanzotto, RANLP 2009) β = X + � C B l where: X + C B is the pseudo-inverse matrix of the evidences matrix X C B obtained from a generic corpus C B l is the logit vector ( − − − − → logit ( p ) )

  26. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Estimator for Application Domain The logit of the testing pairs l ′ = α X C A � β where: α is a parameter used to adapt the model by the β vector to the new domain X C A is the inverse evidence matrix obtained from an adaptation domain corpus C A � β is the regressors vector

  27. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Estimator for Application Domain The logit of the testing pairs l ′ = α X C A � β where: α is a parameter used to adapt the model by the β vector to the new domain X C A is the inverse evidence matrix obtained from an adaptation domain corpus C A � β is the regressors vector Then, step by step testing pairs probability exp ( l i ) p i = 1 + exp ( l i )

  28. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Motivations 1 Probabilistic Ontology Learning 2 Corpus Analysis A Probabilistic Model Logistic Regression Experimental Evaluation 3 Experimental Set-Up Agreement Results Conclusions and Future Works 4

  29. Motivations Probabilistic Ontology Learning Experimental Evaluation Conclusions Experimental Set-Up Target Ontologies 1 Training: pairs that are in hyperonym relation in WordNet ==> about 600000 pairs of words Testing: pairs in Earth Observation Domain ==> about 404 pairs of words Corpus 2 Training: English Web as Corpus , ukWaC (Ferraresi,2008) ==> about 2700000 web pages Testing: corpus related to Earth Observation Domain ==> about 8300 web pages Feature Spaces 3 bag-of-words and n-grams windows: length 3 tokens ==> about 280000 features

Recommend


More recommend