Evaluation of Textual Knowledge Acquisition Tools: a Challenging - PDF document

Evaluation of Textual Knowledge Acquisition Tools: a Challenging Task Ha¨ ıfa Zargayouna, Adeline Nazarenko LIPN Universit´ e Paris 13 - CNRS (UMR 7030) 99, avenue Jean-Baptiste Cl´ ement - F-93430 Villetaneuse, France firstname.lastname@lipn.univ-paris13.fr Abstract A large effort has been devoted to the development of textual knowledge acquisition (KA) tools, but it is still difficult to assess the progress that has been made. The lack of well-accepted evaluation protocols and data hinder the comparison of existing tools and the analysis of their advantages and drawbacks. From our own experiments in evaluating terminology and ontology acquisition tools, it appeared that the difficulties and solutions are similar for both tasks. We propose a general approach for the evaluation of textual KA tools that can be instantiated in different ways for various tasks. In this paper, we highlight the major difficulties of KA evaluation, we then present our proposal for the evaluation of terminologies and ontologies acquisition tools and the associated experiments. The proposed protocols take into consideration the specificity of this type of evaluation. 1. Introduction gold standards. The output of the systems is automatically tuned to the chosen gold standard instead of being com- A large effort has been devoted to the development of tex- pared to several human judgements as it can been done for tual knowledge acquisition (KA) tools, but it is still difficult the evaluation of machine translation. to assess the progress that has been made. The results pro- In this paper, we highlight the major difficulties of KA eval- duced by these tools are difficult to compare, due to the uation, we then present a unified proposal for the evaluation heterogeneity of the proposed methods and of their goals. of terminologies and ontologies acquisition tools and the Various experiments have been made to evaluate termino- associated experiments. The proposed protocols take into logical and ontological tools, some took the form of eval- consideration the specificity of this type of evaluation. uation challenges while others put focus on the application context. 2. Why are KA tools difficult to evaluate? Some challenges related to terminology have been set up ( e.g. NTCIR 1 and CESART (Mustafa El Hadi et al., 2006)) Various difficulties can explain the fact that no comprehen- but they did not have the popularity they deserved and were sive and global framework has yet been proposed. not renewed. Even if evaluation of ontology acquisition Complexity of artifacts The KA tasks themselves are tool has its own workshop (EON 2 ), no challenge has been difficult to delimit because their output are complex arti- organized and there is still no well-accepted evaluation pro- facts. For instance, terminology and ontology acquisition tocol and data. tasks are related as soon as one considers the terminolog- Application-based evaluation were carried out in order to ical labels that are associated with ontological concepts. evaluate the impact of the acquired knowledge in practice; Even considered independently, a terminology and an on- e.g. for document indexing and retrieval (N´ ev´ eol et al., tology have several components (at least terms, variants and 2006; Wacholder and Song, 2003; K¨ ohler et al., 2006), au- semantic relations for terminologies; concepts, hierarchies tomatic translation (Langlais and Carl, 2004), query expan- and roles for ontologies) which cannot be evaluated all to- sion (Bhogal et al., 2007). Nonetheless none of the men- gether. tioned experiences gave a global idea of the impact of these semantic resources on the applications in which they were Heterogeneity of tools Even for a given KA task, there exploited. exists a wide variety of tools. For instance, a term extractor These experiments show that in terminology as well as in may produce twenty times as many terms as another for the ontology acquisition, it remains difficult to compare exist- same acquisition corpus. Some focus on the precision of the ing tools and to analyse their advantages and drawbacks. results while others favor the recall. Some extract only bi- From our own experiments in evaluating terminology and word terms, some also consider more complex compounds. ontology acquisition tools, it appeared that the difficulties The same kind of heterogeneity can be observed for seman- and solutions are similar for both tasks. We propose a uni- tic class acquisition where the size and number of classes fied approach for the evaluation of textual KA tools that can vary from one system to another. can be instantiated in different ways for various tasks. The Gold standards variability It is difficult and unrealis- main originality of this approach lies in the way it takes into tic to establish a unique gold standard as the knowledge account the subjectivity of evaluation and the relativity of extracted depends on domains and applications. Even if textual corpora help to delimit the scope of interpretation, 1 http://research.nii.ac.jp/ntcir there is a multitude of acceptable solutions that vary from 2 Evaluation of Ontologies for the Web 435

Evaluation of Textual Knowledge Acquisition Tools: a Challenging - PDF document

Evaluation of Textual Knowledge Acquisition Tools: a Challenging Task Ha fa Zargayouna, Adeline Nazarenko LIPN Universit e Paris 13 - CNRS (UMR 7030) 99, avenue Jean-Baptiste Cl ement - F-93430 Villetaneuse, France

Textual Criticism Textual Criticism: Definition Textual criticism is the study of copies of

Knowledge acquisition Development cycle of a knowledge-based system Knowledge acquisition G53KRR

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the

26:198:722 Expert Systems I Knowledge representation I Knowledge acquisition I Machine learning I

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Dynamic Embedding on Textual Networks via a Gaussian Process Presenter : Pengyu Cheng Joint work

Natural logic and textual inference Bill MacCartney CS224U 12 May 2014 Textual inference

Design and Realization of the EXCITEMENT Open Platform for Textual Entailment Gnter Neumann,

Textual Entailment Alina Petrova EMCL TUD, HLT FBK February 22, 2012 Alina Petrova EMCL TUD,

Chapter 12. Evaluation Research Chapter 12. Evaluation Research evaluation research? evaluation

User Interface Evaluation Empirical evaluation Heuristic evaluation 1 CS 349 - UI evaluation

CSN08101 Digital Forensics Lecture 6: Acquisition Lecture 6: Acquisition Module Leader: Dr

The Tools of the Trade: How to The Tools of the Trade: How to Find or Create the Evaluation Find

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

707.009 Foundations of Knowledge Management g g Participative Knowledge Acquisition

EMA EFPIA workshop Break-out session no. 3 SOME STATISTICAL ISSUES OF MODELLING AND

STABLE SHARED VIRTUAL ENVIRONMENT HAPTIC INTERACTION UNDER TIME-VARYING DELAY HICHEM ARIOUI,

Cautionary / Forward Looking Statements MAG Silver Corp. is a Canadian issuer. This presentation

TH ` TH ` ESE ESE En vue de lobtention du DOCTORAT DE LUNIVERSIT E DE TOULOUSE D

Improving the validity and quality of our research Danil Lakens Eindhoven University of

CORIAL 360IL 300 mm ICP-RIE equipment for high performances and low CoO Wide process range for

BioMEMS Photomask Aligner Ross Comer-BWIG Paul Fossum-BSAC Nathan Retzlaff-Communicator William

300mm Wafer Manufacturing in China: Challenges and Opportunities May 2017 Zing Semiconductor