Motivation/ Background The semantic similarity task Hypothesis/ Contribution Different methods of using the judgements of natural language speakers on a semantic similarity task. Irma Cornelisse Institute for Logic, Language and Computation December 13, 2010 1 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution Outline Motivation/ Background The semantic similarity task Hypothesis/ Contribution 2 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution Problem description The problem that will be adressed in the paper is the following: How to evaluate the quality of semantic similarity judgments? • I will discuss 2 methodologies: • Gold-standards • Judgements of natural language users 3 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution Gold-standards • Most of the time not complete. • Often don’t give information on how similar a term is to the target term. • Don’t reflect that syonymy is a matter of degree. • Don’t take into account that judgements of synonymy are not strict, there are borderline cases. • Not necessarily reflect the judgement of natural language users. 4 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution Judgements of natural language users • Getting them is time consuming. • Different ways of getting judgements by natural language users, which lead to different results • Spontaneously produce • We know humans face problems spontaneously producing, they don’t have acces to all their knowledge. • Judge given terms: where do we get these terms from? • From your model (evaluate only precision, not recall) • From a gold-standard • From an earlier spontaneously producing task by natural language users. Judgements of natural language users is used a lot, but there is no insight in how the results of these different methods relate to each other, i.e. what are the consequences of the different design choices? 5 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution Research question The main goal of this research is: To obtain insight in how the information obtained by the different methods relate to each other • Where and how does it differ? • Where and how does it coincides? There is probably not one right way, but we can, by characterizing the different methods, argue which methods fits which purpose best. 6 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution Method Subjects • Aproximately 60 1st year students Beta Gamma. Distributional model • Cornetto Dutch Set Demo (http://www.let.rug.nl/erikt/bin/setdemo.cgi) Gold Standard • Van Dalen ‘Synoniemenwoordenboek’ (thesaurus) 7 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution The task 10 Dutch terms: • 5 nouns • 5 adverbs Randomly chosen, satisfying the following criteria: • No polysemy (according to the Van Dale dictionary) • Only one POS tag possible (according to the Van Dale dictionary) • 3 or more synonyms according to the Van Dale thesaurus • 2 or more possible synoyms given by CDSD • No 2 terms are synonyms according to the Van Dale thesaurus • No 2 terms are connected to each other by CDSD 8 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution Subjects: 4 conditions The spontaneously producing task. • Come up with a word that is as similar as possible to the target term. The judging subjects task. • Classify the terms produced by the previous task into: 1. the meaning is the same as the meaning of the target term 2. the meaning is very similar to the meaning of the target term 3. the meaning is reasonably similar to the meaning of the target term 4. the meaning is a bit similar to the meaning of the target term 5. the meaning is not similar to the meaning of the target term The judging gold-standard task. • Classify the synonyms of the target terms given by Van Dale thesaurus as above. The judging model task. • Classify the nearest neighbours of the target terms given by CDSD as above. 9 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution Hypothesis 1. The spontaneously producing task gives a similar output to the judging subjects task , when the group of subjects is big enough. 2. Only considering the best synonyms from the judging subjects task will give not enough graduation results. 3. The judging gold-standard task will come up with terms not produced in the spontaneously producing task and vice versa. 4. Not all terms produced by subjects, the thesaurus and the computer model will be judged as similar to the target term by the subjects. 5. There are terms produced by the computer model which are not in the thesaurus (and vice versa). 10 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution The interesting results The interesting contributions of the results of this exeperiment are: • They give insight in the precision and recall of the different methods, i.e. • Which method returns terms that are judged by non similar by most subjects? • Which method doesn’t return terms that are judged as similar by most subjects? • They give insight in the way different methods return a similarity rate (graduation) of the similar terms, i.e. • How similar is the term to the target term? . 11 / 12
Motivation/ Background The semantic similarity task Hypothesis/ Contribution Discussion The main goal of this research is: To obtain insight in how the information obtained by the different methods relate to each other • Where and how does it differ? • Where and how does it coincides? There is probably not one right way, but we can, by characterizing the different methods, argue which methods fits which purpose best. 12 / 12
Recommend
More recommend