Cross-Lingual Word Sense Disambiguation using WordNets and Context - PowerPoint PPT Presentation

Cross-Lingual Word Sense Disambiguation using WordNets and Context Mapping Priyank Jaini Ankit Agrawal {pjaini,ankitag}@iitk.ac.in Department of Mathematics and Statistics IIT Kanpur Advisor: Prof. Amitabha Mukerjee Date: March 21,2013

What is Word Sense Disambiguation(WSD)? -assigning the correct sense(meaning/context) to a word in a sentence when it can have multiple meanings -Example: ->I was standing on the bank of river Ganga. ->Mr. Bank owns a bank. Importance and Motivation - Machine translation, Lexicography,semantic interpretation,Information retrieval etc -Hindi lacks resources -Can be used to create/enrich sense-tagged data

Our Approach ● Parallel Corpus and Alignment ● English WSD ● Synset Mapping ● Transfer to Hindi

Methodology and Algorithms Used 1)Parallel Corpus for 2)Alignment of text 3)English WSD on Hindi-English (Emille) using Church and the English text Gale Algorithm 4)Synset mapping using [10] 5)Transfer senses to Hindi text Figure taken from [1]

The English Word Sense Disambiguation (Step-3) - We shall use “WordNet::SenseRelate::AllWords” -Uses Lesk Algorithm for disambiguation -After this step, we would have a sense-tagged English text. - English WordNet would be used for English WSD

Synset Mapping (Step 4) -Takes an English synset as input and produces as output the best matching Hindi Synset -Uses the fact that in WordNet, the first word in a synset best represents the sense of the synset -Hypernymy relation is the basis for finding the best match -In Hypernymy Hierarchies, a weighted formula given in [10] is used to determine the best synset.

Synset Mapping -Candidate synsets: obtained by finding the Hindi translations of the first word in the input synset and then finding the Hindi synsets that contain one or more of these translations in them -Hypernymy hierarchies of these candidate synsets found. They are called candidate hierarchies -Hypernymy hierarchy of the input English synset is also obtained. -For each synset obtained in the English hypernymy hierarchy, hindi translations of all the words occuring in it are found. -These Hindi words are found in the candidate hierarchies. If a match is found, weight of that candidate synset is increased. Initially, the weights are zero. -The total weight for each candidate hierarchy is obtained, and the one with the highest weight is mapped to the English synset.

We are expecting: -Since a parallel aligned corpus is used we should achieve a better accuracy -Would give better results for scenarios where: -An English word is polysemous and it's Hindi equivalent is also polysemous -An English word is monosemous and it's Hindi equivalent is polysemous Limitations - Is valid only for nouns - Not trained for morphological handling

References 1)Debasri Chakrabarti,Dipak Kumar Narayan,Prabhakar Pandey,Pushpak Bhattacharyya.Experiences in building the Indo Word Net-A WordNet for Hindi. 2)Bahareh Sarrafzadeh, Nikolay Yakovets, Nick Cercone, Aijun An. Cross Lingual Word Sense Disambiguation for Languages with Scarce Resources. 3)Els Lefever and Veronique Hoste. SemEval-2010 Task 3:Cross-Lingual Word Sense Disambiguation. 4) Michael Lesk.Automatic sense disambiguation using machine readable dictionaries:how to tell a pine cone from an ice cream cone. In SIGDOC’86: Proceedings of the 5th annual international conference on Systems documentation,pages 24-26, New York, NY, USA, 1986.ACM 5) Satanjeev Banerjee and Ted Pedersen. Extended gloss overlaps as a measure of semantic relatedness. In IJCAI’03, pages 850-810,2003. 6) Els Lefever and Veronique Hoste. Examining the validity of Cross-Lingual Word Sense Disambiguation 7) http://wordnet.princeton.edu/ 8) Roberto Navigli. Word Sense Disambiguation-A Survey. 9) Roberto Navigli and Simone Paolo Ponzetto. BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. 10) Ramanand, Akshay Ukey, Brahm Kiran Singh, and Pushpak Bhattacharyya. Mapping and structural analysis of multi-lingual wordnets.

Thank You!!

Church and Gale Algorithm -A method for aligning sentences based on a statistical model of character lengths -Uses the fact that longer/shorter sentences in one language tend to be translated into longer/shorter sentences in another language. -The algorithm is a two step process: 1)Paragraph alignment and then 2)Sentence alignment -Based on a probabilistic model -Also, it is language independent, though would have to be tested on Hindi-English. Ref:A Program for Aligning Sentences in Bilingual Corpora, William A Gale and Kenneth W. Church

English WSD:WordNet::SenseRelate::AllWords -Each target word is centered in a balanced window whose size is decided by the user. -The possilble senses of the word are measured for similarity relative to the senses of the surrounding words present in the window in a pairwise fashion -The sense of the word that has the highest score after summing up the pair-wise score is considered the sense of the word. -For finding similarity it uses the 10 measures of relatedness proposed in WordNet::Similarity[http://wn-similarity.sourceforge.net] Lesk Algorithm -Assigns sense to a word by comparing glosses of the surrounding words with the glosses of various senses of the target word. -The sense whose gloss has most number of overlaps is assigned. -Extended Lesk Algorithm uses context hierarchy of WordNet to improve the accuracy.

Cross-Lingual Word Sense Disambiguation using WordNets and Context - PowerPoint PPT Presentation

Cross-Lingual Word Sense Disambiguation using WordNets and Context Mapping Priyank Jaini Ankit Agrawal {pjaini,ankitag}@iitk.ac.in Department of Mathematics and Statistics IIT Kanpur Advisor: Prof. Amitabha Mukerjee Date: March 21,2013 What

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

Word Meaning & Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

Word Sense Disambiguation WORD SENSE DISAMBIGUATION Homonymy and Polysemy As we have seen,

WSD Word Sense Disambiguation: Determine from context (or otherwise) what Word Sense

Final Projects Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison

Similarity-based Word Sense Disambiguation Yael Karov Shimon Edelman Weizmann Institute MIT

Word Sense Disambiguation for Ontological Document Classification Speaker: Georgiana Ifrim

Semantics Avalanche: Word Sense Disambiguation, Dependency Parsing, Semantic Role

Semantics Avalanche: Word Sense Disambiguation, Dependency Parsing, Semantic Role Labeling/Verb

Natural Language Processing: Word Sense Disambiguation Roman Kern <rkern@tugraz.at>

Word Sense Disambiguation Unsupervised WSD Modern WSD L645 / B659 (Some material from Jurafsky

Data-driven sense induction for disambiguation and lexical selection in translation Marianna

Topic Models for Word Sense Disambiguation and Token-based Idiom Detection Linlin Li, Benjamin

HW #8 WordNet-based WSD Perform word sense disambiguation of probe word In context of

Unsupervised Knowledge-Free Word Sense Disambiguation Dr. Alexander Panchenko University of

WMT 2016 Shared Task on Cross-lingual Pronoun Prediction . Liane Guillou, Christian Hardmeier,

Stacking With Auxiliary Features: Improved Ensembling for Natural Language and Vision Nazneen

The Moment of Meaning The Moment of Meaning

Pronunciation Extraction Through Cross-Lingual Word-to-Phoneme Alignment Felix Stahlberg, Tim

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language

A Method of Cross-Lingual Question-Answering Based on Machine Translation and Noun Phrase

MWE-WN Community discussion Florence, August 2, 2019 1 Agenda Feedback from the joint workshop

Using query transformation to improve Gnutella search performance Surendar Chandra

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Cross-Lingual Word Sense Disambiguation using WordNets and Context - PowerPoint PPT Presentation

Cross-Lingual Word Sense Disambiguation using WordNets and Context Mapping Priyank Jaini Ankit Agrawal {pjaini,ankitag}@iitk.ac.in Department of Mathematics and Statistics IIT Kanpur Advisor: Prof. Amitabha Mukerjee Date: March 21,2013 What

Word Sense Word Sense Word Sense Disambiguation Disambiguation Disambiguation Presented by

Word Sense Disambiguation Word Sense Disambiguation (WSD) Given A

Word Meaning &amp; Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

Word Sense Disambiguation WORD SENSE DISAMBIGUATION Homonymy and Polysemy As we have seen,

WSD Word Sense Disambiguation: Determine from context (or otherwise) what Word Sense

Final Projects Word Sense Disambiguation: A Unified Evaluation Framework and Empirical Comparison

Similarity-based Word Sense Disambiguation Yael Karov Shimon Edelman Weizmann Institute MIT

Word Sense Disambiguation for Ontological Document Classification Speaker: Georgiana Ifrim

Semantics Avalanche: Word Sense Disambiguation, Dependency Parsing, Semantic Role

Semantics Avalanche: Word Sense Disambiguation, Dependency Parsing, Semantic Role Labeling/Verb

Natural Language Processing: Word Sense Disambiguation Roman Kern &lt;rkern@tugraz.at&gt;

Word Sense Disambiguation Unsupervised WSD Modern WSD L645 / B659 (Some material from Jurafsky

Data-driven sense induction for disambiguation and lexical selection in translation Marianna

Topic Models for Word Sense Disambiguation and Token-based Idiom Detection Linlin Li, Benjamin

HW #8 WordNet-based WSD Perform word sense disambiguation of probe word In context of

Unsupervised Knowledge-Free Word Sense Disambiguation Dr. Alexander Panchenko University of

WMT 2016 Shared Task on Cross-lingual Pronoun Prediction . Liane Guillou, Christian Hardmeier,

Stacking With Auxiliary Features: Improved Ensembling for Natural Language and Vision Nazneen

The Moment of Meaning The Moment of Meaning

Pronunciation Extraction Through Cross-Lingual Word-to-Phoneme Alignment Felix Stahlberg, Tim

Automatic Machine Translation Evaluation using Source Language Inputs and Cross-lingual Language

A Method of Cross-Lingual Question-Answering Based on Machine Translation and Noun Phrase

MWE-WN Community discussion Florence, August 2, 2019 1 Agenda Feedback from the joint workshop

Using query transformation to improve Gnutella search performance Surendar Chandra

Explore More Topics

Sambuz

Useful Links

Newsletter

Mail Us

Word Meaning & Word Sense Disambiguation CMSC 723 / LING 723 / INST 725 M ARINE C ARPUAT

Natural Language Processing: Word Sense Disambiguation Roman Kern <rkern@tugraz.at>