language modelling makes sense
play

Language Modelling Makes Sense Propagating Representations through - PowerPoint PPT Presentation

Language Modelling Makes Sense Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation Da Daniel l Lo Loureiro, Alpio Jorge ACL Florence, 31 July 2019 Sense Embeddings Exploiting the latest Neural


  1. Language Modelling Makes Sense Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation Da Daniel l Lo Loureiro, Alípio Jorge ACL – Florence, 31 July 2019

  2. Sense Embeddings Exploiting the latest Neural Language Models (NLMs) for sense-level representation learning. • Beat SOTA for Word Sense Disambiguation (WSD). • Full WordNet in NLM-space (+100K common sense concepts). • Concept-level analysis of NLMs. Introduction Related Work Our Approach Performance Applications Conclusions

  3. Sense Embeddings Exploiting the latest Neural Language Models (NLMs) for sense-level representation learning. • Beat SOTA for English Word Sense Disambiguation (WSD). • Full WordNet in NLM-space (+100K common sense concepts). • Concept-level analysis of NLMs. Introduction Related Work Our Approach Performance Applications Conclusions

  4. Related Work Introduction Rel elated Work ork Our Approach Performance Applications Conclusions

  5. Related Work [Luo et al. (2018b)] [Luo et al. (2018a)] [Peters et al. (2018)] [Iacobacci et al. (2016)] [Vial et al. (2018)] [Melamud et al. (2016)] [Raganato et al. (2017)] [Zhong and Ng (2010)] [Yuan et al. (2016)] Sense-level Bag-of-Features Deep Sequence Representations Classifiers Classifiers (k-NN) (SVM) (BiLSTM) (over NLM reprs.) Introduction Rel elated Work ork Our Approach Performance Applications Conclusions

  6. Related Work [Luo et al. (2018b)] [Luo et al. (2018a)] [Peters et al. (2018)] [Iacobacci et al. (2016)] [Vial et al. (2018)] [Melamud et al. (2016)] [Raganato et al. (2017)] [Zhong and Ng (2010)] [Yuan et al. (2016)] Sense-level Bag-of-Features Deep Sequence Representations Classifiers Classifiers (k-NN) (SVM) (BiLSTM) (over NLM reprs.) Introduction Rel elated Work ork Our Approach Performance Applications Conclusions

  7. Bag-of of-Features Classifiers It Makes Sense (IMS) [Zhong and Ng (2010)] : • POS tags, surrounding words, local collocations. • SVM for each word type in training. • Fallback: Most Frequent Sense (MFS). “glasses” • Improved with word embedding features. [Iacobacci et al. (2016)] • Still competitive (!) Introduction Rel elated Work ork Our Approach Performance Applications Conclusions

  8. Deep Sequence Classifiers Bi-directional LSTMs (BiLSTMs): • Better with: • Attention (as everything else). • Auxiliary losses. (POS, lemmas, lexnames) [Raganato et al. (2017)] • Glosses, via co-attention mechanisms. [Luo et al. (2018)] • Still must fallback on MFS. • Not that much better than bag-of- features… [Raganato et al. (2017)] Introduction Rel elated Work ork Our Approach Performance Applications Conclusions

  9. Context xtual k-NN NN Matching Contextual Word Embeddings: • Produce Sense Embeddings from NLMs (averaging). • Sense embs. can be compared with contextual embs. [Ruder (2018)] • Disambiguation = Nearest Neighbour search (1-NN). • Sense embs. limited to annotations. MFS required. • Promising, but early attempts. Introduction Rel elated Work ork Our Approach Performance Applications Conclusions

  10. Our Approach Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  11. Our Approach • Expand the k-NN approach to full-coverage of WordNet. • Matching senses becomes trivial, no MFS fallbacks needed. • Full-set of sense embeddings in NLM-space is useful beyond WSD. Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  12. Our Approach • Expand the k-NN approach to full-coverage of WordNet. • Matching senses becomes trivial, no MFS fallbacks needed. • Full-set of sense embeddings in NLM-space is useful beyond WSD. Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  13. Our Approach • Expand the k-NN approach to full-coverage of WordNet. • Matching senses becomes trivial, no MFS fallbacks needed. • Full-set of sense embeddings in NLM-space is useful beyond WSD. Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  14. Our Approach • Expand the k-NN approach to full-coverage of WordNet. • Matching senses becomes trivial, no MFS fallbacks needed. • Full-set of sense embeddings in NLM-space is useful beyond WSD. Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  15. Our Approach • Expand the k-NN approach to full-coverage of WordNet. • Matching senses becomes trivial, no MFS fallbacks needed. • Full-set of sense embeddings in NLM-space is useful beyond WSD. Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  16. Challenges Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  17. Challenges • Overcome very limited sense annotations (covers 16% senses). • Infer missing senses correctly so that task performance improves. • Rely only on sense embeddings, no lemma or POS features. Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  18. Challenges • Overcome very limited sense annotations (covers 16% senses). • Infer missing senses correctly so that task performance improves. • Rely only on sense embeddings, no lemma or POS features. Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  19. Challenges • Overcome very limited sense annotations (covers 16% senses). • Infer missing senses correctly so that task performance improves. • Rely only on sense embeddings, no lemma or POS features. Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  20. Challenges • Overcome very limited sense annotations (covers 16% senses). • Infer missing senses correctly so that task performance improves. • Rely only on sense embeddings, no lemma or POS features. Bootstrap Propagate Enrich Reinforce Annotated Dataset WordNet Glosses Morphological Embeddings WordNet Ontology Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  21. Bootstrapping Sense Embeddings Can your insurance company aid you in reducing administrative costs ? Would it be feasible to limit the menu in order to reduce feeding costs ? Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  22. Bootstrapping Sense Embeddings reduce%2:30:00:: insurance_company%1:14:00:: cost%1:21:00:: aid%2:41:00:: administrative%3:01:00:: Can your insurance company aid you in reducing administrative costs ? Would it be feasible to limit the menu in order to reduce feeding costs ? feasible%5:00:00:possible:00 menu%1:10:00:: feeding%1:04:01:: limit%2:30:00:: reduce%2:30:00:: cost%1:21:00:: Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  23. Bootstrapping Sense Embeddings reduce%2:30:00:: insurance_company%1:14:00:: cost%1:21:00:: aid%2:41:00:: administrative%3:01:00:: Can your insurance company aid you in reducing administrative costs ? Would it be feasible to limit the menu in order to reduce feeding costs ? feasible%5:00:00:possible:00 menu%1:10:00:: feeding%1:04:01:: limit%2:30:00:: reduce%2:30:00:: cost%1:21:00:: Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  24. Bootstrapping Sense Embeddings reduce%2:30:00:: insurance_company%1:14:00:: cost%1:21:00:: aid%2:41:00:: administrative%3:01:00:: Can your insurance company aid you in reducing administrative costs ? Would it be feasible to limit the menu in order to reduce feeding costs ? feasible%5:00:00:possible:00 menu%1:10:00:: feeding%1:04:01:: limit%2:30:00:: reduce%2:30:00:: cost%1:21:00:: Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  25. Bootstrapping Sense Embeddings 𝑑 1 𝑑 1 𝑑 1 reduce%2:30:00:: insurance_company%1:14:00:: cost%1:21:00:: 𝑑 1 𝑑 1 aid%2:41:00:: administrative%3:01:00:: 𝑑 2 𝑑 2 𝑑 2 feasible%5:00:00:possible:00 menu%1:10:00:: feeding%1:04:01:: 𝑑 2 𝑑 2 𝑑 2 limit%2:30:00:: reduce%2:30:00:: cost%1:21:00:: Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  26. Bootstrapping Sense Embeddings 𝑑 1 𝑑 1 reduce%2:30:00:: cost%1:21:00:: 𝑑 2 𝑑 2 reduce%2:30:00:: cost%1:21:00:: Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  27. Bootstrapping Sense Embeddings 𝑑 1 𝑑 2 𝑑 n … + + + reduce%2:30:00:: reduce%2:30:00:: reduce%2:30:00:: 𝑤 reduce%2:30:00:: = n 𝑑 1 𝑑 2 𝑑 n … + + + cost%1:21:00:: cost%1:21:00:: cost%1:21:00:: 𝑤 cost%1:21:00:: = n Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  28. Bootstrapping Sense Embeddings 𝑑 1 𝑑 2 𝑑 n + + … + reduce%2:30:00:: reduce%2:30:00:: reduce%2:30:00:: 𝑤 reduce%2:30:00:: = n 𝑑 1 𝑑 2 𝑑 n … + + + cost%1:21:00:: cost%1:21:00:: cost%1:21:00:: 𝑤 cost%1:21:00:: = n Outcome: 33,360 sense embeddings (16% coverage) Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

  29. Propagating Sense Embeddings WordNet’s units, synsets, represent concepts at different levels. Introduction Related Work Our ur Ap Approach Performance Applications Conclusions

Recommend


More recommend