A Semi-Supervised System for Word Sense Disambiguation Feng Wei, - PowerPoint PPT Presentation

PoKED: A Semi-Supervised System for Word Sense Disambiguation Feng Wei, Uyen Trang Nguyen EECS, York University, Canada July 12 - 18, 2020 @ ICML 2020

Objective ▪ How position-wise embedding (unsupervised) could help the downstream WSD task. ▪ How information from descriptive linguistic knowledge graphs (WordNet) can be incorporated into neural network architectures to solve and improve the linguistic WSD task. 2

Contributions & Highlights ▪ Propose a semi-supervised neural system named Position-wise Orthogonal Knowledge-Enhanced Disambiguator (PoKED), supporting attention-driven, long-range dependency modeling. ▪ Incorporate position-wise encoding into an orthogonal framework and applies a knowledge-based attentive neural model to solve the WSD problem. ▪ Propose to use the semantic relations in the WordNet, by extracting semantic level inter-word connections from each document-sentence pair in the WSD dataset. ▪ PoKED achieves better performance than state-of-the-art knowledge- based WSD systems on standard benchmarks. 3

Human Semantic Knowledge Human semantic knowledge is essential to WSD. Document is a hypernym of information , or information is a hyponym of document . SemEval-15 dataset 4

PoNet (Unsupervised Language Model) ▪ Humans decide the sense of a polyseme by firstly understanding its occurring context [Harris, 1954]. ▪ Two stages: PoNet to abstracts context as embeddings; KED to classify over pre- trained context embeddings. 5

Position-wise Encoding : A sequence of N words from vocabulary V [Watcharawittayakul et al., 2018; Wei et al., 2019] Position-wise Encoding 6

Position-wise Encoding ▪ Generate augmented encoding codes by concatenating two codes using two different forgetting factors. ▪ Represent both short-term and long-term dependencies. ▪ Maintain the sensitivity to both nearby and faraway context. Back in the day, we had an entire bank of computers devoted to this problem. Position-wise Position-wise codes of right codes of left 𝛽 1 𝛽 1 𝛽 2 𝛽 2 context context 7

Orthogonal Framework ▪ Introduce a linear orthogonal projection to reduce the dimensionality of the raw high- dimension data and then uses a finite mixture distribution to model the extracted features. ▪ Each hidden layer can be viewed as an orthogonal model being composed of the feature extraction stage and data modeling stage. [Zhang et al., 2016; Wei et al., 2020] 8

Context Embeddings Held-out layer are retained as context embeddings, which provides an effective representation of the surrounding context of a given target word. 9

KED (Supervised Knowledge-based Attentive Model) Vanilla recurrent neural network unfold 10

Data Enrichment with WordNet For each word 𝜕 in a document- sentence pair, obtain a set 𝑨 𝜕 Long Short-term Memory Cell which contains the positions of the document words that 𝜕 is semantically connected to. 11

Data Enrichment with WordNet Long Short-term Memory Cell : directly-involved synsets : indirectly-involved synsets parrot.n.01 keratin.n.01 feather.n.01 bird.n.01 hyponym substance holonym part holonym 12

KED (Supervised Knowledge-based Attentive Model) Sense Prediction Layer Fine-grained Memory Layer Coarse-grained Memory Layer Context Embedding Layer Vanilla recurrent neural network unfold Lexicon Embedding Layer 13

Experiments and Results 14

Experiments and Results Ablation Study on Knowledge-Enhancement Performance Drop (%) - 4.5 - 3.9 - 4.4 - 3.8 - 5.4 15

Experiments and Results Effectiveness of General Knowledge Extraction #average : average number of inter-word connections per word. Bold font: best performance. 16

Experiments and Results Quantitative Analysis of the Hunger for Data MFS baseline : the Most Frequent Sense heuristic computed on Statistics about the datasets used for this work SemCor corpus on each dataset. 17

PoKED: A Semi-Supervised System for Word Sense Disambiguation Feng Wei, Uyen Trang Nguyen EECS, York University, Canada Thank You

A Semi-Supervised System for Word Sense Disambiguation Feng Wei, - PowerPoint PPT Presentation

PoKED: A Semi-Supervised System for Word Sense Disambiguation Feng Wei, Uyen Trang Nguyen EECS, York University, Canada July 12 - 18, 2020 @ ICML 2020 Objective How position-wise embedding (unsupervised) could help the downstream WSD task.

Semi-Supervised Learning Maria-Florina Balcan 03/30/2015 Readings: Semi-Supervised Learning.

Semi-Supervised Kernel Mean Shift Clustering A Semi-Supervised Clustering Approach Motivation:

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

CS330 Paper Presentation: October 16th, 2019 Supervised Classification Semi-Supervised

Support Vector Machines (SVMs). Semi-Supervised Learning. Semi-Supervised SVMs.

Semi-Supervised Tag Extraction in a Web Recommender System Vasily Leksin Sergey Nikolenko

Iterative Hybrid Algorithm for Semi-supervised Classification Martin SAVESKI Supervised by

Particles Competition and Cooperation in Networks for Semi-Supervised Learning Fabricio Breve

Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant

Tangent-Normal Adversarial Regularization for Semi-Supervised Learning Bing Yu , Jingfeng Wu

Link prediction in graph construction for supervised and semi-supervised learning Lilian Berton,

Author Identification Using Semi-supervised Learning Ioannis Kourtis and Efstathios Stamatatos

Parallelizing Semi- ReDAS Lab Supervised Learning Algorithms with MapReduce Nick Gauthier

Semi-supervised Semantic Role Labeling Hagen Frstenau Department of Computational Linguistics

Regularizing objective functionals in semi-supervised learning Dejan Slep cev Carnegie Mellon

Mutual information deep regularization for semi- supervised segmentation J. Peng, M. Pedersoli,

Classification Semi-supervised learning based on network Speakers: Hanwen Wang, Xinxin Huang, and

Realistic Evaluation of Deep Semi-Supervised Learning Algorithms Avital Oliver* Augustus Odena*

Semi-Supervised Learning Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824

A semi-supervised approach to extracting multiword entity names from user reviews Olga

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

Keepin It Real: Semi-Supervised Learning with Realistic Tuning Andrew B. Goldberg Xiaojin

Semi-Supervised Learning Barnabas Poczos Slides Courtesy: Jerry Zhu, Aarti Singh Supervised