analysing lexical semantic change with contextualised
play

Analysing Lexical Semantic Change with Contextualised Word - PowerPoint PPT Presentation

GeCKo, 18 May 2020 Integrating Generic and Contextual Knowledge Analysing Lexical Semantic Change with Contextualised Word Representations Mario Giulianelli, Marco Del Tredici, Raquel Fernndez University of Amsterdam Types Senses


  1. GeCKo, 18 May 2020 Integrating Generic and Contextual Knowledge Analysing Lexical Semantic Change 
 with Contextualised Word Representations Mario Giulianelli, Marco Del Tredici, Raquel Fernández University of Amsterdam

  2. Types Senses highlighter-pen highlighter highlighter-makeup ? ? Usages: contextualised representations Number of usage types is lexeme-specific 
 and induced from language use. Usage vectors are characterised 
 by contexts of occurrence — not by lists 
 of nearest neighbouring words. … <s> ... highlighter ... <\s> …

  3. Method For each word of interest w (1) extract contextualised representations for all occurrences of w in the corpus, 
 using a language model (e.g., BERT or ELMo) (2) cluster all representations of w into usage types by automatically selecting the optimal number of clusters (e.g. K-Means + silhouette score or Affinity Propagation) (3) organise usage clusters into diachronic usage distributions (frequency-based or probability-based) (4) quantify degree of change by comparing representations and usage distributions 1 2 + 
 target word PCA visualisation of all contextualised representations for the word users 
 as it occurs in COHA (Davies, 2012)

  4. Method For each word of interest w (1) extract contextualised representations for all occurrences of w in the corpus, 
 using a language model (e.g., BERT or ELMo) (2) cluster all representations of w into usage types by automatically selecting the optimal number of clusters (e.g. K-Means + silhouette score or Affinity Propagation) (3) organise usage clusters into diachronic usage distributions (frequency-based or probability-based) (4) quantify degree of change by comparing representations and usage distributions users 3 digital usage A 1 usage B services 2 resources usage C Suez usage D 0.8 Canal drugs usage E usage F 0.6 0.4 digital products 0.2 non-digital products 0 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 Contextualised representations (left) and usage type distributions (right) 
 for the word users as it occurs in COHA (Davies, 2012)

  5. 
 
 Method For each word of interest w (1) extract contextualised representations for all occurrences of w in the corpus, 
 using a language model (e.g., BERT or ELMo) (2) cluster all representations of w into usage types by automatically selecting the optimal number of clusters (e.g. K-Means + silhouette score or Affinity Propagation) (3) organise usage clusters into diachronic usage distributions (frequency-based or probability-based) (4) quantify degree of change by comparing representations and usage distributions 4 users 1 usage A usage B usage C between two 
 usage D 0.8 Jensen-Shannon Divergence ( ) usage E usage F 0.6 0.4 users time periods 
 0.2 3 0 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 usage A 1 usage B users usage C usage D 1 usage A 0.8 usage B usage E usage C usage D usage F 0.8 Entropy Di ff erence ( ) usage E or 
 usage F 0.6 0.6 0.4 0.4 0.2 0 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 0.2 average over pairs 0 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 of time periods Average Pairwise Distance ( )

  6. Are the resulting usage clusters interpretable? polysemy and ‘the ceiling of a homonymy ‘ ceiling prices’ church’ ‘the most curious ‘breaking ‘prefer the open sky to reading’ ‘full of questions, through the a ceiling ’ intensely curious ’ ceiling ’ ‘a curious sense of gratitude’ ‘half fearful, half literal vs metaphorical curious ’ ‘ wireless ‘ refuse to hire’ syntactic device’ entity names ‘ refuse or neglect to functionality ‘ wireless perform’ network’ ‘verizon wireless ‘the refuse of the ‘ refuse a draft’ ‘ wireless ly’ theater’ schools’ ‘ refuse , and you die’ affixation

  7. 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 employment and tenure // minority faculty in tenure What types of lexical change are detected? tenure of office you can always go coach // stage coach tenure ­track faculty position cinderella ­ here comes your coach reasons for short term leases and insecurity of tenure narrowing : “tenure” 1 0.8 1 broadening (incl. metaphorisation) : “curtain” 0.6 0.8 0.6 0.4 1 1 0.4 0.2 0.2 0 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 0 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 0.8 0.8 employment and tenure // minority faculty in tenure employment and tenure // minority faculty in tenure tenure of office tenure of office tenure ­track faculty position tenure ­track faculty position reasons for short term leases and insecurity of tenure 1 reasons for short term leases and insecurity of tenure 0.6 0.6 0.8 shift : “coach” 0.6 0.4 0.4 1 0.4 0.8 0.6 0.2 0.2 0.2 0.4 0 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 0.2 0 0 0 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 you can always go coach // stage coach cinderella ­ here comes your coach I hung colored lights around my curtain less windows you can always go coach // stage coach cinderella ­ here comes your coach inflatable curtain ­type head­protection bags new syntactic role : 
 the polished disk // a disk on a rigid backing download raising the curtain on its [...] tax­reform program “download” bureaucracies [...] on both sides of the curtain floppy and hard­ disk drives // portable disk ­radio usage A 1 usage B to download 0.8 0.6 0.4 0.2 COCA (Davies, 2010) COHA (Davies, 2012) 
 a download 0 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

  8. Correlation with human judgements Diachronic Usage Pair Similarity 
 Data : GEMS (Gulordava & Baroni, 2011) 
 A crowdsourced dataset of similarity 100 words w/ shift scores. judgements for more than 3K English word Shift score : average human judgement on a 
 usage pairs (16 lemmas) from different time word’s meaning change between 1960 
 periods. and 2000 (on a 4-points scale). Metric : Spearman rank correlation between 
 NEW DATASET: DUPS annotated change score and our 
 three measures of change. Frequency difference 0.068 Entropy difference ( max ) 0.278 Jensen-Shannon divergence ( max ) 0.276 Average pairwise distance ( Euclidean , max ) 0.285 Gulordava and Baroni (2011) 0.386 Frermann and Lapata (2016) 0.377 4 but wait for it… 1 Algorithm English German Latin Swedish 3 2 Word2vec CBOW cosine similarity baseline Incremental 0.210 0.145 0.217 -0.012 Significant rank correlation between Procrustes 0.285 0.439* 0.387* 0.458* averaged human similarity Fine-tuned contextualised embeddings (top layer) ELMo Cosine similarity 0.254 0.740 * 0.360* 0.252 judgements and BERT similarity 0.605 * 0.569 * ELMo Average pairwise distance 0.560* -0.113 BERT Cosine similarity 0.225 0.590* 0.561 * 0.185 scores for 10 out of 16 words. BERT Average pairwise distance 0.546* 0.427* 0.372* 0.254 (Kutuzov and Giulianelli, 2020)

  9. References Davies, M. (2010). The 400-Million Word Corpus of Historical American English. Corpora. Davies, M. (2012). The Corpus of Contemporary American English. Literary & Linguistic Computing. Devlin, J., Chang, M. W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional 
 Transformers for Language Understanding. In Proceedings of NAACL. Frermann, L., and Lapata, M. (2016). A Bayesian Model of Diachronic Meaning Change. TACL. Gulordava, K., and Baroni, M. (2011). A Distributional Similarity Approach to the Detection of Semantic Change in the Google Books Ngram Corpus. In Proceedings of the GEMS. Kutuzov, A., and Giulianelli, M. (2020). UiO-UvA at SemEval-2020 Task 1:Contextualised Embeddings for Lexical Semantic Change Detection. Forthcoming.

Recommend


More recommend