understanding the downstream instability of word
play

Understanding the Downstream Instability of Word Embeddings Megan - PowerPoint PPT Presentation

Understanding the Downstream Instability of Word Embeddings Megan Leszczynski , Avner May, Jian Zhang, Sen Wu, Chris Aberger, Chris R Stanford University Motivation Re Recommend new conte tent Le Learn new words De Detect the latest


  1. Understanding the Downstream Instability of Word Embeddings Megan Leszczynski , Avner May, Jian Zhang, Sen Wu, Chris Aberger, Chris Ré Stanford University

  2. Motivation Re Recommend new conte tent Le Learn new words De Detect the latest spam Why retrain? Changing distribution of popular videos New spam techniques Out-of-vocabulary words Model freshness is necessary for user satisfaction in many products. 2

  3. Google retrains their app store Google Play models every day , and Facebook retrains search models every hour. [1] Baylor et al. TFX: A TensorFlow-Based Production-Scale Machine Learning Platform. KDD, 2017. [2] Hazelwood et al. Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective. HPCA, 2018. 3

  4. But model training can be unstable… Predictions 1 Data 1 Unnecessary prediction changes! Data 1 + ∆ Predictions 2 Prediction churn [1] Cormier et al. Launch and Iterate: Reducing Prediction Churn. NeurIPS, 2016. 4

  5. Challenges of Instability Debugging Consistent user-experience Model dependencies Research reliability 2 4 1 3 5

  6. Problem Setting: Embedding Server Downstream Tasks Changing Data Named Entity Recognition (NER) Question Answering Embedding Refresh Embeddings Server Sentiment Analysis 0.1 0.3 Relation Extraction 0.5 … Embeddings are shared among downstream tasks. How does the embedding instability propagate to these tasks? 6

  7. Key takeaway: Stability–memory tension With the right understanding , we can improve stability by over 30% — in the same amount of memory 7

  8. Outline Q: How do we define downstream instability? A: % prediction disagreement Q: What embedding hyperparameters impact downstream instability? A: hyperparameters related to memory Q: How can we theoretically understand downstream instability? A: using our eigenspace instability measure (EIS) Q: How can we select embedding hyperparameters to minimize instability? A: using the EIS (or k-NN) measures 8

  9. Definition: Downstream Instability Emb 1 ( " ) Predictions 1 Data 1 Downstream Instability Emb 2 ( # Data 1 + ∆ " ) Predictions 2 Downstream instability = % prediction disagreement between models trained on a pair of embeddings Metrics like instability are important for modularity. 9

  10. Outline Q: How do we define downstream instability? A: % prediction disagreement Q: What embedding hyperparameters impact downstream instability? A: hyperparameters related to memory Q: How can we theoretically understand downstream instability? A: using our eigenspace instability measure (EIS) Q: How can we select embedding hyperparameters to minimize instability? A: using the EIS (or k-NN) measures 10

  11. Hyperparameters that Impact Memory Dimension Precision Memory # features / word # bits / feature 32 32-bi bit 1-bi bit Unif Un iform rm Downstream Qu Quantization 0.04 0.1 -0.03 -0.1 In Interval: Instability [-0. 0.1, 1, 0. 0.1] 1] -0.08 -0.1 [1] May et al. On the downstream performance of compressed word embeddings. NeurIPS, 2019. 11

  12. Impact of Dimension Sentiment Analysis NER Dimension Downstream Instability 12

  13. Impact of Precision Sentiment Analysis NER Precision Downstream Instability 13

  14. Stability-Memory Tradeoff Sentiment Analysis NER 11% Memory Downstream Instability 14

  15. Outline Q: How do we define downstream instability? A: % prediction disagreement Q: What embedding hyperparameters impact downstream instability? A: hyperparameters related to memory Q: How can we theoretically understand downstream instability? A: using our eigenspace instability measure (EIS) Q: How can we select embedding hyperparameters to minimize instability? A: using the EIS (or k-NN) measures 15

  16. Goal: Embedding distance measure Emb 1 ( " ) Predictions 1 Data 1 Distance (Emb1, Emb2) Downstream Instability Emb 2 ( # Data 1 + ∆ " ) Predictions 2 The measure must relate the distance between the embeddings to the downstream instability. 16

  17. Eigenspace Instability Measure (EIS) Key insight: The predictions of a linear regression model trained on an embedding ! depend on the left singular vectors of ! . 1 S V T Singular Value Emb mb = U Decomposition ( ! ) ) [1] May et al. On the downstream performance of compressed word embeddings. NeurIPS, 2019. 17

  18. Eigenspace Instability Measure (EIS) § EIS measures the similarity of the left singular vectors of two embeddings For embeddings ! and " !, EIS ( $, " $ ) = similarity( &, " & ) § Can be computed in time ' () ! - ( is the size of vocabulary and ) is the dimension 18

  19. Eigenspace Instability Measure (EIS) Theorem (informal): EIS is equal to the expected mean-squared difference between the predictions of the linear models trained on X and " X. Direct theoretical connection between the EIS measure and the downstream instability. 19

  20. Outline Q: How do we define downstream instability? A: % prediction disagreement Q: What embedding hyperparameters impact downstream instability? A: hyperparameters related to memory Q: How can we theoretically understand downstream instability? A: using our eigenspace instability measure (EIS) Q: How can we select embedding hyperparameters to minimize instability? A: using the EIS (or k-NN) measures 20

  21. Embedding measure for downstream instability? § EIS measure § k-NN measure [1,2,3] § Semantic displacement (SD) [4] § PIP loss [5] § Eigenspace overlap (EO) [6] [1] Hellrich & Hahn, COLING, 2016; [2] Antoniak & Mimno, TACL, 2018; [3] Wendlandt et al., NAACL-HLT, 2018; [4] Hamilton et al., ACL, 2016; [5] Yin & Shen, NeurIPS, 2018; [6] May et al., NeurIPS, 2019 21

  22. Correlation with Downstream Instability 0.9 0.8 Spearman Correlation 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Eigenspace 1 - k-NN Semantic PIP Loss 1 - Eigenspace Instability Measure Displacement Overlap (EIS) EIS and k-NN measures strongly correlate with downstream instability. 22

  23. Selection Task Setup § Use embedding distance 100-dim, 1-bit measure to select % Disagreement hyperparameters for a fixed 50-dim, 2-bit Choices memory budget 25-dim, 4-bit § Record the difference in Oracle downstream instability to the oracle hyperparameters Memory 23

  24. Selection Task Results EIS 1 - k-NN SD PIP 1 - EO 3.5 Diff. to Oracle (Abs. %) 3 2.5 2 1.5 1 0.5 0 SST-2 MR SUBJ MPQA CoNLL-2003 EIS and k-NN measures outperform other measures as selection criteria. 24

  25. Our theoretically grounded measure improves the stability up to 34% over a full precision baseline in the same amount of memory . 25

  26. Stability-Memory Tension on KG Embeddings Memory Downstream Instability 26

  27. Conclusion § Exposed a stability-memory tradeoff for word embeddings. § Proposed the EIS measure to understand downstream instability. § Evaluated measures for hyperparameter selection to minimize instability. Check out the paper for extended experiments with more embedding algorithms and downstream tasks! Code: Comments or Questions: http://bit.ly/embstability mleszczy@stanford.edu 27

Recommend


More recommend