mikolov s language models
play

Mikolovs Language Models: Distributed Representations of Sentences - PowerPoint PPT Presentation

Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Mikolovs Language Models: Distributed Representations of Sentences and Documents Recurrent Neural


  1. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Mikolov’s Language Models: Distributed Representations of Sentences and Documents Recurrent Neural Language Model Tomas Mikolov 1 May 16, 2014 1 Google Inc1 Tomas Mikolov 2 Mikolov’s Language Models:

  2. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Table of contents 1 Motivation 2 Introduction and Background 3 Paragraph Embeddings 4 Performance 5 Linguistic Regularities in Continuous Space Word Representations Tomas Mikolov 3 Mikolov’s Language Models:

  3. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Motivation Quoth Tomas Mikolov, http://www.fit.vutbr.cz/ imikolov/rnnlm/google.pdf Statistical language models assign probabilities to word sequences Meaningful sentences should be more likely than ambiguous ones Language modeling is an artificial intelligence problem. Tomas Mikolov 4 Mikolov’s Language Models:

  4. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Classical Ngram Models Figure: Text Modeling using Markov Chains, Claude Shannon (1984) max P ( w i | w i − 1 , ... ) (1) Where each w i representation is a 1-N encoding. Tomas Mikolov 5 Mikolov’s Language Models:

  5. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Neural Representation of Words Neural Language Model Bengio et al, 2006. Figure: Word2Vec , Tomas Mikolov Tomas Mikolov 6 Mikolov’s Language Models:

  6. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Beyond Word Embeddings Recursive Deep Tensor Models Socher et. al. Figure: Recursive Tree Structure , Richard Socher 2013 Tomas Mikolov 7 Mikolov’s Language Models:

  7. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Beyond Word Embeddings Recurrent Neural Network Language Model Mikolov et. al. Figure: Recurrent NN , Tomas Mikolov 2010 Tomas Mikolov 8 Mikolov’s Language Models:

  8. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Beyond Word Embeddings Character-Level Recognition Figure: Text Understanding from Scratch , Zhang, LeCun 2015 Tomas Mikolov 9 Mikolov’s Language Models:

  9. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithm Overview Figure: Paragraph Embedding, Learning Model , Tomas Mikolov 2013 Tomas Mikolov 10 Mikolov’s Language Models:

  10. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithmic Overview Part 1. Word embeddings. Given sentence w 1 , w 2 , w 3 ... : T − k max 1 X log p ( w t | w t − k , ..., w t + k ) (2) T t = k where e y wt p ( w t | w t − k , ..., w t + k ) = (3) P i e y i Tomas Mikolov 11 Mikolov’s Language Models:

  11. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithmic Overview Parameters for Step 1: U , b y = b + Uh ( w t − k , ..., w t + k ; W ) (4) Tomas Mikolov 12 Mikolov’s Language Models:

  12. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithmic Overview Part II. Joint Word and Paragraph y = b + Uh ( w t − k , ..., w t + k ; W , D ) (5) W ∈ R p × N D ∈ R p × M p × ( M + N ) Tomas Mikolov 13 Mikolov’s Language Models:

  13. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithm Overview Figure: Distributed Memory Model Tomas Mikolov 14 Mikolov’s Language Models:

  14. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Algorithm Overview Figure: Distributed Bag of Words Model Model Tomas Mikolov 15 Mikolov’s Language Models:

  15. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Sentiment Analysis Figure: Stanford Sentiment Treebank Dataset Tomas Mikolov 16 Mikolov’s Language Models:

  16. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Sentiment Analysis Figure: iMDB Dataset Tomas Mikolov 17 Mikolov’s Language Models:

  17. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Model Figure: Recurrent NN , Tomas Mikolov 2010 Tomas Mikolov 18 Mikolov’s Language Models:

  18. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Components: input : x ( t ) = w ( t ) + s ( t − 1) ⇣ X ⌘ hidden : s j ( t ) = f x i ( t ) ∗ u ji i ⇣ X ⌘ output : y k ( t ) = g s j ( t ) ∗ v kj j where f is sigmoid and g is softmax. Tomas Mikolov 19 Mikolov’s Language Models:

  19. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Spatial Meaning: Vector O ff set Method for Running Linguistic Analogy Questions: y = x b − x a + x c x w y w ∗ = arg max || x w |||| y || w Tomas Mikolov 20 Mikolov’s Language Models:

  20. Motivation Introduction and Background Paragraph Embeddings Performance Linguistic Regularities in Continuous Space Word Representations Results Tomas Mikolov 21 Mikolov’s Language Models:

Recommend


More recommend