lecture 22 representation learning
play

Lecture 22: Representation Learning Kai-Wei Chang CS @ University - PowerPoint PPT Presentation

Lecture 22: Representation Learning Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501-NLP 1 Feature Representations Feature Representation Learning Algorithm Color_red


  1. Lecture 22: Representation Learning Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501-NLP 1

  2. Feature Representations Feature Representation Learning Algorithm Color_red Shape_round Has_leaf … CS6501-NLP 2

  3. Feature Representation v E.g., Conditional Random Field v P 𝒖 𝒙 ∝ ∏ exp(∑ 𝜇 + 𝑔 + 𝑢 . ,𝒙 + ∑ 𝜃 2 𝑕 2 (𝑢 . ,𝑢 .45 , 𝒙) ) . + 2 Edge feature 𝑕(𝑢 . ,𝑢 .45 ,𝒙) 𝑢 5 𝑢 : 𝑢 7 𝑢 8 Node feature 𝑔(𝑢 . ,𝒙) 𝑥 5 𝑥 : 𝑥 7 𝑥 8 CS6501-NLP 3

  4. Feature Representation v High-order combinations – kernel trick CS6501-NLP 4

  5. Tree Kernel v How to measure the similarity between two parse trees? CS6501-NLP 5

  6. Learning representations via NN v Identify high-order combinations v NN architecture for encoding language structures v Learn hierarchical representations v Representations for token/phrases/sentences CS6501-NLP 6

  7. How to represent words? v Token, bi-gram, n-gram (one-hot featuers) v Word embeddings v Task-specific word embeddings v E.g., for sentiment analysis CS6501-NLP 7

  8. How to represent phrases/sentences? v Recursive NN [Socher, Manning, Ng 11] v Many follow-up approaches CS6501-NLP 8

  9. Unsupervised Feature learning & deep learning , Andrew Ng CS6501-NLP 9

  10. Auto-encoder and auto-decoder CS6501-NLP 10

  11. Sequence to sequence models [Sutskever, Vinyals & Le 14] v Have been shown effective in machine translation, image captioning and and many structured tasks CS6501-NLP 11

  12. Structured prediction ⋂ Representation Learning v NLP problems are structural v Output variables are inter-correlated v Need joint predictions v Traditional approaches v Graphical model approaches v E.g., Probabilistic graphical models, structured perceptron v Sequence of decisions v E.g., incremental perceptron, L2S, transition- based methods SPNLP 12

  13. Recent trends v Landscape of methods in Deep ⋂ Structure v Deep learning/hidden representation e.g., seq2seq, RNN v Deep features into factors, traditional factor graph inference e.g., LSTM+CRF, graph transformer networks v Globally optimized transitional-based approaches e.g., beam-search seq2seq, SyntaxNet v … SPNLP 13

Recommend


More recommend