Lecture 22: Representation Learning Kai-Wei Chang CS @ University of Virginia kw@kwchang.net Couse webpage: http://kwchang.net/teaching/NLP16 CS6501-NLP 1
Feature Representations Feature Representation Learning Algorithm Color_red Shape_round Has_leaf … CS6501-NLP 2
Feature Representation v E.g., Conditional Random Field v P 𝒖 𝒙 ∝ ∏ exp(∑ 𝜇 + 𝑔 + 𝑢 . ,𝒙 + ∑ 𝜃 2 2 (𝑢 . ,𝑢 .45 , 𝒙) ) . + 2 Edge feature (𝑢 . ,𝑢 .45 ,𝒙) 𝑢 5 𝑢 : 𝑢 7 𝑢 8 Node feature 𝑔(𝑢 . ,𝒙) 𝑥 5 𝑥 : 𝑥 7 𝑥 8 CS6501-NLP 3
Feature Representation v High-order combinations – kernel trick CS6501-NLP 4
Tree Kernel v How to measure the similarity between two parse trees? CS6501-NLP 5
Learning representations via NN v Identify high-order combinations v NN architecture for encoding language structures v Learn hierarchical representations v Representations for token/phrases/sentences CS6501-NLP 6
How to represent words? v Token, bi-gram, n-gram (one-hot featuers) v Word embeddings v Task-specific word embeddings v E.g., for sentiment analysis CS6501-NLP 7
How to represent phrases/sentences? v Recursive NN [Socher, Manning, Ng 11] v Many follow-up approaches CS6501-NLP 8
Unsupervised Feature learning & deep learning , Andrew Ng CS6501-NLP 9
Auto-encoder and auto-decoder CS6501-NLP 10
Sequence to sequence models [Sutskever, Vinyals & Le 14] v Have been shown effective in machine translation, image captioning and and many structured tasks CS6501-NLP 11
Structured prediction ⋂ Representation Learning v NLP problems are structural v Output variables are inter-correlated v Need joint predictions v Traditional approaches v Graphical model approaches v E.g., Probabilistic graphical models, structured perceptron v Sequence of decisions v E.g., incremental perceptron, L2S, transition- based methods SPNLP 12
Recent trends v Landscape of methods in Deep ⋂ Structure v Deep learning/hidden representation e.g., seq2seq, RNN v Deep features into factors, traditional factor graph inference e.g., LSTM+CRF, graph transformer networks v Globally optimized transitional-based approaches e.g., beam-search seq2seq, SyntaxNet v … SPNLP 13
Recommend
More recommend