Document Modeling with Gated Recurrent Neural Network for Sentiment Classification Duyu Tang, Bing Qin, Ting Liu Harbin Institute of Technology 1
Sentiment Classification • Given a piece of text, sentiment classification focus on inferring the sentiment polarity of the text. • Positive / Negative • 1-5 stars • The task can be at • Word/phrase level, sentence level, document level • We target at document-level sentiment classification in this work 2
Standard Supervised Learning Pipeline Training Data Feature Representation Learning Algorithm Sentiment Classifier 3
Feature Learning Pipeline Training Data Learn text representation/feature from data! Feature Representation Learning Algorithm Sentiment Classifier 4
Deep Learning Pipeline Training Data Semantic Composition Feature Representation Word Representation • Represent each word Learning w 1 w 2 …… w n−1 w n as a low dimensional, Words Algorithm real-valued vector • Solutions: Word2Vec, Sentiment Glove, SSWE Classifier 5
• Compositionality: the meaning of a longer expression depends on the meaning of its Deep Learning Pipeline constituents • Solutions at sentence level • Recurrent NN, Recursive NN, Convolutional NN, Tree-Structured LSTM Training Data Semantic Composition Feature Representation Word Representation • Represent each word Learning w 1 w 2 …… w n−1 w n as a low dimensional, Words Algorithm real-valued vector • Solutions: Word2Vec, Sentiment Glove, SSWE Classifier 6
The idea of this work • We want to build an end-to-end neural network approach for document level sentiment classification • Human beings solve this problem in a hierarchical way: represent sentence from words, and then represent document from sentences • We want to use the semantic/discourse relatedness between sentences to obtain the document representation • We do not want to use an external discourse parser. 7
…… Word Representation 𝑜 w 𝑚𝑜−1 1 w 𝑚1−1 2 w 𝑚2−1 1 2 𝑜 1 1 1 2 2 2 𝑜 𝑜 𝑜 w 1 w 2 w 3 w 1 w 2 w 3 w 1 w 2 w 3 w 𝑚𝑜 w 𝑚1 w 𝑚2 8
…… Sentence Representation Sentence Composition CNN/LSTM CNN/LSTM CNN/LSTM …… Word Representation 𝑜 w 𝑚𝑜−1 1 w 𝑚1−1 2 w 𝑚2−1 1 2 𝑜 1 1 1 2 2 2 𝑜 𝑜 𝑜 w 1 w 2 w 3 w 1 w 2 w 3 w 1 w 2 w 3 w 𝑚𝑜 w 𝑚1 w 𝑚2 9
Document Composition Forward Gated Forward Gated Forward Gated Neural Network Neural Network Neural Network …… Sentence Representation Sentence Composition CNN/LSTM CNN/LSTM CNN/LSTM …… Word Representation 𝑜 w 𝑚𝑜−1 1 w 𝑚1−1 2 w 𝑚2−1 1 2 𝑜 1 1 1 2 2 2 𝑜 𝑜 𝑜 w 1 w 2 w 3 w 1 w 2 w 3 w 1 w 2 w 3 w 𝑚𝑜 w 𝑚1 w 𝑚2 10
Backward Gated Backward Gated Backward Gated Document Composition Neural Network Neural Network Neural Network Forward Gated Forward Gated Forward Gated Neural Network Neural Network Neural Network …… Sentence Representation Sentence Composition CNN/LSTM CNN/LSTM CNN/LSTM …… Word Representation 𝑜 w 𝑚𝑜−1 1 w 𝑚1−1 2 w 𝑚2−1 1 2 𝑜 1 1 1 2 2 2 𝑜 𝑜 𝑜 w 1 w 2 w 3 w 1 w 2 w 3 w 1 w 2 w 3 w 𝑚𝑜 w 𝑚1 w 𝑚2 11
Backward Gated Backward Gated Backward Gated Document Composition Neural Network Neural Network Neural Network Forward Gated Forward Gated Forward Gated Neural Network Neural Network Neural Network …… Sentence Representation Sentence Composition CNN/LSTM CNN/LSTM CNN/LSTM …… Word Representation 𝑜 w 𝑚𝑜−1 1 w 𝑚1−1 2 w 𝑚2−1 1 2 𝑜 1 1 1 2 2 2 𝑜 𝑜 𝑜 w 1 w 2 w 3 w 1 w 2 w 3 w 1 w 2 w 3 w 𝑚𝑜 w 𝑚1 w 𝑚2 12
Softmax Document Representation Backward Gated Backward Gated Backward Gated Document Composition Neural Network Neural Network Neural Network Forward Gated Forward Gated Forward Gated Neural Network Neural Network Neural Network …… Sentence Representation Sentence Composition CNN/LSTM CNN/LSTM CNN/LSTM …… Word Representation 𝑜 w 𝑚𝑜−1 1 w 𝑚1−1 2 w 𝑚2−1 1 2 𝑜 1 1 1 2 2 2 𝑜 𝑜 𝑜 w 1 w 2 w 3 w 1 w 2 w 3 w 1 w 2 w 3 w 𝑚𝑜 w 𝑚1 w 𝑚2 13
Sentence Modeling 14
Sentence Modeling Document Modeling 15
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 16
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 SVM + Bigrams 0.624 0.409 17
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 SVM + Bigrams 0.624 0.409 SVM + TextFeatures 0.624 0.405 18
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 SVM + Bigrams 0.624 0.409 SVM + TextFeatures 0.624 0.405 SVM + AverageWordVec 0.568 0.319 19
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 SVM + Bigrams 0.624 0.409 SVM + TextFeatures 0.624 0.405 SVM + AverageWordVec 0.568 0.319 Conv-Gated NN 0.660 0.425 (BiDirectional Gated Avg) 20
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 SVM + Bigrams 0.624 0.409 SVM + TextFeatures 0.624 0.405 SVM + AverageWordVec 0.568 0.319 Conv-Gated NN 0.660 0.425 (BiDirectional Gated Avg) LSTM-Gated NN 0.676 0.453 21
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 SVM + Bigrams 0.624 0.409 SVM + TextFeatures 0.624 0.405 SVM + AverageWordVec 0.568 0.319 Conv-Gated NN 0.660 0.425 (BiDirectional Gated Avg) Document Modeling Yelp 2015 (5-class) IMDB (10-class) Average 0.614 0.366 Recurrent 0.383 0.176 22
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 SVM + Bigrams 0.624 0.409 SVM + TextFeatures 0.624 0.405 SVM + AverageWordVec 0.568 0.319 Conv-Gated NN 0.660 0.425 (BiDirectional Gated Avg) Document Modeling Yelp 2015 (5-class) IMDB (10-class) Average 0.614 0.366 Recurrent 0.383 0.176 Recurrent Avg 0.597 0.344 23
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 SVM + Bigrams 0.624 0.409 SVM + TextFeatures 0.624 0.405 SVM + AverageWordVec 0.568 0.319 Conv-Gated NN 0.660 0.425 (BiDirectional Gated Avg) Document Modeling Yelp 2015 (5-class) IMDB (10-class) Average 0.614 0.366 Recurrent 0.383 0.176 Recurrent Avg 0.597 0.344 Gated NN 0.651 0.430 24
Yelp 2015 (5-class) IMDB (10-class) Majority 0.369 0.179 SVM + Unigrams 0.611 0.399 SVM + Bigrams 0.624 0.409 SVM + TextFeatures 0.624 0.405 SVM + AverageWordVec 0.568 0.319 Conv-Gated NN 0.660 0.425 (BiDirectional Gated Avg) Document Modeling Yelp 2015 (5-class) IMDB (10-class) Average 0.614 0.366 Recurrent 0.383 0.176 Recurrent Avg 0.597 0.344 Gated NN 0.651 0.430 Gated NN Avg 0.657 0.416 25
In Summary • We develop a neural network approach for document level sentiment classification. • We model document with gated recurrent neural network, and we show that adding neural gates could significantly boost the classification accuracy. • The codes and datasets are available at: http://ir.hit.edu.cn/~dytang 26
Thanks 27
Recommend
More recommend