Linguistically Regularized LSTM for Sentiment Classification Qiao Qian 1 , Minlie Huang 1 ∗ , Jinhao Lei 2 , Xiaoyan Zhu 1 1 State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information Science and Technology Dept. of Computer Science and Technology, Tsinghua University, Beijing 100084, PR China 2 Dept. of Thermal Engineering, Tsinghua University, Beijing 100084, PR China qianqiaodecember29@126.com , aihuang@tsinghua.edu.cn leijh14@gmail.com , zxy-dcs@tsinghua.edu.cn Abstract In spite of the great success of these neural mod- els, there are some defects in previous studies. This paper deals with sentence-level sen- First, tree-structured models such as recursive au- timent classification. Though a variety toencoders and Tree-LSTM (Tai et al., 2015; Zhu of neural network models have been pro- et al., 2015), depend on parsing tree structures posed recently, however, previous models and expensive phrase-level annotation, whose per- either depend on expensive phrase-level formance drops substantially when only trained annotation, most of which has remark- with sentence-level annotation. Second, linguis- ably degraded performance when trained tic knowledge such as sentiment lexicon, negation with only sentence-level annotation; or do words or negators (e.g., not, never ), and intensity not fully employ linguistic resources (e.g., words or intensifiers (e.g., very, absolutely ), has sentiment lexicons, negation words, inten- not been fully employed in neural models. sity words). In this paper, we propose sim- The goal of this research is to developing sim- ple models trained with sentence-level an- ple sequence models but also attempts to fully em- notation, but also attempt to model the lin- ploying linguistic resources to benefit sentiment guistic role of sentiment lexicons, nega- classification. Firstly, we attempts to develop sim- tion words, and intensity words. Results ple models that do not depend on parsing trees and show that our models are able to cap- do not require phrase-level annotation which is too ture the linguistic role of sentiment words, expensive in real-world applications. Secondly, negation words, and intensity words in in order to obtain competitive performance, sim- sentiment expression. ple models can benefit from linguistic resources. Three types of resources will be addressed in this 1 Introduction paper: sentiment lexicon, negation words, and in- tensity words. Sentiment lexicon offers the prior Sentiment classification aims to classify text to polarity of a word which can be useful in deter- sentiment classes such as positive or negative , or mining the sentiment polarity of longer texts such more fine-grained classes such as very positive, as phrases and sentences. Negators are typical sen- positive, neutral, etc . There has been a variety of timent shifters (Zhu et al., 2014), which constantly approaches for this purpose such as lexicon-based change the polarity of sentiment expression. In- classification (Turney, 2002; Taboada et al., 2011), tensifiers change the valence degree of the modi- and early machine learning based methods (Pang fied text, which is important for fine-grained sen- et al., 2002; Pang and Lee, 2005), and recently timent classification. neural network models such as convolutional neu- In order to model the linguistic role of senti- ral network (CNN) (Kim, 2014; Kalchbrenner ment, negation, and intensity words, our central et al., 2014; Lei et al., 2015), recursive autoen- idea is to regularize the difference between the coders (Socher et al., 2011, 2013), Long Short- predicted sentiment distribution of the current po- Term Memory (LSTM) (Mikolov, 2012; Chung sition 1 , and that of the previous or next positions, et al., 2014; Tai et al., 2015; Zhu et al., 2015), and in a sequence model. For instance, if the cur- many more. 1 Note that in sequence models, the hidden state of the cur- ∗ Corresponding Author: Minlie Huang rent position also encodes forward or backward contexts. 1679 Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics , pages 1679–1689 Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics , pages 1679–1689 Vancouver, Canada, July 30 - August 4, 2017. c Vancouver, Canada, July 30 - August 4, 2017. c � 2017 Association for Computational Linguistics � 2017 Association for Computational Linguistics https://doi.org/10.18653/v1/P17-1154 https://doi.org/10.18653/v1/P17-1154
rent position is a negator not , the negator should 2.2 Applying Linguistic Knowledge for change the sentiment distribution of the next posi- Sentiment Classification tion accordingly. To summarize, our contributions Linguistic knowledge and sentiment resources, lie in two folds: such as sentiment lexicons, negation words ( not, never, neither, etc. ) or negators , and intensity • We discover that modeling the linguistic role words ( very, extremely, etc. ) or intensifiers , are of sentiment, negation, and intensity words useful for sentiment analysis in general. can enhance sentence-level sentiment classi- Sentiment lexicon (Hu and Liu, 2004; Wilson fication. We address the issue by imposing et al., 2005) usually defines prior polarity of a lex- linguistic-inspired regularizers on sequence ical entry, and is valuable for lexicon-based mod- LSTM models. els (Turney, 2002; Taboada et al., 2011), and ma- chine learning approaches (Pang and Lee, 2008). • Unlike previous models that depend on pars- There are recent works for automatic construction ing structures and expensive phrase-level an- of sentiment lexicons from social data (Vo and notation, our models are simple and efficient, Zhang, 2016) and for multiple languages (Chen but the performance is on a par with the state- and Skiena, 2014). A noticeable work that ultilizes of-the-art. sentiment lexicons can be seen in (Teng et al., 2016) which treats the sentiment score of a sen- The rest of the paper is organized as follows: tence as a weighted sum of prior sentiment scores In the following section, we survey related work. of negation words and sentiment words, where the In Section 3, we briefly introduce the background weights are learned by a neural network. of LSTM and bidirectional LSTM, and then de- Negation words play a critical role in modify- scribe in detail the lingistic regularizers for senti- ing sentiment of textual expressions. Some early ment/negation/intensity words in Section 4. Ex- negation models adopt the reversing assumption periments are presented in Section 5, and Conclu- that a negator reverses the sign of the sentiment sion follows in Section 6. value of the modified text (Polanyi and Zaenen, 2006; Kennedy and Inkpen, 2006). The shifting 2 Related Work hyothesis assumes that negators change the senti- ment values by a constant amount (Taboada et al., 2.1 Neural Networks for Sentiment 2011; Liu and Seneff, 2009). Since each negator Classification can affect the modified text in different ways, the constant amount can be extended to be negator- There are many neural networks proposed for sen- specific (Zhu et al., 2014), and further, the ef- timent classification. The most noticeable models fect of negators could also depend on the syntax may be the recursive autoencoder neural network and semantics of the modified text (Zhu et al., which builds the representation of a sentence from 2014). Other approaches to negation modeling can subphrases recursively (Socher et al., 2011, 2013; be seen in (Jia et al., 2009; Wiegand et al., 2010; Dong et al., 2014; Qian et al., 2015). Such recur- Benamara et al., 2012; Lapponi et al., 2012). sive models usually depend on a tree structure of input text, and in order to obtain competitive re- Sentiment intensity of a phrase indicates the sults, usually require annotation of all subphrases. strength of associated sentiment, which is quite Sequence models, for instance, convolutional neu- important for fine-grained sentiment classification ral network (CNN), do not require tree-structured or rating. Intensity words can change the valence data, which are widely adopted for sentiment clas- degree (i.e., sentiment intensity) of the modified sification (Kim, 2014; Kalchbrenner et al., 2014; text. In (Wei et al., 2011) the authors propose a lin- Lei et al., 2015). Long short-term memory models ear regression model to predict the valence value are also common for learning sentence-level rep- for content words. In (Malandrakis et al., 2013), resentation due to its capability of modeling the a kernel-based model is proposed to combine se- prefix or suffix context (Hochreiter and Schmid- mantic information for predicting sentiment score. huber, 1997). LSTM can be commonly applied to In the SemEval-2016 task 7 subtask A, a learning- sequential data but also tree-structured data (Zhu to-rank model with a pair-wise strategy is pro- et al., 2015; Tai et al., 2015). posed to predict sentiment intensity scores (Wang 1680
Recommend
More recommend