A Joint Segmentation and Classification Framework for Sentiment - PDF document

A Joint Segmentation and Classification Framework for Sentiment Analysis Duyu Tang ♮ ∗ , Furu Wei ‡ , Bing Qin ♮ , Li Dong ♯ ∗ , Ting Liu ♮ , Ming Zhou ‡ ♮ Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology, China ‡ Microsoft Research, Beijing, China ♯ Beihang University, Beijing, China ♮ { dytang, qinb, tliu } @ir.hit.edu.cn ‡ { fuwei, mingzhou } @microsoft.com ♯ donglixp@gmail.com Abstract timent classification as a special case of text categorization task. Under this perspective, previous In this paper, we propose a joint segmenta- studies typically use pipelined methods with two tion and classification framework for sen- steps. They first produce sentence segmentation- timent analysis. Existing sentiment clas- s with separate text analyzers (Choi and Cardie, sification algorithms typically split a sen- 2008; Nakagawa et al., 2010; Socher et al., 2013b) tence as a word sequence, which does not or bag-of-words (Paltoglou and Thelwall, 2010; effectively handle the inconsistent senti- Maas et al., 2011). Then, feature learning and sen- ment polarity between a phrase and the timent classification algorithms take the segmenta- words it contains, such as “ not bad ” and tion results as inputs to build the sentiment classi- “ a great deal of ”. We address this issue fier (Socher et al., 2011; Kalchbrenner et al., 2014; by developing a joint segmentation and Dong et al., 2014). classification framework ( JSC ), which si- The major disadvantage of a pipelined method multaneously conducts sentence segmen- is the problem of error propagation, since sen- tation and sentence-level sentiment classi- tence segmentation errors cannot be corrected by fication. Specifically, we use a log-linear the sentiment classification model. A typical kind model to score each segmentation candi- of error is caused by the polarity inconsistency be- date, and exploit the phrasal information tween a phrase and the words it contains, such of top-ranked segmentations as features to as � not bad , bad � and � a great deal of , great � . build the sentiment classifier. A marginal The segmentations based on bag-of-words or syn- log-likelihood objective function is de- tactic chunkers are not effective enough to han- vised for the segmentation model, which dle the polarity inconsistency phenomenons. The is optimized for enhancing the sentiment reason lies in that bag-of-words segmentations re- classification performance. The joint mod- gard each word as a separate unit, which losses el is trained only based on the annotat- the word order and does not capture the phrasal ed sentiment polarity of sentences, with- information. The segmentations based on syntac- out any segmentation annotations. Experi- tic chunkers typically aim to identify noun group- ments on a benchmark Twitter sentimen- s, verb groups or named entities from a sentence. t classification dataset in SemEval 2013 However, many sentiment indicators are phrases show that, our joint model performs com- constituted of adjectives, negations, adverbs or id- parably with the state-of-the-art methods. ioms (Liu, 2012; Mohammad et al., 2013a), which are splitted by syntactic chunkers. Besides, a bet- 1 Introduction ter approach would be to utilize the sentiment information to improve the segmentor. Accordingly, Sentiment classification, which classifies the senti- the sentiment-specific segmentor will enhance the ment polarity of a sentence (or document) as posi- performance of sentiment classification in turn. tive or negative, is a major research direction in the field of sentiment analysis (Pang and Lee, 2008; In this paper, we propose a joint segmentation Liu, 2012; Feldman, 2013). Majority of existing and classification framework ( JSC ) for sentimen- approaches follow Pang et al. (2002) and treat sen- t analysis, which simultaneous conducts sentence segmentation and sentence-level sentiment clas- ∗ This work was partly done when the first and fourth sification. The framework is illustrated in Fig- authors were visiting Microsoft Research. 477 Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) , pages 477–487, October 25-29, 2014, Doha, Qatar. c � 2014 Association for Computational Linguistics

CG SC SEG SEG SC that is not bad -1 <+1,-1> NO 0.6 0.6 <+1,-1> NO that is not bad -1 0.4 0.4 that is not bad that is not bad <+1,+1> YES 2.3 2.3 +1 Polarity: +1 Top K 1.6 1.6 that is not bad +1 <+1,+1> YES Input Segmentations Polarity Update Rank Update Figure 1: The joint segmentation and classification framework ( JSC ) for sentiment classification. CG represents the candidate generation model, SC means the sentiment classification model and SEG stands for the segmentation ranking model. Down Arrow means the use of a specified model, and Up Arrow indicates the update of a model. 2 Related Work ure 1. We develop (1) a candidate generation model to generate the segmentation candidates of a sentence, (2) a segmentation ranking model to s- Existing approaches for sentiment classification core each segmentation candidate of a given sen- are dominated by two mainstream directions. tence, and (3) a classification model to predic- Lexicon-based approaches (Turney, 2002; Ding t the sentiment polarity of each segmentation. The et al., 2008; Taboada et al., 2011; Thelwall et phrasal information of top-ranked candidates from al., 2012) typically utilize a lexicon of sentiment the segmentation model are utilized as features to words, each of which is annotated with the sen- build the sentiment classifier. In turn, the predict- timent polarity or sentiment strength. Linguis- ed sentiment polarity of segmentation candidates tic rules such as intensifications and negations are from classification model are leveraged to update usually incorporated to aggregate the sentimen- the segmentor. We score each segmentation can- t polarity of sentences (or documents). Corpus- didate with a log-linear model, and optimize the based methods treat sentiment classification as a segmentor with a marginal log-likelihood objec- special case of text categorization task (Pang et al., tive. We train the joint model from sentences an- 2002). They mostly build the sentiment classifier notated only with sentiment polarity, without any from sentences (or documents) with manually an- segmentation annotations. notated sentiment polarity or distantly-supervised We evaluate the effectiveness of our joint mod- corpora collected by sentiment signals like emoti- el on a benchmark Twitter sentiment classifica- cons (Go et al., 2009; Pak and Paroubek, 2010; tion dataset in SemEval 2013. Results show that Kouloumpis et al., 2011; Zhao et al., 2012). the joint model performs comparably with state- Majority of existing approaches follow Pang et of-the-art methods, and consistently outperforms al. (2002) and employ corpus-based method for pipeline methods in various experiment settings. sentiment classification. Pang et al. (2002) pi- The main contributions of the work presented in oneer to treat the sentiment classification of re- this paper are as follows. views as a special case of text categorization problem and first investigate machine learning meth- • To our knowledge, this is the first work that ods. They employ Naive Bayes, Maximum En- automatically produces sentence segmenta- tropy and Support Vector Machines (SVM) with a tion for sentiment classification within a joint diverse set of features. In their experiments, the framework. best performance is achieved by SVM with bag- of-words feature. Under this perspective, many s- • We show that the joint model yields com- tudies focus on designing or learning effective fea- parable performance with the state-of-the-art tures to obtain better classification performance. methods on the benchmark Twitter sentiment On movie or product reviews, Wang and Man- classification datasets in SemEval 2013. ning (2012) present NBSVM, which trades-off 478

A Joint Segmentation and Classification Framework for Sentiment - PDF document

A Joint Segmentation and Classification Framework for Sentiment Analysis Duyu Tang , Furu Wei , Bing Qin , Li Dong , Ting Liu , Ming Zhou Research Center for Social Computing and Information Retrieval, Harbin

Segmentation Bottom-up Segmentation Semantic / instance segmentation Many Slides from L.

VIDEO SIGNALS Segmentation WHAT IS SEGMENTATION WHAT IS SEGMENTATION Segmentation is a

Learning a Segmentation as Classification Classification Model for Segmentation What is a Good

Semantic Segmentation / Instance Segmentation Based on Deep learning Yiding Liu 2018.12.08

Segmentation using Segmentation using Bayesian Decision Theory Bayesian Decision Theory

Segmentation Segmentation Segmentation Define the accurate boundaries of all objects in an image

Semantic segmentation Image classification Object detection Semantic segmentation Evolution

Pixel-Level Im Image Understanding wit ith Semantic Segmentation and Panoptic Segmentation

Lecture 8: Image Segmentation Peng Chao Face++ Researcher pengchao@megvii.com Nov. 2017

Co-Segmentation of 3D Shapes via Subspace Clustering Ruizhen Hu Lubin Fan

Introduction to RFM segmentation Karolis Urbonas Head of Data Science, Amazon DataCamp

Image Segmentation Machine Learning Study Group Presented by Yaochen Xie Jan 25, 2018 Outline

Model-Based Segmentation and Classification of Gull Trajectories Maike Buchin, Stef Sijben

Image Analysis System Example: Image Classification System pre feature feature segmentation

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

Healthwatch East Sussex Listening | Learning | Influencing | Improving Julie Fitzgerald

Text Analysis Conference TAC 2016 Sponsored by: Hoa Trang Dang National Institute of Standards

Neural Networks for Sentiment Analysis in Czech Ladislav Lenc 1 , 2 & Tom s Hercig 1 , 2 a

Conception and Development of a Pipe & Filter Framework for C++ Johannes Ohlemacher November

Company Ltd 30 October 2016 Jomari Swarts MBA 2016 Keitumetse Bolata MBA 2016 Raymond

Security Intelligence Data Mining University of Amsterdam System & Network Engineering (MSc)

RESULTS FOR THE SIX MONTHS ENDED 31 DECEMBER 2009 25 FEBRUARY 2010 Runge - A global leader in

FINANCIAL RESULTS PRESENTATION FOR THE YEAR ENDED 31 DECEMBER 2017 27 AND 28 FEBRUARY 2018 01

A Joint Segmentation and Classification Framework for Sentiment - PDF document

A Joint Segmentation and Classification Framework for Sentiment Analysis Duyu Tang , Furu Wei , Bing Qin , Li Dong , Ting Liu , Ming Zhou Research Center for Social Computing and Information Retrieval, Harbin

Segmentation Bottom-up Segmentation Semantic / instance segmentation Many Slides from L.

VIDEO SIGNALS Segmentation WHAT IS SEGMENTATION WHAT IS SEGMENTATION Segmentation is a

Learning a Segmentation as Classification Classification Model for Segmentation What is a Good

Semantic Segmentation / Instance Segmentation Based on Deep learning Yiding Liu 2018.12.08

Segmentation using Segmentation using Bayesian Decision Theory Bayesian Decision Theory

Segmentation Segmentation Segmentation Define the accurate boundaries of all objects in an image

Semantic segmentation Image classification Object detection Semantic segmentation Evolution

Pixel-Level Im Image Understanding wit ith Semantic Segmentation and Panoptic Segmentation

Lecture 8: Image Segmentation Peng Chao Face++ Researcher pengchao@megvii.com Nov. 2017

Co-Segmentation of 3D Shapes via Subspace Clustering Ruizhen Hu Lubin Fan

Introduction to RFM segmentation Karolis Urbonas Head of Data Science, Amazon DataCamp

Image Segmentation Machine Learning Study Group Presented by Yaochen Xie Jan 25, 2018 Outline

Model-Based Segmentation and Classification of Gull Trajectories Maike Buchin, Stef Sijben

Image Analysis System Example: Image Classification System pre feature feature segmentation

Graph Classification Classification Outline Introduction, Overview Classification using

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

Healthwatch East Sussex Listening | Learning | Influencing | Improving Julie Fitzgerald

Text Analysis Conference TAC 2016 Sponsored by: Hoa Trang Dang National Institute of Standards

Neural Networks for Sentiment Analysis in Czech Ladislav Lenc 1 , 2 &amp; Tom s Hercig 1 , 2 a

Conception and Development of a Pipe &amp; Filter Framework for C++ Johannes Ohlemacher November

Company Ltd 30 October 2016 Jomari Swarts MBA 2016 Keitumetse Bolata MBA 2016 Raymond

Security Intelligence Data Mining University of Amsterdam System &amp; Network Engineering (MSc)

RESULTS FOR THE SIX MONTHS ENDED 31 DECEMBER 2009 25 FEBRUARY 2010 Runge - A global leader in

FINANCIAL RESULTS PRESENTATION FOR THE YEAR ENDED 31 DECEMBER 2017 27 AND 28 FEBRUARY 2018 01

Neural Networks for Sentiment Analysis in Czech Ladislav Lenc 1 , 2 & Tom s Hercig 1 , 2 a

Conception and Development of a Pipe & Filter Framework for C++ Johannes Ohlemacher November

Security Intelligence Data Mining University of Amsterdam System & Network Engineering (MSc)