Transformation Networks for Target-Oriented Sentiment Classification 1 Xin Li 1 , Lidong Bing 2 , Wai Lam 1 , Bei Shi 1 1 The Chinese University of Hong Kong 2 Tencent AI Lab ACL 2018 1 Joint work with Tencent AI Lab Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 1 / 25
Outline Target-Oriented Sentiment Classification 1 Introduction Problem Formulation Transformation Networks for Target-Oriented Sentiment Classification 2 Motivation The proposed model Experiment 3 Settings Comparative Study Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 2 / 25
Outline Target-Oriented Sentiment Classification 1 Introduction Problem Formulation Transformation Networks for Target-Oriented Sentiment Classification 2 Motivation The proposed model Experiment 3 Settings Comparative Study Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 3 / 25
Introduction Target-Oriented Sentiment Classification (TOSC) is to detect the overall opinions / sentiments of the user review towards the given opinion target. TOSC is a supporting task of Target / Aspect-based Sentiment Analysis [5]. TOSC has been investigated extensively in other names: – Aspect-level Sentiment Classification [1, 7, 10, 11, 12]. – Targeted Sentiment Prediction [6, 14]. – Target-Dependent Sentiment Classification [2, 9]. Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 4 / 25
Outline Target-Oriented Sentiment Classification 1 Introduction Problem Formulation Transformation Networks for Target-Oriented Sentiment Classification 2 Motivation The proposed model Experiment 3 Settings Comparative Study Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 5 / 25
Problem Formulation TOSC is a typical classification task but the input texts come from two sources: Target: explicitly mentioned phrase of opinion target, also called 1 “aspect term” or “aspect”. Context: the original review sentence or the sentence without target 2 phrase. TOSC is to predict the overall sentiment of the context towards the target. Example [Boot time] is super fast, around anywhere from 35 seconds to 1 minute. – This review conveys positive sentiment over the input “Boot time” . Great [food] but the [service] is dreadful. – Given the target “food” , the sentiment polarity is positive while if the input target is “service” , it becomes negative. Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 6 / 25
Outline Target-Oriented Sentiment Classification 1 Introduction Problem Formulation Transformation Networks for Target-Oriented Sentiment Classification 2 Motivation The proposed model Experiment 3 Settings Comparative Study Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 7 / 25
Motivation 1 Convolutional Neural Network (CNN) is more suitable for this task than Attention-based Models [1, 6, 7, 10, 11, 12, 13]. – Sentiments towards the targets are usually determined by key phrases. Example: This [dish] is my favorite and I always get it and never get ✿✿✿✿✿✿✿✿✿✿ tired of it. CNN whose aim is to capture the most informative n-grams (e.g., “is my favorite”) in the sentence should be a suitable model. – Attention-based weighted combination of the entire word-level features may introduce some noises (e.g., “never” and “tired” in above sentence). We employ proximity-based CNN rather than attention-based RNN as the top-most feature extractor. Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 8 / 25
Motivation 2 CNN likely fails in cases where a sentence expresses different sentiments over multiple targets. – Example: great [food] but the [service] was ✿✿✿✿✿✿✿ dreadful! ✿✿✿✿ – CNN cannot fully explore the target information via vector concatenation. – Combining context information and word embedding is an effective way to represent a word in the convolution-based architecture [4] Our Solution: (i) We propose a “Target-Specific Transformation” (TST) component to better consolidate the target information with word representations. (ii) We design two context-preserving mechanisms “Adaptive Scaling” (AS) and “Loseless Forwarding” (LF) to combine the contextualized representations and the transformed representations. Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 9 / 25
Motivation 3 Most of the existing works do not discriminate different words in the same target phrase – In the target phrase, different words would not contribute equally to the target representation. – For example, in “amd turin processor” , phrase head “processor” is more important than “amd” and “turin” . Our TST solves this problem in two steps: (i) Explicitly calculating the importance scores of the target words. (ii) Conducting word-level association between the target and its context. Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 10 / 25
Outline Target-Oriented Sentiment Classification 1 Introduction Problem Formulation Transformation Networks for Target-Oriented Sentiment Classification 2 Motivation The proposed model Experiment 3 Settings Comparative Study Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 11 / 25
Model Overview (𝑚+1) ℎ 𝑜 𝑧 Convolution Layer CPT Conv2d LF/AS (𝑀) (𝑀) (𝑀) ℎ 1 ℎ 2 ℎ 𝑜 (𝑚) TST ෨ ℎ 𝑜 CPT CPT CPT ··· Transformation Architecture fully-connected (𝑚) 𝜐 ··· ··· ··· 𝑠 ℎ 𝑜 𝑗 (1) (1) (1) ℎ 1 ℎ 2 ℎ 𝑜 CPT CPT ··· CPT (0) (0) (0) ℎ 1 ℎ 2 ℎ 𝑜 𝜐 𝜐 ℎ 1 ℎ 2 𝜐 ℎ 𝑛 Bi-directional LSTM ··· (𝑚) ℎ 𝑜 ··· ··· ··· 𝜐 𝜐 𝜐 𝑦 1 𝑦 2 𝑦 𝑛 𝑦 1 𝑦 2 𝑦 𝑜 Figure: Architecture of TNet. Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 12 / 25
Model Overview The proposed TNet consists of the following three components: 1 (BOTTOM) Bi-directional LSTM for memory building – Generating contextualized word representations. 2 (MIDDLE) Deep Transformation architecture for learning target-specific word representations – Refining word-level representations with the input target and the contextual information. 3 (TOP) Proximity-based convolutional feature extractor. – Introducing position information to detect the most salient features more accurately. Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 13 / 25
Deep Transformation Architecture Deep Transformation Architecture stacks multiple Context-Preserving Transformation (CPT) layers – Deeper network helps to learn more abstract features (He et al., CVPR 2016; Lecun et al., Nature 2015). Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 14 / 25
CPT Layer The functions of the CPT layer are two folds: 1 Incorporating opinion target information into (𝑚) ෨ ℎ 𝑗 the word-level representations. TST fully-connected – Generating context-aware target representations r τ i conditioned on the i -th (𝑚) 𝜐 𝑠 ℎ 𝑗 𝑗 word representation h ( l ) fed to the l -th layer: i m 𝜐 𝜐 𝜐 ℎ 1 ℎ 2 ℎ 𝑛 j ∗ F ( h ( l ) � r τ i = h τ i , h τ j ) , ··· (𝑚) ℎ 𝑗 j =1 ··· exp ( h ( l ) ⊤ h τ j ) 𝜐 𝜐 𝜐 𝑦 1 𝑦 2 𝑦 𝑛 F ( h ( l ) i i , h τ j ) = , � m k =1 exp ( h ( l ) ⊤ k ) h τ i Figure: Target-Specific – Obtaining target-specific word representations Transformation (TST) h ( l ) ˜ i : component h ( l ) = g ( W τ [ h ( l ) ˜ : r τ i ] + b τ ) , i i Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 15 / 25
CPT Layer 2 Preserving context information for the upper layers – We design two Context-Preserving Mechanisms to add context h ( l ) information back to the transformed word features ˜ i (i) Adaptive Scaling ( AS ) (Similar to Highway Connection [8]): t ( l ) = σ ( W trans h ( l ) + b trans ) , i i h ( l +1) = t ( l ) ⊙ ˜ h ( l ) + (1 − t ( l ) i ) ⊙ h ( l ) i . i i i (ii) Lossless Forwarding ( LF ) (Similar to Residual Connection [3]): h ( l +1) = h ( l ) h ( l ) + ˜ i . i i Xin Li, Lidong Bing, Wai Lam, Bei Shi Transformation Networks for Target-Oriented Sentiment Classification ACL 2018 16 / 25
Recommend
More recommend