SEMI-SUPERVISED STANCE DETECTION IN TWEETS BASED ON SENTIMENT RULES - PowerPoint PPT Presentation

1 SEMI-SUPERVISED STANCE DETECTION IN TWEETS BASED ON SENTIMENT RULES Marcelo Dias and Karin Becker Instituto de Informática – UFRGS – Porto Alegre - Brazil marcelo.dias@inf.ufrgs.br and karin.becker@inf.ufrgs.br

Introduction 2  Opinion Analysis  Detect sentiment polarity (negative or positive)  T arget (often mentioned in the text)  Stance Detection  Detect Stance (against or favor)  T owards a given target (main target vs indirect targets)  In favor stance can be expressed through positive/negative sentiments (and vice-versa)

Introduction 3  Related Work  Structured text or discussion threads (congress vote, on-line debate, ....)  wider textual context to interpret content  [Thomas et al. 2006] [Anand et al. 2011] [Somasundaran and Wiebe 2009]  T weets: short text and poorly written content  rely more on inferences from static/dynamic properties of the platform  [Rajadesingan and Liu 2014]  Less focus on properties extracted from textual contents only  Most works adopt supervised methods  Often address a binary problem (Favor/Against)

Goal 4  Stance Detection based only on tweets textual content  Rule-based, Semi-supervised method  3 classes problem (Favor, Against and None)  Improvements on our early work  Third place in SemEval 2016 T ask 6-B (unsupervised, Trump T arget)  Evaluate generality using several distinct domains  SemEval 2016 T ask 6-A T argets (supervised)

Process Overview 6

Process Overview 7

Process Overview: automatic labeling 8

Key and T arget N-grams 9  Key n-grams: terms/phrases that denote a stance  T arget n-grams: identify a target directly or indirectly related to main target  combined with polarity to denote a stance  May be Favor or Against Main target: Hillary Clinton N-GRAMS FAVOR AGAINST KEY ReadyForHillary, StopHillary, Hillary2016 MakeAmericaGreatAgain TARGET Hillary, Democrats T rump, Republicans

Key and T arget N-grams Identifjcation 10

Key and T arget N-grams Identifjcation 11  Input: domain corpus  Current selection  N-Gram frequency ranking  Manual selection of top frequent n-grams  Output: selected Key and T arget n-grams  Currently evaluating automatic n-grams selection methods

Process Overview: Automatic Labeling 12

Rules x Stance 13 FEATURES Presence of at least one Favor/Against Key N-grams Presence of at least one Favor/Against T arget N-grams Presence of at least one hashtag T weet Polarity

Automatic Labeling 18  Input: selected n-grams and a dataset  T weet Pre-processing  features extraction  tweet polarity detection (combination of ofg-the- shelf APIs)  Rules Application  Output: Filtered labeled tweets and discarded tweets

Predictive Model Generation 20

Method Overview: Stance Detection 22

Experiments 24  Goal:  Generality of the method for stance detection  6 datasets on various domains  Rules coverage  Rules precision  Stance prediction

Datasets: SemEval 2016 – T ask 6 25  Stance: Against, Favor or None  Subtask A – Supervised  5 targets with 2 datasets each (training and test)  Atheism, Climate change is a real concern, Feminism, Hillary Clinton and Legalization of Abortion  Subtask B – Semi- supervised/Unsupervised  1 targets with 2 datasets each (domain and test) Fonte:  Donald Trump http://www.saifmohamma d.com/WebPages/StanceD ataset.htm

Rules Coverage 26  Average corpus coverage: 75%  In general, Rules 2, 3, 4 and 7 were representative  13% to 17%  Rules 5 and 6 are representative only for Atheism  Rule 1 is representative only for Feminism

Rules Precision 27 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% RULE 1 RULE 2 RULE 3 RULE 4 RULE 5 RULE 6 RULE 7

Automatic Labeling x Predictive Model 28 Precision weighted Average 80 77 75 69 69 70 63 62 58 60 56 48 50 42 41 Automatic Labelling 40 35 Predictive Model 30 20 10 0 Abortion Atheism Climate Feminism Hillary Trump

Results x Baseline 29 0.7 0.63 0.62 0.61 0.58 0.57 0.6 0.56 0.54 0.54 0.51 0.48 0.48 0.5 0.42 0.4 0.3 OUR RESUL T 0.2 SEMEVAL WINNER 0.1 0 Except for Trump, all the baselines were developed using a supervised method

Strengths and Weakness 30  Strengths  Simplicity of the method  May be applied to difgerent domains/targets  Simplify the manual corpus annotation efgort  Restricted to n-grams  Weakness  Dependent on the appropriate selection of n-grams  Requires domain knowledge  Some rules do not perform well  Performance depends on the prevalence of the class

Future Work 31  Key and target N-grams automatic identifjcation  Revised set of rules  Neutral stance identifjcation improvement  Improvement of supervised-learning predictive models  Predictive model features  Automatic extraction of training instances from authority twitter profjles  Classifjcation algorithms or committees

SEMI-SUPERVISED STANCE DETECTION IN TWEETS BASED ON SENTIMENT RULES - PowerPoint PPT Presentation

1 SEMI-SUPERVISED STANCE DETECTION IN TWEETS BASED ON SENTIMENT RULES Marcelo Dias and Karin Becker Instituto de Informtica UFRGS Porto Alegre - Brazil marcelo.dias@inf.ufrgs.br and karin.becker@inf.ufrgs.br Introduction 2

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

Semi-Supervised Kernel Mean Shift Clustering A Semi-Supervised Clustering Approach Motivation:

Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant

Support Vector Machines (SVMs). Semi-Supervised Learning. Semi-Supervised SVMs.

Semi-Supervised Learning Maria-Florina Balcan 03/30/2015 Readings: Semi-Supervised Learning.

CS330 Paper Presentation: October 16th, 2019 Supervised Classification Semi-Supervised

Iterative Hybrid Algorithm for Semi-supervised Classification Martin SAVESKI Supervised by

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Stance Detection Sujit Kumar (186101107) Ph. D. Student Under the Supervision of Dr. Sanasam

Cross-Target Stance Classification with Self-Attention Networks Chang Xu, Ccile Paris, Surya

Filtering tweets AN ALYZ IN G S OCIAL MEDIA DATA IN R Vivek Vijayaraghavan Data Science Coach

Genres: Discourse, Speech, and Tweets Sentiment, Subjectivity & Stance Ling 575 April 15,

Normalizing tweets with edit scripts and recurrent neural embeddings Grzegorz Chrupaa |

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Shoestring: Graph-Based Semi- Supervised Classification with Severely Limited Labeled Data Wanyu

Data at the Speed of your Users Apache Cassandra and Spark for simple, distributed, near real-time

10/11/17 Ron Rogers Barbara Boone @ronbrogers @ boonebbuzz Ron_Rogers@ocali.org

Large-Scale Data Engineering Data streams and low latency processing event.cwi.nl/lsde DATA

Tagvisor: A Privacy Advisor for Sharing Hashtags Yang Zhang Joint work with Mathias Humbert,

#ACTIVISM & YOU CREATING CHANGE IN 140 CHARACTERS Patwin Land ROLL CALL Who here has a

Social Media & Citizen Science Giulia Annovi SISSA | 20 March 2017 Social Media, in

The Politics of Squares Professor Helmut K Anheier Laurie Penny Dean, Hertie School of

The Politics of Squares Professor Helmut K Anheier Laurie Penny Dean, Hertie School of

SEMI-SUPERVISED STANCE DETECTION IN TWEETS BASED ON SENTIMENT RULES - PowerPoint PPT Presentation

1 SEMI-SUPERVISED STANCE DETECTION IN TWEETS BASED ON SENTIMENT RULES Marcelo Dias and Karin Becker Instituto de Informtica UFRGS Porto Alegre - Brazil marcelo.dias@inf.ufrgs.br and karin.becker@inf.ufrgs.br Introduction 2

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

Semi-Supervised Kernel Mean Shift Clustering A Semi-Supervised Clustering Approach Motivation:

Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant

Support Vector Machines (SVMs). Semi-Supervised Learning. Semi-Supervised SVMs.

Semi-Supervised Learning Maria-Florina Balcan 03/30/2015 Readings: Semi-Supervised Learning.

CS330 Paper Presentation: October 16th, 2019 Supervised Classification Semi-Supervised

Iterative Hybrid Algorithm for Semi-supervised Classification Martin SAVESKI Supervised by

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Stance Detection Sujit Kumar (186101107) Ph. D. Student Under the Supervision of Dr. Sanasam

Cross-Target Stance Classification with Self-Attention Networks Chang Xu, Ccile Paris, Surya

Filtering tweets AN ALYZ IN G S OCIAL MEDIA DATA IN R Vivek Vijayaraghavan Data Science Coach

Genres: Discourse, Speech, and Tweets Sentiment, Subjectivity &amp; Stance Ling 575 April 15,

Normalizing tweets with edit scripts and recurrent neural embeddings Grzegorz Chrupaa |

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

Shoestring: Graph-Based Semi- Supervised Classification with Severely Limited Labeled Data Wanyu

Data at the Speed of your Users Apache Cassandra and Spark for simple, distributed, near real-time

10/11/17 Ron Rogers Barbara Boone @ronbrogers @ boonebbuzz Ron_Rogers@ocali.org

Large-Scale Data Engineering Data streams and low latency processing event.cwi.nl/lsde DATA

Tagvisor: A Privacy Advisor for Sharing Hashtags Yang Zhang Joint work with Mathias Humbert,

#ACTIVISM &amp; YOU CREATING CHANGE IN 140 CHARACTERS Patwin Land ROLL CALL Who here has a

Social Media &amp; Citizen Science Giulia Annovi SISSA | 20 March 2017 Social Media, in

The Politics of Squares Professor Helmut K Anheier Laurie Penny Dean, Hertie School of

The Politics of Squares Professor Helmut K Anheier Laurie Penny Dean, Hertie School of

Genres: Discourse, Speech, and Tweets Sentiment, Subjectivity & Stance Ling 575 April 15,

#ACTIVISM & YOU CREATING CHANGE IN 140 CHARACTERS Patwin Land ROLL CALL Who here has a

Social Media & Citizen Science Giulia Annovi SISSA | 20 March 2017 Social Media, in