in Social Media Context Zhongyu Wei 1 , Junwen Chen 1 , Wei Gao 2 , - PowerPoint PPT Presentation

An Empirical Study on Uncertainty Identification in Social Media Context Zhongyu Wei 1 , Junwen Chen 1 , Wei Gao 2 , Binyang Li 1 Lanjun Zhou 1 , Yulan He 3 , and Kam-Fai Wong 1 1 The Chinese University of Hong Kong 2 Qatar Computing Research Institute, Qatar Foundation, Doha, Qatar 3 School of Engineering, Applied Science, Aston University, Birmingham, UK August 5th, 2013 at Sofia, Bulgaria The 51st Annual Meeting of the Association for Computational Linguistics

Background Earthquake Warning …. Election Prediction 2

Background Factuality 3

Uncertainty  “Uncertainty” can be interpreted as lack of information: the receiver of the information (i.e., the hearer or the reader) cannot be certain about some pieces of information”. 4

Uncertainty  Related work  Binary uncertainty classification on formal text.  CoNLL shared task 2010  Existing uncertainty corpus.  Factbank (Newswires)  BioScope (Biology paper)  Wikipedia Weasels (Wikipedia article) 5

Motivation  2011 London Riots dataset  18.9% of 326,747 tweets contain uncertainty keyword  Rare work on social media  Uncertainty identification is domain dependent. Probably  No corpus available in social media context. Possibly Maybe … … 6

Contribution  We propose a variant of classification scheme for uncertainty identification in social media context.  We construct the first uncertainty dataset in social media context.  We perform uncertainty identification experiments and explore effectiveness of different types of features. 7

Traditional Classification*  Epistemic :  On the basis of our world knowledge we cannot decide at the moment whether the statement is true or false.  Possible: It may be raining.  Probable: It is probably raining.  Hypothetical :  This type of uncertainty includes four sub-classes:  Doxastic : I believe Tom can win the game.  Investigation : I examined the result and found … ….  Condition : If tom can win, I will buy you lunch .  Dynamic : I hope tom can win. *Ferenc Kiefer. 2005. Lehetoseg es szuksegszeruseg [Possibility and necessity]. Tinta Kiado, Budapest. 8

Preliminary experiment  827 tweets annotation  Traditional scheme: 65 uncertain  Manually: 246 uncertain  More than 70% uncertain tweet are missing.  Different uncertainty expression on social media. 9

Uncertainty in social media  Three observations  No tweet under category of investigation .  @dobibid I have tested the link, it is fake!  Express uncertainty by question .  @ITVCentral Can you confirm that Birmingham children’s hospital has/hasn’t been attacked by rioters?  Express uncertainty by quoting external information.  Friend who works at the children’s hospital in Birmingham says the riot police are protecting it. 10

Classification for social media Category Subtype Cue Example Possible may It may be raining. Epistemic Probable likely It is probably raining. If it rains, we’ll stay in. Condition if Doxastic believe He believes that the Earth is flat. Dynamic hope fake picture of the london eye on fire... i hope Hypothetical External someone Someone said that London zoo was said attacked. Question seriously? Birmingham riots are moving to the children hospital?! seriously? Based on proposed scheme is based on Kiefer’s work (2005) which was previously extended to normalize uncertainty corpora in d ifferent genres by  Szarvas et al. (2012). Ferenc Kiefer. 2005. Lehetoseg es szuksegszeruseg[Possibility and necessity]. Tinta Kiado, Budapest.  Gy ¨ orgy Szarvas, Veronika Vincze, Rich ´ ard Farkas, Gy ¨ orgy M ´ ora, and Iryna Gurevych. 2012. Crossgenre and cross-domain detection of semantic  uncertainty. Computational Linguistics, 38(2):335 – 367. 11

Annotation  London Riots dataset  August 6-13 2011  4,743 unique tweet related to seven riots events*.  Annotation scheme  Two trained annotators.  Binary judgment in terms of author’s intended meaning .  Sub-class label for tweets with uncertainty label.  A third annotator for final decision.  Cue-phrase identification to form a uncertainty cue-phrase list. *Identified by UK newspaper “ The Guardian ” * 12

Annotation  Tweet #: 4743  Uncertainty#: 926 (19.52%)  Kappa agreement:  0.9073 for binary classification  0.8271 for fine-grained annotation Epistemic Possible# 16 Probable# 129 Condition# 71 Hypothetical Doxastic# 48 Dynamic# 21 External# 208 Question# 488 13

Experiment setup  Task  Uncertainty tweet identification  Approaches  Cue-phrase matching (CP)  Supervised machine learning (SVM *** )  N-grams (unigram + bigram + trigram)  Content-based feature  Twitter-specific feature  User-based feature  Evaluation  5-fold validation  Precision, recall, F-1 14

Experiment Category Name Description Length Length of the tweet Content-based Cue_Phrase Whether the tweet contains a uncertainty cue OOV_Ratio Ratio of words out of vocabulary URL Whether the tweet contains a URL URL_Count Frequency of URLs in corpus Retweet_Count How many times has this tweet been retweeted Twitter-specific Hashtag Whether the tweet contains a hashtag Hashtag_Count Number of hashtag in tweets Reply Is the current tweet a reply tweet Retweet Is the current tweet a retweet tweet Follower_Count Number of follower the user owns List_Count Number of list the users owns Friend_Count Number of friends the user owns User-based Favorites_Count Number of favorites the user owns Tweet_Count Number of tweets the user published Verified Whether the user is verified 15

Experiment Approach Precision Recall F-1 CP 0.3732 0.9589 0.5373 SVM n-gram 0.7278 0.8259 0.7737 (+43.9%*) SVM n-gram+C 0.8010 0.8260 0.8133 SVM n-gram+U 0.7708 0.8271 0.7979 SVM n-gram+T 0.7578 0.8266 0.7907 SVM n-gram+ALL 0.8162 0.8269 0.8215 C: content based features.  U: user based features.  T: twitter specific features.  ALL: the combination of C, U and T.  *compare to CP 16

Experiment  Performance of content-based features Approach Precision Recall F-1 SVM n-gram+Cue-Phrase 0.7989 0.8266 0.8125 SVM n-gram+Length 0.7372 0.8216 0.7715 SVM n-gram+OOV_Ratio 0.7414 0.8233 0.7802  Presence of uncertain cue-phrase is most indicative. 17

Experiment  Classification errors of SVM n-gram+ALL Type Poss. Prob. D.&D. Cond. Que. Ext. Total# 16 129 69 71 488 208 Error# 11 20 18 11 84 40 Error% 0.69 0.16 0.26 0.15 0.17 0.23  Combine dynamic and doxastic for error analysis.  Perform worst on two categories with least samples. 18

Conclusion  Propose a variant of classification scheme for uncertainty identification in social media.  Perform uncertainty identification experiments and explore effectiveness of different type of features.  In future, we will explore to use uncertainty identification for social media applications 19

Questions or Suggestions? 20

Zhongyu Wei ( 魏忠鈺 ) http://www.se.cuhk.edu.hk/~zywei/ zywei@se.cuhk.edu.hk Kam-Fai Wong( 黃錦輝 ) http://www.cintec.cuhk.edu.hk/kfwong/ kfwong@se.cuhk.edu.hk 21

in Social Media Context Zhongyu Wei 1 , Junwen Chen 1 , Wei Gao 2 , - PowerPoint PPT Presentation

An Empirical Study on Uncertainty Identification in Social Media Context Zhongyu Wei 1 , Junwen Chen 1 , Wei Gao 2 , Binyang Li 1 Lanjun Zhou 1 , Yulan He 3 , and Kam-Fai Wong 1 1 The Chinese University of Hong Kong 2 Qatar Computing Research

Presentation 1 What is social media? Get Media Smart social media 2 What is social media?

Social Media Legal Issues Brian C. England Deputy City Attorney Garland, Texas March 7, 2018

Social Media for Mason AGENDA What is Social Media Social Media Strategy Content

Social Media donts What is social media Social media is nothing new Just an extension

Social Media Analytics Ahmed Abbasi University of Virginia 1 Outline Social Media Overview

Getting Social What is social media? Why does social media matter? What social media

Social Media Seminar for Development Educators Part 1: Social Media Basics How are these

Social Media for Business July 28, 2009 What is it? Social media marketing also known as social

network science and social science on Twitter mor naaman rutgers SC&I | social media

Presentation 2 Why is there advertising on social media? Get Media Smart social media 2

Social Media Week BEIRUT Social Media versus Traditional Media; The contradictory results of the

Digital Media Addiction Smart Phones, Social Media and Suicide Fact: Social Media is a

Contents Introduction What is social media Social media overview Classification of

Social media for equality bodies Adam Zbiejczuk & Jaroslav Faltus - Social media for equality

SOCIAL MEDIA & NON PROFITS Tips and tricks for success. Public Relations WHAT IS SOCIAL

Social Media -- Understanding it and Making it Work Preliminary Guidance on Social Media

Catchy vs. Comprehension Tanya Archie Noyce Master Teaching Fellow Secondary Math Coach for

Quantum Game with Photons: Tensors in TypeScript, Visualized Piotr Migda p.migdal.pl /

Recursion Ch 14 Announcements Midterm graded on gradescope Highlights - recursion Recursion

Recursion Ch 14 Highlights - recursion Recursion No fancy blue words or classes this chapter

Programming Language Ideas Escape the Lab: A Declarative Data Description Language Kathleen

COMP 516 Research Methods in Computer Science Dominik Wojtczak Department of Computer Science

GTI Diagonalization A. Ada, K. Sutner Carnegie Mellon University Fall 2017 Comments 1

Memory Models and OpenMP Hans-J. Boehm 6/16/2010 1 Disclaimers: Much of this work was done

Sambuz

Useful Links

Newsletter

Mail Us

in Social Media Context Zhongyu Wei 1 , Junwen Chen 1 , Wei Gao 2 , - PowerPoint PPT Presentation

An Empirical Study on Uncertainty Identification in Social Media Context Zhongyu Wei 1 , Junwen Chen 1 , Wei Gao 2 , Binyang Li 1 Lanjun Zhou 1 , Yulan He 3 , and Kam-Fai Wong 1 1 The Chinese University of Hong Kong 2 Qatar Computing Research

Presentation 1 What is social media? Get Media Smart social media 2 What is social media?

Social Media Legal Issues Brian C. England Deputy City Attorney Garland, Texas March 7, 2018

Social Media for Mason AGENDA What is Social Media Social Media Strategy Content

Social Media donts What is social media Social media is nothing new Just an extension

Social Media Analytics Ahmed Abbasi University of Virginia 1 Outline Social Media Overview

Getting Social What is social media? Why does social media matter? What social media

Social Media Seminar for Development Educators Part 1: Social Media Basics How are these

Social Media for Business July 28, 2009 What is it? Social media marketing also known as social

network science and social science on Twitter mor naaman rutgers SC&amp;I | social media

Presentation 2 Why is there advertising on social media? Get Media Smart social media 2

Social Media Week BEIRUT Social Media versus Traditional Media; The contradictory results of the

Digital Media Addiction Smart Phones, Social Media and Suicide Fact: Social Media is a

Contents Introduction What is social media Social media overview Classification of

Social media for equality bodies Adam Zbiejczuk &amp; Jaroslav Faltus - Social media for equality

SOCIAL MEDIA &amp; NON PROFITS Tips and tricks for success. Public Relations WHAT IS SOCIAL

Social Media -- Understanding it and Making it Work Preliminary Guidance on Social Media

Catchy vs. Comprehension Tanya Archie Noyce Master Teaching Fellow Secondary Math Coach for

Quantum Game with Photons: Tensors in TypeScript, Visualized Piotr Migda p.migdal.pl /

Recursion Ch 14 Announcements Midterm graded on gradescope Highlights - recursion Recursion

Recursion Ch 14 Highlights - recursion Recursion No fancy blue words or classes this chapter

Programming Language Ideas Escape the Lab: A Declarative Data Description Language Kathleen

COMP 516 Research Methods in Computer Science Dominik Wojtczak Department of Computer Science

GTI Diagonalization A. Ada, K. Sutner Carnegie Mellon University Fall 2017 Comments 1

Memory Models and OpenMP Hans-J. Boehm 6/16/2010 1 Disclaimers: Much of this work was done

Sambuz

Useful Links

Newsletter

Mail Us

network science and social science on Twitter mor naaman rutgers SC&I | social media

Social media for equality bodies Adam Zbiejczuk & Jaroslav Faltus - Social media for equality

SOCIAL MEDIA & NON PROFITS Tips and tricks for success. Public Relations WHAT IS SOCIAL