Predicting the Future with Deep Learning and Signals from Social - PowerPoint PPT Presentation

Predicting the Future with Deep Learning and Signals from Social Media SVITLANA VOLKOVA, PHD Senior Research Scientist Data Sciences and Analytics Group, National Security Directorate Pacific Northwest National Laboratory ACL Workshop on Natural Language Processing and Computational Social Science August 10, 2017 1

Social Media Analytics Forecasting Analytics Predictive Analytics Identify Forecast Suspicious Perspective Accounts Dynamics Brussels Bombings March 2016 Predict Forecast Final Output Probabilities } Probability Activation Layer (sigmoid/softmax) Deceptive Language … } Dense Layer (100 units) } Tensor Concatenation … } Dense Layer (100 units) Dense Layer … } LSTM/ (100 units) … } Convolutional Change News Layer (100 units) Dense Layer … } (100 units) Embedding … } Layer (200 units) Network/ } } Input Word Linguistic Sequences Cues Russia-Ukraine Forecast Detect Conflict 2014 – 2015 Future Real-World Forecast Events Events and The most likely Conflict event type Predict } Fully Influenza and connected } Output Instability Probabilities layer } Softmax Layer Native } LSTM … Weather layer } Dense Layer (128 units) LSTM … } .4 .3 .3 .3 pre-trained Event Types (100 units) Language … … … … Entity … Distributions } Dense Layer .3 .1 .3 .1 (100 units) Predicted weekly .03 t 0 t 1 t 3 t 4 embedding ILI proportions dimension (100) } Fully } russian } Binary connected … tanks Output } Input layer spotted … Embeddings in Dense crimea … } ES DE FR JA IN Classification Merge today Layers layer Bidirectional } … } } GRU LSTM LSTM (20 units) layer layer Bidirectional … } GRU (20 units) .4 .3 .3 .3 .03 .01 .02 .05 … Embedding } Layer t 0 t 1 t 3 t 4 … … … … (30 units) ILI predictors .1 .3 .1 .3 English Input (Bytes) t 0 t 1 t 3 t 4 SM predictors August 10, 2017 2

Outline Predicting Suspicious and Trusted News on Twitter Final Output Probabilities } Probability Activation Layer (sigmoid/softmax) … } (joint work with K. Shaffer, J. Yang, and N. Hodas) Dense Layer (100 units) } Tensor Concatenation Dense Layer … } (100 units) … Dense Layer } LSTM/ (100 units) … } Convolutional Layer (100 units) … Dense Layer } (100 units) Embedding … } Layer (200 units) Network/ } Input Word } Linguistic Sequences Cues Analyzing and Forecasting Targeted Perspectives in Social Media (collaboration with H. Rashkin and Y. Choi) ) Writer P t n ( e w g the writer a the predicate → → portrays the t doesn’t directly = h — e w agent as being m imply what the ( P e unfairly ) writer thinks of P (agent → theme) — opportunistic the theme Agent Theme — agent is unfairly taking advantage — = of the theme Reader Forecasting Short-Term Change in Text Representations during Crisis Events from VK (joint work with I. Stewart, D. Arendt, and E. Bell) August 10, 2017 3

Outline Predicting Suspicious and Trusted News on Twitter Final Output Probabilities } Probability Activation Layer (sigmoid/softmax) … } (joint work with K. Shaffer, J. Yang, and N. Hodas) Dense Layer (100 units) } Tensor Concatenation Dense Layer … } (100 units) … Dense Layer } LSTM/ (100 units) … } Convolutional Layer (100 units) … Dense Layer } (100 units) Embedding … } Layer (200 units) Network/ } Input Word } Linguistic Sequences Cues Analyzing and Forecasting Targeted Perspectives in Social Media (collaboration with H. Rashkin and Y. Choi) Forecasting Short-Term Change in Text Representations during Crisis Events from VK (joint work with I. Stewart, D. Arendt, and E. Bell) August 10, 2017 4

Motivation and Background 62% of U.S. adults get news on social media (Pew Research, Oct 2016) 64% of U.S. adults said that “made-up news” has caused a “great deal of confusion” about the facts of current events (Pew Research, Dec 2016) Previous work on deception detection: Deceptive Amazon reviews (Choi, Mihalcea) Satirical news (Rubin et al.2015) Rumors (Qazvinian et al., 2011; Liu et al., 2015) Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on August 10, 2017 5 Twitter. S. Volkova, K. Shaffer, J. Yea Jang and N. Hodas. ACL 2017.

Deceptive News Google Fact Checking: https://www.blog.google/topics/journalism-news/expanding-fact-checking-google/ Facebook 3 rd Party Verification: http://newsroom.fb.com/news/2016/12/news-feed-fyi-addressing-hoaxes-and-fake-news/ August 10, 2017 6

Deceptive News Types Propaganda Hoax Clickbait Satire Intent to Deceive No Intent to Deceive Propaganda deliberately spread misinformation in order to appeal to certain groups Hoax seek to mislead, rather than entertain, readers for financial or political gain Clickbait take bits of true stories but insinuate and make up other details to sew fear Satire take fun of the news, are satirical bent, or parodies of news August 10, 2017 7

Twitter News Data Propaganda Hoax Clickbait Satire Disinfo Propaganda Conspiracy Hoax Clickbait Intent to Deceive No Intent to Deceive No Intent to Deceive Intent to Deceive 2M suspicious tweets 130K total 65K suspicious August 10, 2017 8

News Categorization http://www.marketwatch.com/story/how-does-your-favorite-news-source-rate-on-the-truthiness-scale-consult-this-chart-2016-12-15

Alternative News Categorization http://www.marketwatch.com/story/how-does-your-favorite-news-source-rate-on-the-truthiness-scale-consult-this-chart-2016-12-15

Annotations Brussels bombing dataset March 15 – March 29, 2016 One week after and before March 22 nd , 2016 Account-level vs. tweet-level annotations: Fake news annotations http://www.fakenewswatch.com/ PropOrNot http://www.propornot.com/p/the-list.html (manually verified) Signs of propaganda Tries to persuade Influences the emotions, attitudes, opinions, and actions Target audiences for political, ideological, and religious purposes Have examples of selectively-omitting and one-sided messages August 10, 2017 11

Task Definition Build tweet-level neural network models to differentiate between: Verified vs. unverified news posts (130K) ? Intent to Deceive No Intent to Deceive Types of unverified news posts: propaganda, hoax, clickbait, satire (65K) Propaganda Hoax Clickbait Satire No Intent to Deceive Intent to Deceive disinformation, propaganda, conspiracy, clickbait, hoaxes (2M) Disinfo Propaganda Conspiracy Hoax Clickbait No Intent to Deceive Intent to Deceive August 10, 2017 12

Model Baselines: logistic regression with TFIDF and Doc2Vec representations Our models: neural networks (RNN/CNN) with social network interaction and linguistic cues: hedging, assertive, factive, implicative verbs Final Output Probabilities } Probability Activation Layer (sigmoid/softmax) … } Dense Layer (100 units) } Tensor Concatenation Dense Layer … } (100 units) … Dense Layer } LSTM/ (100 units) … } Convolutional Layer (100 units) … Dense Layer } Embedding (100 units) … } Layer (200 units) Network/ Input Word } } Linguistic Sequences Cues August 10, 2017 13 Keras: https://keras.io/, scikit-learn: http://scikit-learn.org/stable/, Doc2Vec: https://pypi.python.org/pypi/gensim

Linguistic Analysis Moral Foundation Theory (Haidt and Grahm, 2007, Graham et al., 2009) Harm, Care, Loyalty, Betrayal, Authority Biased Language (Recasens et al., 2013) Assertive, Factive, Hedging, Implicative, Report Verbs Subjective Language (Volkova et al., 2013, Liu et al., 2005, Riloff et al., 2003) Betrayal↑, Care↑, Loyalty↓, Hedging↓, Implicative↓ Loyalty↑, Hedges↑, Subj↑, Betrayal↓ Care↓, Subjective↓, Factive↓, Bias↓ August 10, 2017 14

Verified vs. Suspicious Prediction Results Binary: linguistic and social graph features (130K tweets, 10 fold c.v.) ? Intent to Deceive No Intent to Deceive LR D2V LR TFIDF RNN CNN 0.95 1 0.93 Accuracy 0.9 0.81 0.76 0.8 0.7 0.6 text + graph + ling. cues all 15 August 10, 2017

Suspicious News Prediction Results (1) Multi-class prediction: satire, hoaxes, clickbaits, propaganda (65K) Propaganda Hoax Clickbait Satire Intent to Deceive No Intent to Deceive RNN CNN LR TFIDF LR D2V 0.71 0.8 0.66 0.63 0.63 F1 macro 0.6 0.4 0.2 text + network + ling. all markers August 10, 2017 16

Suspicious News Prediction Results (2) Multi-class prediction: disinformation, propaganda, conspiracy, clickbait, hoaxes (2M) Disinfo Propaganda Conspiracy Hoax Clickbait No Intent to Deceive Intent to Deceive 4-way (no disinfo) 5-way 1 0.85 0.84 0.78 0.76 F1 macro 0.8 0.67 0.65 0.6 0.4 0.2 0 words + network + deepwalk 0.98 0.92 0.71 0.64 0.61 August 10, 2017 17

Predicting the Future with Deep Learning and Signals from Social - PowerPoint PPT Presentation

Predicting the Future with Deep Learning and Signals from Social Media SVITLANA VOLKOVA, PHD Senior Research Scientist Data Sciences and Analytics Group, National Security Directorate Pacific Northwest National Laboratory ACL Workshop on

Natural Language Processing with Deep Learning CS224N The Future of Deep Learning + NLP Kevin

Predicting and hiding personal information from from face images using deep learning Sebastian

Shallow Reading with Deep Learning Predicting popularity of online content using only its title W.

Deep Learning based tonal detection for passive sonar signals Dae-Jin Jung, Jihun Park, Sang Ho

Predicting Deep Zero-Shot Convolutional Neural Networks using Textual Descriptions Jimmy Lei Ba,

Deep Learning with Audio Signals Prepare, Process, Design, Expect Keunwoo Ch i Keunwoo Choi

Computational Systems Biology Deep Learning in the Life Sciences 6.802 6.874 20.390 20.490

DEEP LEARNING ON RF DATA Adam Thompson | Senior Solutions Architect Background Information Radio

Predicting the Transient Signals from Galactic Centers: Circumbinary Disks and Tidal Disruptions

Computational Systems Biology Deep Learning in the Life Sciences 6.802 6.874 20.390 20.490

Computational Systems Biology Deep Learning in the Life Sciences 6.802 6.874 20.390 20.490

DSC 102 Systems for Scalable Analytics Arun Kumar Topic 6: Deep Learning Systems 1 Outline

Hao Su July 6, 2017 Outline Overview of 3D deep learning 3D deep learning algorithms

INTRODUCTION TO MACHINE LEARNING Joseph C. Osborn CS 51A Spring 2020 Machine Learning is

ACCELERATE DEEP LEARNING WITH NVIDIA'S DEEP LEARNING PLATFORM | STEPHEN JONES | GTC16 DEEP

Deep Learning on GPUs March 2016 What is Deep Learning? GPUs and DL AGENDA DL in practice

LEADING THE LEARNING REVOLUTION MICHAEL FULLAN QUEST 2015 DEEP LEARNING IN A DIGITAL

Reproducibility and Replicability in Deep Reinforcement Learning (and Other Deep Learning

Image Classification with DIGITS NVIDIA Deep Learning Institute 1 DEEP LEARNING INSTITUTE DLI

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

10703 Deep Reinforcement Learning Reinforcement Learning in Humans and Animals Tom Mitchell

An Overview of Deep Residual Learning Semih Yagcioglu 01.03.2016 Deep Residual Learning

Predicting Temporal Sets with Deep Neural Networks Le Yu, Leilei Sun*, Bowen Du, Chuanren Liu, Hui

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree