“ DeClarE ” Debunking Fake News and False Claims using Evidence-Aware Deep Learning Kashyap Popat, Subhabrata Mukherjee, Andrew Yates, Gerhard Weikum EMNLP-2018
M OTIVATION “Rapid spread of misinformation online" – one of the top 10 challenges as per The World Economic Forum Many truth-checking websites manually verify/falsify claims 1 https://www.aljazeera.com/news/2018/10/bolsonaro-continues-lead-polls-fake-news-scandal-181019220347524.html 2 2 https://www.nytimes.com/2018/10/17/opinion/deep-fake-technology-democracy.html
R ELATED W ORK & L IMITATIONS Truth Finding Conflict resolution amongst Limited to structured data multi-source structured data Joint inference of source No linguistic cues reliability and truth Communities & Social Media Focused only on communities Probabilistic graphical models Community specific features Social Network analysis Natural Language Claims No external evidence Supervised approaches Substantial feature modeling 3
P ROBLEM S TATEMENT Assess the credibility (true/false) of textual claims Presents interpretable evidence supporting the assessment False T extual DeClarE * Claim Evidence True World Wide Web *DeClare: Debunking Claims with Interpretable Evidence 4
T EXTUAL C LAIMS Tucker Carlson: “Far more children died last year drowning in their bathtubs than were killed accidentally by guns." conservativeflashnews.com: “President Obama ordered a life -sized bronze statue of himself to be permanently installed at the White House’’ Coca- Cola’s original diet cola drink, TaB, took its name from an acronym for “totally artificial beverage”. 5
E VIDENCE conservativeflashnews.com: “President Obama ordered a life -sized bronze statue of himself to be permanently installed at the White House’’ ABC News: An article making the rounds on Facebook falsely says that a bronze statue of former President Barack Obama will soon be in the entryway of the White House. But you won't be seeing it any time soon -- or any time at all. The story is false. The Florida Times:The emails have made their way across the internet. But reports that Obama ordered a $200,000 life-size bronze statue of himself to be “permanently installed in the White House” are totally false. 6
O UTLINE Motivation Problem Statement Key Contributors Network Architecture & Approach Experiments & Results Conclusion 7
K EY C ONTRIBUTORS Evidence – Search Engine Language style and semantics of evidence – biLSTM Interaction between claim and external evidence – Attention Mechanism Trustworthiness of underlying sources – Claim and Evidence Source Embeddings 8
I NPUT R EPRESENTATIONS Claim and article: sequences of word embeddings Claim source and article source: source embeddings 𝑑𝑡 ∈ 𝑆 𝑒 𝑡 𝑏𝑡 ∈ 𝑆 𝑒 𝑡 [𝑏 𝑙 ] ∈ 𝑆 𝑒 𝑥 𝑑 𝑚 ∈ 𝑆 𝑒 𝑥 9
A RTICLE R EPRESENTATION Language style aware article representation A biLSTM – hidden state output for each word in the evidence [𝑏 𝑙 ] [ℎ 𝑙 ] 10
C LAIM S PECIFIC A TTENTION (1/2) Importance of each word in the article text w.r.t. the claim 1 𝑚 𝑚 𝑑 𝑚 Overall claim representation: 𝑑 = 𝑑 with each article term: 𝑏 𝑙 = 𝑏 𝑙 ⨁ 𝑑 Append [ 𝑏 𝑙 ] 𝑑 𝑚 [𝑏 𝑙 ] [ℎ 𝑙 ] 11
C LAIM S PECIFIC A TTENTION (2/2) Claim specific attention weights: [𝛽 𝑙 ] [ 𝑏 𝑙 ] 𝑑 𝑚 [𝑏 𝑙 ] [ℎ 𝑙 ] 12
A TTENTION F OCUSED A RTICLE R EPRESENTATION Attention focused article representation [ 𝑏 𝑙 ] [𝛽 𝑙 ] 𝑑 𝑚 [𝑏 𝑙 ] [ℎ 𝑙 ] 13
C REDIBILITY S CORE Per-article credibility score 14
C REDIBILITY S CORE Per-article credibility score 15
E XPERIMENTS Case Studies Snopes (SN) – classification (~4300 claims) PolitiFact (PF) – classification (~3500 claims) NewsTrust (NT) – regression (~5344 news headlines) SemEval-2017 Task (SE) – classification (~250 tweets) Analysis Source embeddings Attention weights 16
E XPERIMENTAL S ETUP Evaluation: 10% of the data for parameter tuning 10-fold cross-validation on 90% of the data Keras with tensorflow backend 17
C ASE S TUDY : S NOPES & P OLITI F ACT Snopes (~4300 claims) “The user of solar panels drains the sun of energy.’’ Verifies Internet rumors, hoaxes, and other claims “Entering your PIN in reverse at any PolitiFact (~3500 claims) ATM will automatically summon Verifies political claims made the police” by politicians in USA Extracted ~30 top search Hillary Clinton: "The gun epidemic is the leading cause of death of young African- results as evidence American men, more than the next nine causes put together." 18
E VALUATION Baselines LSTM-T ext (Rashkin et al., 2017) – no usage of evidence CNN-T ext (Wang, 2017) – no usage of evidence DistantSup (Popat et al., 2017) DeClarE – Our Approach Performance measures per-class accuracies, macro F1, AUC 19
R ESULTS : S NOPES & P OLITI F ACT Dataset Configuration Macro-F1 AUC LSTM-T ext 0.66 0.70 CNN-T ext 0.66 0.72 Snopes DistantSup 0.82 0.88 DeClarE 0.79 0.86 LSTM-T ext 0.63 0.66 CNN-T ext 0.64 0.67 Politifact DistantSup 0.62 0.68 DeClarE 0.68 0.75 20
C ASE S TUDY : N EWS T RUST News review community – members review news articles Each story: article, article source, user reviews and ratings (scale 1 to 5) Title of the article – claim Article source – claim source User reviews – evidence User ids – evidence sources Regression task – predict the credibility score 21
R ESULTS : N EWS T RUST Additional baseline: CCRF+SVR (Mukherjee and Weikum, 2015) Performance measure – Mean Square Error (MSE) Configuration MSE CNN-T ext 0.53 CCRF+SVR 0.36 LSTM-T ext 0.35 DistantSup 0.35 DeClarE 0.29 22
A NALYZING A RTICLE S OURCE E MBEDDINGS Fake Sources Authentic Sources 23
A NALYZING C LAIM S OURCE E MBEDDINGS Republicans Democrats 24
A NALYZING A TTENTION W EIGHTS 25
C ONCLUSION Proposed an end-to-end neural network model No feature modeling Provide interpretable evidence Experiments on real-world claims demonstrate effectiveness of our approach Considering external evidence helps! Datasets: https://www.mpi-inf.mpg.de/dl-cred-analysis/ 26
Thank You! 27
Recommend
More recommend