Twitter User Profiling: Bot and Gender Identification 7 th Author - PowerPoint PPT Presentation

Twitter User Profiling: Bot and Gender Identification 7 th Author Profiling Task PAN 2019 – CLEF Workshop Dijana Kosmajac Dr Vlado Keselj Faculty of Computer Science, Dalhousie University Halifax, Nova Scotia, Canada

Overview • Introduction • Bot Detection on Social Media • Methodology • DNA-inspired User Behaviour Fingerprint • Diversity Measures • Dataset of 7 th Author Profiling Task • Experiments and Results • Conclusion Note: for gender detection approach, please refer to the working notes 2

Bot Detection on Social Media • Social media - convenient platforms for people to share, communicate, and collaborate. • Openness of social media is great, but… malicious behaviors happen, such as bullying, terrorist attack planning, and fraud information dissemination, etc. • Important task: detect these abnormal activities as accurately and early as possible to prevent disasters and attacks. • For this study we approached to a subdomain: bot detection Introduction Methodology Dataset Experiments Conclusion 3

Bot and Gender Detection on Social Media • DeBot: Twitter Bot Detection via Warped Correlation, Chavoshi et al., 2016 • DNA-Inspired Online Behavioral Modeling and Its Application to Spambot Detection, Cresci et al., 2016 Introduction Methodology Dataset Experiments Conclusion 4

DNA-inspired User Behaviour Fingerprint • Introduced first time in Cresci et al., 2016 User timeline 3 ∗ 2^3= 24 different labels ACBCADDCCAF… ASCII(65+ code ) Introduction Methodology Dataset Experiments Conclusion 5

DNA-inspired User Behaviour Fingerprint • We used 1-, 2-, 3- and 4-grams • 3-gram example: Introduction Methodology Dataset Experiments Conclusion 6

Diversity Measures 2 1 𝑛 𝑛𝑏𝑦 𝑊(𝑛, 𝑂) 𝑛 • Yule’s 𝐿 = 𝐷 − 𝑂 + σ 𝑛=1 𝑂 𝑊(𝑂) 𝑞 𝑗 ln(𝑞 𝑗 ) • Shannon’s 𝐼 = − σ 𝑗=1 1 • Simpson’s 𝐸 = 𝑊(𝑂) 𝑞 𝑗 2 σ 𝑗=1 log(𝑂) • Honore’s 𝑆 = 100 1− 𝑊(1,𝑂) 𝑊(𝑂) 𝑊(2,𝑂) • Sichel’s 𝑇 = 𝑂 Introduction Methodology Dataset Experiments Conclusion 7

Dataset • Bot t-SNE visualization. (a) English, (b) Spanish • English: • 2,880 train and 1,240 dev • Spanish: • 2,080 train and 920 dev Introduction Methodology Dataset Experiments Conclusion 8

Dataset • Diversity measures visualization for English Honore’s R Yule’s K Shannon’s H Simpson’s D Sichel’s S Introduction Methodology Dataset Experiments Conclusion 9

Dataset • Diversity measures visualization for Spanish Honore’s R Yule’s K Shannon’s H Simpson’s D Sichel’s S Introduction Methodology Dataset Experiments Conclusion 10

Experiments with language-specific training • Experiment 1: character n-grams range 2-4, w/o diversity measures. • Experiment 2: character n-grams 1-3, w/ diversity measures Introduction Methodology Dataset Experiments Conclusion 11

Experiments with combined training • Experiment 3: same as E1, only combined training set • Experiment 4: same as E2, only combined training set Introduction Methodology Dataset Experiments Conclusion 12

Official results • 13 th place in total, better than all baselines. Introduction Methodology Dataset Experiments Conclusion 13

Conclusion and Future Work • A novel, yet simple method for bot detection on social media. • Language independent, since it does not use the language-specific features. • Disadvantage – doesn’t consider language -specific features which may be more fine-grained. • Explore the effect of the length of the user fingerprint on ability to differentiate bot and genuine users. • Explore the effect of the timespan the fingerprint is collected. • Explore the effect of using variable length fingerprint. • Explore possibility of unsupervised bot detection using diversity measures and clustering. Introduction Methodology Dataset Experiments Conclusion 14

Twitter User Profiling: Bot and Gender Identification 7 th Author - PowerPoint PPT Presentation

Twitter User Profiling: Bot and Gender Identification 7 th Author Profiling Task PAN 2019 CLEF Workshop Dijana Kosmajac Dr Vlado Keselj Faculty of Computer Science, Dalhousie University Halifax, Nova Scotia, Canada Overview

How machine learning is used in www.coach-bot.de processing text Fabian Reich www.coach-bot.de

Hello World! The Microsoft Bot Ecosystem Bot Service / Bot Builder SDK Bot Builder SDK

Homework 1 Perl programming - TA bot release and demo attention Irc bot fighting screen shot

Web User Profiling using Data Redundancy http://aminer.org/profiling Xiaotao Gu, Hong Yang, Jie

noribo fun engaging endearing sushi delivery bot dining out experience matters novelty

Profiling of Data-Parallel Processors Daniel Kruck 09/02/2014 09/02/2014 Profiling Daniel

Leaving no one behind The role of evidence-building and profiling to include displacement in

Expression Profiling Mark Voorhies 4/4/2011 Mark Voorhies Expression Profiling Review

COZ : Finding Code that Counts with Causal Profiling Anuja Golechha Agenda Profiling

Optimization Profiling VisualVM Exercise Meme Credit: Randall Munroe, hrefhttp://xkcd.comxkcd

Profiling of Algorithms Profiling refers to the experimental measurement of the performance of

An introduction to Profiling Physics Coding Club: 09/06/2017 D. Dickinson

Internet Society Chapters Advisory Council Topics brought before the BoT BoT meeting, Panama

Game Bot Identification Game Bot Identification based on Manifold Learning based on Manifold

1/37 Lesson: How I Learned to Stop Worrying and Love the Bot 2/37 Lesson: How I Learned to Stop

Designing Empathetic Responses Example Bot : ...This will create an OS to best fit your needs.

11-830 Computational Ethics for NLP Lecture 12: Computational Propaganda History of Propaganda

https://woebot.io/ Soft awareness 2017 chats with Replika https://vimeo.com/250440998#t=40s

Accelerating SE research adoption with Analysis Bots https://github.com/AnalysisBotsPlatform

BotSuer BotSuer BotSuer BotSuer: : : : Suing Stealthy P2P Bots in Network Traffic through

Not-A-Bot: Improving Service Availability in the Face of Botnet Attacks R. Gummadi, H.

Whos Here? 1. Name 2. Library 3. STEM programming @ your Library 4. Computer

Di Discovery of the he Bur ursty Di Discovery of the he Bur ursty Botnet b Bo by u unusu

Why Transformers Work. More info blablabla More info blablabla More info blablabla More

Sambuz

Useful Links

Newsletter

Mail Us

Twitter User Profiling: Bot and Gender Identification 7 th Author - PowerPoint PPT Presentation

Twitter User Profiling: Bot and Gender Identification 7 th Author Profiling Task PAN 2019 CLEF Workshop Dijana Kosmajac Dr Vlado Keselj Faculty of Computer Science, Dalhousie University Halifax, Nova Scotia, Canada Overview

How machine learning is used in www.coach-bot.de processing text Fabian Reich www.coach-bot.de

Hello World! The Microsoft Bot Ecosystem Bot Service / Bot Builder SDK Bot Builder SDK

Homework 1 Perl programming - TA bot release and demo attention Irc bot fighting screen shot

Web User Profiling using Data Redundancy http://aminer.org/profiling Xiaotao Gu, Hong Yang, Jie

noribo fun engaging endearing sushi delivery bot dining out experience matters novelty

Profiling of Data-Parallel Processors Daniel Kruck 09/02/2014 09/02/2014 Profiling Daniel

Leaving no one behind The role of evidence-building and profiling to include displacement in

Expression Profiling Mark Voorhies 4/4/2011 Mark Voorhies Expression Profiling Review

COZ : Finding Code that Counts with Causal Profiling Anuja Golechha Agenda Profiling

Optimization Profiling VisualVM Exercise Meme Credit: Randall Munroe, hrefhttp://xkcd.comxkcd

Profiling of Algorithms Profiling refers to the experimental measurement of the performance of

An introduction to Profiling Physics Coding Club: 09/06/2017 D. Dickinson

Internet Society Chapters Advisory Council Topics brought before the BoT BoT meeting, Panama

Game Bot Identification Game Bot Identification based on Manifold Learning based on Manifold

1/37 Lesson: How I Learned to Stop Worrying and Love the Bot 2/37 Lesson: How I Learned to Stop

Designing Empathetic Responses Example Bot : ...This will create an OS to best fit your needs.

11-830 Computational Ethics for NLP Lecture 12: Computational Propaganda History of Propaganda

https://woebot.io/ Soft awareness 2017 chats with Replika https://vimeo.com/250440998#t=40s

Accelerating SE research adoption with Analysis Bots https://github.com/AnalysisBotsPlatform

BotSuer BotSuer BotSuer BotSuer: : : : Suing Stealthy P2P Bots in Network Traffic through

Not-A-Bot: Improving Service Availability in the Face of Botnet Attacks R. Gummadi, H.

Whos Here? 1. Name 2. Library 3. STEM programming @ your Library 4. Computer

Di Discovery of the he Bur ursty Di Discovery of the he Bur ursty Botnet b Bo by u unusu

Why Transformers Work. *More info blablabla *More info blablabla *More info blablabla *More

Sambuz

Useful Links

Newsletter

Mail Us

Why Transformers Work. More info blablabla More info blablabla More info blablabla More