the rise of novel twitter social spambots
play

The rise of novel Twitter social spambots SoBigData day @EUI, - PowerPoint PPT Presentation

The rise of novel Twitter social spambots SoBigData day @EUI, Florence, 11-10-2017 Marinella Petrocchi IIT-CNR, Pisa, Italy SPAMBOTS & SOCIAL NETWORKS spambot AN OPEN PROBLEM Spambots (Semi-)automated accounts with (often) harmful


  1. The rise of novel Twitter social spambots SoBigData day @EUI, Florence, 11-10-2017 Marinella Petrocchi IIT-CNR, Pisa, Italy

  2. SPAMBOTS & SOCIAL NETWORKS spambot AN OPEN PROBLEM Spambots (Semi-)automated accounts with (often) harmful intention Misinformation spreading, steal of personal data, manipulation of stock market, infiltration in political discourse

  3. THE RISE OF THE SOCIAL BOTS They escape detection techniques, by evolving: On Twitter: fake followers (till 2012) 1 st evolution (2012-2014) current (?) wave (2015-2017) New spambots are almost indistinguishable from genuine accounts E. Ferrara, O. Varol, C. Davis, F. Menczer, and A. Flammini, “The rise of social bots,” Communica)ons of the ACM , vol. 59, no. 7, pp. 96–104, 2016

  4. FAKE FOLLOWERS

  5. NAIVE FAKE ACCOUNTS WERE EASY TO BUY

  6. The new wave SOCIAL SPAMBOTS

  7. SOCIAL SPAMBOTS Undistinguishable from genuine accounts if analyzed one-by-one Analysis of the online behavior of large groups of users, with the goal of detecting possible spambots among them

  8. The idea MODELING THE ONLINE BEHAVIOR OF USERS Behaviour Sequence of actions performed by an account Digital DNA Each type of action is associated to a character (e.g., A, B, C) The online behaviour of an account is modeled as a sequence of characters (i.e., a string, similarly to biologic DNA) according to the sequence of actions performed by that account

  9. The idea MODELING THE ONLINE BEHAVIOR OF USERS Timeline of a Twitter account R Encoding T tweet, P R retweet, P reply R …RRTRPR T R R

  10. DIGITAL DNA VS BIOLOGIC DNA A adenine, G guanine, T tweet, R retweet, T thymine, C cytosine P reply …RRTRPRTPRRPRTPRPTPRRTRPR …AGTCTCCATTTTCAGGTCGTA …RPRTPTTRPTRPTPRRRRTPPRPP …GTTTAAGATCGCCTCATCACC …TTTRRRPPTPRPTPRTRPTRRRTP …AGGCAATTCGCCTGAACTGG …PRTRPRTPPPPRTPRRPRTPPRRT …AGTCTCGATCCTTTCCTCGTT …TRTRPRTPRRPRTPRPTPTPPRTT …AAAATCGAACGCCTTGTCGG …ATTCTCCATCGCCTAAACAAC …TRPPRTPPTRPPTPRRTTTPPRPR

  11. Spambots characterization SIMILARITY BETWEEN DIGITAL DNA SEQUENCES Intuition Automated accounts (spambots) have similar DNA sequences LCS (longest common substring) Longest substring between N sequences of digital DNA …TRRRPRRTRRPRTPRPTPRRTRPR …RPRTPTTRRRPRRTPRRRRTPPRP RRRPRRT …TTTRRRPRRRPRRTRTRPTRRRTP (length: 7 characters) …PRTRPRTPPPPRTPRRRRRPRRTR M. Arnold and E. Ohlebusch, “Linear Lme algorithms for generalizaLons of the longest common substring problem,” Algorithmica , vol. 60, no. 4, pp. 806–818, 2011

  12. Spambots characterization LCS: SPAMBOTS VS HUMANS LCS: similarity measure

  13. Spambots detection LCS: SPAMBOTS + HUMANS (MIXED GROUP) 1. accounts with high similarity 2. steep decrease in similarity 3. accounts with low similarity

  14. Spambots detection DETECTION TECHNIQUES Unsupervised approach

  15. Spambots detection DETECTION TECHNIQUES 2. Supervised approach

  16. Spambots detection DATASETS Evaluation datasets: 1. Mixed1 (1982 accounts): 50% Bot1, 50% human Mixed2 (928 accounts): 50% Bot2, 50% human 2.

  17. C. Yang, R. Harkreader, and G. Gu, “Empirical evaluaLon and new design for fighLng evolving TwiVer spammers,” IEEE Transac)ons on Informa)on Forensics and Security , vol. 8, no. 8, pp. 1280–1293, 2013 Spambots detection EVALUATION Z. Miller, B. Dickinson, W. Deitrick, W. Hu, and A. H. Wang, “TwiVer spammer detecLon using data stream clustering,” Informa)on Sciences , vol. 260, pp. 64– 73, 2014 F. Ahmed, and M. Abulaish, “A generic staLsLcal approach for spam detecLon in online social networks,” Computer Communica)ons , vol. 36, no. 10, pp. 1120–1129, 2013

  18. TAKE-HOME MESSAGES • New evolutionary wave: social spambots • Current techniques fail in detecting them • Detection via digital DNA analysis: effective and efficient (lightweight features – no graphs – linear complexity algorithms) Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, Maurizio Tesconi: “The Paradigm Shi? of social spambots: Evidence, theories, and tools for the arms race”, WWW 2017 Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, Maurizio Tesconi: “ Social Fingerprin)ng: Detec)on of spambots groups thorugh DNA inspired behavioral modeling” IEEE TransacLons on Dependable and Secure CompuLng, 2017 Stefano Cresci, Roberto Di Pietro, Marinella Petrocchi, Angelo Spognardi, Maurizio Tesconi: “ExploiLng digital DNA for the analysis of similariLes in TwiVer behaviours” IEEE Data Science and AnalyLcs, 2017

  19. Questions? THANK YOU! Marinella Petrocchi marinella.petrocchi@iit.cnr.it http://mib.projects.iit.cnr.it/dataset.html

Recommend


More recommend