WarningBird: Detecting Suspicious URLs in Twitter Stream NDSS 2012 - PowerPoint PPT Presentation

WarningBird: Detecting Suspicious URLs in Twitter Stream NDSS 2012 Sangho Lee and Jong Kim POSTECH, Korea February 8, 2012

Suspicious URLs in Twitter • Twitter suffers from malicious tweets. – Containing URLs for spam, phishing, … • Many detection schemes rely on – Features of Twitter accounts and msgs. – Features of URL and content • Many evading techniques also exist. – Feature fabrication – Conditional redirection NDSS 2012 2

Conditional Redirection • Attackers distribute initial URLs of conditional redirect chains via tweets. • Conditional redirection servers will lead – Normal browsers to malicious landing pages – Crawlers to benign landing pages • User agent, IP addresses, repeated visiting, … Misclassifications can occur NDSS 2012 3

Motivation and Goal • Attackers can evade previous detection schemes. – Selectively provide malicious content to normal browsers not to investigators • We propose a novel suspicious URL detection system for Twitter. – Be robust against evasion techniques – Detects suspicious URLs in real time NDSS 2012 4

Outline • Introduction • Case Study • Proposed Scheme • Evaluation • Discussion and Conclusion NDSS 2012 5

Case Study blackraybansunglasses.com 6584 different accounts & short URLs (3% of daily sample) google.com for reused crawlers random spam page for normal browsers NDSS 2012 6

Outline • Introduction • Case Study • Proposed Scheme – Basic Idea – System Overview – Derived Features • Evaluation • Discussion and Conclusion NDSS 2012 7

Basic Idea • Attackers need to reuse redirection servers. – No infinite redirection servers • We analyze a group of correlated URL chains. – To detect redirection servers reused – To derive features of the correlated URL chains NDSS 2012 8

System Overview • Data collection – Collect tweets with URLs from public timeline – Visit each URL to obtain URL chains and IP addresses • Feature extraction – Group domains with the same IP addresses – Find entry point URLs – Generate feature vectors for each entry point NDSS 2012 9

System Overview (continued) • Training – Label feature vectors using account status info. • suspended Þ malicious, active Þ benign – Build classification models • Classification – Classify suspicious URLs NDSS 2012 10

Features • Correlated URL chains – Length of URL redirect chain – Frequency of entry point URL – # of different initial and landing URLs • Tweet context information – # of different Twitter sources – Standard deviation of account creation dates – Standard deviation of friends-followers ratios NDSS 2012 11

Outline • Introduction • Case Study • Proposed Scheme • Evaluation – System Setup and Data Collection – Training Classifiers – Data Analysis – Detection Efficiency – Running Time • Discussion and Conclusion NDSS 2012 12

System Setup and Data Collection • System specification – Two Intel Quad Core Xeon 2.4 GHz CPUs – 24 GiB main memory • Data collection – Twitter Streaming API – One percent samples from Twitter public timeline (Spritzer role) – 27,895,714 tweets with URLs between April 8 and August 8, 2011 (122 days) NDSS 2012 13

Training Classifiers • Training dataset – Tweets between May 10 and July 8 – 183,113 benign and 41,721 malicious entry point URLs • Classification algorithm – L2-regularized logistic regression • 10-fold cross validation – FP: 1.64%, FN: 10.69% NDSS 2012 14

Data Analysis 3758 entry point URLs (on average, daily) 283 suspicious URLs 20 false positive URLs 30 new suspicious URLs • Relatively small number of new suspicious URLs – We detect suspicious URLs that are not detected or blocked by Twitter. NDSS 2012 15

Data Analysis (continued) • Reoccurrences of May 10’s URLs • Up to 12% benign & 52% suspicious URLs NDSS 2012 16

Detection Efficiency • We measure the time difference between – When WarningBird detects suspicious accounts – When Twitter suspends the accounts Avg. time difference: 13.5 min more than 20 hours NDSS 2012 17

Running Time • Processing time for each URL: 28.31 ms – Redirect chain crawling: 24.20 ms • Hundred crawling threads – Domain grouping: 2.00 ms – Feature extraction: 1.62 ms – Classification: 0.48 ms • Our system can classify about 127,000 URLs per hour. – About 12.7% of all public tweets with URLs per hour NDSS 2012 18

Outline • Introduction • Case Study • Proposed Scheme • Evaluation • Discussion and Conclusion NDSS 2012 19

Discussion • Evasion is possible but restricted. – Do not reuse redirection servers • Need extra $ (to buy compromised hosts) • Need more effort to take down hosts – Reduce the rate of malicious tweets • Less effective NDSS 2012 20

Conclusion • We proposed a new suspicious URL detection system for Twitter. • Our system is robust against feature fabrication and conditional redirection. • Evaluation results show accuracy and efficiency. NDSS 2012 21

WarningBird: Detecting Suspicious URLs in Twitter Stream NDSS 2012 - PowerPoint PPT Presentation

WarningBird: Detecting Suspicious URLs in Twitter Stream NDSS 2012 Sangho Lee and Jong Kim POSTECH, Korea February 8, 2012 Suspicious URLs in Twitter Twitter suffers from malicious tweets. Containing URLs for spam, phishing,

URLs K. Cooper 1 1 Department of Mathematics Washington State University 2014 URLs Introduction

Magic URLs in an XML Universe | Contents | 2 Contents Magic URLs in an XML

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

Intercepting Suspicious Chrome Extension Actions Michael Cypher Department of Computing

Actionable Objective Optimization for Suspicious Behavior Detection on Large Bipartite Graphs

Stream Ciphers Stream Ciphers 1 Stream Ciphers Generalization of one-time pad Trade

How Using NNNetwork simplified i18n config Objectives - Configure iOS app for IT/ES (help urls,

Uniform Resource Locators (URLs) Scheme Port Number Query

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

Using Twitter for your CPD Janet Thomas November 2019 #PHYSIO19 Why twitter for CPD?

ML at Twitter: A Deep Dive into Twitters Timeline Cibele Montez Halasz, Twitter Cortex

//Dashboard //Twitter Panel //Twitter Panel Context and Actions Act based on the document

NetFlow Analysis: Detecting covert channels on the network Detecting malicious traffic by using

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

draft-fieau-https-delivery-delegation Frdric Fieau - Orange Iuniana Oprescu - Orange IETF 93

Pointers Ch 9 & 13.1 Highlights - pointers object vs memory address An object is simply a

IT350: Web & Internet Programming Set 19: Security, Hacking and myspace Input to your

Data-Intensive Distributed Computing CS 431/631 (Fall 2020) Part 3: From MapReduce to Spark (1/2)

Making Drupal Friendly for Editors and Clients BADCamp

EECS 394 Software Development Chris Riesbeck Developing Mobile/Web Apps 1 Wednesday, October

Macaroons and dCache or delegating in a cloudy world Patrick Fuhrmann Paul Millar Paul

Session 22 Intra Server Control 1 Lecture Objectives Understand the differences between a

WarningBird: Detecting Suspicious URLs in Twitter Stream NDSS 2012 - PowerPoint PPT Presentation

WarningBird: Detecting Suspicious URLs in Twitter Stream NDSS 2012 Sangho Lee and Jong Kim POSTECH, Korea February 8, 2012 Suspicious URLs in Twitter Twitter suffers from malicious tweets. Containing URLs for spam, phishing,

URLs K. Cooper 1 1 Department of Mathematics Washington State University 2014 URLs Introduction

Magic URLs in an XML Universe | Contents | 2 Contents Magic URLs in an XML

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

? sync ref chosen as sync source by Listener Stream B: Presentation Stream C: timestamps

Intercepting Suspicious Chrome Extension Actions Michael Cypher Department of Computing

Actionable Objective Optimization for Suspicious Behavior Detection on Large Bipartite Graphs

Stream Ciphers Stream Ciphers 1 Stream Ciphers Generalization of one-time pad Trade

How Using NNNetwork simplified i18n config Objectives - Configure iOS app for IT/ES (help urls,

Uniform Resource Locators (URLs) Scheme Port Number Query

Large-Scale Machine Learning at Twitter 2 Large-Scale Machine Learning at Twitter Jimmy Lin and

Using Twitter for your CPD Janet Thomas November 2019 #PHYSIO19 Why twitter for CPD?

ML at Twitter: A Deep Dive into Twitters Timeline Cibele Montez Halasz, Twitter Cortex

//Dashboard //Twitter Panel //Twitter Panel Context and Actions Act based on the document

NetFlow Analysis: Detecting covert channels on the network Detecting malicious traffic by using

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

draft-fieau-https-delivery-delegation Frdric Fieau - Orange Iuniana Oprescu - Orange IETF 93

Pointers Ch 9 &amp; 13.1 Highlights - pointers object vs memory address An object is simply a

IT350: Web &amp; Internet Programming Set 19: Security, Hacking and myspace Input to your

Data-Intensive Distributed Computing CS 431/631 (Fall 2020) Part 3: From MapReduce to Spark (1/2)

Making Drupal Friendly for Editors and Clients BADCamp

EECS 394 Software Development Chris Riesbeck Developing Mobile/Web Apps 1 Wednesday, October

Macaroons and dCache or delegating in a cloudy world Patrick Fuhrmann Paul Millar Paul

Session 22 Intra Server Control 1 Lecture Objectives Understand the differences between a

Pointers Ch 9 & 13.1 Highlights - pointers object vs memory address An object is simply a

IT350: Web & Internet Programming Set 19: Security, Hacking and myspace Input to your