PST2013 The role of phone numbers in Andrei Costin understanding cyber-crime M. Balduzzi + A. Costin ∗ J. Isachenkova ∗ A. Francillon ∗ D. Balzarotti ∗ ∗ Eurecom, Sophia Antipolis, France + Trend Micro Research, EMEA July 11, 2013 1/34
Introduction PST2013 Online/digital identifiers in cyber-crime Andrei Costin Mail Domain name/Web site Social networks/Nicknames profile Extensive studies: [LMK + 10, TGM + 11, KKL + 08, Ede03, CHMS06] Phone numbers Limited studies: [CYK10, STHB99, Pol05, Hyp] Studied mainly in context of premium short number mobile frauds Our main focus 2/34
Introduction PST2013 Andrei Costin Phone number usages Mail signatures Extensively used in many businesses Offers less anonymization than other identifiers Links cyber domain to reality domain Commonly used in various online frauds, e.g.: Premium numbers fraud Scam fraud 3/34
Introduction PST2013 Andrei Importance of cyber-crime and phone numbers – example Costin Banking Trojan.Shylock [symb] Injects code into banking websites Replaces telephone details into the contact pages of online banking websites 4/34
Hypothesis PST2013 Andrei Costin Phone numbers are used in cyber-crime activities Can we find telecom operators preference? Can we find geographical preference? Phone numbers can be a stronger identification metric vs. other identifiers 5/34
Goals PST2013 Andrei Costin Check those hypothesis against real data-sets Evaluate the reliability of automated phone numbers extraction and analysis Identify challenges and limitations Automatically find patterns associated with recurrent criminal activities Automatically correlate the extracted information for Telecom operator preference Geographic area preference 6/34
Methodology PST2013 Andrei Costin 7/34
Datasets I PST2013 Andrei Data Sources initially considered Costin SPAM Large and extremely noisy dataset Extremely challenging to extract and clean phone numbers WHOIS Focused on malicious domains High quality dataset (intl. format) Phone numbers are dummy or replaced by CERTs’ contact numbers 8/34
Datasets II PST2013 Andrei Costin ANDROID Small and noisy dataset Mainly contained short premium numbers – open problem SCAM Large and high quality dataset Phone numbers are an important part of business model Focus on this dataset 9/34
Phone Number Extraction I PST2013 Andrei Costin Success and Reliability of Extraction depend on How well formatted the number is Call: 0336 9505705 9 am - 5 pm Can be decoded as 2 valid numbers: +443369505705 or +33695057059 We aim at obtaining: Non ambiguous normalized number Fully qualified international format number 10/34
Phone Number Extraction II PST2013 Andrei Costin How structured and easy to parse the information is WHOIS records (easy) vs. Malicious mobile binary (difficult) How noisy the data source is Spam messages are very noisy (to defeat anti-spam filters) Scam messages have almost no noise 11/34
Phone Number Extraction Challenges Example Number obfuscation used [syma] PST2013 Andrei Costin 12/34
Scam Message Sample PST2013 Andrei Costin 13/34
SCAM Dataset PST2013 Andrei Costin Used user reports aggregator 419scam Data timespan: January 2009 – August 2012 Enriched and correlated with numbering plans (NNPC) databases Free ( libphonenumber ) Commercial (more detailed and updated) 14/34
SCAM Email Categories PST2013 Emails classified in 10 categories Andrei 3 categories cover over 90% of the data Costin S c a m C a t ego r i e s F i nan c i a l sc a m ( 62 % ) F a k e l otte r y ( 25 % ) N e x t of k i n ( 8 % ) O the r ( 5 % ) 15/34
SCAM Phones Categories PST2013 ˜67k unique normalized phone numbers Andrei Classified using numbering plans (NNPC) databases Costin N u m be r t y pe b r ea k do w n U K P R S ( 51 % ) M ob il e ( 44 % ) O the r ( 5 % ) 16/34
SCAM Communities/Identity Links Used clustering techniques, discovered identity links PST2013 Identified 102 communities Andrei Costin Supports the hypothesis that phone numbers are a good metric to study scammers 17/34
PST2013 Andrei Costin ANALYSIS OF MOBILE PHONE NUMBERS 18/34
Questions and Hypothesis PST2013 Andrei Costin For how long are phone numbers used? Are phone numbers reused or discarded? If discarded, after how long? Are phone numbers used in roaming? If roaming, to which extent? We try to answer these questions with HLR queries 19/34
HLR Querying PST2013 HLR=Home Location Register Andrei Costin Important component of Mobile Network Operators 20/34
Single HLR Queries PST2013 In Aug 2012, querying once for all mobiles encountered in : Andrei Jan – Jun 2012 Costin Jul 2012 M ob il e phone s ne t w o r k s t a t u s 90 80 70 60 P e r c en t age 50 J an − J un 2012 40 J u l 2012 30 20 10 0 E RR O FF O N R O A M N e t w o r k s t a t u s 21/34
Repeated HLR Queries Performed HLR queries PST2013 For 1400 numbers Andrei Every 3 days Costin During Jul – Aug 2012 Hypothesis1: Possibility of a link with the Nigerian groups Hypothesis2: May be used to conceal location 22/34
Phone Numbers Reuse PST2013 Question: Andrei For how long a scam number is used? Costin P hone nu m be r r eu s e 18 16 14 R eu s e pe r c en t age 12 10 8 6 4 2 0 3 2 1 A ge o f r eu s ed phone nu m be r ( y ea r s ) 23/34
PST2013 Andrei Costin ANALYSIS OF UK PRS PHONE NUMBERS 24/34
What are UK PRS numbers? I PST2013 Andrei Costin Definition Premium rate services (PRS) are a form of micro-payment for paid content, data services and value added services that are subsequently charged to user phone bill UK PRS is a 800 Mil. GBP bussines (2009) 25/34
What are UK PRS numbers? II PST2013 Andrei Usages Costin Conceal geographic location of real phone, via call forwarding Earn revenue from calls to these numbers Challenges Hard to trace the ”service provider” Hard to trace the real phone number behind forwarding Hard to detect or prove that fraud is involved 26/34
Range of UK PRS numbers PST2013 ˜34k unique phone numbers in UK range of 07x Premium Rate Services numbers Andrei Costin 4 operators (out of 88) provide more than 90% of fraud-related UK PRS numbers ˜5% of one operator allocated range is fraud-related T op 4 U K P R S T e l e c o m s p r o v i d i ng nu m be r s f ound i n f r aud 60 50 40 S ha r e o f t he f r aud i n P e r c en t age ope r a t o r ’ s 30 nu m be r i ng r ange O pe r a t o r s ha r e i n 20 t o t a l f r aud 10 0 M ag r a t hea O pen T e l e c o m F l e x T e l I n v o m o T e l e c o m N a m e 27/34
Conclusion – Results PST2013 Andrei Costin Phone numbers are a strong digital identifier in some cyber-crime activities Phone numbers help in automated scammer community detection HLR lookups help in identifying identify recurrent cyber-criminal business models to study phone numbers’ geographical use and activity patterns 28/34
Conclusion – Future Work PST2013 Andrei Phone number extraction is an open, non-trivial Costin problem Improve matching algorithms and their context-awareness PRS phone numbers are opaque is a ”traceroute” of PRS phone numbers possible? learn business models behind them Short number extraction and evaluation Open and challenging, non-trivial problem Becomes a growing concern with mobile malware 29/34
Questions? PST2013 Andrei Costin Contacts: Software and System Security Group @ EURECOM S3.eurecom.fr Thank you! 30/34
References I PST2013 Duncan Cook, Jacky Hartnett, Kevin Manderson, and Andrei Costin Joel Scanlan, Catching spam before it arrives: domain specific dynamic blacklists , Proceedings of the 2006 Australasian workshops on Grid computing and e-research, ACSW Frontiers ’06, vol. 54, 2006. Nicolas Christin, Sally S. Yanagihara, and Keisuke Kamataki, Dissecting one click frauds , CCS ’10, ACM, 2010. Eve Edelson, The 419 scam: information warfare on the spam front and a proposal for local filtering. , Computers & Security 22 (2003), no. 5. 31/34
References II PST2013 Mikko Hypponen, Malware Goes Mobile , Andrei Costin http://www.cs.virginia.edu/~robins/Malware_ Goes_Mobile.pdf . Christian Kreibich, Chris Kanich, Kirill Levchenko, Brandon Enright, Geoffrey M. Voelker, Vern Paxson, and Stefan Savage, On the spam campaign trail , LEET’08, 2008. Olumide B. Longe, Victor Mbarika, M. Kourouma, F. Wada, and R. Isabalija, Seeing beyond the surface, understanding and tracking fraudulent cyber activities , CoRR abs/1001.1993 (2010). 32/34
References III PST2013 Andrei Craig Pollard, Telecom fraud: Telecom fraud: the cost of Costin doing nothing just went up , Network Security 2005 (2005), no. 2. J. Shawe-Taylor, K. Howker, and P. Burge, Detection of fraud in mobile telecommunications , Information Security Technical Report 4 (1999), no. 1. Evolution of Russian Phone Number Spam , http://www.symantec.com/connect/blogs/ revolution-russian-phone-number-spam . 33/34
Recommend
More recommend