Notos: Building a Dynamic Reputation System for DNS Manos Antonakakis, Roberto Perdisci , David Dagon, Wenke Lee, and Nick Feamster College of Computing Georgia Institute of Technology Atlanta, Georgia ONR MURI Review Meeting June 10, 2010
Form Approved Report Documentation Page OMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing the collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden, to Washington Headquarters Services, Directorate for Information Operations and Reports, 1215 Jefferson Davis Highway, Suite 1204, Arlington VA 22202-4302. Respondents should be aware that notwithstanding any other provision of law, no person shall be subject to a penalty for failing to comply with a collection of information if it does not display a currently valid OMB control number. 1. REPORT DATE 3. DATES COVERED 2. REPORT TYPE 10 JUN 2010 00-00-2010 to 00-00-2010 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Notos: Building a Dynamic Reputation System for DNS 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT NUMBER Georgia Institute of Technology,College of Computing,Atlanta,GA,30332 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR’S ACRONYM(S) 11. SPONSOR/MONITOR’S REPORT NUMBER(S) 12. DISTRIBUTION/AVAILABILITY STATEMENT Approved for public release; distribution unlimited 13. SUPPLEMENTARY NOTES MURI Review, June 2010. U.S. Government or Federal Rights License 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF 18. NUMBER 19a. NAME OF ABSTRACT OF PAGES RESPONSIBLE PERSON a. REPORT b. ABSTRACT c. THIS PAGE Same as 17 unclassified unclassified unclassified Report (SAR) Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18
Problems with Static Blacklisting • Malware families utilize large number of domains for discovering the “up -to- date” C&C address – Examples are the Sinowal, Bobax and Conficker bots families that generate tens of thousands new C&C domains every day – IP-based (dynamic or not) blocking technologies cannot keep up with the number of IP addresses that the C&C domains typically use – DNSBL based technologies cannot keep up with the volume of new domain names the botnet uses every day • Detecting and blocking such type of agile botnets cannot be achieve with the current state-of-the-art 11/4/09 ONR MURI Review 2
Outline • Notos – Notations, Passive DNS trends, and anchor- zones – Network based profile modeling – Network and zone based profiles clustering – Reputation function – System implementation – Results • Conclusions and Future Work 11/4/09 ONR MURI Review 3
Notos • Network and zone based features that capture the characteristics of resource provisioning, usages, and management by domains. – Learn the models of legitimate and malicious domains • Classify new domains with a very low FP% (0.3846%) and high TP% (96.8%). – Days or even weeks before they appear on static blacklists. 11/4/09 ONR MURI Review 4
Notation & Terminology • Resource Record (RR) – www.example.com 192.0.32.10 • 2nd level domain (2LD) and 3rd level domain (3LD) – For the domain name www.example.com: 2LD is the example.com and 3LD is the www.example.com • Related Historic IPs (RHIPs) – All “routable” IPs that historically have been mapped with the domain name in the RR, or any domain name under the 2LD and 3LD • Related Historic Domains (RHDNs) – All fully qualified domain names (FQDN) that historically have been linked with the IP in the RR, its corresponding CIDR and AS 11/4/09 ONR MURI Review 5
Passive DNS data • Successful DNS resolutions that can be observed in a given network • Data set has traffic from 2 ISP sensors - one in west coast and one in east coast, also data from SIE • We observe that different classes of zones demonstrate different passive DNS behaviors • The number of new domain names and IPs we observe every day is in the range of 150,000 to 200,000 11/4/09 ONR MURI Review 6
Pep.n t/ ~ ~ ~ ~ ~ /' /!_ · ~i~ t -· ·~v/·.f"\ ~ ~ - ~ . ~ . : m - 1 Passive DNS trends (c) Akamai Class Growltl (d) CON Class Growltl (a) Uniqu:e AIRs In Th e T wo I SPs Sensors (per day} Over Ti me (Days) Over Time (Days) ! r .... . . ..... ... . 10000 10000 ....-----.-- --. f"o. f/ ..... . A v e+ ' .... . -J ., .J... .- •• • :}.. ; 1. - • \/ • I l j 5 80 1000 i Unique ARs ------ ---- 1000 0 1 0 20 30 40 50 60 70 Days 100 (b) N mv A.R s Growtfl ln pONS OB For All Z ones 10 : : -: I : l 10 '-"- 100 10 100 10 1 Unique ON Uniquo DNs U ni que IPs Uni que IP 0 10 20 30 40 50 60 70 New RRs New RRs Oavs ( g) Common Class Growth (h) CDF Of RR Growth (e) Pop Class Growth (f) Dyn. DNS Class Growth Over Ti me (Days) Fo r All Classes Over Time (Days) Ove r Tme (Days) 100000 ,....---....-----. 10000 "' 1000 "' "' E E 1000 :::> 10 2 0 0 '!' .. > 100 > > . 100 10 ./ 10 1 0_01 1 10 100 10 100 10 100 0.1 CON Unique DN Unique DN Unique DN Akamai .. Unique IP Common ..... ... Dynamic Unique IP Unique IP s New RRs New RRs Pop New RRs Anchor classes in pDNS: Akamai, CDN, Popular, DYNDNS and Common 11/4/09 ONR MURI Review 7 ee
Features Notos computes three feature vectors for a RR, based on its RHIPs, RHDNs and Evidence data. The analysis of these feature vectors is forwarded to the reputation engine. These 3 vectors are the Network Based Feature Vector [18], Zone Based Feature Vector [17] and the Evidence Based Feature Vector [6]. 11/4/09 ONR MURI Review 8
Network Profile Modeling • Train a Meta-Classifier based on the 5 anchor-classes • The network feature vector of a domain name d is translated into the network modeling output (NM(d)) The NM(d) is a feature vector composed from the confidence scores for each different anchor-class 11/4/09 ONR MURI Review 9
Domain Clustering The network and zone based feature vectors of a domain d are used to produce the domain clustering output (DC(d)) In this step we are able to characterize unknown domains within clusters based on already labeled domains in close proximity . The DC(d) is a 5-feature vector characterizing the position of d in the cluster . 11/4/09 ONR MURI Review 10
Reputation Function • Each domain d in our dataset is transformed into three feature vectors by Notos: NM(d) , DC(d) and EV(d) (evidence profile output); these vectors assemble the reputation vector v(d) • The reputation function f(v(d)) assigns a score to the domain name d between [0,1] • The reputation function is a statistical classifier (Decision Tree with Logistic Boost - after model selection) • The reputation function is trained using labeled domain data 11/4/09 ONR MURI Review 11
Operational Model of Notos • Notos utilizes the Off-line mode to train classifiers, build the clusters and train the reputation function • In the In-line mode , Notos assigns reputation to new RRs observed at the monitoring point 11/4/09 ONR MURI Review 12
. co~n, .,~- Dyna~nic Aka~nai ~- Li ~e - - - Pep.n ~- ~ ~nyspace, Aka~nai ~ _____ _..,.- - --- - .,_ - - - Malicous £~-7} . - - -- --- - - - - Conf'.C Sinkhole -{B} - (net) - - - - - - and NTP .. /. - - - - - J ....... \, - - ...... - - - -- ISP-Rev. - - - ...... Lookups -{0-7} - .. 1 - Malicious -{B} .... - 2 .. - Malicious, - - f'ew Popular - 3 f'ew CDN 0 _.. - (net) - - - DNs : - - - 4 - - - ; Akadns and - ; Google -{0-S,B .Ji - - - 5 : Mal. -{6,7,9} _ 9 _ Dyn. DNS -{7,9} .- - - - - - - - - - .. - - .. - .. -- - 6 - - - - - ISP-Rev. - 8 - - Lookups -- 7 - - - - - - - - - - - - 1' 1' 1 • - - t f'acebook, _ - · - CDNs, Aka~naitech Malware and _ _ _ - Few Spa~n, Malicious -- - -- ........ -- -- - CDNs • - -- _..,...._ --- - -- - -- - -- - -- 11/4/09 ONR MURI Review 13 ee
~ . ~ ~ Pep.n Results from the Reputation Function False Positive Rate vs True Positive Rate 1 0.99 0.98 0.97 : over All Pos. vs Threshold - TP ' Cl> al 1 a: 0.96 0.98 Cl> - .::: 0.96 ' § 0.94 0.95 " (j) 0.92 " (j) 0 0.9 a... " (3 0.94 0.88 Cl> a... 0.86 .._ ::::::1 0.84 I- 0.93 ROC ····+- '·· 0.82 0. 8 .,_ .__.__.___.___......___.......___.......___.......___..___. 0 0.10.2J.3).40.fD.00.70.00.9 1 0.92 Threshold 0.91 ROC ----+---· 0.9 0 0.02 0.04 0.06 0.08 0.1 False Positive Rate FP%=0.3849% and TP%=96.8% 11/4/09 ONR MURI Review 14 ee
Recommend
More recommend