Query Log Analysis Detecting Anomalies in DNS Tra ffi c at a TLD Resolver Pieter Robberechts , Maarten Bosteels, Jesse Davis and Wannes Meert
Goal and Context Goal and Context The QLAD System Results Conclusion
DNS The Domain Name System www.cs.kuleuven.be 14.154.78.252 Browser DNS
DNS The Domain Name System Cache Root Cache ? .be 174.34.28.193 Servers ? .cs.kuleuven.be ? .kuleuven.be 14.154.78.252 54.186.35.8 TLD ? .cs.kuleuven.be Recursive Resolver Browser Name Server (ISP) 1 4 . 1 5 4 . 7 8 . 2 5 2 Authoritative Name Server
DNS Belgium The .be ccTLD resolver 1.5 million domains Domain name registry 4 for .be/.vlaanderen/.brussels nameservers - Manage registration of domains - Provide infrastructure to answer queries 350 million queries / day Highlights uit 2016. DNS Belgium. URL: https://www.dnsbelgium.be/sites/default/files/generated/files/documents/cijfers%20deel%201%20-%20980px_v04_NL.pdf
DNS Belgium Current Situation PCAP files - stored for 10 days - analysed post-mortem
DNS Belgium Current Situation We believe that proactive and real-time analysis of this data could contribute to the resilience and security of DNS Belgium’s service.
4 Challenges 1. Huge data volume ‣ e ffi ciency and scalability ! ‣ easy to stay under the hood 2. No labelled or clean training data 3. Wide range of attacks, under constant evolution 4. Specific nature of DNS tra ffi c ‣ periodicity and trends ‣ few (typically two) packets per flow
Goal and Context The QLAD System The QLAD System Results Conclusion
QLAD System Overview ANOMALY DATA TRANSFORMATION PRESENTATION DETECTION QLAD-global ENTRADA -- OR -- QLAD-UI DSC QLAD-flow
Data Transformation ENTRADA vs DSC ENTRADA DSC aggregate convert archive archive + SQL MonogDB API
Data Transformation ENTRADA vs DSC ENTRADA DSC • • Stores all tra ffi c Lightweight • • Allows a detailed analysis No additional infrastructure • SQL interface • • Storage cost and infrastructure No detailed (log level) analysis "ClientAddr": [ { "val": "195.238.24.111", "count": 1014 }, { "val": "195.238.25.53", "count": 70 }, { "val": "195.238.25.99", "count": 63 }, { "val": "195.238.24.117", "count": 61 }, { "val": "194.78.30.189", "count": 59 }, { "val": "42.236.23.92", "count": 55 }, { "val": "195.238.25.108", "count": 55 }, { "val": "42.236.23.91", "count": 54 }, { "val": "193.58.1.131", "count": 52 },
QLAD- fl ow Dewaele, G., Fukuda, K., Borgnat, P ., Abry, P ., & Cho, K. (2007). Extracting Hidden Anomalies using Sketch and Non Gaussian Multiresolution Statistical Detection Procedures. Proc. ACM SIGCOMM Workshop on Large-Scale Attack Defense (LSAD’07), 1–8. Hash packets h ₁
QLAD- fl ow Algorithm Count packets at different aggregation levels 2 1 3 0 2 1 3 1 0 0 2 0 1 Level 1 α₁ , β₁ 3 3 3 4 0 2 Level 2 α₂ , β₂ 6 7 2 Level 3 α₃ , β₃
QLAD- fl ow Compare groups at Algorithm each aggregation level Level 1 Level 2 Level 3 β₁ β₁ β₁ + + + α₁ α₁ α₁ Identify groups that Avg Distance differ from average Anomalous group
QLAD- fl ow Repeat with different Algorithm hash functions h1 h2 h3 ∩
QLAD- fl ow Shortcomings Some attacks span a lot of flows e.g. DoS with spoofed IP address QLAD-flow is unable to detect these
QLAD-global Algorithm Observation : each tra ffi c anomaly causes changes in the distribution of one or more tra ffi c features Look at entropy!
QLAD-global Algorithm TLD TLD GET NEW RUN UPDATE SLD SLD ENTRADA ENTROPIES DETECTOR MODELS qtype qtype - EMA rcode rcode - Kalman -- OR -- client - ... client ASN ASN country country DSC response size response size 1 2 4 REPORT ANOMALIES - timestamp - features with anomaly 3
Goal and Context The QLAD System Results Results Conclusion
Data Description of the evaluation dataset Sunday 12 to Monday 13 February 2017 1 42 GB server 58,345,819 queries
Results Detected anomalies QLAD-flow QLAD-flow Total QLAD-global (source IP) (query name) (unique) Bening Caching resolver 12 2 12 8 2 9 Email marketing 1 2 3 Other Malicious Spam sender 3 3 5 2 5 Domain enumeration Reflection attack 1 1 2 Phishing 1 1 3 2 1 4 DoS attack 1 1 1 Unknown TOTAL 35 4 9 39
Results Detected anomalies • No ground truth → Impossible to use standard evaluation → Manual inspection of detected anomalies • Only tip of the iceberg?
Goal and Context The QLAD System Results Conclusion Conclusion
Conclusion QLAD - ENTRADA / DSC - QLAD-flow - QLAD-global - QLAD-UI is a combination that works! However, Anomaly ≠ attack / abuse ➡ filtering needed Can this be automated?
Thanks! Any questions? Interested? All software is open source! QLAD: https://github.com/DNSBelgium/qlad ENTRADA: https://github.com/SIDN/entrada DSC: https://github.com/DNS-OARC/dsc
Recommend
More recommend