Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen IPv6 Scanning Smart address selection and comparison to legacy IP Intermediate talk Sebastian Gebhard Supervisors: Oliver Gasser, Quirin Scheitle September 30, 2015 Chair for Network Architectures and Services Department of Informatics Technische Universit¨ at M¨ unchen Sebastian Gebhard – IPv6 Scanning 1
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Motivation Approach Data sourcing Intermediate results Lessons learned so far Future work Time plan Sebastian Gebhard – IPv6 Scanning 2
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Motivation ◮ Recent advances in scanning technology have made large scale IPv4 scanning feasible ◮ zmap [4]: whole IPv4 address space in 4.5 minutes ◮ IPv6 address space is vastly larger ◮ Scanning the whole address space is not feasible Sebastian Gebhard – IPv6 Scanning 3
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Motivation ◮ Recent advances in scanning technology have made large scale IPv4 scanning feasible ◮ zmap [4]: whole IPv4 address space in 4.5 minutes ◮ IPv6 address space is vastly larger ◮ Scanning the whole address space is not feasible Proposed solution ◮ Smart address selection ◮ Choose addresses already seen in the network ◮ Close neighbours of known addresses ◮ Pattern recognition? Sebastian Gebhard – IPv6 Scanning 3
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Approach ◮ Data harvesting → hitlist generation ◮ Parsing data sources ◮ DNS resolving (for some data sources) ◮ Filtering and evaluating results ◮ Scanning targets ◮ traceroute ◮ Port scanning on port 80, 443 ◮ Evaluation Sebastian Gebhard – IPv6 Scanning 4
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Data sources ◮ Alexa Top 1Million ◮ Rapid7 rDNS ◮ Rapid7 DNS ANY ◮ Caida DNS names ◮ DNS zone files Sebastian Gebhard – IPv6 Scanning 5
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Data sources Alexa Top 1Million [5] CSV list of the TOP 1 Million most visited websites, published by Alexa Internet, Inc. 1,google.com 2,facebook.com 3,youtube.com Figure: Alexa Top 1M Websites [5] - Entries 1 to 3 ◮ Raw size ◮ Filesize: 22 MB, lines: 1,000,000 ◮ Processing ◮ DNS lookup: AAAA record of every domain Sebastian Gebhard – IPv6 Scanning 6
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Data sources scans.io [1] - Rapid7 Reverse DNS Reverse names for the whole IPv4 space (coverage ∼ 28%) 131.159.14.1, erdbeerschnitzel.net.in.tum.de 131.159.14.10, rack.net.in.tum.de 131.159.14.100, asgard.net.in.tum.de 131.159.14.101, ipmi.asgard.net.in.tum.de Figure: Rapid7 Reverse DNS dataset excerpt [1] ◮ Raw size ◮ Filesize: 56 GB, lines: 1,216,518,754 ◮ Processing ◮ DNS lookup: AAAA record of every hostname Sebastian Gebhard – IPv6 Scanning 7
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Data sources scans.io [1] - Rapid7 DNS ANY DNS ANY lookup on names gathered from scans by Rapid7 internetscience.net.in.tum.de,aaaa, 2001:4ca0:2001:13:216:3eff:fee1:6973 internetwin.tumblr.com,a,66.6.41.21 internetwin.tumblr.com,txt,|v=spf1 include:_spf.google.com include:sendgrid.net include:mail.zendesk.com -all internet-science.eu,ns,lucifer.net.in.tum.de internet-science.eu,ns,nimbus.net.in.tum.de intranet.in.tum.de,cname,intranet.informatik.tu-muenchen.de Figure: Rapid7 DNS ANY dataset excerpt [1] ◮ Raw size ◮ Filesize: 69 GB, lines: 1,435,173,425 ◮ Processing ◮ extract AAAA records Sebastian Gebhard – IPv6 Scanning 8
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Data sources Caida DNS names [2] IP addresses collected by Caida’s large scale traceroute measurements on Archipelago (Ark) with PTR record 1438578912 2001:4ca0:2001:10::1 st.gw.net.in.tum.de 1438578913 2001:4ca0:2001:10:eea8:6bff:fef4:b705 priwen.net.in.tum.de 1438578951 2001:4ca0:2001:10::2 st.scylla.net.in.tum.de 1438578955 2001:4ca0:2001:10::3 st.charybdis.net.in.tum.de Figure: Caida DNS names dataset excerpt ◮ Raw size ◮ Filesize: 40 MB, lines: 618,480 ◮ Processing ◮ cut -f2 Sebastian Gebhard – IPv6 Scanning 9
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Data sources DNS zone files [6] Every registered domain in the [biz, com, info, mobi, net, org, xxx] zones WARRIOR4US.COM GARYBETTUM.COM SDRTCJ.COM CAVALLOCREEKFARM.COM TIESCOMMUNITY.COM Figure: .com zone file excerpt [6] ◮ Raw size ◮ Filesize: 2.6 GB, lines: 151,881,502 ◮ Processing ◮ DNS lookup: AAAA record of every hostname Sebastian Gebhard – IPv6 Scanning 10
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Intermediate results ◮ Filtering ◮ Evaluation ◮ Results Sebastian Gebhard – IPv6 Scanning 11
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Filtering raw sort -u Remove duplicates duplicates bogon? Remove bogons bogons Remove special IANAspecial? prefixes IANAspecial Remove unannounced pfx2as? prefixes unnanounced Remove blacklisted blacklist? blacklisted prefixes final Sebastian Gebhard – IPv6 Scanning 12
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Filtering raw sort -u Remove duplicates duplicates bogon? Remove bogons bogons Remove special IANAspecial? prefixes IANAspecial Remove unannounced pfx2as? prefixes unnanounced Remove blacklisted blacklist? blacklisted prefixes Figure: Filter output on final Rapid7 DNS ANY Sebastian Gebhard – IPv6 Scanning 12
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Evaluation Automated generation of PDF report for single data sources using iPython notebook ◮ Basic statistics ◮ Amount of data removed by filters Sebastian Gebhard – IPv6 Scanning 13
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Evaluation Automated generation of PDF report for single data sources using iPython notebook ◮ Basic statistics ◮ Amount of data removed by filters ◮ Mapping found IPs to announced prefixes ◮ IPs per prefix ◮ IPs per AS ◮ AS / Prefix with most hits Sebastian Gebhard – IPv6 Scanning 13
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Evaluation Automated generation of PDF report for single data sources using iPython notebook ◮ Basic statistics ◮ Amount of data removed by filters ◮ Mapping found IPs to announced prefixes ◮ IPs per prefix ◮ IPs per AS ◮ AS / Prefix with most hits ◮ Efficiency of data source ◮ Unique, real IP addresses per size of source file ◮ Possible SLAAC addresses ◮ # and % of all prefixes / ASes found Sebastian Gebhard – IPv6 Scanning 13
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Evaluation Automated generation of PDF report for single data sources using iPython notebook ◮ Basic statistics ◮ Amount of data removed by filters ◮ Mapping found IPs to announced prefixes ◮ IPs per prefix ◮ IPs per AS ◮ AS / Prefix with most hits ◮ Efficiency of data source ◮ Unique, real IP addresses per size of source file ◮ Possible SLAAC addresses ◮ # and % of all prefixes / ASes found ◮ Hamming Weight ◮ Calculate hamming weight of interface ID of each IP ◮ Compare hamming weights ◮ Identify servers and clients? Sebastian Gebhard – IPv6 Scanning 13
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Results for hitlist generation Alexa Top 1M rDNS DNS Any Caida DNS names zone files size 22 MB 56 GB 69 GB 40 MB 2.6 GB input lines 1M 1.2B 1.4B 618k 151M raw 90,761 1,023,950 9,769,653 618,480 4,762,297 (9.076 %) a (0.084 %) (0.681 %) (100.000 %) (3.136 %) final 43,822 462,185 1,440,984 102,580 430,689 (4.382 %) (0.038 %) (0.100 %) (16.586 %) (0.284 %) ASes 1,424 4,795 5,708 5,488 2,371 (1,424 ppm) b (4 ppm) (4 ppm) (8,873 ppm) (16 ppm) PFXes 1,695 6,749 8,506 9,269 2,995 (1,695 ppm) (6 ppm) (6 ppm) (14,987 ppm) (20 ppm) AS coverage 13.984 % 47.088 % 56.054 % 53.894 % 23.284 % PFX coverage 6.575 % 26.178 % 32.993 % 35.953 % 11.617 % Combined AS coverage 7,331 (71.99 %) Combined PFX coverage 12,854 (49.86 %) a Second line: Efficiency = value input lines b ppm: Parts per million = AS / pfx per 1 Million input lines) Sebastian Gebhard – IPv6 Scanning 14
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Lessons learned ◮ Always sanitize your input data ◮ PTR records don’t have to contain a valid hostname. 109.234.32.137 which resolves to \ 236 \ 232 \ 240- \ 234 \ 240 \ 229 \ 241 \ 229 \ 23586. \ 240 \ 244. Sebastian Gebhard – IPv6 Scanning 15
Chair for Network Architectures and Services Technische Universit¨ at M¨ unchen Lessons learned ◮ Always sanitize your input data ◮ PTR records don’t have to contain a valid hostname. 109.234.32.137 which resolves to \ 236 \ 232 \ 240- \ 234 \ 240 \ 229 \ 241 \ 229 \ 23586. \ 240 \ 244. ◮ People are using IPv6 over ISDN Sebastian Gebhard – IPv6 Scanning 15
Recommend
More recommend