Internet measurements at complexnetworks.fr Guillaume Valadon - http://valadon.complexnetworks.fr LIP6 (CNRS - UPMC) Complex Networks team http://complexnetworks.fr
The team http://complexnetworks.fr : plots & videos – 4 permanent members : Jean-Loup Guillaume, Matthieu Latapy, Bénédicte Le Grand, Clémence Magnien – 2 postdocs, 9 Ph.D. students Focus & interests: – Internet topology, P2P networks, social networks – measurements – analysis 2
Outline 1. Internet topology measurements Frédéric Ouedraogo, Clémence Magnien, Matthieu Latapy 2. eDonkey measurements: server side 3. eDonkey measurements: honeypot 3
Context IP topology of the Internet : using traceroute-like tools few sources, high numbers of destinations measures : – long & high cost, – bias : fake links & missed links 4
Traceroute measurements 5
Traceroute measurements 5
Traceroute measurements 5
Traceroute measurements 5
Traceroute measurements 5
Traceroute measurements 5
Traceroute measurements * 5
Traceroute measurements * 5
Traceroute: unbalanced load 3000 2500 # times probed many probes on links closest 2000 1500 1000 500 0 0 5 10 15 20 25 30 Distance from the monitor 6
Traceroute: unbalanced load 3000 2500 # times probed many probes on links closest 2000 1500 1000 500 0 0 5 10 15 20 25 30 Distance from the monitor Traceroute limitations: unbalanced load, information redundancy, obtained view is not a tree 6
Ego-centered view tracetree http://data.complexnetworks.fr/Radar/ – one source – fixed set of destinations – the result is a tree – fast measurement (~100 round per day) – 7
Tracetree measurements 8
Tracetree measurements 8
Tracetree measurements * 8
Tracetree measurements * * 8
Tracetree measurements * * * 8
Parameters Many parameters: – number of destinations – delay between rounds – maximum TTL ? – ... We want: 1. high frequency 2. large ego-centered view 3. low network load 9
Parameters : frequency +,1--- +,,0-- +,,/-- +,,.-- # IP +,,1-- &#'!%(#&#)* +,,--- !"##$ ! %" +,-0-- +,-/-- +,-.-- - ,- 1- 2- .- 3- /- 4- 0- # Hours Test monitor Control monitor 10
Parameters : frequency +,1--- +,,0-- +,,/-- +,,.-- # IP +,,1-- &#'!%(#&#)* +,,--- !"##$ ! %" +,-0-- +,-/-- +,-.-- - ,- 1- 2- .- 3- /- 4- 0- # Hours Test monitor frequency has no impact on discovered addresses Control monitor 10
Parameters : destination number 30000 10000 d. 25000 20000 # IP 15000 3000 d. 3000 d. 3000 d. (sim) 10000 1000 d. (sim) 1000 d. 5000 0 0 10 20 30 40 50 60 70 80 90 # Hours Test monitor Control monitor 11
Parameters : destination number 30000 10000 d. 25000 20000 # IP 15000 3000 d. 3000 d. 3000 d. (sim) 10000 1000 d. (sim) 1000 d. 5000 0 0 10 20 30 40 50 60 70 80 90 # Hours Test monitor too many destinations == loss of efficiency Control monitor 11
Available data [ADN’08, ICIMP’09] Two parameter sets: – normal: 3000 destinations, max TTL 30, 10 minutes delay (~100 rounds / day) – fast: 1000 destinations, max TTL 15, 1 minute delay (~ 800 rounds / day) Available data at http://data.complexnetworks.fr/Radar/ – several sets of random destinations – 150 monitors – several months of uninterrupted measures 12
Outline 1. Internet topology measurements 2. eDonkey measurements: server side Frédéric Aidouni, Matthieu Latapy, Clémence Magnien 3. eDonkey measurements: honeypot 13
Context study exchanges in P2P networks – files diffusion – communities of interests – popularity some motivations – understand users behaviour – develop new P2P protocols – blind content detection – detect pedophile activities – protocol and exchange simulations 14
eDonkey exchanges 1. inter-clients: file downloads 2. inter-servers: statistical data 3. clients-servers: files & sources search K e y w o r ds s ea r c h F il e s li s t S e r v e u r C li e n t Sou r ce s s ea r c h Sou r ce s li s t 15
Capturing traffic on a real server eDonkey server Capture client P CA P du m p P CA P d ec od i ng U D P t r a ffi c P CA P fl o w e Don ke y e x c h a ng e s i nsp ec t i on e Don ke y t r a ffi c A nony m i s a t i on a nd f o r m a tt i ng X M L e n c od i ng <opcode dir="received" TS="2786402.373146" IP=" 0045125351 " type="high" port="02029"><OP_GLOBSEARCHREQ> <tags count="1"><anon-string> 3108886 </anon-string></tags> </OP_GLOBSEARCHREQ></opcode> 16
Basic analysis : files sizes 2.5e+09 small files 2e+09 # of files 1.5e+09 700 MB 350 MB 1 GB 1e+09 230 MB 175 MB 1.4 GB 5e+08 0 1 10 100 1000 10000 Size (in MB) obtained from the server answers CD-ROM size and fractions (1/2, 1/3, and 1/4) ➡ related to classical sizes of storage support 17
Basic analysis : time between queries # of peers Time (in seconds) 18
Basic analysis : time between queries # of peers Time (in seconds) regularities of queries 18
Resulting data set in numbers [HotP2P’09] 10 weeks measurements ~500 GB of compressed XML ~ 10 billions messages ~ 90 millions clients ~ 280 millions of distinct files ➡ anonymized data available online at http://antipaedo.lip6.fr 19
Outline 1. Internet topology measurements 2. eDonkey measurements: server side 3. eDonkey measurements: honeypot Oussama Allali, Matthieu Latapy, Clémence Magnien 20
Honeypot based measurements eDonkey honeypot: – customized eDonkey client – announce files to a server (filename, hash, size) – log queries made by regular clients Manager: – control distributed honeypots – send commands to honeypots: server to connect, files to exchange, ... 21
eDonkey exchanges 22
eDonkey exchanges Send nothing Send random content 22
Methodology 24 PlanetLab nodes, running distributed honeypots: – 12 sending no content – 12 sending random content 1 greedy honeypot: – learn files during the first day – afterwards, announce these files distributed greedy Honeypots 24 1 Duration in days 32 15 Shared files 4 3 175 Distinct peers 110 049 871 445 Distinct files 28 007 267 047 23
Parameters : distributed or greedy 120000 6000 900000 80000 800000 70000 100000 5000 700000 60000 80000 4000 600000 Total number of peers Number of new peers Total number of peers Number of new peers 50000 500000 60000 3000 40000 400000 30000 300000 40000 2000 20000 200000 20000 1000 Distributed Greedy 10000 100000 total peer total peer new peers new peers 0 0 0 0 0 2 4 6 8 10 12 14 16 18 0 5 10 15 20 25 30 35 Days Days long measurements are relevant effects of blacklisting and file popularity 24
Parameters : no-content & random-content 90000 2e+06 random content random content no content no content 1.8e+06 80000 1.6e+06 70000 Number of REQUEST-PART queries 1.4e+06 60000 Number of peers 1.2e+06 50000 1e+06 40000 800000 30000 600000 20000 400000 HELLO messages REQUEST -part messages 10000 200000 0 0 0 5 10 15 20 25 30 35 0 5 10 15 20 25 30 35 Days Days advantage of sending random content global and local blacklisting 25
Parameters : number of honeypots 120000 100000 80000 Number of peers 60000 40000 20000 lower bound of 100 samples avreage of 100 samples upper bound of 100 samples 0 0 5 10 15 20 25 Number of honeypots important benefit in using several honeypots 26
Conclusion several data sets available – IP topology – eDonkey measurement: • server side • client side • honeypot Ongoing works – understand topology dynamics – community of interests in eDonkey – anomaly detection in the IP topology – ... 27
Questions ? 28
Recommend
More recommend