Institute of Computer Science Chair of Communication Networks Prof. Dr.-Ing. P. Tran-Gia Characterization of BitTorrent Swarms and their Distribution in the Internet Tobias Hoßfeld , Frank Lehrieder, David Hock, Simon Oechsner University of Würzburg, Germany Zoran Despotovic, Wolfgang Kellerer, Maximilian Michel DoCoMo Communication Laboratories Europe GmbH, Germany
Agenda Introduction BitTorrent-like P2P networks Aim: characterization of real-life BitTorrent swarms Methodology and data sets Measurements Swarm sizes AS clustering of peers Traffic of BitTorrent swarms Characterization: distribution of peers over ASs Conclusion Characterization of BitTorrent Swarms and their Distribution in the Internet 2 Frank Lehrieder
BitTorrent-like P2P Networks In wide use for user-assisted content distribution, mostly file-sharing Responsible for a large fraction of today’s traffic in the Internet Example network: Seed: Tracker: Peer which has the complete Index server, knows addresses file, uploads only of all peers in the swarm Transfer of data chunks: File is divided in chunks of 512 KB Leecher: Swarm: Peer which does not have Set of all peers exchanging the the complete file, uploads same file and downloads data Characterization of BitTorrent Swarms and their Distribution in the Internet 3 Frank Lehrieder
Aim: Characterization of Real-Life BitTorrent Swarms Autonomous Peers Major research topic: Systems (AS) Application layer traffic optimization (ALTO) for BitTorrent networks Performance evaluation difficult Crucial impact of evaluation scenarios Slightly modified mechanisms “the Internet” lead to different results What is the nature of real-life BitTorrent swarms in the Internet? Distribution of peers over swarms Distribution of peers over ASs Exploitation potential for ALTO mechanisms Time dynamics, file sizes, content, … Characterization of BitTorrent Swarms and their Distribution in the Internet 4 Frank Lehrieder
Available Data Sets Characterization of BitTorrent Swarms and their Distribution in the Internet 5 Frank Lehrieder
Swarm Sizes ID Mean Max. p 80 Mov. 25.46 20079 0.13 TV. 15.53 7276 0.17 Mus. 9.76 3813 0.25 KPi. 11.12 72988 0.18 KMi. 6.99 763 0.45 KDe. 9.73 1883 0.31 Pop. 691.14 30961 0.45 24h. 146.68 19748 0.12 Mean and max. nr. of peers/swarm, fraction of swarms containing 80% of all peers (p 80 ) Almost all swarms have less than 100 peers (exception: Pop.) Maximum swarm sizes are by far larger than the mean value The fraction of swarms containing 80% of the peers (p 80 ) is roughly 0.2 for most of the data sets. Characterization of BitTorrent Swarms and their Distribution in the Internet 6 Frank Lehrieder
AS Clustering of Peers To which degree are peers of a swarm clustered in their ASs? “ -clustered” peers have at least ( -1) other peers in the same AS AS clustering of swarm s: 3 = 3/5 4 = 0 = #( -clustered peers) / swarm size Example swarm Most swarms have a very low fraction or even no peers at all clustered in their ASs Only 4% of the music swarms have an AS with 5 or more peers Only 12% of the movie swarms have an AS with 5 or more peers Characterization of BitTorrent Swarms and their Distribution in the Internet 7 Frank Lehrieder
Traffic of BitTorrent Swarms Two simple approximations for the traffic of a swarm “with file sizes”: traffic is proportional to (swarm size * file size) “w/o file sizes”: traffic of a swarm is proportional to swarm size 80-90% of the traffic are owed to 20% of the swarms (pareto principle) “Potentially local traffic”= traffic of a swarm * 2 ALTO mechanisms useful only in the top 20% of the swarms Characterization of BitTorrent Swarms and their Distribution in the Internet 8 Frank Lehrieder
Characterizing the Distribution of Peers over ASs Intention Input for performance evaluation Real-life distribution of peers over ASs within a swarm For every swarms s spread over n ASs Assign AS ids k {1,…,n} to ASs with decreasing nr. of peers F s (k): fraction of peers in s that belong to AS with id k Average F s (k) of all swarms s: F(k) (=> dark blue bars) Fit F(k) with power-law function: P(k)=a/k b + c (=> red curve) Characterization of BitTorrent Swarms and their Distribution in the Internet 9 Frank Lehrieder
Conclusion Measurement study comprises swarms of Different index servers (piratebay, mininova, demonoid) Different types of content (music, movies, regional content) Measurement results Most swarms are small and cannot use ALTO mechanisms Most traffic (80-90%) produced by a few large swarms ALTO mechanisms have a high potential in these swarms Further results: regional swarms, timely dynamics, distribution of peers over countries, number of peers vs. AS degree Characterizations of BitTorrent swarms for performance evaluations Distribution of peers over ASs within a swarm Further char’s: file sizes, number of peers, and top AS fraction Characterization of BitTorrent Swarms and their Distribution in the Internet 10 Frank Lehrieder
BACKUP Characterization of BitTorrent Swarms and their Distribution in the Internet 11 Frank Lehrieder
Distribution of Peers over ASs and Countries Average number of peers per AS is very small (<5) for most swarms Maximum number of peers per AS is still quite small AS affiliation not the only metric: country codes (MaxMind GeoIP) Characterization of BitTorrent Swarms and their Distribution in the Internet 12 Frank Lehrieder
Peculiarities of “Regional” Swarms 16 example swarms considered Calculate distribution of peers over ASs for every swarm Swarm sharing regional content Determine kurtosis of this distr. Spread over less ASs Higher kurtosis for regional Higher top AS fraction swarms (due to concentration in large ASs) Characterization of BitTorrent Swarms and their Distribution in the Internet 13 Frank Lehrieder
Recommend
More recommend