Abilene Observatory Datasets Matt Zekauskas, matt@internet2.edu 03-Jun-2004
Major Datasets, roughly size order � Flow data (last 11bits of IP zeroed) � Latency (one-way, 2*11^2 paths (v4, v6)) � Near future: IGP (IS-IS updates/node) � SYSLOG from router (internal at least, future) � Router snapshots � 1 and 5 min SNMP usage, errors � Throughput (iperf, 1Gb limited, 2*2*11^2 sets (v4, v6, TCP, UDP)) � [multicast] 03-Jun-2004 2
Common Theme � Summaries available via Web • Graphs, Tables • Time series (of summaries) via “web services” � Raw data in diverse formats, only by special request (often on HPSS / MSS) • (Probably) not enough metadata kept • Stored by date; some tarballs • Recovery is a manual process • Except router snapshots. Have all XML files. � Flow data gone after 30 days 03-Jun-2004 3
Other problems � Stuff is archived in many places (with different administrative hurdles) • OSU, Columbus [RAID] • IU, Bloomington [HPSS] • Internet2 / Ann Arbor [RAID, tape] � Although all “available” off of http://abilene.internet2.edu/observatory it’s a twisty maze of Web links (and no access to archived data) � Validation? Know some of flow data bad… 03-Jun-2004 4
Some future plans � New databases: IGP, BGP � Looking to use a DHS grant to help clean up databases & improve access � Hope to contribute to this effort 03-Jun-2004 5
URLs � http://abilene.internet2.edu/observatory • Pointers to all measurements/sites/projects • Particularly http://abilene.internet2.edu/observatory/data-views.html � http://www.abilene.iu.edu/ • NOC home page. Weathermap, Proxy, SNMP measurements � http://netflow.internet2.edu/weekly/ • Summarized flow data � http://www.itec.oar.net/abilene-netflow/ • “Raw” – matricies; (Anon) feeds available on request 03-Jun-2004 6
Details on select datasets 03-Jun-2004
Flow data � Collected using Mark Fullmer’s flowtools • 1/100 sampling ; may be losing some, bet don’t note � Stored (last 11 bits zeroed) on a RAID for 30 days � Summaries stored forever • http://www.itec.oar.net/abilene-netflow/ • http://netflow.internet2.edu/weekly/ � Raw: access via rsync. • One directory per day per router • Files with 5 min chunks (so 288/day) • Not necessarily available in real time, <=24 hr delay possible 03-Jun-2004 8
Latency � Our own owamp implementation • 1/sec poisson ; full mesh among router nodes � Summaries • Stored in mySQL • Web displays, including graphical and “worst 10” • XML/SOAP access http://abilene.internet2.edu/ami/webservices.html using GGF Network Measurement WG schema… � Raw • Directory per path per day • Tar in to one large file per day, then compressed • HPSS, no public access 03-Jun-2004 9
Router Snapshots � 1/hr query of routers using XML & Junoscript; config, routing state, interface status… � Raw • Stored as compressed XML files forever • Access via SOAP/XML or fetching files • http://loadrunner.uits.iu.edu/~gcbrowni/Abilene/raw-data.html � Web page veneer • http://loadrunner.uits.iu.edu/~gcbrowni/Abilene/ • Current status, and also old available by date • Some processed more than others (including some into rrd files and graphs [rrd files available in addition to graphs]) 03-Jun-2004 10
5 min SNMP � Polled using custom software, stored into RRD files � Access depends on link type • Backbone from weathermap • Access by clicking on map of router nodes • Either way, end up with typical MRTG-style � Raw • RRD files for current day off MRTG-style page • RRD files stored per day (I believe) on HPSS –No public access, but should be available on request 03-Jun-2004 11
03-Jun-2004 12
Questions � Brief description of data sets � Tools used (?) � Cataloging and Archiving the data � Problems with the data � Desirable database support � Future plans 03-Jun-2004 13
Observatory � Publishing measurement data • Stuff we collect for operations • Stuff we collect for research � The ability for research projects to add their equipment, or run software on our platform • Peer reviewed • Why? Passive, collocation makes analysis easier • AMP, PMA, Planetlab 03-Jun-2004 14
Recommend
More recommend