a framework for historical analysis and real 4me
play

a framework for historical analysis and real-4me monitoring of BGP - PowerPoint PPT Presentation

a framework for historical analysis and real-4me monitoring of BGP data Chiara Orsini, Alistair King, Danilo Giordano, Vasileios Giotsas, Alberto Dainotti alistair@caida.org CAIDA, UC San Diego BGPSTREAM BGP data analysis for the masses


  1. a framework for historical analysis and real-4me monitoring of BGP data Chiara Orsini, Alistair King, Danilo Giordano, Vasileios Giotsas, Alberto Dainotti alistair@caida.org CAIDA, UC San Diego

  2. BGPSTREAM BGP data analysis for the masses • Open source libraries, APIs and tools for live and historical BGP data analysis • Simple API • Versa?le • Facilitates reproducibility and repeatability • Real?me monitoring • Stable: h"ps:/ /bgpstream.caida.org 2

  3. MOTIVATION Why BGPStream? • BGP research and monitoring is important • Lots of exis?ng BGP measurement data • Route Views and RIPE RIS have >15 years of data (16TB) • BUT , dis?nct lack of good tooling for processing/analyzing BGP data • State of the art? 
 3

  4. MOTIVATION Why BGPStream? • BGP research and monitoring is important • Lots of exis?ng BGP measurement data • Route Views and RIPE RIS have >15 years of data (16TB) • BUT , dis?nct lack of good tooling for processing/analyzing BGP data • State of the art? 
 wget http://archive.org/xyz/abc/file.mrt 
 4

  5. MOTIVATION Why BGPStream? • BGP research and monitoring is important • Lots of exis?ng BGP measurement data • Route Views and RIPE RIS have >15 years of data (16TB) • BUT , dis?nct lack of good tooling for processing/analyzing BGP data • State of the art? 
 wget http://archive.org/xyz/abc/file.mrt 
 bgpdump -m file.mrt | my_parser.py 5

  6. MOTIVATION Why BGPStream? • BGP research and monitoring is important • Lots of exis?ng BGP measurement data • Route Views and RIPE RIS have >15 years of data (16TB) • BUT , dis?nct lack of good tooling for processing/analyzing BGP data • State of the art? 
 wget http://archive.org/xyz/abc/file.mrt 
 bgpdump -m file.mrt | my_parser.py 6

  7. THE BGPSTREAM FRAMEWORK An overview 7

  8. THE BGPSTREAM FRAMEWORK An overview Metadata Broker 8

  9. THE BGPSTREAM FRAMEWORK An overview Metadata User Libraries Broker 9

  10. THE BGPSTREAM FRAMEWORK An overview Metadata User Libraries Broker metadata crawler Public HTTP … Data Archives 10

  11. THE BGPSTREAM FRAMEWORK An overview metadata query Metadata User Libraries Broker metadata crawler Public HTTP … Data Archives 11

  12. THE BGPSTREAM FRAMEWORK An overview metadata query Metadata User Libraries Broker MRT data (via HTTP) metadata crawler Public HTTP … Data Archives 12

  13. THE BGPSTREAM FRAMEWORK An overview metadata query User Code Python API Metadata User Libraries libBGPStream Broker MRT data (via HTTP) metadata crawler Public HTTP … Data Archives 13

  14. THE BGPSTREAM FRAMEWORK Stacked view 14

  15. THE BGPSTREAM FRAMEWORK Stacked view 1 15

  16. THE BGPSTREAM FRAMEWORK Stacked view 2 1 16

  17. THE BGPSTREAM FRAMEWORK Stacked view 3 2 1 17

  18. BGPSTREAM USER LIBRARY libBGPStream • Issues queries to metadata broker • Retrieves data directly from Data Providers • Currently supports MRT (RFC 6396) • De-mul?plexes data from many sources into a single stream • Provides ?me-ordered sor?ng 18

  19. RECORDS & ELEMS ExtracAng informaAon from MRT BGPStream Record Function Field Type • BGPStream Record : project name (e.g., Route Views) project string collector string collector name (e.g., rrc00) RIB or Updates type enum dump time long time the containing dump was begun • Encapsulates an MRT record first, middle, or last record of a dump position enum time long timestamp of the MRT record status enum record validity flag • Adds metadata (e.g. collector) de-serialized MRT record MRT record struct • MRT records (may) contain info for mul?ple BGPStream Elem Table 1: BGPStream elem fields. peers/prefixes Function Field Type route from a RIB dump, an- type enum nouncement, withdrawal, or state • E.g. routes to a single prefix from mulOple peers message time long timestamp of MRT record in a RIB dump IP address of the VP peer address struct AS number of the VP peer ASN long IP prefix prefix* struct • Records are decomposed into BGPStream Elems: IP address of the next hop next hop* struct AS path AS path* struct community attribute community* struct • E.g. prefix announcement from a single peer FSM state (before the change) old state* enum FSM state (after the change) new state* enum * denotes a field conditionally populated based on 19

  20. C API Specifying a stream 20

  21. C API Specifying a stream 21

  22. C API Specifying a stream 22

  23. C API Specifying a stream 23

  24. C API Consuming the stream 24

  25. C API Consuming the stream 25

  26. C API Consuming the stream 26

  27. PYTHON BINDINGS - CASE STUDY Studying AS path inflaAon using PyBGPStream How many AS paths are longer than the shortest path between two ASes? from _pybgpstream import BGPStream, BGPRecord, BGPElem 1 from collections import defaultdict 2 AS path length discrepancy PMF from itertools import groupby 3 import networkx as nx 4 0.8 5 0.7 stream = BGPStream() 6 as_graph = nx.Graph() 7 0.6 rec = BGPRecord() 8 bgp_lens = defaultdict(lambda: defaultdict(lambda: None)) 9 0.5 lin stream.add_filter(’record-type’,’ribs’) 10 0.4 stream.add_interval_filter(1438415400,1438416600) 11 stream.start() 12 30 LINES OF PYTHON CODE 0.3 13 while(stream.get_next_record(rec)): 14 0.2 elem = rec.get_next_elem() 15 0.1 0.1 while(elem): 16 monitor = str(elem.peer_asn) 17 10 -2 hops = [k for k, g in groupby(elem.fields[’as-path’].split(" "))] 18 10 -3 if len(hops) > 1 and hops[0] == monitor: 19 origin = hops[-1] 20 log 10 -4 for i in range(0,len(hops)-1): 21 as_graph.add_edge(hops[i],hops[i+1]) 22 10 -5 bgp_lens[monitor][origin] = \ 23 10 -6 min(filter(bool,[bgp_lens[monitor][origin],len(hops)])) 24 elem = rec.get_next_elem() 25 10 -7 for monitor in bgp_lens: 26 0 1 2 3 4 5 6 7 8 9 10 11 for origin in bgp_lens[monitor]: 27 AS path length difference[d] nxlen = len(nx.shortest_path(as_graph, monitor, origin)) 28 print monitor, origin, bgp_lens[monitor][origin], nxlen 29 27

  28. PY BGPSTREAM Python bindings • Single script includes data specifica?on and analysis logic: • Enhances reproducibility/repeatability • All of the power of the C API, available in Python 28

  29. PYTHON BINDINGS - CASE STUDY Timely reacAve measurements • We monitor c ommunity-based black-holing • Vic?m of DoS aWack announces prefix with special community aPribute to request neighbors drop traffic • We trigger traceroutes to characterize the black-holing event (using 50-100 probes per event) • probed 253 vic?ms (90-95% of black-holing events) while black-holing in effect • C ombined passive control-plane and ac2ve data-plane measurements to capture and inves2gate transient rou2ng policies 29

  30. BGP CORSARO ConAnuous realAme monitoring Hijacking of AS137 (GARR) - Jan 2015* • Plugin-based tool for processing live BGP data • Con?nuously extracts derived data from BGPStream in regular 2me bins • Incl. “prefix-monitor” sample plugin • Monitor your own address space • How many prefixes/origin ASes? *originally described by Dyn Research: http://research.dyn.com/2015/01/vast-world-of-fraudulent-routing/ 30

  31. BIG DATA BGP data analysis for the 1% • “Students can write scripts to analyze BGP data, but I need to do REAL analysis…” • We conducted a proof-of-concept study using PyBGPStream with Apache Spark: • Analyzed 15 years of data: • one RIB per month • all Route Views and RIPE RIS collectors • > 3000 RIBs, ~44 billion BGPStream Elems • See the paper for more details about lessons learned • PyBGPStream/Spark template script: hWps:/ /github.com/CAIDA/bgpstream 31

  32. BIG DATA - CASE STUDIES RouAng table size over Ame # IPv4 pre fj xes 500k 400k 300k 200k 100k 0 2002 2004 2006 2008 2010 2012 2014 2016 32

  33. BIG DATA - CASE STUDIES Transit ASes over Ame Transit ASNs % (IPv4) Transit ASNs % (IPv6) 60K 60 # ASNs (IPv4) # ASNs (IPv6) 50K 50 40K 40 Transit ASNs % # ASNs 30K 30 20K 20 10K 10 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 33 (c)

  34. BIG DATA - CASE STUDIES Transit ASes over Ame Transit ASNs % (IPv4) Transit ASNs % (IPv6) 60K 60 # ASNs (IPv4) # ASNs (IPv6) 50K 50 40K 40 Transit ASNs % # ASNs 30K 30 20K 20 10K 10 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 34 (c)

  35. BIG DATA - CASE STUDIES Transit ASes over Ame Transit ASNs % (IPv4) Transit ASNs % (IPv6) 60K 60 # ASNs (IPv4) # ASNs (IPv6) 50K 50 40K 40 Transit ASNs % # ASNs 30K 30 20K 20 10K 10 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 35 (c)

Recommend


More recommend