BGP A study into the BGP protocol as well as BGP implementations to improve Route Server scalability. Parallelization Jenda Brands Patrick de Niet
Insert content in this area 2 The internet is growing B Active BGP entries in FIB From cidr-report.org NETWORKS More prefixes announced • De -aggregation of prefixes • Thus , more prefixes announced • Currently 673,602 prefixes (03-07-17) ROUTES More routes to prefixes announced • More interconnections are made
3 Introduction to internet exchanges I Internet Exchange All nodes in the same layer-2 domain IX Benefits of IX Content Enterprise Provider Flat fee from IX Negotiate peering terms with neighbours PEERING Content Enterprise Internet Exchange (IX) Costs of peering Provider • Internet Exchanges reduce peering costs and administration Enterprise Enterprise
Insert content in this area 4 Introduction to route servers I Traditional BGP Enterprise Full-mesh peering BGP Without Route Server Content Enterprise Provider 21 Peerings in full-mesh required ( N(N-1)/2 ) 6 sessions per node Same layer 2 network Lot of administration/configuration for all peers Content Enterprise Provider Enterprise Enterprise
5 Introduction to route servers I Current BGP BGP With route server Peering with route server Enterprise 14 Peerings required ( N2 ) 2 sessions per node, each route server has 7 Less administration/configuration needed for Content peering Enterprise Provider Private peering possible Route Server reduces load on clients Problem Content Enterprise Provider Convergence time Route Route server server Maximum CPU usage on route server Aged routes on the clients Enterprise Enterprise
6 Problem summary Route servers are doing the heavy lifting and pushing BGP capabilities As a result convergence times are increasing The exact cause of this behaviour within BGP is unidentified
Insert content in this area 7 Research question What improvements can be made to the Border Gateway Protocol (BGP) or its implementations to resolve current CPU bottlenecks when processing updates ? Why are current BGP implementations (inherently) single-threaded? What past work has been done to solve this specific issue? What optimizations can be done to resolve this issue?
Insert content in this area 8 General BGP architecture
Insert content in this area 9 BGP specification (phase 1) Route server I IN- OUT- POLICY POLICY Best Path Calc. UPDATE Peer 1 Peer 1 Adj-RIB-Out Adj-RIB-In P1 P1 Peer 2 Peer 2 Adj-RIB-Out Adj-RIB-In P2 P2 Loc-RIB Peer n Peer n Adj-RIB-Out Adj-RIB-In Pn Pn
Insert content in this area 10 BGP specification (phase 2) Route server IN- OUT- POLICY POLICY Best Path Calc. Peer 1 Peer 1 Adj-RIB-Out Adj-RIB-In P1 P1 Peer 2 Peer 2 Adj-RIB-Out Adj-RIB-In P2 P2 Loc-RIB Peer n Peer n Adj-RIB-Out Adj-RIB-In Pn Pn
11 BGP specification (phase 3) Route server IN- OUT- POLICY POLICY Best Path Calc. Peer 1 Peer 1 Adj-RIB-Out Adj-RIB-In P1 P1 Peer 2 Peer 2 Adj-RIB-Out Adj-RIB-In P2 P2 Loc-RIB Peer n Peer n Adj-RIB-Out Adj-RIB-In Pn Pn
About our company info 12 Testing scenarios SCENARIO 1 SCENARIO 2 SCENARIO 3 THREE to ONE MANY to ONE REAL WORLD Three peers Many peers Many peers One route server One route server One route server Simulate link-flap Simulate link-flap Overlapping prefixes Simulate link-flap
About our company info 13 Testing scenarios SAME UNIQUE REAL-WORLD All peers SAME prefix All peers UNIQUE prefix REAL WORLD Peer 1 Peer 1 Peer 1 1.0.0.0/24 1.0.0.0/24 1.0.0.0/20 1.0.1.0/24 1.0.1.0/24 1.0.16.0/20 1.0.2.0/24 1.0.2.0/24 1.0.32.0/20 Peer 2 Peer 2 Peer 2 1.0.0.0/24 1.0.3.0/24 1.0.4.0/23 1.0.1.0/24 1.0.4.0/24 1.0.6.0/23 1.0.2.0/24 1.0.5.0/24 1.0.8.0/23 Peer n Peer n Peer n 1.0.0.0/24 1.0.6.0/24 1.0.5.0/24 1.0.1.0/24 1.0.7.0/24 1.0.7.0/24 1.0.2.0/24 1.0.8.0/24 1.0.8.0/24
About our company info 14 Testbed ROUTE SERVER PEER SERVERS PEERS EIGHT servers for peers ONE route server 800 peers max Intel(R) Xeon(R) CPU E3-1220L Intel(R) Xeon(R) CPU L3426 @ ExaBGP daemons V2 @ 2.30GHz (4 cores) 1.87GHz (8 cores) 7.7GB RAM 7.7GB RAM BIRD BGP daemon Docker used for containers
About our company info 15 Definitions CONVERGENCE LINK FLAP METRICS What defines CONVERGED All peers UNIQUE prefix What was MEASURED state Either Simulate flapping link CPU Utilization Got END-OF-RIB for last Bring link to RS down Memory Utilization peer Bandwidth Stops sending UPDATES
Insert content in this area 16 Observations C Convergence time Convergence time vs number of peers RESULTS Convergence times Convergence time 600 Lower numbers show lower convergence times 500 Higher numbers show increasingly higher times 400 Time in seconds 10,000 prefixes with 800 peers significantly higher 300 200 100 0 3 10 100 200 300 400 500 600 700 800 Number of peers 100 prefixes per peer 1,000 prefixes per peer 10,000 prefixes per peer
Insert content in this area 17 Observations Turning off export of routes E NO EXPORT Phase 3 Convergence time 500 Sending UPDATES disabled 450 “export none” 400 350 No significant difference Time in seconds Phase 3 (sending UPDATES) can not be the 300 issue Export on 250 Export off 200 Unable to conclusively rule out remaining 150 phases 100 50 0 Export on/off
About our company info 18 Solutions PROTOCOL IMPLEMENTATION PROTOCOL improvements IMPLEMENTATION improvements Snapshot of Adj-RIB-In Load balance route servers Sorted on prefix Single endpoint for customers iBGP for internal convergence Calculate hash on peer side eBGP for peering With OPEN message send hash RS compares hash If hash is the same no need for full UPDATE
Insert content in this area 19 Protocol solution P Protocol modifications PREFIX BASED Create prefix based RIB-In Route server Best IN- OUT- Path POLICY POLICY Calc. Create table per prefix Add all paths to that prefix When starting Phase 2 calculation only lock Peer Prefix that specific RIB Peer 1 Peer 1 Adj-RIB-Out Adj-RIB-In P1 Adj-RIB-In P1 Peer Prefix Peer 2 Peer 2 Adj-RIB-Out Adj-RIB-In P2 Adj-RIB-In Loc-RIB P2 Peer Prefix Peer n Peer n Adj-RIB-Out Adj-RIB-In Pn Adj-RIB-In Pn Pn
Insert content in this area 20 Protocol solution P Protocol modifications HASHING Compare hash before full UPDATE Receive OPEN End of Phase 3 Peer x message Calculate hash of RIB on peer-side Calculate hash of After link-flap send hash in OPEN message Compare hash RIB-Out RS compares hashes, if match, no need for full UPDATE Send OPEN message to Route Send NOTIFICATION Match? Yes Server (not request RIB) (Incl hash) No Send NOTIFICATION (request RIB) Peer x Route Server
Insert content in this area 21 Implementation solution L Load balancing Customers do peering with load-balancer BEFORE LB eBGP Enterprise Customers peer through load balancer Peer with route server behind load balancer Content Enterprise Provider Content Enterprise Provider Load balancer
Insert content in this area 22 Implementation solution L Load balancing Load-balancer balances between route servers IX BEHIND LB iBGP Load balancer iBGP full mesh eBGP to load-balancer Route server Route Route server server Route Route server server iBGP eBGP
Insert content in this area 23 Future work Rule out phase 1 Benchmarking of code 1 1 Go through (open-source) code Narrow down the problem as much as possible Good chances phase 1 is also not the issue Put timestamps, find delaying pieces of code Narrow down bottleneck PoC of hashing mechanism PoC of load balancing 2 2 Set up a proof of concept of the proposed Set up a proof of concept with load balancing hashing mechanism Measure convergence time gain Find any caveats not identified yet
THANK YOU Any further questions?
Let’s have a beer THANK YOU Any further questions?
Recommend
More recommend