Fast Control Plane Analysis Using an Abstract Representation Aditya Akella Aaron Gember-Jacobson, Raajay Viswanathan and Ratul Mahajan UW-Madison and Microsoft 1
Control plane is… • Essential → configuration errors may cause security/availability problems • Complex → errors may not be immediately apparent Routing Routing Control Routing Routing Routing Routing process table process table plane process table Data Forwarding To: A table plane 2
Important functional invariants Always blocked Always isolated Always equivalent paths Always traverse middlebox Challenge: Invariants violated under some (combinations of) failures 3
Generate data planes [Batfish] Analyze current data plane [HSA, → time consuming → cannot verify invariants Veriflow] always hold Forwarding Forwarding Forwarding Forwarding Forwarding Forwarding Table’ Table’ Table’’ Table’’ Table’’’ Table’’’ Proactive Forwarding Forwarding Verification Forwarding Table’ Table’’ Table’’’ Blocked, isolated, waypoints, equivalence … • Properties of paths , not paths themselves Higher-level abstraction Fast analysis • Data centers, enterprises use a limited set of control plane constructs 4
Abstract Representation for Control planes (ARC) Src:U Dst:T C O D I 3 0 0 0 B I C O D I 0 1 3 … C I D O 3 0 0 1 1 0 0 B O C I D O 0 1 3 Dst:U Src:T Control plane configuration Abstract representation • Encodes the network’s forwarding behavior • Encodes the network’s forwarding behavior under all possible infrastructure faults under all possible infrastructure faults • Proactive verification boils down to checking simple graph-level properties fast • Ignore which protocols used and how 5
Key requirements of ARC 1) Sound & Complete : each digraph contains every feasible path and no infeasible paths verification of invariants 2) Precise : assign edge weights such that the min-cost path matches the real path counter-examples, equivalence testing Src:U Dst:T C O D I 3 0 0 T T 1 B C 0 B I C O D I 0 1 3 T C I D O 3 1 3 0 0 1 1 0 D U 0 B O C I D O 0 T 1 3 OSPF Dst:U Src:T 6
• Why weighted digraphs? • How to ensure soundness, completeness, precision? 7
Routing protocols used today IV 4 Path length AS 1 AS 2 Dijsktra’s algorithm BGP OSPF Router 1 Router 2 & preference AD=110 AD=20 • Commonality : cost-based path selection algorithm • Differences : granularity & currency • Also must account for: – Traffic class specificity – Route redistribution – Route selection based on administrative distance Challenge: determining the structure and edge weights of the graphs 8
Extended topology graphs (ETGs) • One per traffic class • Vertices : routing processes Sound and complete (for OSPF, BGP, redistr …) • Edges : flow of data enabled by exchange of routing information X.3 I Y.3 O Z.3 I D ST :T 0.4 0.6 T X Y 2 0 0 OSPF 3 3 0 1 Z X.3 O 0.4 Y.3 I 0.6 Z.3 O BGP 1 S RC :S A S B A.1 I B.1 O Z.1 I 1 1 Edge-weights based on 0 0 0 configured costs and A.1 O B.1 I Z.1 O 1 1 administrative distances 9
ETG edge weights • Inter-device : OSPF weights; Precise unit cost per hop for BGP (each router is an AS) (for DAG • Intra-device : redistribution only: no cost within redistribution, + scaling process; fixed-cost between processes AD graphs) X.3 I Y.3 O Z.3 I D ST :T 0.3 3 2 0.2 ≤ Shortest Longest 0 0 0 T X Y 2 path = 0.5 path = 5 X.3 O Y.3 I Z.3 O 0.3 0.2 2 3 OSPF 2 3 Gap = 1 1 Z S RC :S 1 BGP 1 A.1 I B.1 O Z.1 I 1 1 2 S A B Shortest 0 0 0 path = 1 A.1 O B.1 I Z.1 O 1 1 10
ARC properties Construct Sound & Precise Complete OSPF Single area RIP eBGP Select by AS path length, local pref. Static Routes ACLs Route filters Route selection (based on No redistribution OR redistribution Administrative Distance) costs congruent with ADs Route redistribution Acyclic & costs congruent with ADs Sound and complete for 100% Precise for 96% 11
Verification • Always traverse middlebox : 1) remove all edges with middleboxes 2) Src and Dst in same connected component? F O G O OSPF F G F I G I E E O B C E I B I C I D ST :S S D U D O B O C O D I S RC :U 12
Verification • Always reachable with < k link failures: max-flow on unit- weight ETG ≥ k? F O G O OSPF 1 F G ∞ F I G I E E O B C E I B I C I D ST :S S D U D O B O C O 3 edge-disjoint paths Max-flow = 3 D I S RC :U 13
Verification Invariant Graph property Required ARC Properties Separate connected Always blocked Sound & Complete components Always reachable Max flow ≥ k Sound & Complete with < k failures Always traverse Separate connected Sound & Complete waypoint (chain) components Always isolated No common edges Sound & Complete Same structure & weights Sound, Complete, Equivalence & Precise Precision required to produce counter-examples 14
Implementation and evaluation • Implemented in Java using Batfish (parsing) and JGraphT (graph algorithms) https://bitbucket.org/uw-madison-networking-research/arc • Configurations from 314 data center networks operated by a large online service provider • 4-core 2.8GHz CPU 24GB RAM 15
Evaluation: time to generate ARC Most time is spent parsing Fast (< 10 sec) even for large networks 16
Evaluation: verification time Always blocked Always reachable Always isolated with < k failures < 500 ms < 1 sec Up to 16 min (Batfish: 694 days!) Verification time is proportional to the number of traffic classes; easily parallelized 17
Next steps • Precision under fewer assumptions • Generality of ARCs • Other uses… 18
Next steps: automated repair 1) Transform ETGs to have desired attributes (e.g., src and dst → always Configurations ARC blocked) 2) Translate to config changes (e.g., remove edge → add ACL) Repairs Challenge: finding a minimal repair (e.g., many ACLs vs. remove BGP neighbor) without side-effects 19
Next steps: Transition to SDN Controller Controller uses ETGs to drive forwarding plane configurations Configurations ARC Minimize controller involvement, churn? Different underlying network topology? 20
Next steps: synthesis • Operators require fine-grained control over routing: waypoints, isolation, traffic engineering – Intents configurations • Distributed routing based on shortest path – very difficult to program! • One approach: input data planes resilient ARCs configs 21
Synthesis • Edge weights 1 S1 • Input path to dst must be the shortest path 1 S2 • Uniqueness of shortest path 3 S4 1 • Route filtering • Disable edges for a destination S3 1 to ensure path is shortest 3 1 • Backup paths 1 • Weights such that backup path is chosen during link failures S5 S6 22
Summary • Presented an abstract representation for control planes – Fast and simple verification under arbitrary failures – Verification is based on graph-level properties – Up to 5 orders of magnitude speed-up • Useful for repair, transition, synthesis, … Try it! https://bitbucket.org/uw-madison- networking-research/arc 23
Backup 24
Evaluation: verification time Always blocked Always blocked using ARC using Batfish < 500 ms > 694 days! Verification with ARC is 3 to 5 orders of magnitude faster! 25
Verification • Always blocked : Src and Dst in same connected component? D ST :S S RC :T ? C O D I T S OSPF C I D O T T B 1 C T 1 3 D U S T ? C O D I T U C I D O D ST :U S RC :T 26
Fast Control Plane Analysis Using an Abstract Representation Aditya Akella Aaron Gember-Jacobson, Raajay Viswanathan and Ratul Mahajan UW-Madison and Microsoft 28
Fast Control Plane Analysis Using an Abstract Representation Aditya Akella 29
Recommend
More recommend