using an abstract representation
play

Using an Abstract Representation Aditya Akella Aaron - PowerPoint PPT Presentation

Fast Control Plane Analysis Using an Abstract Representation Aditya Akella Aaron Gember-Jacobson, Raajay Viswanathan and Ratul Mahajan UW-Madison and Microsoft 1 Control plane is Essential configuration errors may cause


  1. Fast Control Plane Analysis Using an Abstract Representation Aditya Akella Aaron Gember-Jacobson, Raajay Viswanathan and Ratul Mahajan UW-Madison and Microsoft 1

  2. Control plane is… • Essential → configuration errors may cause security/availability problems • Complex → errors may not be immediately apparent Routing Routing Control Routing Routing Routing Routing process table process table plane process table Data Forwarding To: A table plane 2

  3. Important functional invariants Always blocked Always isolated Always equivalent paths Always traverse middlebox Challenge: Invariants violated under some (combinations of) failures 3

  4. Generate data planes [Batfish] Analyze current data plane [HSA, → time consuming → cannot verify invariants Veriflow] always hold Forwarding Forwarding Forwarding Forwarding Forwarding Forwarding Table’ Table’ Table’’ Table’’ Table’’’ Table’’’ Proactive Forwarding Forwarding Verification Forwarding Table’ Table’’ Table’’’ Blocked, isolated, waypoints, equivalence … • Properties of paths , not paths themselves Higher-level abstraction Fast analysis • Data centers, enterprises use a limited set of control plane constructs 4

  5. Abstract Representation for Control planes (ARC) Src:U Dst:T C O D I 3 0 0 0 B I C O D I 0 1 3 … C I D O 3 0 0 1 1 0 0 B O C I D O 0 1 3 Dst:U Src:T Control plane configuration Abstract representation • Encodes the network’s forwarding behavior • Encodes the network’s forwarding behavior under all possible infrastructure faults under all possible infrastructure faults • Proactive verification boils down to checking simple graph-level properties  fast • Ignore which protocols used and how 5

  6. Key requirements of ARC 1) Sound & Complete : each digraph contains every feasible path and no infeasible paths  verification of invariants 2) Precise : assign edge weights such that the min-cost path matches the real path  counter-examples, equivalence testing Src:U Dst:T C O D I 3 0 0 T T 1 B C 0 B I C O D I 0 1 3 T C I D O 3 1 3 0 0 1 1 0 D U 0 B O C I D O 0 T 1 3 OSPF Dst:U Src:T 6

  7. • Why weighted digraphs? • How to ensure soundness, completeness, precision? 7

  8. Routing protocols used today IV 4 Path length AS 1 AS 2 Dijsktra’s algorithm BGP OSPF Router 1 Router 2 & preference  AD=110 AD=20 • Commonality : cost-based path selection algorithm • Differences : granularity & currency • Also must account for: – Traffic class specificity – Route redistribution – Route selection based on administrative distance Challenge: determining the structure and edge weights of the graphs 8

  9. Extended topology graphs (ETGs) • One per traffic class • Vertices : routing processes Sound and complete (for OSPF, BGP, redistr …) • Edges : flow of data enabled by exchange of routing information X.3 I Y.3 O Z.3 I D ST :T 0.4 0.6 T X Y 2 0 0 OSPF 3 3 0 1 Z X.3 O 0.4 Y.3 I 0.6 Z.3 O BGP 1 S RC :S A S B A.1 I B.1 O Z.1 I 1 1 Edge-weights based on 0 0 0 configured costs and A.1 O B.1 I Z.1 O 1 1 administrative distances 9

  10. ETG edge weights • Inter-device : OSPF weights; Precise unit cost per hop for BGP (each router is an AS) (for DAG • Intra-device : redistribution only: no cost within redistribution, + scaling process; fixed-cost between processes AD graphs) X.3 I Y.3 O Z.3 I D ST :T 0.3 3 2 0.2 ≤ Shortest Longest 0 0 0 T X Y 2 path = 0.5 path = 5 X.3 O Y.3 I Z.3 O 0.3 0.2 2 3 OSPF 2 3 Gap = 1 1 Z S RC :S 1 BGP 1 A.1 I B.1 O Z.1 I 1 1 2 S A B Shortest 0 0 0 path = 1 A.1 O B.1 I Z.1 O 1 1 10

  11. ARC properties Construct Sound & Precise Complete  OSPF Single area   RIP  eBGP Select by AS path length, local pref.   Static Routes   ACLs   Route filters  Route selection (based on No redistribution OR redistribution Administrative Distance) costs congruent with ADs  Route redistribution Acyclic & costs congruent with ADs Sound and complete for 100% Precise for 96% 11

  12. Verification • Always traverse middlebox : 1) remove all edges with middleboxes 2) Src and Dst in same connected component? F O G O OSPF F G F I G I E E O B C E I B I C I D ST :S S D U D O B O C O D I S RC :U 12

  13. Verification • Always reachable with < k link failures: max-flow on unit- weight ETG ≥ k? F O G O OSPF 1 F G ∞ F I G I E E O B C E I B I C I D ST :S S D U D O B O C O 3 edge-disjoint paths Max-flow = 3 D I S RC :U 13

  14. Verification Invariant Graph property Required ARC Properties Separate connected Always blocked Sound & Complete components Always reachable Max flow ≥ k Sound & Complete with < k failures Always traverse Separate connected Sound & Complete waypoint (chain) components Always isolated No common edges Sound & Complete Same structure & weights Sound, Complete, Equivalence & Precise Precision required to produce counter-examples 14

  15. Implementation and evaluation • Implemented in Java using Batfish (parsing) and JGraphT (graph algorithms) https://bitbucket.org/uw-madison-networking-research/arc • Configurations from 314 data center networks operated by a large online service provider • 4-core 2.8GHz CPU 24GB RAM 15

  16. Evaluation: time to generate ARC Most time is spent parsing Fast (< 10 sec) even for large networks 16

  17. Evaluation: verification time Always blocked Always reachable Always isolated with < k failures < 500 ms < 1 sec Up to 16 min (Batfish: 694 days!) Verification time is proportional to the number of traffic classes; easily parallelized 17

  18. Next steps • Precision under fewer assumptions • Generality of ARCs • Other uses… 18

  19. Next steps: automated repair 1) Transform ETGs to have desired attributes (e.g., src and dst → always Configurations ARC blocked) 2) Translate to config changes (e.g., remove edge → add ACL) Repairs Challenge: finding a minimal repair (e.g., many ACLs vs. remove BGP neighbor) without side-effects 19

  20. Next steps: Transition to SDN Controller Controller uses ETGs to drive forwarding plane configurations Configurations ARC Minimize controller involvement, churn? Different underlying network topology? 20

  21. Next steps: synthesis • Operators require fine-grained control over routing: waypoints, isolation, traffic engineering – Intents  configurations • Distributed routing based on shortest path – very difficult to program! • One approach: input data planes  resilient ARCs  configs 21

  22. Synthesis • Edge weights 1 S1 • Input path to dst must be the shortest path 1 S2 • Uniqueness of shortest path 3 S4 1 • Route filtering • Disable edges for a destination S3 1 to ensure path is shortest 3 1 • Backup paths 1 • Weights such that backup path is chosen during link failures S5 S6 22

  23. Summary • Presented an abstract representation for control planes – Fast and simple verification under arbitrary failures – Verification is based on graph-level properties – Up to 5 orders of magnitude speed-up • Useful for repair, transition, synthesis, … Try it! https://bitbucket.org/uw-madison- networking-research/arc 23

  24. Backup 24

  25. Evaluation: verification time Always blocked Always blocked using ARC using Batfish < 500 ms > 694 days! Verification with ARC is 3 to 5 orders of magnitude faster! 25

  26. Verification • Always blocked : Src and Dst in same connected component? D ST :S S RC :T ? C O D I T S OSPF C I D O T T B 1 C T 1 3 D U S T ? C O D I T U C I D O D ST :U S RC :T 26

  27. Fast Control Plane Analysis Using an Abstract Representation Aditya Akella Aaron Gember-Jacobson, Raajay Viswanathan and Ratul Mahajan UW-Madison and Microsoft 28

  28. Fast Control Plane Analysis Using an Abstract Representation Aditya Akella 29

Recommend


More recommend