Sidestepping BGP's Limitations objective deliver tra ffi c with the best performance possible challenge BGP does not consider demand, capacity or performance approach shift control from BGP at routers to a software controller
Design Priorities Operational simplicity minimize change and system complexity
Design Priorities Operational simplicity minimize change and system complexity Ease of deployment interoperate with existing infrastructure and tooling
Responsibility for Routing Operational simplicity design priorities Ease of deployment Traditional routers Host-based routing Route per destination from BGP Route per packet dictated by hosts
Responsibility for Routing Operational simplicity design priorities Ease of deployment Traditional routers Host-based routing Route per destination from BGP Route per packet dictated by hosts Edge Fabric's approach: Controller overrides BGP's decisions at router Hosts provide hints on packet priority
Edge Fabric's Approach to Control BGP 1 Router selects routes using BGP sessions Router BGP Route A � 48
Edge Fabric's Approach to Control BGP 1 Router selects routes using BGP sessions Additional Inputs BGP Router BGP Edge Fabric routes Edge Fabric selects ideal routes 2 using BGP routes + other inputs Route A � 49
Edge Fabric's Approach to Control Inputs to Edge Fabric Additional Inputs BGP routes (from router) Edge Fabric Advanced policy Prefix tra ffi c rates 1 Gbps Circuit capacities 40 Gbps Route performance measurements � 50
Edge Fabric's Approach to Control Inputs to Edge Fabric Additional Inputs BGP routes (from router) Edge Fabric Advanced policy Route B Prefix tra ffi c rates 1 Gbps Circuit capacities 40 Gbps Route performance measurements � 51
Edge Fabric's Approach to Control BGP 1 Router selects routes using BGP sessions Additional Inputs BGP Router BGP Edge Fabric routes Edge Fabric selects ideal routes 2 using BGP routes + other inputs Route B Route A � 52
Edge Fabric's Approach to Control BGP 1 Router selects routes using BGP sessions Additional Inputs BGP Router BGP Edge Fabric routes Edge Fabric selects ideal routes 2 using BGP routes + other inputs Route B Route A override If router and Edge Fabric choose 3 Router Edge Fabric BGP di ff erent routes, override router use Route B Route B Route A Route B � 53
Types of Edge Fabric Overrides Edge Fabric can override BGP's decision in order to...
Types of Edge Fabric Overrides Edge Fabric can override BGP's decision in order to... 203.0.113.0/24 Before Peering Move tra ffi c for set of end-users override per <destination> After Transit
Types of Edge Fabric Overrides Edge Fabric can override BGP's decision in order to... 203.0.113.0/24 Before Peering Move tra ffi c for set of end-users override per <destination> After Transit Low priority tra ffi c Before Move class of end-user tra ffi c Peering override per <destination, tra ffi c class> After Transit (see paper for details)
Example Override: Preventing Congestion 12 Gbps load BGP's decision Route A 10 Gbps capacity Router 12 Gbps demand ISP 0 Gbps load Tier 1 Route B 100 Gbps capacity
Example Override: Preventing Congestion 12 Gbps load BGP's decision Route A 10 Gbps capacity Router 12 Gbps demand ISP Demand composed of two prefixes: 0 Gbps load Tier 1 Route B 100 Gbps capacity
Example Override: Preventing Congestion 12 Gbps load BGP's decision Route A 10 Gbps capacity Router 12 Gbps demand ISP Demand composed of two prefixes: 198.51.100.0/24 | 9.5 Gbps 203.0.113.0/24 | 2.5 Gbps 0 Gbps load Tier 1 Route B 100 Gbps capacity
Example Override: Preventing Congestion 9.5 Gbps load Edge Fabric Route A 10 Gbps capacity Router 12 Gbps demand ISP Demand composed of two prefixes: 198.51.100.0/24 | 9.5 Gbps 203.0.113.0/24 | 2.5 Gbps Tier 1 Route B 100 Gbps capacity Edge Fabric shifts a prefix's tra ffi c to an alternate link
Example Override: Preventing Congestion 9.5 Gbps load Edge Fabric Route A 10 Gbps capacity Shifts 203.0.113.0/24 Router 12 Gbps demand (destination-based override) ISP Demand composed of two prefixes: 198.51.100.0/24 | 9.5 Gbps 203.0.113.0/24 | 2.5 Gbps +2.5 Gbps load Tier 1 Route B 100 Gbps capacity Edge Fabric shifts a prefix's tra ffi c to an alternate link
Enacting Overrides at Routers selected inject Transit Route Edge Router 203.0.113.0/24 route via BGP 1 Edge Fabric injects override route via BGP
Enacting Overrides at Routers selected inject Transit Route Edge Router 203.0.113.0/24 route via BGP 1 Edge Fabric injects override route via BGP Injected Route Edge Fabric BGP 203.0.113.0/24 ENGINE injection via BGP BGP's selected route 2 BGP at routers prefers routes from Edge Fabric
Enacting Overrides at Routers Edge Fabric monitors BGP's decisions and overrides them as needed We gain centralized control over the distributed BGP process without removing BGP from our routers
Edge Fabric is Flexible BGP routes Policy Circuit capacity and tra ffi c rates Path per <destination> Path per <destination, tra ffi c class> Route performance measurements inputs override granularities Edge Fabric supports sophisticated tra ffi c engineering policies
Edge Fabric Meets Our Design Priorities Operational simplicity Can fallback to BGP at routers Allows operators to continue to use existing tools Synchronization is only required between Edge Fabric and routers
Edge Fabric Meets Our Design Priorities Operational simplicity Can fallback to BGP at routers Allows operators to continue to use existing tools Synchronization is only required between Edge Fabric and routers Ease of deployment BGP sessions with external peers remain at routers Uses BGP protocol for injections Uses other industry standards for route and tra ffi c info (BMP/IPFIX/sFlow)
Outline 1 Overview 2 Facebook's Connectivity and Challenges 3 Sidestepping BGP's Limitations with Edge Fabric 4 Results from Edge Fabric's Behavior in Production 5 Evolution and Related Work
Edge Fabric entered production in 2013 Objective: Prevent circuit congestion
Edge Fabric in Production BMP BGP routes Edge Edge Fabric Routers BGP Tra ffi c rates IPFIX/sFlow Runs per PoP, executes every 30 seconds Controls 100% of Facebook's egress tra ffi c (see paper for implementation details)
Target Circuit Utilization To Avoid Congestion 110% if all tra ffi c was placed onto its most preferred path circuit utilization How much tra ffi c should Edge Fabric remove?
Target Circuit Utilization To Avoid Congestion 110% if all tra ffi c was placed onto its most preferred path 100% packet loss during bursts circuit utilization
Target Circuit Utilization To Avoid Congestion 110% if all tra ffi c was placed onto its most preferred path 100% packet loss during bursts circuit utilization 50% poor utilization
Target Circuit Utilization To Avoid Congestion 110% if all tra ffi c was placed onto its most preferred path 100% packet loss during bursts circuit utilization ~95% high utilization with tolerance for bursts in tra ffi c 50% poor utilization
Evaluating Congestion Avoidance Key questions: Does Edge Fabric prevent circuit congestion and packet drops? Does Edge Fabric keep circuit utilization at prescribed threshold?
Evaluating Congestion Avoidance Does Edge Fabric prevent circuit congestion and packet drops? When Edge Fabric was shifting tra ffi c away 99.9% of the time, no packet drops During measurement period
Evaluating Congestion Avoidance Does Edge Fabric prevent circuit congestion and packet drops? When Edge Fabric was shifting tra ffi c away 99.9% of the time, no packet drops During measurement period When Edge Fabric was not active No packet drops
Evaluating Congestion Avoidance Does Edge Fabric prevent circuit congestion and packet drops? When Edge Fabric was shifting tra ffi c away 99.9% of the time, no packet drops During measurement period When Edge Fabric was not active No packet drops Edge Fabric intervened when needed and prevented circuit congestion
Evaluating Congestion Avoidance Can we keep utilization at the threshold? [ Circuit utilization - threshold] every 30 seconds for circuits where demand > capacity
Evaluating Congestion Avoidance Can we keep utilization at the threshold? 30 % of samples [ Circuit utilization - threshold] 20 every 30 seconds for circuits where demand > capacity 10 0 4% -4% -3% -2% -1% 0% 1% 2% 3% Circuit utilization - threshold
Evaluating Congestion Avoidance Can we keep utilization at the threshold? 30 % of samples [ Circuit utilization - threshold] 20 every 30 seconds for circuits where demand > capacity 10 Ideal value 0 4% -4% -3% -2% -1% 0% 1% 2% 3% Circuit utilization - threshold
Evaluating Congestion Avoidance Can we keep utilization at the threshold? 30 % of samples [ Circuit utilization - threshold] 20 every 30 seconds for circuits where demand > capacity 10 Ideal value Utilization lower than threshold Utilization higher than threshold 0 4% -4% -3% -2% -1% 0% 1% 2% 3% Circuit utilization - threshold
Evaluating Congestion Avoidance Can we keep utilization at the threshold? Within 2% 30 % of samples Threshold 20 10 0 -4% -3% -2% -1% 0% 1% 2% 3% 4% Circuit utilization - threshold Utilization lower than threshold Utilization higher than threshold
Does Edge Fabric prevent circuit congestion and packet drops? Yes. Yes. Does Edge Fabric keep circuit utilization at prescribed threshold? Edge Fabric prevents packet loss while keeping circuit utilization high
Outline 1 Overview 2 Facebook's Connectivity and Challenges 3 Sidestepping BGP's Limitations with Edge Fabric 4 Results from Edge Fabric's Behavior in Production 5 Evolution and Related Work
Evolution: Enacting Decisions Edge Fabric decisions via MPLS/DSCP/GRE "send via circuit X" X Packet servers routers Initially: Host-based routing Overrides enacted by hosts Hosts signal egress path per packet
Evolution: Enacting Decisions Edge Fabric Edge Fabric decisions decisions via MPLS/DSCP/GRE via DSCP "send via circuit X" "video tra ffi c" X Packet Packet servers routers servers routers Initially: Host-based routing Today: Edge-based routing Overrides enacted by hosts Overrides enacted by routers at edge Hosts signal egress path per packet Hosts signal priority per packet
Evolution: Enacting Decisions Before: Host-based routing Today: Edge-based routing Both provide the capabilities we want today Preventing congestion, incorporating advanced policy, application-specific and performance-aware routing
Recommend
More recommend