Networking Updates Roopa Prabhu Aug 14, 2020
Linux Kernel dataplane for an Open standards based multihoming protocol 2
Traditional Multihoming peerlink switch2 switch1 Host2 Host1 3
Open Multihoming solution with VxLAN Overlay ● E-VPN multihoming: BGP based E-VPN multihoming controlplane [1] ○ Connect your servers to a redundant pair of switches running Open BGP based multihoming protocol ○ Peer switches are connected over VxLan overlay (peer switches are vxlan tunnel endpoints or VTEPs) Vxlan Vxlan overlay overlay switch1 switch2 switch3 host2 host2 host1 4
Linux Kernel Dataplane forwarding enhancements to support E-VPN-multihoming ● Vxlan FDB ECMP nexthop groups support [2] ○ Ability to ECMP to multiple evpn peered vteps vxlan fdb entry: nexthop group entry: # bridge fdb show | grep vni1000 # ip nexthop ls 02:02:00:00:00:13 dev vni1000 nhid 102 self permanent id 12 via 172.16.1.2 scope link fdb id 13 via 172.16.1.3 scope link fdb id 102 group 12/13 fdb 5
Encoding local vs peer ownership in neigh entries ● Kernel FDB and neighbor database is central to a Multihoming protocol neighbour table Linux bridge ● Keeping them in sync across multihoming peers for faster convergence is key FDB 2 ● Kernel API enhancements for accuracy and better convergence amidst mac 1 moves in these systems (With requests from FRR team): ● Bridge notify: To indicate a MAC has become active locally due to kernel 1 dataplane seeing a packet on a host port locally [3] 2 ● Neighbor entry enhancements to indicate local reachability and multihoming-peer reachability. new flag (pending upstream) E-VPN control plane (FRR) 6
Miscellaneous updates 7
Protodown and protodown reason ● protodown is a per netdevice flag today that enables control plane protocols to hold an interface carrier down ● Multiple users: ○ Multihoming protocols ○ VRRP ○ port security violation ○ flaky link, auto-detect and keep the link down ● New protodown-reason support upstream. (iproute2 changes pending) $cat /etc/iproute2/protodown_reasons.d/r.conf 0 mlag 1 evpn 2 vrrp 3 psecurity $ip link set dev vxlan0 protodown on protodown_reason vrrp on $ip link set dev vxlan0 protodown_reason mlag on $ip link show 14: vxlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether f6:06:be:17:91:e7 brd ff:ff:ff:ff:ff:ff protodown on <mlag,vrrp> 8
NAT offload to switch ASIC 1 iptables • Move NAT function from host to the ASICs on the switch: NAT at higher speeds and scale • Linux kernel NAT offload to switch ASIC: iptables/nftables/conntrack or TC conntrack Linux kernel Dynamic NAT offload: conntrack iptables NAT iptables dynamic NAT entry 1 Trap first packet to CPU matching NAT rule 2 2 3 On conntrack entry learn, ofload conntrack entry to HW 3 (Offload via netlink or in-kernel offload API) Switch ASIC 9
References [1] E-VPN Multihoming: https://tools.ietf.org/html/rfc7432#section-8 [2] VxLAN FDB nexthop groups: https://patchwork.ozlabs.org/project/netdev/cover/1590125177-39176-1-git-send-email-roopa@cumulusnetworks.com/ [3] Bridge notify flag: https://patchwork.ozlabs.org/project/netdev/cover/20200623204718.1057508-1-nikolay@cumulusnetworks.com/ [4] Protodown reason: https://patchwork.ozlabs.org/project/netdev/patch/1596242041-14347-1-git-send-email-roopa@cumulusnetworks.com/ [5] NAT offload on Cumulus Linux: https://docs.cumulusnetworks.com/cumulus-linux-41/Layer-3/Network-Address-Translation-NAT/ 10 10
Recommend
More recommend