Networking (Containers) in Ultra- Low-Latency Environments Avi Deitcher avi@atomicinc.com
אינסכא Avi Deitcher avi@atomicinc.com
אינסכא Akh-san-ya \?aksnaja?\ n. (ancient Aramaic, from Ancient Greek xénos ) 1: Hospitality, lodging; 2: Host. Avi Deitcher avi@atomicinc.com
אינסכא Akh-san-ya \?aksnaja?\ n. (ancient Aramaic, from Ancient Greek xénos ) 1: Hospitality, lodging; 2: Host. : אינסכא דובכב םיחתופ Ancient Jewish custom to begin public speaking by honouring or thanking the hosts. Avi Deitcher avi@atomicinc.com
Who Am I? Avi Deitcher avi@atomicinc.com
Who Am I? ( not 24601) Avi Deitcher avi@atomicinc.com
Who Am I? ( not 24601) • Life in tech business: – 10 yrs financial services IT – 10+ yrs consulWng & training – Some startups on the way • Avid (if not very good) ice hockey player • Long-Wme lover of great engineering…. when used to make a real difference • Atomic Inc: – ConsulWng – Training Avi Deitcher avi@atomicinc.com
A Li\le History Avi Deitcher avi@atomicinc.com
A Li\le History Summer 2015 • Fintech X: “Help us containerize!” – Hint : It is harder than you think… and worth it – Culture/process > technology • QuesWon: Networking? • Answer: ScienWfic method Avi Deitcher avi@atomicinc.com
A Li\le History Summer 2015 Summer 2016 • Fintech X: “Help us • Good pracWce demands: containerize!” 1. Redo tests with new opWons – Hint : It is harder than you and versions think… and worth it 2. Make tests available – Culture/process > technology 3. Explain it all well • QuesWon: Networking? • Answer: ScienWfic method Avi Deitcher avi@atomicinc.com
What Is “Ultra-Low” Latency? Avi Deitcher avi@atomicinc.com
What Is “Ultra-Low” Latency? “every 100ms of delay costs 1% of sales” [1] 1. h\p://home.blarg.net/%7Eglinden/StanfordDataMining.2006-11-29.ppt Avi Deitcher avi@atomicinc.com
What Is “Ultra-Low” Latency? “every 100ms of delay costs 1% of sales” [1] “extra 0.5s in search page generaWon Wme dropped traffic by 20%” [2] 1. h\p://home.blarg.net/%7Eglinden/StanfordDataMining.2006-11-29.ppt 2. h\p://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html Avi Deitcher avi@atomicinc.com
What Is “Ultra-Low” Latency? “every 100ms of delay costs 1% of sales” [1] “extra 0.5s in search page generaWon Wme dropped traffic by 20%” [2] Not. Even. Close. 1. h\p://home.blarg.net/%7Eglinden/StanfordDataMining.2006-11-29.ppt 2. h\p://glinden.blogspot.com/2006/11/marissa-mayer-at-web-20.html Avi Deitcher avi@atomicinc.com
Ultra- Low Latency 38 messages in 7 milliseconds 1 message (avg) every 184 𝓋 -sec! Avi Deitcher avi@atomicinc.com
Networking Workloads • Networked Workloads: “things that do work and must talk” • Same principles for all workloads: – VMs – Cloud – Serverless – Containers Avi Deitcher avi@atomicinc.com
Two Types of Networking… Direct Avi Deitcher avi@atomicinc.com
Two Types of Networking… Direct Fabric+Overlay Avi Deitcher avi@atomicinc.com
… maybe four Workload Awareness Avi Deitcher avi@atomicinc.com
… maybe four Workload Awareness Fabric Awareness Avi Deitcher avi@atomicinc.com
Networking OpWons Direct Overlay Metal Flannel macvlan Weave Bridge/vSwitch Docker Overlay (no NAT) Calico (IPIP) net=host SR-IOV Workload Awareness Fabric Awareness Docker bridge (NAT) Calico (NaWve) Avi Deitcher avi@atomicinc.com
Our Tests What We Tested How We Tested • netperf ⇒ netserver • .net • UDP & TCP round-robin – Because it had to be metal • Sizes: 300, 500, 1024, 2048 – Wicked smart team • No orchestraWon = complete • Complete test run control • 50000 iteraWons – Network changes – Law of large numbers – Hardware variaWons, errors • Latency (Avg, %iles), CPU • DifferenRals, not absolutes h\ps://github.com/deitch/network-tests Avi Deitcher avi@atomicinc.com
Local vs. Remote Avi Deitcher avi@atomicinc.com
Avi Deitcher avi@atomicinc.com
Avi Deitcher avi@atomicinc.com
Avi Deitcher avi@atomicinc.com
Local Networking Summary • SR-IOV horrible latency but great CPU – Hold that thought… • net=host on par with metal • macvlan closest virtualized to metal • Rest in same range: – Latency: 5-10 𝓋 -sec overhead – CPU: negligible difference • Calico (IPIP & naWve) & Docker overlay slightly more performant • Watch out for very large TCP packets Avi Deitcher avi@atomicinc.com
Avi Deitcher avi@atomicinc.com
Avi Deitcher avi@atomicinc.com
Avi Deitcher avi@atomicinc.com
Remote Networking Summary • Weave (sleeve) adds latency and CPU – Reason for “fast datapath” • Again, macvlan best virtualized • All the rest: – Latency: within 50 𝓋 -sec of each other, except SR- IOV with very large TCP packets – CPU: similar, but keep an eye on Flannel (UDP) Avi Deitcher avi@atomicinc.com
About that SR-IOV Type 1: Intel I350 1Gbps Type 3: Mellanox MT27500 ConnectX-3 10Gbps Avi Deitcher avi@atomicinc.com
SR-IOV SR-IOV does not automaRcally mean beXer • Switch in network card • Trades host CPU for card processor • Quality varies drama5cally – Even Mellanox far worse locally • My 2¥: SR-IOV falls further behind due to: – Speed of iteraWon – Open-source – Sowware + CPU Avi Deitcher avi@atomicinc.com
Headaches (and Thanks) • Headaches – Weave SYN-(nothing) – etcd is “touchy” – Packet L3 network is powerful but… unique • Macvlan, weave, flannel: all required pings for mac • Se{ng up bridge w/o NAT, Calico, macvlan was “different” – SR-IOV is complicated and flaky, especially Mellanox – netperf with UDP packets can get stuck (Calico-ipip) – And a whole lot more (ask me offline) • And thanks: – Bryan Boreham, Adam Harrison at weave.works – Zac Smith, Adam, Aaron, Andy, Lucas, everyone at Packet Avi Deitcher avi@atomicinc.com
What else could we do? Ø Other hardware types Ø Other network fabrics Ø Docker macvlan network driver (experimental) Ø Ipvlan Ø Other packet sizes Ø Kernel and network stack tuning Ø Distant (and VPN) networks Ø Other traffic pa\erns Ø Other host-to-host encrypWon Ø A whole lot more… Avi Deitcher avi@atomicinc.com
Conclusions • SR-IOV: most of the Wme, just not worth it • Performance: – Metal (+ net=host): always performs best – Direct network++: macvlan is your friend – Others: Roughly similar, careful of Weave (sleeve) • What’s your use case? – ULL: Metal/net=host > macvlan > calico > overlay – Everything else: Focus on your architecture and skills Pick intelligently: easier, not simple Avi Deitcher avi@atomicinc.com
Conclusions • SR-IOV: most of the Wme, just not worth it • Performance: – Metal (+ net=host): always performs best – Direct network++: macvlan is your friend – Others: Roughly similar, careful of Weave (sleeve) • What’s your use case? – ULL: Metal/net=host > macvlan > calico > overlay – Everything else: Focus on your architecture and skills Pick intelligently: easier, not simple Avi Deitcher avi@atomicinc.com
QuesWons and help: @avideitcher avi@atomicincinc.com
Recommend
More recommend