3. It is easy to navigate large deployments by looking at neighborhoods. Even small deployments can have complex connectivity
3. It is easy to navigate large deployments by looking at neighborhoods. Even small deployments can have complex connectivity
3. It is easy to navigate large deployments by looking at neighborhoods. Even small deployments can have complex connectivity
3. It is easy to navigate large deployments by looking at neighborhoods. Neighborhood: All the services up to N hops from the selection
3. It is easy to navigate large deployments by looking at neighborhoods. Neighborhood: All the services up to N hops from the selection 1. Search
3. It is easy to navigate large deployments by looking at neighborhoods. Neighborhood: All the services up to N hops from the selection 1. Search 2. Detected anomalies
3. It is easy to navigate large deployments by looking at neighborhoods. Neighborhood: All the services up to N hops from the selection 1. Search 2. Detected anomalies 3. Alerts (Slack/PagerDuty etc.)
4. Connection visibility can point to failure domains: version, instance, zone. Got an alert / anomaly. Now what? Common causes: • New version deploy • Overloaded / borked instance • Geo / zone failure Or, helpful to know if failure concentrated on single • Container spec • Process • Port
Agenda / Claims
Agenda / Claims 1. Visibility into connections between services facilitates SRE/DevOps.
Agenda / Claims 1. Visibility into connections between services facilitates SRE/DevOps. 2. Effective triage requires visibility into how network infrastructure affects services.
Agenda / Claims 1. Visibility into connections between services facilitates SRE/DevOps. 2. Effective triage requires visibility into how network infrastructure affects services. 3. It is easy to navigate large deployments by looking at neighborhoods .
Agenda / Claims 1. Visibility into connections between services facilitates SRE/DevOps. 2. Effective triage requires visibility into how network infrastructure affects services. 3. It is easy to navigate large deployments by looking at neighborhoods . 4. Connection visibility can point to failure domains : version, instance, zone.
Agenda / Claims 1. Visibility into connections between services facilitates SRE/DevOps. 2. Effective triage requires visibility into how network infrastructure affects services. 3. It is easy to navigate large deployments by looking at neighborhoods . 4. Connection visibility can point to failure domains : version, instance, zone. 5. Logging, metrics, tracing, service meshes are not ideal for connection visibility.
Agenda / Claims 1. Visibility into connections between services facilitates SRE/DevOps. 2. Effective triage requires visibility into how network infrastructure affects services. 3. It is easy to navigate large deployments by looking at neighborhoods . 4. Connection visibility can point to failure domains : version, instance, zone. 5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. 6. Linux CLI provides great visibility without per-application changes.
Agenda / Claims 1. Visibility into connections between services facilitates SRE/DevOps. 2. Effective triage requires visibility into how network infrastructure affects services. 3. It is easy to navigate large deployments by looking at neighborhoods . 4. Connection visibility can point to failure domains : version, instance, zone. 5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. 6. Linux CLI provides great visibility without per-application changes. 7. BPF enables 1-second granularity, low-overhead, full coverage of connections.
Agenda / Claims 1. Visibility into connections between services facilitates SRE/DevOps. 2. Effective triage requires visibility into how network infrastructure affects services. 3. It is easy to navigate large deployments by looking at neighborhoods . 4. Connection visibility can point to failure domains : version, instance, zone. 5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. 6. Linux CLI provides great visibility without per-application changes. 7. BPF enables 1-second granularity, low-overhead, full coverage of connections. 8. BPF can handle encrypted connections (with uprobes)
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Logs, metrics, tracing, service meshes Each is great for its own use case! • Logs: low barrier, app internals • Metrics: dashboards on internals & business metrics • Tracing: cross-service examples of bad cases • Service mesh: aggregated connectivity, security, circuit breaking But…
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Cons: • Engineering time: requires per-service work (and maintenance) • Performance and cost • No infra visibility (drops, RTT) • Logs+Metrics: service centric, not connection • Tracing: sampling, cost
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Service mesh caveats Service Service Envoy Envoy cluster cluster HTTP HTTP conn mgr conn mgr cluster
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Service mesh caveats misconfigured mesh → broken telemetry. ● ○ want telemetry from a different source to debug the mesh Service Service Envoy Envoy cluster cluster HTTP HTTP conn mgr conn mgr cluster
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Service mesh caveats misconfigured mesh → broken telemetry. ● ○ want telemetry from a different source to debug the mesh ● partial deployments & managed services Service Service Envoy Envoy cluster cluster HTTP HTTP conn mgr conn mgr cluster
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Service mesh caveats misconfigured mesh → broken telemetry. ● ○ want telemetry from a different source to debug the mesh ● partial deployments & managed services ● no transport layer data (packet drops, RTT) Service Service Envoy Envoy cluster cluster HTTP HTTP conn mgr conn mgr cluster
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Service mesh caveats misconfigured mesh → broken telemetry. ● ○ want telemetry from a different source to debug the mesh ● partial deployments & managed services ● no transport layer data (packet drops, RTT) ● doesn’t solve the analysis part. Data is either ○ too aggregated - missing info on failure domains (version, zone, node) ○ too detailed (access logs) - still need to process 100k+ events/sec Service Service Envoy Envoy cluster cluster HTTP HTTP conn mgr conn mgr cluster
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Service mesh caveats misconfigured mesh → broken telemetry. ● ○ want telemetry from a different source to debug the mesh ● partial deployments & managed services ● no transport layer data (packet drops, RTT) ● doesn’t solve the analysis part. Data is either ○ too aggregated - missing info on failure domains (version, zone, node) ○ too detailed (access logs) - still need to process 100k+ events/sec Service Service Envoy Envoy cluster cluster HTTP HTTP conn mgr conn mgr cluster
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Service mesh caveats misconfigured mesh → broken telemetry. ● ○ want telemetry from a different source to debug the mesh ● partial deployments & managed services ● no transport layer data (packet drops, RTT) ● doesn’t solve the analysis part. Data is either ○ too aggregated - missing info on failure domains (version, zone, node) ○ too detailed (access logs) - still need to process 100k+ events/sec Service Service Envoy Envoy cluster cluster HTTP HTTP conn mgr conn mgr cluster
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Service mesh caveats misconfigured mesh → broken telemetry. ● ○ want telemetry from a different source to debug the mesh ● partial deployments & managed services ● no transport layer data (packet drops, RTT) ● doesn’t solve the analysis part. Data is either ○ too aggregated - missing info on failure domains (version, zone, node) ○ too detailed (access logs) - still need to process 100k+ events/sec Service Service Envoy Envoy cluster cluster HTTP HTTP conn mgr conn mgr cluster
5. Logging, metrics, tracing, service meshes are not ideal for connection visibility. Service mesh caveats misconfigured mesh → broken telemetry. ● ○ want telemetry from a different source to debug the mesh ● partial deployments & managed services ● no transport layer data (packet drops, RTT) ● doesn’t solve the analysis part. Data is either ○ too aggregated - missing info on failure domains (version, zone, node) ○ too detailed (access logs) - still need to process 100k+ events/sec ● eBPF user probes can efficiently get data from mesh and transport layer Service Service Envoy Envoy cluster cluster HTTP HTTP conn mgr conn mgr cluster
6. Linux CLI provides great visibility without per-application changes. Socket: Timestamp Source Destination Ports Bytes Drops RTT 1418530010 172.31.16.139 172.31.16.21 20641 22 4249 2 4 ms Protocol: Method Endpoint Code GET checkout?q=hrz4N 200
6. Linux CLI provides great visibility without per-application changes. Socket: Timestamp Source Destination Ports Bytes Drops RTT 1418530010 172.31.16.139 172.31.16.21 20641 22 4249 2 4 ms Protocol: Method Endpoint Code GET checkout?q=hrz4N 200 K8s: Tag IP Pod Zone Image 172.31.16.139 frontend frontend-image v1.16 us-west-1c 172.31.16.21 checkoutservice checkout-image v2.12a us-west-1a
6. Linux CLI provides great visibility without per-application changes. Socket: Timestamp Source Destination Ports Bytes Drops RTT 1418530010 172.31.16.139 172.31.16.21 20641 22 4249 2 4 ms Protocol: Method Endpoint Code GET checkout?q=hrz4N 200 K8s: Tag IP Pod Zone Image 172.31.16.139 frontend frontend-image v1.16 us-west-1c 172.31.16.21 checkoutservice checkout-image v2.12a us-west-1a Joined : Timestamp Source Destination Ports Bytes Drops RTT 1418530010 frontend checkout 20641 22 4249 2 4 ms frontend-image checkout-image v1.16 v2.12a Method Endpoint Code us-west-1c us-west-1a GET checkout?q=hrz4N 200
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data A X B
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data A X iptables B
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data A A → X X iptables B
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data A A → X (A,X) ~ X (A,B) iptables B
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data A A → X (A,X) ~ X (A,B) iptables A → B B
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data $ kubectl describe pod $POD Name: A Namespace: staging ... A Status: Running IP: 100.101.198.137 A → X Controlled By: ReplicaSet/A (A,X) ~ X (A,B) iptables A → B B
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data $ kubectl describe pod $POD Name: A Namespace: staging ... A Status: Running IP: 100.101.198.137 A → X Controlled By: ReplicaSet/A (A,X) ~ X (A,B) iptables A → B B
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data $ kubectl describe pod $POD Name: A Namespace: staging ... A Status: Running IP: 100.101.198.137 A → X Controlled By: ReplicaSet/A # PID=`docker inspect -f '{{.State.Pid}}' $CONTAINER` \ (A,X) nsenter -t $PID -n ss -ti ~ ESTAB 0 0 100.101.198.137:34940 100.65.61.118:8000 X (A,B) cubic wscale:9,9 rto:204 rtt:0.003/0 mss:1448 iptables cwnd:19 ssthresh:19 bytes_acked:2525112 segs_out:15664 segs_in:15578 data_segs_out:15662 send 73365.3Mbps lastsnd:384 A → B lastrcv:10265960 lastack:384 rcv_space:29200 minrtt:0.002 B
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data $ kubectl describe pod $POD Name: A Namespace: staging ... A Status: Running IP: 100.101.198.137 A → X Controlled By: ReplicaSet/A # PID=`docker inspect -f '{{.State.Pid}}' $CONTAINER` \ (A,X) nsenter -t $PID -n ss -ti A X ~ ESTAB 0 0 100.101.198.137:34940 100.65.61.118:8000 X (A,B) cubic wscale:9,9 rto:204 rtt:0.003/0 mss:1448 iptables cwnd:19 ssthresh:19 bytes_acked:2525112 segs_out:15664 segs_in:15578 data_segs_out:15662 send 73365.3Mbps lastsnd:384 A → B lastrcv:10265960 lastack:384 rcv_space:29200 minrtt:0.002 B
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data $ kubectl describe pod $POD Name: A Namespace: staging ... A Status: Running IP: 100.101.198.137 A → X Controlled By: ReplicaSet/A # PID=`docker inspect -f '{{.State.Pid}}' $CONTAINER` \ (A,X) nsenter -t $PID -n ss -ti A X ~ ESTAB 0 0 100.101.198.137:34940 100.65.61.118:8000 X (A,B) cubic wscale:9,9 rto:204 rtt:0.003/0 mss:1448 iptables cwnd:19 ssthresh:19 bytes_acked:2525112 segs_out:15664 segs_in:15578 data_segs_out:15662 send 73365.3Mbps lastsnd:384 A → B lastrcv:10265960 lastack:384 rcv_space:29200 minrtt:0.002 # conntrack -L tcp 6 86399 ESTABLISHED src=100.101.198.137 dst=100.65.61.118 sport=34940 dport=8000 B src=100.101.198.147 dst=100.101.198.137 sport=8000 dport=34940 [ASSURED] mark=0 use=1
6. Linux CLI provides great visibility without per-application changes. Getting Flow Data $ kubectl describe pod $POD Name: A Namespace: staging ... A Status: Running IP: 100.101.198.137 A → X Controlled By: ReplicaSet/A # PID=`docker inspect -f '{{.State.Pid}}' $CONTAINER` \ (A,X) nsenter -t $PID -n ss -ti A X ~ ESTAB 0 0 100.101.198.137:34940 100.65.61.118:8000 X (A,B) cubic wscale:9,9 rto:204 rtt:0.003/0 mss:1448 iptables cwnd:19 ssthresh:19 bytes_acked:2525112 segs_out:15664 segs_in:15578 data_segs_out:15662 send 73365.3Mbps lastsnd:384 A → B lastrcv:10265960 lastack:384 rcv_space:29200 minrtt:0.002 # conntrack -L tcp 6 86399 ESTABLISHED src=100.101.198.137 A X dst=100.65.61.118 sport=34940 dport=8000 B src=100.101.198.147 dst=100.101.198.137 sport=8000 B A dport=34940 [ASSURED] mark=0 use=1
6. Linux CLI provides great visibility without per-application changes. CLI tools have disadvantages • Performance: ○ iterates over all sockets ○ built for CLI use (printfs)
6. Linux CLI provides great visibility without per-application changes. CLI tools have disadvantages • Performance: ○ iterates over all sockets ○ built for CLI use (printfs) • Coverage : Linux CLI tools are polling based
6. Linux CLI provides great visibility without per-application changes. CLI tools have disadvantages • Performance: ○ iterates over all sockets ○ built for CLI use (printfs) • Coverage : Linux CLI tools are polling based poll poll poll poll time
6. Linux CLI provides great visibility without per-application changes. CLI tools have disadvantages • Performance: ○ iterates over all sockets ○ built for CLI use (printfs) • Coverage : Linux CLI tools are polling based poll poll socket poll poll time
6. Linux CLI provides great visibility without per-application changes. CLI tools have disadvantages • Performance: ○ iterates over all sockets ○ built for CLI use (printfs) • Coverage : Linux CLI tools are polling based poll poll socket poll poll time → Misses events between polls
Enter eBPF • Linux bpf() system call since 3.18 • Run code on kernel events • Only changes, more data • Safe: In-kernel verifier, read-only • Fast: JIT-compiled Unofficial BPF mascot by Deirdré Straughan
Enter eBPF • Linux bpf() system call since 3.18 • Run code on kernel events • Only changes, more data • Safe: In-kernel verifier, read-only • Fast: JIT-compiled Unofficial BPF mascot by Deirdré Straughan → 100% coverage + no app changes + low overhead ftw!
7. BPF enables 1-second granularity, low-overhead, full coverage of connections. Using eBPF tcptop: ● instruments tcp_sendmsg and tcp_cleanup_rbuf
7. BPF enables 1-second granularity, low-overhead, full coverage of connections. Using eBPF tcptop: ● instruments tcp_sendmsg and tcp_cleanup_rbuf
7. BPF enables 1-second granularity, low-overhead, full coverage of connections. Using eBPF tcptop: ● instruments tcp_sendmsg and tcp_cleanup_rbuf ● need to be careful of races: # IPv4: build dict of all seen keys ipv4_throughput = defaultdict(lambda: [0, 0]) for k, v in ipv4_send_bytes.items(): key = get_ipv4_session_key(k) ipv4_throughput[key][0] = v.value ipv4_send_bytes.clear() as for loop is running, kernel continues with updates, clear() throws those out.
7. BPF enables 1-second granularity, low-overhead, full coverage of connections. Using eBPF tcptop: ● instruments tcp_sendmsg and tcp_cleanup_rbuf ● need to be careful of races: # IPv4: build dict of all seen keys ipv4_throughput = defaultdict(lambda: [0, 0]) for k, v in ipv4_send_bytes.items(): key = get_ipv4_session_key(k) ipv4_throughput[key][0] = v.value ipv4_send_bytes.clear() as for loop is running, kernel continues with updates, clear() throws those out.
7. BPF enables 1-second granularity, low-overhead, full coverage of connections. System architecture Flow Collection Kubernetes Agent ECS Docker Linux Containers Processes Socket NAT
Recommend
More recommend