TCP CONGESTION SIGNATURES Srikanth Sundaresan (Facebook) Amogh Dhamdhere (CAIDA/UCSD) kc Claffy (CAIDA/UCSD) Mark Allman (ICSI) 1 w w w . cai da. or
Typical Speed Tests Don’t Tell Us Much 2 w w w . cai da. or
Typical Speed Tests Don’t Tell Us Much 2 w w w . cai da. or
Typical Speed Tests Don’t Tell Us Much 2 w w w . cai da. or
Typical Speed Tests Don’t Tell Us Much • Upload and download throughput measurements: no information beyond that 2 w w w . cai da. or
Typical Speed Tests Don’t Tell Us Much What type of congestion did the TCP flow experience? 2 w w w . cai da. or
Two Potential Sources of Congestion in the End-to-end Path 3 w w w . cai da. or
Two Potential Sources of Congestion in the End-to-end Path • Self-induced congestion - Clear path, the flow is able to saturate the bottleneck link - eg: last-mile access link 3 w w w . cai da. or
Two Potential Sources of Congestion in the End-to-end Path • Self-induced congestion - Clear path, the flow is able to saturate the bottleneck link - eg: last-mile access link • External congestion - Flow starts on an already congested path - eg: congested interconnect 3 w w w . cai da. or
Two Potential Sources of Congestion in the End-to-end Path • Self-induced congestion - Clear path, the flow is able to saturate the bottleneck link - eg: last-mile access link • External congestion - Flow starts on an already congested path - eg: congested interconnect Distinguishing the two cases has implications for users / ISPs / regulators 3 w w w . cai da. or
Does Throughput Indicate Type of Congestion? • Cannot distinguish using just throughput numbers - Access plan rates vary widely, and are typically not available to content / speed test providers - eg: Speed test reports 5 Mbps – is that the access link rate (DSL), or a congested path? 4 w w w . cai da. or
Does Throughput Indicate Type of Congestion? • Cannot distinguish using just throughput numbers - Access plan rates vary widely, and are typically not available to content / speed test providers - eg: Speed test reports 5 Mbps – is that the access link rate (DSL), or a congested path? We can use the dynamics of TCP’s startup phase, i.e., Congestion Signatures 4 w w w . cai da. or
TCP’s RTT Congestion Signatures 5 w w w . cai da. or
TCP’s RTT Congestion Signatures • Flows experiencing self-induced congestion fill up an empty buffer during slow start - Hence increase the TCP flow RTT 5 w w w . cai da. or
TCP’s RTT Congestion Signatures • Flows experiencing self-induced congestion fill up an empty buffer during slow start - Hence increase the TCP flow RTT • Externally congested flows encounter an already full buffer - Less potential for RTT increases 5 w w w . cai da. or
TCP’s RTT Congestion Signatures • Flows experiencing self-induced congestion fill up an empty buffer during slow start - Hence increase the TCP flow RTT • Externally congested flows encounter an already full buffer - Less potential for RTT increases • Self-induced congestion therefore has higher RTT variance compared to external congestion 5 w w w . cai da. or
TCP’s RTT Congestion Signatures • Flows experiencing self-induced congestion fill up an empty buffer during slow start - Hence increase the TCP flow RTT • Externally congested flows encounter an already full buffer - Less potential for RTT increases • Self-induced congestion therefore has higher RTT variance compared to external congestion We can quantify this using Max-Min and CoV of RTT 5 w w w . cai da. or
Example Controlled Experiment 1 . 0 • 20 Mbps “access” link External 0 . 8 Self with 100 ms buffer 0 . 6 CDF • 1 Gbps “interconnect” 0 . 4 link with 50 ms buffer 0 . 2 Max-Min RTT 0 . 0 10 1 10 2 1 . 0 • Self-induced External 0 . 8 Self congestion flows have 0 . 6 higher values for both CDF metrics and are clearly 0 . 4 distinguishable 0 . 2 CoV RTT 0 . 0 10 − 2 10 − 1 10 0 6 w w w . cai da. or
Example Controlled Experiment 1 . 0 • 20 Mbps “access” link External 0 . 8 Self with 100 ms buffer 0 . 6 CDF • 1 Gbps “interconnect” 0 . 4 link with 50 ms buffer 0 . 2 Max-Min RTT 0 . 0 10 1 10 2 1 . 0 • Self-induced External 0 . 8 Self congestion flows have 0 . 6 higher values for both CDF metrics and are clearly 0 . 4 distinguishable 0 . 2 CoV RTT 0 . 0 10 − 2 10 − 1 10 0 The two types of congestion exhibit widely contrasting behaviors 6 w w w . cai da. or
Model • Max-min and CoV of RTT derived from RTT samples during slow start • We feed the two metrics into a simple Decision Tree - We control the depth of the tree to a low value to minimize complexity • We build the decision tree classifier using controlled experiments and apply it to real-world data 7 w w w . cai da. or
Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Pi 2 Server 4 8 w w w . cai da. or
Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Background cross-traffic Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Interconnect Pi 2 cross-traffic Server 4 8 w w w . cai da. or
Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Background cross-traffic Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Interconnect Pi 2 cross-traffic Throughput Server 4 tests 8 w w w . cai da. or
Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Background cross-traffic Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Interconnect Pi 2 cross-traffic Throughput Server 4 tests • Emulated “access” link + “core” link - Wide range of access link throughputs, buffer sizes, loss rates, cross- traffic (background and congestion-inducing) - Can accurately label flows in training data as “self” or “externally” congested 9 w w w . cai da. or
Validating the Method: Step 1- Controlled Experiments Server 2 Server 1 Background cross-traffic Internet Pi 1 100 Mbps 1 Gbps Server 3 Shaped “access” R2 R1 Interconnect Pi 2 cross-traffic Throughput Server 4 tests High accuracy: precision and recall > 90% in most settings 10 w w w . cai da. or
Validating the Method: Step 2 ISP B ISP A Ark VP • From Ark VP in ISP A identified congested link with ISP B using TSLP* *Dhamdhere et al. “Inferring Persistent Interdomain Congestion”, SIGCOMM 2018 11 w w w . cai da. or
Validating the Method: Step 2 ISP B “far” side Interdomain link “near” side ISP A Ark VP • From Ark VP in ISP A identified congested link with ISP B using TSLP* *Dhamdhere et al. “Inferring Persistent Interdomain Congestion”, SIGCOMM 2018 11 w w w . cai da. or
Validating the Method: Step 2 ISP B “far” side Latency measurements to “near” and “far” side of Interdomain link “near” side interdomain link over time ISP A Ark VP • From Ark VP in ISP A identified congested link with ISP B using TSLP* *Dhamdhere et al. “Inferring Persistent Interdomain Congestion”, SIGCOMM 2018 11 w w w . cai da. or
Validating the Method: Step 2 ISP B TSLP latency (far side) 70 60 “far” side 50 40 30 Interdomain 20 link 10 “near” side 02/18 02/25 03/04 03/11 ISP A Ark VP • From Ark VP in ISP A identified congested link with ISP B using TSLP* *Dhamdhere et al. “Inferring Persistent Interdomain Congestion”, SIGCOMM 2018 12 w w w . cai da. or
Validating the Method: Step 2 ISP B TSLP latency (far side) 70 60 “far” side 50 40 30 Interdomain 20 link 10 “near” side 02/18 02/25 03/04 03/11 ISP A Diurnal latency elevation Ark VP indicates congestion • From Ark VP in ISP A identified congested link with ISP B using TSLP* *Dhamdhere et al. “Inferring Persistent Interdomain Congestion”, SIGCOMM 2018 12 w w w . cai da. or
Validating the Method: Step 2 M-lab NDT server ISP B congested link ISP A Ark VP 13 w w w . cai da. or
Validating the Method: Step 2 M-lab NDT server ISP B Throughput measurements from Ark VP to M-lab NDT server congested link traversing congested interdomain link ISP A Ark VP 13 w w w . cai da. or
Validation of the Method: Step 2 30 25 d/l Mbps 20 15 10 5 0 02/18 02/25 03/04 03/11 TSLP latency (far side) 70 60 50 40 30 20 10 02/18 02/25 03/04 03/11 Strong correlation between throughput and TSLP latency: flows during elevated TSLP latency labeled as “externally” congested 14 w w w . cai da. or
Validation of the Method: Step 2 30 25 “Externally” d/l Mbps 20 congested 15 10 5 0 02/18 02/25 03/04 03/11 TSLP latency (far side) 70 60 50 40 30 20 10 02/18 02/25 03/04 03/11 Strong correlation between throughput and TSLP latency: flows during elevated TSLP latency labeled as “externally” congested 14 w w w . cai da. or
Recommend
More recommend