responding to spurious timeouts in tcp
play

Responding to Spurious Timeouts in TCP Andrei Gurtov University of - PowerPoint PPT Presentation

Responding to Spurious Timeouts in TCP Andrei Gurtov University of Helsinki Reiner Ludwig Ericsson Research Outline Motivation Spurious Timeouts in TCP Robustness to Packet Losses Undoing Congestion Control Adapting


  1. Responding to Spurious Timeouts in TCP Andrei Gurtov University of Helsinki Reiner Ludwig Ericsson Research

  2. Outline • Motivation • Spurious Timeouts in TCP • Robustness to Packet Losses • Undoing Congestion Control • Adapting the Retransmit Timer • Performance Evaluation • Conclusions 2/15

  3. Motivation • Delay variation in wireless 50000 45000 networks 40000 – Cell reselections in GPRS 35000 Sequence Number (B) last 3-15 sec 30000 25000 – Bandwidth oscillation in 20000 CDMA2000 15000 – Link-layer persistent error 10000 Sn d_D ata recovery Sn d_Ack 5000 0 • Deployed aggressive 500 510 520 530 540 550 560 570 580 Tim e of D ay (s) retransmission timers A TCP trace shows spurious – 10 ms granularity and 200 retransmissions caused by two cell ms minimum in Linux TCP reselections in a live GPRS network. – Solaris 3/15

  4. Spurious Timeouts in TCP • Spurious timeouts hurt TCP performance – Unnecessary retransmissions during go- back-N – Disrupted congestion control • In a short run, slow start causes congestion • In a long run, underutilization due to reduced ssthresh 4/15

  5. Spurious Timeouts in TCP • Detecting spurious timeouts – Eifel, F-RTO, etc… • Response after detecting a spurious timeout – Our focus 5/15

  6. Main Issues w.r.t. Response 1. Robustness to packet losses – Danger of genuine timeouts 2. How to restore the congestion control state – Does a full restore of cwnd and ssthresh cause a burst? – Do partial restore options perform well? 3. How to adapt the retransmit timer – To avoid clogging the network with unnecessary retransmissions in the future 6/15

  7. 1. Robustness to Packet Losses • State-of-the-art TCP is often sufficient – Fast Retransmit+ Sack+Limited Transmit • Heavy losses do trigger genuine timeouts – TCP gets low throughput – Cannot adapt RTO to a more conservative level • Solutions – FACK works well but not with reordering – NewReno+Sack works almost as well and appears safe for the Internet 7/15

  8. 2. Undoing Congestion Control • Full undo – too aggressive? – Appropriate, no bursts observed • Partial undo sets the sender idle for a while – The flight size is higher than the reduced cwnd – The ACK clock is can be lost • A new proposal: use the ACK clock but in congestion avoidance – Ssthresh=cwnd_old, cwnd=ssthresh 8/15

  9. TCP RTO • TCP does not take samples from delayed segments (Karn algorithm) • TCP with timestamps can do that – RTO is more conservative but decays quite fast • TCP with Eifel uses timestamps – Already more conservative than the standard TCP – Also, maintains a larger window that results into higher RTT and higher RTO 9/15

  10. 3. Adapting RTO • Upon a spurious timeout – Reseed: initialize SRTT and RTTVAR with new sample (history discard) – Back-off: keep the exponential back-off count – Min++: increase the minimum RTO (by 1 sec) • Reset to the standard timer upon a geniune timeout 10/15

  11. Performance Evaluation • ns2, dumbbell with TCP and CBR sources – 3G or satellite link: 2 Mbps, 400 ms RTT – Periodic delay spikes • 250 % gain in throughput when delay spikes occur on uncongested path (without CBR) • TCP fairness does not suffer because response to packet losses is unchanged 11/15

  12. Robustness to Packet Losses: TCPs with CBR 1 0.9 0.8 0.7 0.6 Download Time ormalized Segments Sent 0.5 Spurious RTOs 0.4 Genuine RTOs N 0.3 0.2 0.1 0 std eifel std eifel std eifel Reno-SACK NewReno-SACK FACK 12/15

  13. Undo of Congestion Control: TCP FACK with CBR 1 0.9 0.8 0.7 0.6 Full Normalized Partial 0.5 None 0.4 New partial 0.3 0.2 0.1 0 Download Time Segments Sent Spurious RTOs Genuine RTOs 13/15

  14. Adopting RTO: TCP FACK with CBR 1 0.95 0.9 0.85 0.8 Std Normalized Reseed 0.75 Back-off Min++ 0.7 0.65 0.6 0.55 0.5 Download Time Segments Sent Spurious RTOs Genuine RTOs 14/15

  15. Summary • An update of TCP sender improving performance over paths with variable delays • Up to 250% throughput gain on links with a high bandwidth-delay product • Adequate performance on congested paths – NewReno-SACK is robust to packet losses • Full restore of cong. control after a spurious timeout is ok • Using back-offs or increasing the min RTO can reduce the number of spurious RTOs by 40% with only slightly lower throughput • We have some real measurements for 2.5G links 15/15

Recommend


More recommend