CS 557 Congestion Avoidance Congestion Avoidance and Control Jacobson and Karels, 1988 Spring 2013
The Story So Far … . Some Essential Apps: DNS (naming) and NTP (time). Transport layer: End to End communication, Multiplexing, Reliability, Network layer: Addressing, Congestion control, Fragmentation, Dynamic Flow control, Routing, Best Effort Forwarding Data Layer: richly connected network (many paths) with many types of unreliable links
Main Points • Objective: – Techniques for dealing with network congestion • Approach: – Slow Start – Adaptive Timers. – Additive Increase/multiplicitive decrease • Contributions: – Essential points of TCP congestion control
Motivation and Context • Network Collapsing Due to Congestion – Throughput drops from 32 Kbps to 40 bps – One conclusion, packet switching failed … – This paper says we can fix the problem • Conservation of Packets – Can ’ t have collapse if packets entering network = packets leaving network – Can we achieve conservation of packets?
TCP Review RFCs: 793, 1122, 1323, 2018, 2581 • point-to-point: • full duplex data: – one sender, one receiver – bi-directional data flow in same connection • reliable, in-order byte – MSS: maximum segment steam: size – no “ message boundaries ” • connection-oriented: • pipelined: – handshaking (exchange – TCP congestion and flow of control msgs) init ’ s control set window size sender, receiver state • send & receive buffers before data exchange • flow controlled: application application – sender will not overwhelm writes data reads data socket socket door door receiver TCP TCP send buffer receive buffer segment
TCP Flow Control (1/2) flow control sender won ’ t overflow • receive side of TCP receiver ’ s buffer by connection has a transmitting too much, too fast receive buffer: • speed-matching service: matching the send rate to the • app process may be receiving app ’ s slow at reading from drain rate buffer
TCP Flow control (2/2) • Rcvr advertises spare room by including value of RcvWindow in segments • Sender limits (Suppose TCP receiver discards out-of-order unACKed data to segments) RcvWindow • spare room in buffer – guarantees receive = RcvWindow buffer doesn ’ t = RcvBuffer-[LastByteRcvd - overflow LastByteRead]
TCP seq. # ’ s and ACKs Seq. # ’ s: Host A Host B – byte stream “ number ” of first User Seq=42, ACK=79, data = ‘ C ’ types byte in segment ’ s ‘ C ’ data host ACKs ACKs: receipt of ’ C ‘ C ’ , echoes ‘ = – seq # of next byte a t a d , 3 4 back ‘ C ’ = K C A expected from other , 9 7 = q e S side – cumulative ACK host ACKs receipt Q: how receiver handles Seq=43, ACK=80 of echoed out-of-order segments ‘ C ’ – TCP spec doesn ’ t say, - up to implementation time simple telnet scenario
Challenges to Conservation • Connection never reaches equilibrium – Too many initial packets drives network into congestion and then hard to recover … . • Sender adds packets before one leaves – Poor timer causes retransmission of packets that are still in-flight on the network. • Equilibrium can ’ t be reached due to resource limits on path – Assume packet loss is due to congestion and back off by multiplicative factor
Slow Start • TCP is Self-Clocking – Receipt of ack triggers new packet • Good if Network is in Stable State: – How to ramp up at the start? – Start slow - 1 packet – Each ack triggers two packets • Quickly Ramp Up Window to Correct Size
TCP: retransmission scenarios Host A Host A Host B Host B Seq=92, 8 bytes data Seq=92, 8 bytes data Seq=92 timeout Seq=100, 20 bytes data timeout ACK=100 X loss Seq=92, 8 bytes data Seq=92, 8 bytes data Sendbase = 100 Seq=92 timeout SendBase = 120 0 0 1 = K C A SendBase SendBase = 100 = 120 premature timeout time time lost ACK scenario
TCP retransmission scenarios Host A Host B Seq=92, 8 bytes data ACK=100 timeout Seq=100, 20 bytes data X loss 0 2 1 SendBase = K C A = 120 time Cumulative ACK scenario
TCP Timeout Values Q: how to estimate RTT? Q: how to set TCP timeout value? • SampleRTT : measured time from segment transmission • longer than RTT until ACK receipt – but RTT varies – ignore retransmissions • too short: premature • SampleRTT will vary, want timeout estimated RTT “ smoother ” – unnecessary – average several recent retransmissions measurements, not just • too long: slow current SampleRTT reaction to segment loss
TCP Round Trip Time (RTT) EstimatedRTT = (1- α )*EstimatedRTT + α *SampleRTT • Exponential weighted moving average • influence of past sample decreases exponentially fast • typical value: α = 0.125
Example RTT estimation: RTT: RTT: gaia.c gaia.cs.u s.umass.edu mass.edu to to fantasia.e fantasia.eure urecom.fr com.fr 350 300 ds) onds econ (millisec 250 (millis RTT RTT 200 150 100 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 ti time me (sec (s econ onnds ds) SampleRTT Estimated RTT
TCP Round Trip Time and Timeout Setting the timeout • EstimtedRTT plus “ safety margin ” – large variation in EstimatedRTT -> larger safety margin • first estimate of how much SampleRTT deviates from EstimatedRTT: DevRTT = (1- β )*DevRTT + β *|SampleRTT-EstimatedRTT| (typically, β = 0.25) Then set timeout interval: TimeoutInterval = EstimatedRTT + 4*DevRTT
TCP Window Size Over Time congestion window 24 Kbytes 16 Kbytes 8 Kbytes time Long-lived TCP connection
TCP Congestion Control Review • When CongWin is below Threshold , sender in slow-start phase, window grows exponentially. • When CongWin is above Threshold , sender is in congestion-avoidance phase, window grows linearly. • When a triple duplicate ACK occurs, Threshold set to CongWin/2 and CongWin set to Threshold . • When timeout occurs, Threshold set to CongWin/2 and CongWin is set to 1 MSS.
TCP ACK generation [RFC 1122, RFC 2581] TCP Receiver action Event at Receiver Delayed ACK. Wait up to 500ms Arrival of in-order segment with for next segment. If no next segment, expected seq #. All data up to send ACK expected seq # already ACKed Immediately send single cumulative Arrival of in-order segment with ACK, ACKing both in-order segments expected seq #. One other segment has ACK pending Immediately send duplicate ACK, Arrival of out-of-order segment indicating seq. # of next expected byte higher-than-expect seq. # . Gap detected Immediate send ACK, provided that Arrival of segment that segment startsat lower end of gap partially or completely fills gap
Event State TCP Sender Action Commentary ACK Slow Start CongWin = CongWin + MSS, Resulting in a doubling of receipt for (SS) If (CongWin > Threshold) CongWin every RTT previously set state to “ Congestion unacked Avoidance ” data ACK Congestion CongWin = CongWin+MSS * Additive increase, resulting in receipt for Avoidance (MSS/CongWin) increase of CongWin by 1 MSS previously (CA) every RTT unacked data Loss event SS or CA Threshold = CongWin/2, Fast recovery, implementing detected by CongWin = Threshold, multiplicative decrease. CongWin triple Set state to “ Congestion will not drop below 1 MSS. duplicate Avoidance ” ACK Timeout SS or CA Threshold = CongWin/2, Enter slow start CongWin = 1 MSS, Set state to “ Slow Start ” Duplicate SS or CA Increment duplicate ACK count CongWin and Threshold not ACK for segment being acked changed
Fast Retransmit • Time-out period often • If sender receives 3 relatively long: ACKs for the same – long delay before resending data, it supposes that lost packet segment after ACKed • Detect lost segments via duplicate ACKs. data was lost: – Sender often sends many – fast retransmit: resend segments back-to-back segment before timer – If segment is lost, there will likely be many duplicate expires ACKs.
Fast retransmit algorithm: event: ACK received, with ACK field value of y if (y > SendBase) { SendBase = y if (there are currently not-yet-acknowledged segments) start timer } else { increment count of dup ACKs received for y if (count of dup ACKs received for y = 3) { resend segment with sequence number y } a duplicate ACK for fast retransmit already ACKed segment
TCP Fairness Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K TCP connection 1 bottleneck TCP router connection 2 capacity R
Why is TCP fair? Two competing sessions: • Additive increase gives slope of 1, as throughout increases • multiplicative decrease drops throughput proportionally equal bandwidth share R t u p h g u o loss: decrease window by factor of 2 r h congestion avoidance: additive increase t loss: decrease window by factor of 2 2 congestion avoidance: additive increase n o i t c e n n o C Connection 1 throughput R
TCP Throughput • What ’ s the average throughout ot TCP as a function of window size and RTT? – Ignore slow start • Let W be the window size when loss occurs. • When window is W, throughput is W/RTT • Just after loss, window drops to W/2, throughput to W/2RTT. • Average throughout: .75 W/RTT
Recommend
More recommend