Transport Protocols End-to-End Protocols � Convert host-to-host packet delivery service into a process-to-process communication channel � Demultiplexing: Multiple applications can share the Kameswari Chebrolu network Dept. of Electrical Engineering, IIT Kanpur � End points identified by ports � Ports are not interpreted globally � servers have well defined ports (look at /etc/services) User Datagram Protocol (UDP) Application Layer Expectations � Guaranteed message delivery Demultiplexing UDP Header � Ordered delivery 0 16 31 Application Application Application process process process SrcPort DstPort � No duplication Length Checksum Ports � Support arbitrarily large messages Data Queues � Synchronization between the sender and receiver � Support flow control Computes checksum Packets over UDP header, � Support demultiplexing demultiplexed UDP message body and pseudo-header Packets arrive
Transmission Control Protocol (TCP) Limitations of Networks � Connection oriented � Packet Losses � Maintains state to provide reliable service � Byte-stream oriented � Re-ordering � Handles byte streams instead of messages � Duplicate copies � Full Duplex � Limit on maximum message size � Supports flow of data in each direction � Long delays � Flow-control � Prevents sender from overrunning the receiver � Congestion-control � Prevents sender from overloading the network Sliding Window: Data Link vs Transport TCP Cont... P2P: Dedicated Link -- Physical Link connects the same two computers Application process Application process TCP: Connects two processes on any two machines in the Internet � Needs explicit connection establishment phase to exchange state Write bytes Read bytes P2P: Fixed round trip transmission time (RTT) TCP: Potentially different and widely varying RTTs TCP TCP � Timeout mechanism has to be adaptive Send buffer Receive buffer P2P: No Reordering TCP: Scope for reordering due to arbitrary long delays � � � Segment Segment Segment � Need to be robust against old packets showing up suddenly Transmit segments
Sliding Window: Data Link vs Transport TCP Header Format 10 0 4 16 31 SrcPort DstPort P2P: End points can be engineered to support the link SequenceNum TCP: Any kind of computer can be connected to the Internet Acknowledgment � Need mechanism for each side to learn other side's resources (e.g. buffer space) -- Flow control HdrLen 0 Flags AdvertisedWindow Checksum UrgPtr P2P: Not possible to unknowingly congest the link Options (variable) TCP: No idea what links will be traversed, network capacity can Data dynamically vary due to competing traffic � Need mechanism to alter sending rate in response to network Data (SequenceNum) congestion – Congestion control Sender Receiver Acknowledgment + AdvertisedWindow State Transition Diagram Connection Establishment Active participant (client) (server) SYN, SequenceNum = x y , = m u N e c n e u q e S 1 + K , x C = t A n + e N m Y g d S e l w o n k c A ACK, Acknowledgment =y+1
Sliding Window Recap Protection Against Wraparound � Wraparound occurs because sequence number Sending application Receiving application field is finite TCP TCP � 32 bit sequence number space LastByteWritten LastByteRead � Maximum Segment Lifetime (MSL) is 120 sec LastByteAcked LastByteSent NextByteExpected LastByteRcvd Bandwidth Time until Wraparound T1 (1.5Mbps) 6.6 hrs Sending Side: Receiving Side: Ethernet (10Mbps) 57 minutes � LastByteAcked <= LastByteSent � LastByteRead <= NextByteExpected T3 (45 Mbps) 13 minutes � LastByteSent <= LastByteWritten � NextByteExpected <= FDDI (100Mbps) 6 minutes � Buffer bytes between LastByteRcvd+1 STS-3 (155Mbps) 4 minutes STS-12 (622Mbps) 55 seconds LastByteAcked and � Buffer bytes between LastByteRead STS-24 (1.2Gbps) 28 seconds LastByteWritten and LastByteRcvd Flow Control Congestion Control � Buffers are of finite size � MaxSendBuffer and MaxRcvBuffer � At steady state use Self-clocking � Receiving side: � Acks pace transmission of packets � LastByteRcvd – LastByteRead <= MaxRcvBuffer � Challenges: � AdvertisedWindow = MaxRcvBuffer – ((NextByteExpected – 1) – � How to determine available capacity? LastByteRead) � Sending side: � How to adjust sending rate to varying capacity? � LastByteSent – LastByteAcked <= AdvertisedWindow � EffectiveWindow = AdvertisedWindow – (LastByteSent – LastByteAcked) � LastByteWritten – LastByteAcked <= MaxSendBuffer � Persist when AdvertisedWindow is zero
Congestion Avoidance: Additive AIMD Cont... � Problem: How do we detect congestion? Increase/Multiplicative Decrease � Answer: Timeouts � Introduce a new variable: CongestionWindow � TCP interprets timeout as a result of congestion � Limits the amount of data in transit � Multiplicative decrease: Cut CongestionWindow by half � MaxWindow = Minimum of (CongestionWindow,AdvertisedWindow) on each timeout � EffectiveWindow = Maxwindow – (LastByteSent – � Additive Increase: Increase CongestionWindow by LastByteAcked) Maximum Segment Size (MSS) per RTT � Adjust CongestionWindow to changes in capacity � In practice, increment a little on each ack, � Decrease CongestionWindow when congestion goes up � CongestionWindow += Increment � Increase CongestionWindow when congestion goes down � Increment = MSS * (MSS/CongestionWindow) Saw Tooth Pattern Slow Start � AIMD approach is used at steady state 70 60 � But how to get to steady state? 50 � Increase Congestion Window exponentially 40 30 � Begin with CongestionWindow = 1 20 � Double CongestionWindow every RTT 10 � “Slow” compared to sending entire advertised window all at once 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 � Used during beginning of connection Time (seconds) � Used when connection goes dead due to timeout
Congestion Window vs Time Cwnd Cwnd/2 Slow Slow Waiting for Congestion Start Timeout Start Avoidance Timeout Time Fast Retransmit/Fast Recovery RTT Estimation: Original Algorithm Sender Receiver Fast Retransmit: Packet 1 � Measure SampleRTT for sequence/ack combo Use duplicate acks to Packet 2 ACK 1 Packet 3 trigger retransmission ACK 2 Packet 4 � EstimatedRTT = a*EstimatedRTT + (1-a)*SampleRTT ACK 2 Packet 5 � a is between 0.8-0.9 Packet 6 Fast Recovery: ACK 2 � small a heavily influenced by temporary fluctuations ACK 2 Peform congestion � large a not quick to adapt to real changes avoidance Retransmit � Timeout = 2 * EstimatedRTT packet 3 instead of slow start ACK 6
Jacobson/Karels Algorithm Cont.. Jacobson/Karels Algorithm � Difference = SampleRTT - EstimatedRTT � EstimatedRTT = EstimatedRTT + ( d * � Incorrect estimation of RTT worsens congestion Difference) � Algorithm takes into account variance of RTTs � Deviation = Deviation + d ( |Difference| - � If variance is small, EstimatedRTT can be trusted Deviation)), where d ~ 0.125 � If variance is large, timeout should not depend � Timeout = u * EstimatedRTT + q * Deviation, heavily on EstimatedRTT where u = 1 and q = 4 � Exponential RTO backoff � Summary � Transport protocols essentially demultiplexing functionality � Examples: UDP, TCP, RTP � TCP is a reliable connection-oriented byte-stream protocol � Sliding window based � Provides flow and congestion control
Recommend
More recommend