End-to-End Protocols • Underlying best-effort network – drop messages Reliable Byte-Stream (TCP) – re-orders messages – delivers duplicate copies of a given message – limits messages to some finite size – delivers messages after an arbitrarily long delay Outline • Common end-to-end services Connection Establishment/Termination – guarantee message delivery Sliding Window Revisited – deliver messages in the same order they are sent – deliver at most one copy of each message Flow Control – support arbitrarily large messages Adaptive Timeout – support synchronization – allow the receiver to flow control the sender – support multiple application processes on each host Spring 2005 CS 461 1 Spring 2005 CS 461 2 Simple Demultiplexor (UDP) TCP Overview • Unreliable and unordered datagram service • Connection-oriented • Full duplex • Adds multiplexing • Byte-stream • Flow control: keep sender • No flow control from overrunning receiver – app writes bytes – TCP sends segments • Congestion control: keep • Endpoints identified by ports – app reads bytes sender from overrunning – servers have well-known ports network – see /etc/services on Unix Application process Application process 0 16 31 • Header format SrcPort DstPort Write Read … bytes bytes Checksum Length … Data TCP TCP Send buffer Receive buffer • Optional checksum – psuedo header + UDP header + data … Segment Segment Segment Transmit segments Spring 2005 CS 461 3 Spring 2005 CS 461 4
Segment Format Data Link Versus Transport • Potentially connects many different hosts 0 4 10 16 31 – need explicit connection establishment and termination SrcPort DstPort • Potentially different RTT SequenceNum – need adaptive timeout mechanism Acknowledgment • Potentially long delay in network 0 Flags AdvertisedWindow HdrLen – need to be prepared for arrival of very old packets Checksum UrgPtr • Potentially different capacity at destination Options (variable) – need to accommodate different node capacity Data • Potentially different network capacity – need to be prepared for network congestion Spring 2005 CS 461 5 Spring 2005 CS 461 6 Segment Format (cont) Connection Establishment and Termination • Each connection identified with 4-tuple: – (SrcPort, SrcIPAddr, DsrPort, DstIPAddr) • Sliding window + flow control Active participant Passive participant (client) (server) – acknowledgment, SequenceNum, AdvertisedWinow SYN, SequenceNum = x Data (SequenceNum) y , SYN + ACK, SequenceNum = Sender Receiver Acknowledgment = x + 1 Acknowledgment + AdvertisedWindow • Flags A C K , A c k n o w l e d – SYN, FIN, RESET, PUSH, URG, ACK g m e n t = y + 1 • Checksum – pseudo header + TCP header + data Spring 2005 CS 461 7 Spring 2005 CS 461 8
State Transition Diagram Sliding Window Revisited CLOSED Active open/SYN Sending application Receiving application Passive open Close Close LISTEN TCP TCP LastByteWritten LastByteRead SYN/SYN + ACK Send/SYN SYN/SYN + ACK SYN_RCVD SYN_SENT LastByteAcked LastByteSent NextByteExpected LastByteRcvd ACK SYN + ACK/ACK • Sending side • Receiving side Close/FIN ESTABLISHED – LastByteAcked < = – LastByteRead < LastByteSent NextByteExpected Close/FIN FIN/ACK – LastByteSent < = – NextByteExpected < = FIN_WAIT_1 CLOSE_WAIT LastByteWritten FIN/ACK LastByteRcvd +1 ACK + FIN/ACK ACK Close/FIN – buffer bytes between – buffer bytes between FIN_WAIT_2 CLOSING LAST_ACK LastByteAcked and NextByteRead and Timeout after two ACK LastByteWritten ACK LastByteRcvd segment lifetimes FIN/ACK TIME_WAIT CLOSED Spring 2005 CS 461 9 Spring 2005 CS 461 10 Flow Control Silly Window Syndrome • Send buffer size: MaxSendBuffer • Receive buffer size: MaxRcvBuffer • How aggressively does sender exploit open window? • Receiving side – LastByteRcvd - LastByteRead < = MaxRcvBuffer – AdvertisedWindow = MaxRcvBuffer - ( NextByteExpected - NextByteRead ) Sender Receiver • Sending side – LastByteSent - LastByteAcked < = AdvertisedWindow – EffectiveWindow = AdvertisedWindow - ( LastByteSent - LastByteAcked ) • Receiver-side solutions – LastByteWritten - LastByteAcked < = MaxSendBuffer – block sender if ( LastByteWritten - LastByteAcked ) + y > – after advertising zero window, wait for space equal to a MaxSenderBuffer maximum segment size (MSS) • Always send ACK in response to arriving data segment – delayed acknowledgements • Persist when AdvertisedWindow = 0 Spring 2005 CS 461 11 Spring 2005 CS 461 12
Protection Against Wrap Around Nagle’s Algorithm • How long does sender delay sending data? • 32-bit SequenceNum – too long: hurts interactive applications – too short: poor network utilization Bandwidth Time Until Wrap Around T1 (1.5 Mbps) 6.4 hours – strategies: timer-based vs self-clocking Ethernet (10 Mbps) 57 minutes • When application generates additional data T3 (45 Mbps) 13 minutes – if fills a max segment (and window open): send it FDDI (100 Mbps) 6 minutes STS-3 (155 Mbps) 4 minutes – else STS-12 (622 Mbps) 55 seconds • if there is unack’ed data in transit: buffer it until ACK arrives STS-24 (1.2 Gbps) 28 seconds • else: send it Spring 2005 CS 461 13 Spring 2005 CS 461 14 Keeping the Pipe Full TCP Extensions • 16-bit AdvertisedWindow • Implemented as header options Bandwidth Delay x Bandwidth Product • Store timestamp in outgoing segments T1 (1.5 Mbps) 18KB • Extend sequence space with 32-bit timestamp Ethernet (10 Mbps) 122KB T3 (45 Mbps) 549KB (PAWS) FDDI (100 Mbps) 1.2MB • Shift (scale) advertised window STS-3 (155 Mbps) 1.8MB STS-12 (622 Mbps) 7.4MB STS-24 (1.2 Gbps) 14.8MB assuming 100ms RTT Spring 2005 CS 461 15 Spring 2005 CS 461 16
Adaptive Retransmission Karn/Partridge Algorithm (Original Algorithm) Sender Receiver Sender Receiver • Measure SampleRTT for each segment / ACK pair O r O i g r i n i g a i n l a t l r a t n r a s n m s m i s s i s i o s n i o n • Compute weighted average of RTT SampleR TT SampleR TT R e t K r a C n A s m – EstRTT = � x EstRTT + � x SampleRTT i s s i o n R e t r a n s m – where � + � = 1 i s s i o n K C A � � between 0.8 and 0.9 � � between 0.1 and 0.2 • Set timeout based on EstRTT • Do not sample RTT when retransmitting – TimeOut = 2 x EstRTT • Double timeout after each retransmission Spring 2005 CS 461 17 Spring 2005 CS 461 18 Jacobson/ Karels Algorithm • New Calculations for average RTT • Diff = SampleRTT - EstRTT • EstRTT = EstRTT + ( � x Diff) • Dev = Dev + � ( |Diff| - Dev) – where � is a factor between 0 and 1 • Consider variance when setting timeout value • TimeOut = µ x EstRTT + � x Dev – where µ = 1 and � = 4 • Notes – algorithm only as good as granularity of clock (500ms on Unix) – accurate timeout mechanism important to congestion control (later) Spring 2005 CS 461 19
Recommend
More recommend