Introduction and Transport-Layer Services
A transport-layer protocol provides for logical communication between application processes running on different hosts. The transport layer converts the network layer’s host-to-host delivery into process-to-process delivery.
Relationship Between Transport and Network Layers:
- Network layer: logical communication between hosts (IP, best-effort)
- Transport layer: logical communication between processes (extending the network service)
- Transport protocols can be implemented only at the endpoints (end-to-end principle)
Internet Transport Protocols:
- TCP — reliable, connection-oriented, congestion control, flow control
- UDP — unreliable, connectionless, no frills
Multiplexing and Demultiplexing
Demultiplexing — delivering the data in a transport-layer segment to the correct socket at the receiving host.
Multiplexing — gathering data from multiple sockets at the source host, encapsulating with header info (for demultiplexing), and passing to the network layer.
How demultiplexing works:
- UDP: uses destination IP + destination port (connectionless)
- TCP: uses source IP + source port + destination IP + destination port (connection-oriented)
Multiplexing/demultiplexing is the mechanism by which transport layer extends host-to-host delivery to process-to-process delivery.
Connectionless Transport: UDP
UDP (User Datagram Protocol):
- No handshake before sending (connectionless)
- Unreliable — segment may be lost or delivered out of order
- No congestion control — sender can pump data at any rate
- Lightweight — minimal header overhead (8 bytes)
UDP Segment Structure:
| Field | Size |
|---|---|
| Source port | 16 bits |
| Destination port | 16 bits |
| Length | 16 bits |
| Checksum | 16 bits |
| Application data (payload) | variable |
UDP Checksum:
- Detects errors in the segment (header + data)
- One’s complement sum of: segment words + IP pseudo-header
- Receiver checksums the whole segment; if all bits are 1, no error detected
Why use UDP? No connection establishment (no delay), no connection state (more clients), small header, no congestion control (can send at application-determined rate).
Principles of Reliable Data Transfer
rdt 1.0 — Reliable Transfer over a Perfectly Reliable Channel
The underlying channel is perfectly reliable — no bit errors, no packet loss.
rdt_send(data):
pkt = make_pkt(data)
udt_send(pkt)
rdt_rcv(pkt):
extract(pkt, data)
deliver_data(data)
rdt 2.0 — Reliable Transfer over a Channel with Bit Errors
Use ACK (positive acknowledgment) and NAK (negative acknowledgment), plus checksum for error detection. This is an Automatic Repeat reQuest (ARQ) protocol.
rdt_send(data):
sndpkt = make_pkt(data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) and isACK(rcvpkt):
deliver_data(extract(rcvpkt))
rdt_rcv(rcvpkt) and isNAK(rcvpkt):
retransmit(sndpkt)
Problem: What if ACK/NAK itself is corrupted? Solution: Add sequence numbers.
rdt 2.1 and rdt 2.2 — Handling Corrupted ACK/NAK
- Add a sequence number (0 or 1) to each packet
- Sender includes sequence number in packet
- Receiver includes ACK-ed sequence number in acknowledgment
- rdt 2.2: NAK-free protocol — receiver sends ACK with sequence number of last correctly received packet; sender retransmits if duplicate ACK received
rdt 3.0 — Reliable Transfer over a Lossy Channel with Bit Errors
Add a timer mechanism. Sender starts a timer after sending a packet. If timer expires before receiving ACK, retransmit the packet.
rdt 3.0 (Alternating-Bit Protocol) handles both bit errors and packet loss. Performance problem: stop-and-wait limits throughput to 1 packet per RTT.
Pipelined Reliable Data Transfer
Stop-and-wait utilization: U_sender = (L/R) / (RTT + L/R). With large RTT, this is very low.
Pipelining allows multiple packets to be in flight simultaneously, increasing throughput.
Go-Back-N (GBN)
- Sender can have up to N unACK-ed packets
- Sender maintains a base (oldest unACK-ed) and nextseqnum (next to send)
- A single timer for the oldest unACK-ed packet
- If timeout occurs, retransmit all packets from base to nextseqnum-1
- Receiver only accepts packets in order; discards out-of-order packets (no buffering)
- Cumulative ACK: ACK(n) acknowledges all packets up through n
Selective Repeat (SR)
- Sender: timer for each in-flight packet
- Only retransmits the specific packet that timed out
- Receiver: acknowledges each correctly received packet, buffers out-of-order packets
- Sender window size must be <= half the sequence number space to avoid ambiguity
Connection-Oriented Transport: TCP
TCP Connection
- Full-duplex service
- Point-to-point (single sender, single receiver)
- Connection-oriented — handshake before data transfer
- MSS (Maximum Segment Size) — maximum application data in a segment (typically 1460 bytes)
TCP Segment Structure
| Field | Size | Description |
|---|---|---|
| Source port | 16 bits | Sending process port |
| Dest port | 16 bits | Receiving process port |
| Sequence number | 32 bits | Byte stream offset |
| ACK number | 32 bits | Next expected byte |
| Header length | 4 bits | In 32-bit words |
| Reserved | 6 bits | Unused |
| Flags (URG, ACK, PSH, RST, SYN, FIN) | 6 bits | Control flags |
| Receive window | 16 bits | Flow control (bytes available) |
| Internet checksum | 16 bits | Error detection |
| Urgent pointer | 16 bits | Offset to urgent data |
| Options | variable | e.g., MSS, timestamp |
| Data | variable | Application payload |
Round-Trip Time Estimation and Timeout
EstimatedRTT = (1 - alpha) * EstimatedRTT + alpha * SampleRTT
(alpha is typically 1/8 = 0.125)
DevRTT = (1 - beta) * DevRTT + beta * |SampleRTT - EstimatedRTT|
(beta is typically 1/4 = 0.25)
TimeoutInterval = EstimatedRTT + 4 * DevRTT
TCP Reliable Data Transfer
- Retransmission on timeout
- Fast retransmit: if sender receives 3 duplicate ACKs for the same sequence number, retransmit the missing segment before the timer expires
Flow Control
- Receiver advertises its available buffer space via the Receive Window (rwnd) field
- Sender must ensure:
LastByteSent - LastByteAcked <= rwnd
TCP Flow Control prevents the sender from overwhelming the receiver’s buffer. This is distinct from congestion control (which prevents overwhelming the network).
TCP Connection Management
Three-way handshake:
- Client sends SYN segment (SYN=1, seq=client_isn)
- Server sends SYNACK segment (SYN=1, ACK=client_isn+1, seq=server_isn)
- Client sends ACK (ACK=server_isn+1), may include data
Closing connection:
- Client sends FIN (FIN=1)
- Server ACKs the FIN
- Server sends its own FIN
- Client ACKs the server’s FIN (enters TIME_WAIT)
Principles of Congestion Control
Causes and costs of congestion:
- When packet arrival rate exceeds link capacity, queues build up
- Cost 1: queuing delay increases
- Cost 2: retransmissions due to dropped packets waste bandwidth
- Cost 3: premature retransmissions (unnecessary duplicates) waste bandwidth
- Cost 4: when a packet is dropped, the work done to transport it is wasted
Approaches to congestion control:
- End-to-end: no explicit network feedback; TCP deduces congestion from packet loss (implicit)
- Network-assisted: routers provide explicit feedback (e.g., ECN, ATM)
TCP Congestion Control
AIMD (Additive Increase Multiplicative Decrease):
- Slow start: cwnd starts at 1 MSS, doubles every RTT (exponential growth), until ssthresh is hit
- Congestion avoidance: cwnd increases by 1 MSS per RTT (linear growth)
- On triple duplicate ACK: cwnd = cwnd/2, ssthresh = cwnd/2 (multiplicative decrease)
- On timeout: cwnd = 1 MSS, ssthresh = half of cwnd before timeout, re-enter slow start
TCP congestion control follows AIMD — sawtooth pattern: linear increase until loss, then halving the congestion window.
TCP Fairness:
- AIMD converges to fair sharing of bottleneck bandwidth among competing TCP flows
- If K flows share a bottleneck of rate R, each gets roughly R/K
- Problem: UDP and other non-TCP flows can starve TCP flows
ECN (Explicit Congestion Notification):
- Network-assisted congestion indication
- Router marks packets (CE = Congestion Experienced) when queue is near-full
- Receiver echoes the ECN-echo flag
- Sender reacts as if a packet was dropped (reduces window)
Key Formulas
| Concept | Formula | Notes |
|---|---|---|
| RTT estimation | EstimatedRTT = (1-alpha)EstimatedRTT + alphaSampleRTT | alpha = 1/8 |
| RTT deviation | DevRTT = (1-beta)DevRTT + beta | SampleRTT - EstimatedRTT |
| Timeout | TimeoutInterval = EstimatedRTT + 4*DevRTT | Starts at 1 second |
| Stop-and-wait utilization | U = (L/R) / (RTT + L/R) | For link rate R, packet L |
| TCP window at steady state | Avg throughput = (3/4)W_maxMSS / RTT | W_max = max cwnd |
References
- Computer Networking: A Top-Down Approach, 7th Edition — Kurose & Ross, Pearson, 2017
- RFC 793 — Transmission Control Protocol
- RFC 768 — User Datagram Protocol
- RFC 5681 — TCP Congestion Control
- RFC 2581 — TCP Congestion Control (obsolete)