Skip to main content

Transport Layer

The transport layer is the most common root cause of backend latency and reliability problems.

Why It Matters

Connection setup and teardown directly affect tail latency.
Poor timeout/retry strategy amplifies outages.
Buffer and window tuning controls throughput on high-latency links.

TCP vs UDP

Topic	TCP	UDP
Reliability	Ordered and retransmitted	Best-effort
Connection	Stateful	Connectionless
Typical Usage	HTTP, databases, RPC	DNS, streaming, QUIC transport

TCP Three-Way Handshake

Handshake adds startup latency. Connection reuse is essential for high-QPS systems.

Flow Control and Congestion Control

Flow control protects receiver buffers.
Congestion control protects the network path.

Monitor with:

ss -ti

Connection Lifecycle

High TIME_WAIT counts are normal in short-connection workloads, but can still exhaust ephemeral ports.

Practical Tuning Areas

Keep-alive and connection pool limits.
Connect/read/write timeout budget.
Retry policy with idempotency and backoff.
Kernel socket settings only after measurement.

Debugging Playbook

# Socket states and queue sizes
ss -tan state established,time-wait

# Packet-level view
tcpdump -i any tcp port 443 -nn

Common Incidents

Connection timeout

Check route/firewall/listening port in order.
Verify timeout mismatch between caller and callee.

Connection reset

Inspect RST packets and upstream idle timeout.
Verify keep-alive heartbeat and proxy settings.

Throughput collapse on long RTT

Validate window scaling and receive buffers.
Compare congestion algorithm behavior by workload.

Why It Matters
TCP vs UDP
TCP Three-Way Handshake
Flow Control and Congestion Control
Connection Lifecycle
Practical Tuning Areas
Debugging Playbook
Common Incidents
Related Reading