本文主要内容是TCP协议的规格描述,包括具体的协议格式、相关术语解释、协议实现指导等。

协议规格

本节介绍TCP协议的具体格式,详细描述了TCP协议数据包的构成。

tcp header format

Source Port: 16 bits
The source port number.

Destination Port: 16 bits
The destination port number.

Sequence Number: 32 bits
The sequence number of the first data octet in this segment (except when SYN is present). If SYN is present the sequence number is the initial sequence number (ISN) and the first data octet is ISN+1.

Acknowledgment Number: 32 bits
If the ACK control bit is set this field contains the value of the next sequence number the sender of the segment is expecting to receive. Once a connection is established this is always sent.

Data Offset: 4 bits
The number of 32 bit words in the TCP Header. This indicates where the data begins. The TCP header (even one including options) is an integral number of 32 bits long.

Reserved: 6 bits
Reserved for future use. Must be zero.

Control Bits: 6 bits (from left to right):
URG: Urgent Pointer field significant
ACK: Acknowledgment field significant
PSH: Push Function
RST: Reset the connection
SYN: Synchronize sequence numbers
FIN: No more data from sender

Window: 16 bits
The number of data octets beginning with the one indicated in the acknowledgment field which the sender of this segment is willing to accept.

Checksum: 16 bits
The checksum field is the 16 bit one’s complement of the one’s complement sum of all 16 bit words in the header and text. If a segment contains an odd number of header and text octets to be checksummed, the last octet is padded on the right with zeros to form a 16 bit word for checksum purposes. The pad is not transmitted as part of the segment. While computing the checksum, the checksum field itself is replaced with zeros.

The checksum also covers a 96 bit pseudo header conceptually

Urgent Pointer: 16 bits
This field communicates the current value of the urgent pointer as a positive offset from the sequence number in this segment. The urgent pointer points to the sequence number of the octet following the urgent data. This field is only be interpreted in segments with the URG control bit set.

Options: variable
Options may occupy space at the end of the TCP header and are a multiple of 8 bits in length. All options are included in the checksum.

Padding: variable
The TCP header padding is used to ensure that the TCP header ends and data begins on a 32 bit boundary. The padding is composed of zeros.

相关术语

The maintenance of a TCP connection requires the remembering of several variables. We conceive of these variables being stored in a connection record called a Transmission Control Block or TCB. Among the variables stored in the TCB are the local and remote socket numbers, the security and precedence of the connection, pointers to the user’s send and receive buffers, pointers to the retransmit queue and to the current segment. In addition several variables relating to the send and receive sequence numbers are stored in the TCB.

Send Sequence Variables
SND.UNA - send unacknowledged
SND.NXT - send next
SND.WND - send window
SND.UP - send urgent pointer
SND.WL1 - segment sequence number used for last window update
SND.WL2 - segment acknowledgment number used for last window update
ISS - initial send sequence number

Receive Sequence Variables
RCV.NXT - receive next
RCV.WND - receive window
RCV.UP - receive urgent pointer
IRS - initial receive sequence number

Current Segment Variables
SEG.SEQ - segment sequence number
SEG.ACK - segment acknowledgment number
SEG.LEN - segment length
SEG.WND - segment window
SEG.UP - segment urgent pointer
SEG.PRC - segment precedence value

A connection progresses through a series of states during its lifetime.
LISTEN - represents waiting for a connection request from any remote TCP and port.
SYN-SENT - represents waiting for a matching connection request after having sent a connection request.
SYN-RECEIVED - represents waiting for a confirming connection request acknowledgment after having both received and sent a connection request.
ESTABLISHED - represents an open connection, data received can be delivered to the user. The normal state for the data transfer phase of the connection.
FIN-WAIT-1 - represents waiting for a connection termination request from the remote TCP, or an acknowledgment of the connection termination request previously sent.
FIN-WAIT-2 - represents waiting for a connection termination request from the remote TCP.
CLOSE-WAIT - represents waiting for a connection termination request from the local user.
CLOSING - represents waiting for a connection termination request acknowledgment from the remote TCP.
LAST-ACK - represents waiting for an acknowledgment of the connection termination request previously sent to the remote TCP (which includes an acknowledgment of its connection termination request).
TIME-WAIT - represents waiting for enough time to pass to be sure the remote TCP received the acknowledgment of its connection termination request.
CLOSED - represents no connection state at all.

序列号

在启动连接时,实际上是对序列号进行同步。

应答机制具有累加性,这样能够体现指定序列号的数据收到了多少。
The acknowledgment mechanism employed is cumulative so that an acknowledgment of sequence number X indicates that all octets up to but not including X have been received.

The typical kinds of sequence number comparisons which the TCP must perform include:
(a) Determining that an acknowledgment refers to some sequence number sent but not yet acknowledged.
(b) Determining that all sequence numbers occupied by a segment have been acknowledged (e.g., to remove the segment from a retransmission queue).
(c) Determining that an incoming segment contains sequence numbers which are expected (i.e., that the segment "overlaps" the receive window).

关于ISN(initial sequence number)选择的问题
how does the TCP identify duplicate segments from previous incarnations of the connection? This problem becomes apparent if the connection is being opened and closed in quick succession, or if the connection breaks with loss of memory and is then reestablished.
To avoid confusion we must prevent segments from one incarnation of a connection from being used while the same sequence numbers may still be present in the network from an earlier incarnation. We want to assure this, even if a TCP crashes and loses all knowledge of the sequence numbers it has been using. When new connections are created, an initial sequence number (ISN) generator is employed which selects a new 32 bit ISN. The generator is bound to a (possibly fictitious) 32 bit clock whose low order bit is incremented roughly every 4 microseconds. Thus, the ISN cycles approximately every 4.55 hours. Since we assume that segments will stay in the network no more than the Maximum Segment Lifetime (MSL) and that the MSL is less than 4.55 hours we can reasonably assume that ISN’s will be unique.

TCP安静时间(quite time)的概念
every segment emitted occupies one or more sequence numbers in the sequence space, the numbers occupied by a segment are "busy" or "in use" until MSL seconds have passed, upon crashing a block of space-time is occupied by the octets of the last emitted segment, if a new connection is started too soon and uses any of the sequence numbers in the space-time footprint of the last segment of the previous connection incarnation, there is a potential sequence number overlap area which could cause confusion at the receiver.

连接的建立

在RFC793的原文中,给出了许多种连接建立的方式,包括一些异常情况下建立连接的流程,不过最常用的还是三次握手方式。

TCP三次握手建立连接方式的过程示意图。

   TCP A                                                 TCP B
1. CLOSED                                                LISTEN
2. SYN-SENT    --> <SEQ=100><CTL=SYN>                --> SYN-RECEIVED
3. ESTABLISHED <-- <SEQ=300><ACK=101><CTL=SYN,ACK>   <-- SYN-RECEIVED
4. ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK>       --> ESTABLISHED
5. ESTABLISHED --> <SEQ=101><ACK=301><CTL=ACK><DATA> --> ESTABLISHED

关于RST(reset)
As a general rule, reset (RST) must be sent whenever a segment arrives which apparently is not intended for the current connection. A reset must not be sent if it is not clear that this is the case.

连接中的状态以及各种异常情况
There are three groups of states:

  1. If the connection does not exist (CLOSED) then a reset is sent in response to any incoming segment except another reset.
  2. If the connection is in any non-synchronized state (LISTEN, SYN-SENT, SYN-RECEIVED), and the incoming segment acknowledges something not yet sent (the segment carries an unacceptable ACK), or if an incoming segment has a security level or compartment which does not exactly match the level and compartment requested for the connection, a reset is sent.
  3. If the connection is in a synchronized state (ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT), any unacceptable segment (out of window sequence number or unacceptible acknowledgment number) must elicit only an empty acknowledgment segment containing the current send-sequence number and an acknowledgment indicating the next sequence number expected to be received, and the connection remains in the same state.

连接的断开

关闭的含义是没有更多的数据需要发送。
CLOSE is an operation meaning "I have no more data to send."

由于连接是双工的,所以在关闭时需要考虑的情况更复杂。
The notion of closing a full-duplex connection is subject to ambiguous interpretation, of course, since it may not be obvious how to treat the receiving side of the connection.

There are essentially three cases:

  1. The user initiates by telling the TCP to CLOSE the connection
  2. The remote TCP initiates by sending a FIN control signal
  3. Both users CLOSE simultaneously

一般情况下关闭(四次挥手)

   TCP A                                                TCP B
1. ESTABLISHED                                          ESTABLISHED
2. (Close)
   FIN-WAIT-1 --> <SEQ=100><ACK=300><CTL=FIN,ACK>   --> CLOSE-WAIT
3. FIN-WAIT-2 <-- <SEQ=300><ACK=101><CTL=ACK>       <-- CLOSE-WAIT
4.                                                      (Close)
   TIME-WAIT  <-- <SEQ=300><ACK=101><CTL=FIN,ACK>   <-- LAST-ACK
5. TIME-WAIT  --> <SEQ=101><ACK=301><CTL=ACK>       --> CLOSED
6. (2 MSL)
   CLOSED

两端同时关闭的情况

   TCP A                                               TCP B
1. ESTABLISHED                                         ESTABLISHED
2. (Close)                                             (Close)
   FIN-WAIT-1 --> <SEQ=100><ACK=300><CTL=FIN,ACK> ...  FIN-WAIT-1
              <-- <SEQ=300><ACK=100><CTL=FIN,ACK> <--
              ... <SEQ=100><ACK=300><CTL=FIN,ACK> -->
3. CLOSING    --> <SEQ=101><ACK=301><CTL=ACK>     ...  CLOSING
              <-- <SEQ=301><ACK=101><CTL=ACK>     <--
              ... <SEQ=101><ACK=301><CTL=ACK>     -->
4. TIME-WAIT                                           TIME-WAIT
   (2 MSL)                                             (2 MSL)
   CLOSED                                              CLOSED

优先权和安全性

安全性要求必须完全相同,优先权取最高的。
The intent is that connection be allowed only between ports operating with exactly the same security and compartment values and at the higher of the precedence level requested by the two ports.

优先权和安全性使用的是在IP协议中定义的字段。
The precedence and security parameters used in TCP are exactly those defined in the Internet Protocol (IP).

A connection attempt with mismatched security/compartment values or a lower precedence value must be rejected by sending a reset. Rejecting a connection due to too low a precedence only occurs after an acknowledgment of the SYN has been received.

数据传输

Once the connection is established data is communicated by the exchange of segments. Because segments may be lost due to errors (checksum test failure), or network congestion, TCP uses retransmission (after a timeout) to ensure delivery of every segment. Duplicate segments may arrive due to network or TCP retransmission.

重发的时间间隔是动态变化的
Because of the variability of the networks that compose an internetwork system and the wide range of uses of TCP connections the retransmission timeout must be dynamically determined.

紧急数据的发送
The objective of the TCP urgent mechanism is to allow the sending user to stimulate the receiving user to accept some urgent data and to permit the receiving TCP to indicate to the receiving user when all the currently known urgent data has been received by the user.

发送窗口的管理
The window sent in each segment indicates the range of sequence numbers the sender of the window (the data receiver) is currently prepared to accept.

窗口过大或窗口过小的效果
Indicating a large window encourages transmissions. If more data arrives than can be accepted, it will be discarded. This will result in excessive retransmissions, adding unnecessarily to the load on the network and the TCPs. Indicating a small window may restrict the transmission of data to the point of introducing a round trip delay between each new segment transmitted.

接口

讨论到接口,就有两种TCP接口可谈,它们分别是向上的和向下的两种。向上的接口负责与用户进行对接的接口,向下的接口是与更底层进行交互的接口。
There are of course two interfaces of concern: the user/TCP interface and the TCP/lower-level interface. We have a fairly elaborate model of the user/TCP interface, but the interface to the lower level protocol module is left unspecified here, since it will be specified in detail by the specification of the lowel level protocol.

协议中给出了很详细的关于接口实现的指导性描述,在本文中就暂时略过了,日后有需要的再去查。

事件的处理

本节主要给出了一种协议的实现方式。
The processing depicted in this section is an example of one possible implementation. Other implementations may have slightly different processing sequences, but they should differ from those in this section only in detail, not in substance.

TCP的活动可以描述为对事件的响应。
The activity of the TCP can be characterized as responding to events. The events that occur can be cast into three categories: user calls, arriving segments, and timeouts. This section describes the processing the TCP does in response to each of the events. In many cases the processing required depends on the state of the connection.

后续的内容主要是各个事件的具体响应,在这里不做记录了。

参考资料

(全文完)