How to connect the network: Chapter 2: a life cycle of TCP connection | CSDN creation punch in

The book is based on a simple scenario: the user enters a URL into the browser and returns the response, which is the life cycle of a network request.

The book is divided into six parts:

  1. The application layer client generates HTTP and delegates it to the protocol stack of the operating system
  2. The protocol stack (TCP/IP module) calls the network card driver to generate an electrical signal
  3. How does the network card reach the router used to access the Internet through the router
  4. Relay transmission within the Internet
  5. After arriving at the web server, first pass the firewall check
  6. How does the web server collect data

The second chapter mainly introduces how the protocol stack and network card in the operating system send application messages to the server:

  1. Create socket
  2. Connect server
  3. Send and receive data
  4. Disconnect from the server and delete the socket
  5. Packet sending and receiving operation of IP and Ethernet
  6. Operation of sending and receiving data with UDP
    This article introduces 1 to 4 and the whole life cycle of TCP module.

The main highlights are as follows:

  1. Internal structure of protocol stack
  2. What is the entity of the socket? Is there any tool that can be observed directly?
  3. What happened during the "connect"?
  4. Specific workflow when sending and receiving data.
  5. What happened during "disconnection"?

0. General

Before starting the exploration, I sorted out several concepts:
Internal structure of protocol stack;
Socket entity
TCP lifecycle

0.1 internal structure of protocol stack

The so-called protocol stack is divided into upper and lower parts,

  1. TCP UDP module for receiving and sending data entrusted by the application.
  2. IP module that controls the sending of network packets. The IP module includes ICMP Protocol and ARP protocol
  3. The network card driver is responsible for controlling the network card hardware and is used to monitor the photoelectric signals in the transmission network cable
Browser, mail and other general applications are generally used TCP
DNS Query and other short control data are generally used UDP

0.2 entity of socket

First use netstat to feel it intuitively

Socket: the memory space used to store control information in the protocol stack
Control information: protocol type ip address port number status

The socket records the information of both sides of the communication and the state of the communication,
The protocol stack works according to the control information in the socket

0.3 TCP lifecycle

rfc793#section-3.2

 A connection goes through a series of states in its life cycle. 
LISTEN, SYN-SENT, SYN-RECEIVED, ESTABLISHED, 
FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT, 
CLOSED.  

CLOSED Is a virtual state. Indicates that the connection does not exist on the state machine.
TCB: transmission control block, Communication control block, that is, the communication information stored in the socket

Explain three handshakes and four waves

TCP state machine:

                              +---------+ ---------\      active OPEN
                              |  CLOSED |            \    -----------
                              +---------+<---------\   \   create TCB
                                |     ^              \   \  snd SYN
                   passive OPEN |     |   CLOSE        \   \
                   ------------ |     | ----------       \   \
                    create TCB  |     | delete TCB         \   \
                                V     |                      \   \
                              +---------+            CLOSE    |    \
                              |  LISTEN |          ---------- |     |
                              +---------+          delete TCB |     |
                   rcv SYN      |     |     SEND              |     |
                  -----------   |     |    -------            |     V
 +---------+      snd SYN,ACK  /       \   snd SYN          +---------+
 |         |<-----------------           ------------------>|         |
 |   SYN   |                    rcv SYN                     |   SYN   |
 |   RCVD  |<-----------------------------------------------|   SENT  |
 |         |                    snd ACK                     |         |
 |         |------------------           -------------------|         |
 +---------+   rcv ACK of SYN  \       /  rcv SYN,ACK       +---------+
   |           --------------   |     |   -----------
   |                  x         |     |     snd ACK
   |                            V     V
   |  CLOSE                   +---------+
   | -------                  |  ESTAB  |
   | snd FIN                  +---------+
   |                   CLOSE    |     |    rcv FIN
   V                  -------   |     |    -------
 +---------+          snd FIN  /       \   snd ACK          +---------+
 |  FIN    |<-----------------           ------------------>|  CLOSE  |
 | WAIT-1  |------------------                              |   WAIT  |
 +---------+          rcv FIN  \                            +---------+
   | rcv ACK of FIN   -------   |                            CLOSE  |
   | --------------   snd ACK   |                           ------- |
   V        x                   V                           snd FIN V
 +---------+                  +---------+                   +---------+
 |FINWAIT-2|                  | CLOSING |                   | LAST-ACK|
 +---------+                  +---------+                   +---------+
   |                rcv ACK of FIN |                 rcv ACK of FIN |
   |  rcv FIN       -------------- |    Timeout=2MSL -------------- |
   |  -------              x       V    ------------        x       V
    \ snd ACK                 +---------+delete TCB         +---------+
     ------------------------>|TIME WAIT|------------------>| CLOSED  |
                              +---------+                   +---------+

Combined with the life cycle of TCP, analyze the process of connection creation, connection, sending, receiving and disconnection.

1. Create socket

int socket(int af, int type, int protocol);

af: Address family, i.e IP Address type, commonly used are AF_INET and AF_INET6
  AF_INET representative IPv4,For example 127.0.0.1
  AF_INET6 representative IPv6,For example, 1030::C9B4:FF12:48AA:1A2B
type: Data transmission mode/Socket type, commonly used are SOCK_STREAM,SOCK_STREAM
  SOCK_STREAM Stream format socket/Connection oriented socket 
  SOCK_DGRAM datagram socket /Connectionless socket
protocol: Transmission protocol, commonly used are IPPROTO_TCP and IPPTOTO_UDP
  IPPROTO_TCP: TCP transport protocol 
  IPPTOTO_UDP: UDP transport protocol
 Return value: descriptor

See details Detailed explanation of socket() function , two versions of Linux and windows are introduced

Workflow for creating sockets

  1. Request a piece of memory from the memory manager -- > malloc()
  2. Initialization control information (protocol type ip address port)
  3. The descriptor is returned to the application to uniquely identify the socket (control information). The subsequent communication application will bring the descriptor when interacting with the protocol stack.

2. Connect to the server

What is the connection

int connect(int sock, struct sockaddr *serv_addr, socklen_t addrlen);
sock		Socket file descriptor
serv_addr	Address family ip port
addrlen		serv_addr size

The context of this connection is the protocol stack, because there is always a signal for the network cable.
In the connection stage, the socket has just been created, and the protocol stack does not know who the communication object is. In this stage, the client will convey the request to start communication to the server and exchange control information with each other.
Therefore, it may be more appropriate to call this stage the preparation stage.

What did you do when connecting:

  1. The application gives the server ip port to the protocol stack
  2. The protocol stack initiates a request to start communication
  3. Interactive control information
  4. Open up a buffer for sending and receiving data

Header responsible for saving control information

There are two types of control information in communication operation:

  1. The control information exchanged when the client and the server communicate with each other. That is, the header of various protocols, TCP, IP and MAC. The header is used to record and exchange control information
  2. The information stored in the socket is used to control the operation of the protocol stack. Information passed from the application and information received from the communication object.

For the control information in the socket, different protocol stacks have different implementations, as long as the protocol header is generated according to the regulations during communication.

TCP header information:

Original text: https://www.rfc-editor.org/rfc/rfc793.html#section-3.1

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Source Port          |       Destination Port        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Sequence Number                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Acknowledgment Number                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Data |           |U|A|P|R|S|F|                               |
   | Offset| Reserved  |R|C|S|S|Y|I|            Window             |
   |       |           |G|K|H|T|N|N|                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Checksum            |         Urgent Pointer        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Options                    |    Padding    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                             data                              |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Control information:

There are two types of control information in communication operation:
1. Information recorded in the header
2. Information recorded in socket (memory space in protocol stack)

netstat can be used to view sockets
Use wireshark (windows) or tcpdump (linux) to view the protocol header

3. Sending and receiving data

ssize_t write(int fd, const void *buf, size_t nbytes);
fd		File descriptor

write() The function will put the buffer buf Medium nbytes Bytes written to file descriptor fd In the corresponding buffer, the number of bytes written is returned if successful, and returned if failed -1. 

After the application calls the protocol stack, it will not be sent immediately. First store the corresponding transceiver buffer of the socket, and then send it after meeting certain conditions:

  1. The buffer data length is greater than or equal to MSS before sending
  2. Time: there is a timer inside the protocol, which will be sent after a certain time

MTU = header size (TCP IP header, generally 40 bytes) + MSS
MTU maximum transmission unit
MSS maximum segment size

If the length is preferred, the throughput is large and the delay is high
If time is preferred, the delay is low and the throughput is small

<-------- MTU -------->
| IP head | TCP head | data |
 			  <- MSS ->

Write buffer.

Similar application scenarios:
Persistence mechanism of database.
reids,mysql Asynchronous brush disk
kafka Asynchronous transmission of producers, etc

Split large data

If the requested message is too large, it will be split into multiple network packets.
TCP header is added by TCP module, and IP header and MAC header are added by IP module.

ACK retransmission mechanism

There are three attributes in the TCP header for the retransmission mechanism of ACK:

Serial number
ACK number
 Data offset

Serial number: it can indicate the position of the first character of the current network packet data in the whole message.
Data offset: the position where the data starts
The length of data can be calculated through the total length of TCP packet and data offset
ACK No.: serial number + data length

After the client sends the data, the server will return the ACK number. If you don't receive it within a certain time, resend it.
If the ACK packet is not received after several retransmissions, an error is returned to the application.

Because of the retransmission function of TCP module, the network card router will directly discard the wrong network packet after receiving it.

Adjust the timeout of ACK number

Congestion will occur when the network is busy, and the ACK return will be slow. If the timeout time is short, it will lead to frequent retries and aggravate the congestion.
TCP adopts the method of dynamically adjusting the timeout. If the ACK number returns slowly, the timeout time will be extended accordingly. If the ACK number returns fast, the timeout time will be shortened

Due to the low time accuracy of the computer, too short timeout time can not be measured accurately. Basically adjusted to 0.5 Seconds to 1 second

Use window to manage ACK number

If you wait for the ACK number to arrive after sending the network packet, send another network packet. I can't do anything while waiting for ACK. It's a waste of time.
In order to improve efficiency, TCP uses sliding window to manage data transmission and ACK number.

The sliding window corresponds to the transceiver buffer of the protocol stack, the window attribute of the TCP header, and the size of the interactive window between the two sides of the connection (the size of the buffer).

operation mode:

The receiver informs the sender of the remaining window size.
The sender sends network packets continuously according to the window size.
After the receiver has processed the data in the buffer, it will tell the sender the current remaining window size of the buffer.

Combination of ACK and window

When does the receiver send the ACK and window to the sender?

1, When to update the window size?

  1. The receiver fetches the data from the cache and passes it to the application
  2. Size of interaction window in TCP connection phase of receiver

2, Time to return to ACK
After the data reaches the receiver and is stored in the buffer. You can return the ACK number.

Sending ACK and window every time a network packet is received will lead to the decline of network efficiency.
Therefore, the receiver will wait for a period of time when sending ACK and window update. When multiple ack numbers and window update packets need to be sent continuously, the last ack number or window size can be sent.

4. Disconnect and delete the socket from the server

Enter time after four wave connections_ Wait state, wait for 2MSL and delete the socket.

What is MSL?

MSL: Maximum Segment Lifetime maximum lifetime of network package (see RFC793)
The time of TCP segment in the network system is defined as 2 minutes, which is an engineering experience value. linux is usually set to 30 seconds by default (if the number of ports is 60000 and the waiting time is 30 seconds, the maximum qps of a listening port is 2000 in the case of short connection).

Why wait for 2MSL

An MSL is required for sending network packets and returning ACK

Why wait for 2MSL before deleting the socket

Waiting is to prevent misoperation. Take a simple example

  1. Client sends FIN
  2. ACK returned by the server
  3. Server sends FIN
  4. The client returns an ACK
    In step 4, if the ACK returned by the client is lost, the server will resend the FIN. At this time, the client creates a socket with the same port number. After receiving the FIN resend by the server, it enters the disconnection stage.

Keywords: network Back-end Network Protocol TCP/IP

Added by countrydj on Thu, 27 Jan 2022 07:33:02 +0200