The book is based on a simple scenario: the user enters a URL into the browser and returns the response, which is the life cycle of a network request.
The book is divided into six parts:
- The application layer client generates HTTP and delegates it to the protocol stack of the operating system
- The protocol stack (TCP/IP module) calls the network card driver to generate an electrical signal
- How does the network card reach the router used to access the Internet through the router
- Relay transmission within the Internet
- After arriving at the web server, first pass the firewall check
- How does the web server collect data
The second chapter mainly introduces how the protocol stack and network card in the operating system send application messages to the server:
- Create socket
- Connect server
- Send and receive data
- Disconnect from the server and delete the socket
- Packet sending and receiving operation of IP and Ethernet
- Operation of sending and receiving data with UDP
This article introduces 1 to 4 and the whole life cycle of TCP module.
The main highlights are as follows:
- Internal structure of protocol stack
- What is the entity of the socket? Is there any tool that can be observed directly?
- What happened during the "connect"?
- Specific workflow when sending and receiving data.
- What happened during "disconnection"?
0. General
Before starting the exploration, I sorted out several concepts:
Internal structure of protocol stack;
Socket entity
TCP lifecycle
0.1 internal structure of protocol stack
The so-called protocol stack is divided into upper and lower parts,
- TCP UDP module for receiving and sending data entrusted by the application.
- IP module that controls the sending of network packets. The IP module includes ICMP Protocol and ARP protocol
- The network card driver is responsible for controlling the network card hardware and is used to monitor the photoelectric signals in the transmission network cable
Browser, mail and other general applications are generally used TCP DNS Query and other short control data are generally used UDP
0.2 entity of socket
First use netstat to feel it intuitively
Socket: the memory space used to store control information in the protocol stack
Control information: protocol type ip address port number status
The socket records the information of both sides of the communication and the state of the communication, The protocol stack works according to the control information in the socket
0.3 TCP lifecycle
A connection goes through a series of states in its life cycle. LISTEN, SYN-SENT, SYN-RECEIVED, ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT, CLOSED. CLOSED Is a virtual state. Indicates that the connection does not exist on the state machine. TCB: transmission control block, Communication control block, that is, the communication information stored in the socket
Explain three handshakes and four waves
TCP state machine:
+---------+ ---------\ active OPEN | CLOSED | \ ----------- +---------+<---------\ \ create TCB | ^ \ \ snd SYN passive OPEN | | CLOSE \ \ ------------ | | ---------- \ \ create TCB | | delete TCB \ \ V | \ \ +---------+ CLOSE | \ | LISTEN | ---------- | | +---------+ delete TCB | | rcv SYN | | SEND | | ----------- | | ------- | V +---------+ snd SYN,ACK / \ snd SYN +---------+ | |<----------------- ------------------>| | | SYN | rcv SYN | SYN | | RCVD |<-----------------------------------------------| SENT | | | snd ACK | | | |------------------ -------------------| | +---------+ rcv ACK of SYN \ / rcv SYN,ACK +---------+ | -------------- | | ----------- | x | | snd ACK | V V | CLOSE +---------+ | ------- | ESTAB | | snd FIN +---------+ | CLOSE | | rcv FIN V ------- | | ------- +---------+ snd FIN / \ snd ACK +---------+ | FIN |<----------------- ------------------>| CLOSE | | WAIT-1 |------------------ | WAIT | +---------+ rcv FIN \ +---------+ | rcv ACK of FIN ------- | CLOSE | | -------------- snd ACK | ------- | V x V snd FIN V +---------+ +---------+ +---------+ |FINWAIT-2| | CLOSING | | LAST-ACK| +---------+ +---------+ +---------+ | rcv ACK of FIN | rcv ACK of FIN | | rcv FIN -------------- | Timeout=2MSL -------------- | | ------- x V ------------ x V \ snd ACK +---------+delete TCB +---------+ ------------------------>|TIME WAIT|------------------>| CLOSED | +---------+ +---------+
Combined with the life cycle of TCP, analyze the process of connection creation, connection, sending, receiving and disconnection.
1. Create socket
int socket(int af, int type, int protocol); af: Address family, i.e IP Address type, commonly used are AF_INET and AF_INET6 AF_INET representative IPv4,For example 127.0.0.1 AF_INET6 representative IPv6,For example, 1030::C9B4:FF12:48AA:1A2B type: Data transmission mode/Socket type, commonly used are SOCK_STREAM,SOCK_STREAM SOCK_STREAM Stream format socket/Connection oriented socket SOCK_DGRAM datagram socket /Connectionless socket protocol: Transmission protocol, commonly used are IPPROTO_TCP and IPPTOTO_UDP IPPROTO_TCP: TCP transport protocol IPPTOTO_UDP: UDP transport protocol Return value: descriptor
See details Detailed explanation of socket() function , two versions of Linux and windows are introduced
Workflow for creating sockets
- Request a piece of memory from the memory manager -- > malloc()
- Initialization control information (protocol type ip address port)
- The descriptor is returned to the application to uniquely identify the socket (control information). The subsequent communication application will bring the descriptor when interacting with the protocol stack.
2. Connect to the server
What is the connection
int connect(int sock, struct sockaddr *serv_addr, socklen_t addrlen); sock Socket file descriptor serv_addr Address family ip port addrlen serv_addr size
The context of this connection is the protocol stack, because there is always a signal for the network cable.
In the connection stage, the socket has just been created, and the protocol stack does not know who the communication object is. In this stage, the client will convey the request to start communication to the server and exchange control information with each other.
Therefore, it may be more appropriate to call this stage the preparation stage.
What did you do when connecting:
- The application gives the server ip port to the protocol stack
- The protocol stack initiates a request to start communication
- Interactive control information
- Open up a buffer for sending and receiving data
Header responsible for saving control information
There are two types of control information in communication operation:
- The control information exchanged when the client and the server communicate with each other. That is, the header of various protocols, TCP, IP and MAC. The header is used to record and exchange control information
- The information stored in the socket is used to control the operation of the protocol stack. Information passed from the application and information received from the communication object.
For the control information in the socket, different protocol stacks have different implementations, as long as the protocol header is generated according to the regulations during communication.
TCP header information:
Original text: https://www.rfc-editor.org/rfc/rfc793.html#section-3.1 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgment Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data | |U|A|P|R|S|F| | | Offset| Reserved |R|C|S|S|Y|I| Window | | | |G|K|H|T|N|N| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | Urgent Pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Control information:
There are two types of control information in communication operation: 1. Information recorded in the header 2. Information recorded in socket (memory space in protocol stack)
netstat can be used to view sockets
Use wireshark (windows) or tcpdump (linux) to view the protocol header
3. Sending and receiving data
ssize_t write(int fd, const void *buf, size_t nbytes); fd File descriptor write() The function will put the buffer buf Medium nbytes Bytes written to file descriptor fd In the corresponding buffer, the number of bytes written is returned if successful, and returned if failed -1.
After the application calls the protocol stack, it will not be sent immediately. First store the corresponding transceiver buffer of the socket, and then send it after meeting certain conditions:
- The buffer data length is greater than or equal to MSS before sending
- Time: there is a timer inside the protocol, which will be sent after a certain time
MTU = header size (TCP IP header, generally 40 bytes) + MSS
MTU maximum transmission unit
MSS maximum segment size
If the length is preferred, the throughput is large and the delay is high
If time is preferred, the delay is low and the throughput is small
<-------- MTU --------> | IP head | TCP head | data | <- MSS -> Write buffer. Similar application scenarios: Persistence mechanism of database. reids,mysql Asynchronous brush disk kafka Asynchronous transmission of producers, etc
Split large data
If the requested message is too large, it will be split into multiple network packets.
TCP header is added by TCP module, and IP header and MAC header are added by IP module.
ACK retransmission mechanism
There are three attributes in the TCP header for the retransmission mechanism of ACK:
Serial number ACK number Data offset
Serial number: it can indicate the position of the first character of the current network packet data in the whole message.
Data offset: the position where the data starts
The length of data can be calculated through the total length of TCP packet and data offset
ACK No.: serial number + data length
After the client sends the data, the server will return the ACK number. If you don't receive it within a certain time, resend it.
If the ACK packet is not received after several retransmissions, an error is returned to the application.
Because of the retransmission function of TCP module, the network card router will directly discard the wrong network packet after receiving it.
Adjust the timeout of ACK number
Congestion will occur when the network is busy, and the ACK return will be slow. If the timeout time is short, it will lead to frequent retries and aggravate the congestion.
TCP adopts the method of dynamically adjusting the timeout. If the ACK number returns slowly, the timeout time will be extended accordingly. If the ACK number returns fast, the timeout time will be shortened
Due to the low time accuracy of the computer, too short timeout time can not be measured accurately. Basically adjusted to 0.5 Seconds to 1 second
Use window to manage ACK number
If you wait for the ACK number to arrive after sending the network packet, send another network packet. I can't do anything while waiting for ACK. It's a waste of time.
In order to improve efficiency, TCP uses sliding window to manage data transmission and ACK number.
The sliding window corresponds to the transceiver buffer of the protocol stack, the window attribute of the TCP header, and the size of the interactive window between the two sides of the connection (the size of the buffer).
operation mode:
The receiver informs the sender of the remaining window size. The sender sends network packets continuously according to the window size. After the receiver has processed the data in the buffer, it will tell the sender the current remaining window size of the buffer.
Combination of ACK and window
When does the receiver send the ACK and window to the sender?
1, When to update the window size?
- The receiver fetches the data from the cache and passes it to the application
- Size of interaction window in TCP connection phase of receiver
2, Time to return to ACK
After the data reaches the receiver and is stored in the buffer. You can return the ACK number.
Sending ACK and window every time a network packet is received will lead to the decline of network efficiency.
Therefore, the receiver will wait for a period of time when sending ACK and window update. When multiple ack numbers and window update packets need to be sent continuously, the last ack number or window size can be sent.
4. Disconnect and delete the socket from the server
Enter time after four wave connections_ Wait state, wait for 2MSL and delete the socket.
What is MSL?
MSL: Maximum Segment Lifetime maximum lifetime of network package (see RFC793)
The time of TCP segment in the network system is defined as 2 minutes, which is an engineering experience value. linux is usually set to 30 seconds by default (if the number of ports is 60000 and the waiting time is 30 seconds, the maximum qps of a listening port is 2000 in the case of short connection).
Why wait for 2MSL
An MSL is required for sending network packets and returning ACK
Why wait for 2MSL before deleting the socket
Waiting is to prevent misoperation. Take a simple example
- Client sends FIN
- ACK returned by the server
- Server sends FIN
- The client returns an ACK
In step 4, if the ACK returned by the client is lost, the server will resend the FIN. At this time, the client creates a socket with the same port number. After receiving the FIN resend by the server, it enters the disconnection stage.