socket module and packet sticking

I socket module

socket concept

socket layer

Understanding socket

Socket is an intermediate software abstraction layer for communication between application layer and TCP/IP protocol family. It is a group of interfaces. In the design mode, socket is actually a facade mode. It hides the complex TCP/IP protocol family behind the socket interface. For users, a group of simple interfaces is all, allowing the socket to organize data to comply with the specified protocol.

In fact, from your point of view, socket is a module. We establish the connection and communication between the two processes by calling the methods already implemented in the module.
Others call socket ip+port, because ip is used to identify the location of a host in the Internet, and port is used to identify an application on the machine.
So as long as we establish ip and port, we can find an application and use socket module to communicate with it.

tcp protocol and udp protocol

TCP (Transmission Control Protocol) is a reliable, connection oriented protocol (eg: make a call), low transmission efficiency, full duplex communication (send cache & receive cache), byte stream oriented. Applications using TCP: Web browser; E-mail, file transfer program.

UDP (User Datagram Protocol) is an unreliable and connectionless service with high transmission efficiency (small delay before transmission), one-to-one, one to many, many to one, many to many, message oriented, best service and no congestion control. Applications using UDP: domain name system (DNS); Video stream; Voice over IP (VoIP).

I know you don't understand this. Go straight to the figure above.

II socket initial use

socket based on TCP protocol

tcp is link based. You must start the server first, and then start the client to link the server

server side

import socket
sk = socket.socket()
sk.bind(('127.0.0.1',8898))  #Bind address to socket
sk.listen(5)          #Listening link
conn,addr = sk.accept() #Accept client link
ret = conn.recv(1024)  #Receive client information
print(ret)       #Print client information
conn.send(b'hi')        #Send information to clients
conn.close()       #Close Client Socket 
sk.close()        #Close server socket (optional)

client side

import socket
sk = socket.socket()           # Create client socket
sk.connect(('127.0.0.1',8898))    # Trying to connect to the server
sk.send(b'hello!')
ret = sk.recv(1024)         # Conversation (send / receive)
print(ret)
sk.close()            # Close client socket

Problem: some students may encounter problems when restarting the server

# Add the following code
from socket import SOL_SOCKET, SO_REUSEADDR
server.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)  # Add before bind

III Sticky bag

Sticking phenomenon

Let's make a remote command execution program based on tcp (command LS - L; LLL; PWD)

Server

import socket
from socket import SOL_SOCKET, SO_REUSEADDR
import subprocess

server = socket.socket()  # The default is to buy a mobile phone based on the network TCP transmission protocol
server.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)  # That's it. Add before bind
server.bind(('127.0.0.1', 8080))  # Binding ip and port card
server.listen(5)  # Half connection pool startup (transition)

while True:
    sock, address = server.accept()  # listen state of three handshakes
    print(address)  # Client address
    while True:
        try:
            data = sock.recv(1024)  # Receive messages sent by the client and listen to others
            # For mac and linux, you need to add a check
            if len(data) == 0: continue
            command_cmd = data.decode('utf8')
            sub = subprocess.Popen(command_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
            res = sub.stdout.read() + sub.stderr.read()
            sock.send(res)
        except ConnectionResetError as e:
            print(e)
            break

client

import socket

client = socket.socket()  # Buy a mobile phone
client.connect(('127.0.0.1', 8080))  # dial

while True:
    msg = input('Please enter cmd command>>>:').strip()
    if len(msg) == 0:
        continue
    # speak
    client.send(msg.encode('gbk'))
    # Listen to him
    data = client.recv(1024)
    print(data.decode('gbk'))

Cause of inclusion

TCP(transport control protocol,Transmission control protocol (TCP) is connection oriented and flow oriented, providing high reliability services.
There should be one-to-one pairs at both ends of the transceiver (client and server) socket,Therefore, the sender uses the optimization method to send multiple packets to the receiver more effectively to the other party( Nagle Algorithm), combine the data with small interval and small amount of data into a large data block, and then packet.
In this way, the receiver is difficult to distinguish, and a scientific unpacking mechanism must be provided. That is, flow oriented communication has no message protection boundary. 
For empty messages: tcp It is based on data flow, so the messages sent and received cannot be empty. Therefore, it is necessary to add a processing mechanism for empty messages at both the client and the server to prevent the program from getting stuck udp It is based on datagrams, and can be sent even if you enter empty content (enter directly), udp The agreement will help you encapsulate the message and send it. 
Reliable glued tcp agreement: tcp The protocol data will not be lost. If the packet is not received, the next reception will continue. The last reception will continue, and the client will always receive it ack The contents of the buffer are cleared only when. The data is reliable, but the packet will stick.

Only TCP will stick packets, and UDP will not stick packets

Two cases of sticking package

Sender's caching mechanism

The sender needs to wait until the buffer is full to send out, resulting in sticky packets (the time interval between sending data is very short, the data is very small, and they converge to produce sticky packets)

Caching mechanism of receiver

The receiver does not receive the packets in the buffer in time, resulting in multiple packets received (the client sends a piece of data, the server receives only a small part, and the server still takes the data left over from the buffer next time, resulting in sticky packets)

summary

Packet sticking only occurs in tcp protocol:

1. On the surface, the sticky packet problem is mainly due to the caching mechanism of the sender and receiver and the flow oriented characteristics of tcp protocol.

2. In fact, it is mainly because the receiver does not know the boundaries between messages and how many bytes of data are extracted at one time

IV Sticky package solution

The root of the problem is that the receiver does not know the length of the byte stream to be transmitted by the sender, so the solution to sticking packets is to focus on how to let the sender know the total size of the byte stream to be transmitted before sending data, and then the receiver receives all data in an endless loop.

Existing problems:
The running speed of the program is much faster than the network transmission speed, so send the byte stream length before sending a byte. This method will amplify the performance loss caused by network delay

We can use a module that can convert the length of data to be sent into bytes of fixed length. In this way, each time the client receives a message, it only needs to accept the content of this fixed length byte and take a look at the size of the information to be received next. As long as the final accepted data reaches this value, it will stop and just receive the complete data.

struct module

import struct
obj = struct.pack('i',123456)
print(len(obj))  # 4
obj = struct.pack('i',898898789)
print(len(obj))  # 4
# No matter how big the number is, the packed length is always 4

Using struct to solve sticky package

With the help of struct module, we know that the length number can be converted into a standard size 4-byte number. Therefore, this feature can be used to send the data length in advance

When sending When receiving
First send the data converted by struct, with a length of 4 bytes First accept 4 bytes and use struct to convert them into numbers to obtain the data length to be received
Resend data Then receive the data according to the length

Server

import socket
import subprocess
import struct

# Get phone
server = socket.socket()
# Fixed address
server.bind(('127.0.0.1', 9999))
# Set up semi connection pool
server.listen(5)
while True:
    # monitor
    sock, address = server.accept()
    try:
        while True:
            cmd_command = sock.recv(1024)
            sub = subprocess.Popen(cmd_command.decode('utf8'),
                                   shell=True,
                                   stdout=subprocess.PIPE,
                                   stderr=subprocess.PIPE)
            msg = sub.stdout.read() + sub.stderr.read()
            # Making msg headers
            msg_head = struct.pack('i', len(msg))
            # Send header
            sock.send(msg_head)
            # Send real information
            sock.send(msg)
    except ConnectionResetError as e:
        print(e)
        break

client

import socket
import struct

# Get phone
client = socket.socket()
# dial
client.connect(('127.0.0.1', 9999))
while True:
    cmd_command = input('Please enter the to execute CMD command(Press q sign out)>>>:').strip()
    if not cmd_command: continue
    if cmd_command.lower() == 'q':
        break
    client.send(cmd_command.encode('utf8'))
    # Receive header
    msg_head = client.recv(4)
    msg_len = struct.unpack('i', msg_head)[0]
    # Receive real data
    msg = client.recv(msg_len)
    print(msg.decode('gbk'))

When sending When receiving
Send dictionary header First receive the header length and take it out with struct
Send dictionary The header content is received according to the extracted length, then decoded and deserialized
Send real data Get the details of the data to be retrieved from the deserialization result, and then get the real data content
# server side
		 # 1. Make the header first
     header_dic = {
         'filename': 'a.txt',
         'md5': 'asdfasdf123123x1',
         'total_size': len(stdout) + len(stderr)
     }
     header_json = json.dumps(header_dic)
     header_bytes = header_json.encode('utf-8')
     # 2. Send 4 bytes first (including the length of the header)
     conn.send(struct.pack('i', len(header_bytes)))
     # 3 resend header
     conn.send(header_bytes)
     # 4. Finally, send real data
     conn.send(stdout)
     conn.send(stderr)
# client side
		#1. Receive 4 bytes first and find out the length of the header
    header_size=struct.unpack('i',client.recv(4))[0]
    #2. Receive the header and get the header_dic
    header_bytes=client.recv(header_size)
    header_json=header_bytes.decode('utf-8')
    header_dic=json.loads(header_json)
    print(header_dic)
    total_size=header_dic['total_size']
    #3. Receive real data
    cmd_res=b''
    recv_size=0
    while recv_size < total_size:
        data=client.recv(1024)
        recv_size+=len(data)
        cmd_res+=data
    print(cmd_res.decode('gbk'))

Summary: send the dictionary header first, then the dictionary data, and finally the real data

Keywords: socket

Added by newb110508 on Thu, 13 Jan 2022 10:53:48 +0200