Simple summary of select, poll and epoll

Simple summary of select, poll and epoll

1.socket

socket is used to represent the special file type of inter process communication. Its essence is the pseudo file formed by the kernel with the help of buffer.
socket = IP address (uniquely identifying a host) + protocol + port number (uniquely identifying a process in a host)

2. Process of socket creating TCP

3. select()/poll()/epoll()

3.1 select()

  1. select() function prototype
    int select(int nfds,			     // The maximum number in the monitored file descriptor set is increased by one
    			fd_set* readfds,	     // The read descriptor has a set of monitoring data
    			fd_set* writefds,        // Monitor whether written data reaches the file descriptor set
    			fd_set* exceptfds,       // Monitor that abnormal data reaches the file descriptor set
    			struct timeval* timeout  // Timed blocking monitoring time: NULL, permanent waiting, > 0, fixed time, = 0, return immediately
    			) 
    
  2. select() time complexity
    During select monitoring, you only know that I/O events occur, but you don't know which streams have occurred and which operations. Therefore, you need to poll all streams, so select has O(n) complexity.
  3. select() summary
    select is essentially the next step of processing by setting or checking the data structure storing the fd flag bit. The disadvantages of this are:
    1. The number of fd that a single process can monitor is limited, that is, the size of the listening port is limited. The default number of 32-bit computers is 1024. The default value of 64 bit computers is 2048 (the specific number can be viewed in cat / proc / sys / FS / file max).
    2. The socket scanning is linear scanning, that is, the polling method is adopted, which has low efficiency:
    3. It is necessary to maintain a data structure used to store a large amount of FD, which will cause large replication overhead in user space and kernel space when passing the structure (fd_set will copy from user state to kernel state every time select is called).

3.2 poll()

  1. Prototype of poll() function

    int poll(struct pollfd *fds, // Event structure 
    		 nfds_t nfds, //number
    		 int timeout // Blocking event
    		)
    
    struct pollfd{
    	int fd;       //File descriptor
    	short events; //Monitored events
    	short revents;//Monitor the events returned if the conditions are met
    }
    
  2. poll() time complexity
    Time complexity O(n)

  3. poll() summary
    1. poll is essentially no different from select. It copies the array passed in by the user to the kernel space, and then queries the device status corresponding to each fd. However, it has no limit on the maximum number of connections because it is stored based on a linked list.

    2. Poll is "horizontal trigger". If fd is not processed after it is reported, the fd will be reported again in the next poll.

3.3 epoll()

  1. Prototype of epoll() function
int epoll_create(int size);//Function (tree building): create an epoll object, return the file descriptor of the object (this descriptor represents the epoll object), and size indicates the number to listen.
//Function (add, delete and modify nodes): add a socket and its related events to the epoll object descriptor to monitor the data flow on the socket through the epoll object;
int epoll_ctl(int efpd,//epoll_ Epoll object descriptor returned by create();
			  int op,  //op: action. There are addition, deletion and modification. The corresponding number is 1,2,3, - corresponding macro definition: EPOLL_CTL_ADD,EPOLL_CTL_DEL,EPOLL_CTL_MOD
			  int sockid, // Represents a client connection
			  struct epoll_event *event //Event information. EPOLL_CTL_ADD and EPOLL_CTL_MOD will use the event information in this event parameter.
			  );

typedef union epoll_data {
    void *ptr;
     int fd;
     __uint32_t u32;
     __uint64_t u64;
 } epoll_data_t;//Save data related to a file descriptor that triggers the event

 struct epoll_event {
     __uint32_t events;      /* epoll event */
     epoll_data_t data;      /* User data variable */
 };

//Function: block for a short period of time and wait for the event to occur, and return to the event collection
int epoll_wait(int epfd,                  //epoll_ Epoll object descriptor returned by create()
			   struct epoll_event *events,//Events: event array, indicating the epoll_ maxevents that can be collected by the wait call are ready read / write events
			   int maxevents,			  //Length of events array
			   int timeout);              //Length of blocking wait
  1. epoll() time complexity
    Time complexity O(1).
    Epoll will inform us which stream has an I/O event, so we say that epoll is actually event driven.

  2. epoll() summary
    1. Without the limitation of maximum concurrent connection, the upper limit of FD that can be opened is far greater than 1024 (about 100000 ports can be monitored on 1G memory);

    2. The efficiency improvement is not a polling method, and the efficiency will not decrease with the increase of the number of FD. In the actual network environment, the efficiency of Epoll will be much higher than that of select and poll.

    3. Memory copy, using mmap() file to map memory to speed up message transmission with kernel space; That is, epoll uses mmap to reduce replication overhead.
    4. epoll has two trigger modes: horizontal trigger (epollt) and edge trigger (EPOLLET). LT is the default mode.

    In LT mode, as long as the fd data is readable, epoll_wait will return its event to remind the user program to operate;
    In the ET mode, it will only prompt once and will not prompt again until the next time there is data inflow, whether there is data readable in fd or not.

Keywords: C++ network

Added by miksel on Tue, 08 Mar 2022 14:44:41 +0200