[I/O multiplexing] select system call

Article Directory

[1] Function prototype

#include<sys/select.h>
int select(int maxfdp,fd_set *readfds,fd_set *writefds,fd_set *errorfds,
			struct timeval *timeout);

Meaning of each parameter

int maxfdp is an integer value that refers to the range of all file descriptors in the collection, that is, the maximum of all file descriptors plus 1, since file descriptors start with 0.

Strct fd_set can be understood as a collection in which a file descriptor, or file handle, is stored.

  • readfds needs to detect a collection of readable file descriptors;
  • Wtefds needs to detect a collection of writable file descriptors;
  • A collection of exception file descriptors that errorfds needs to detect.

The fd_set collection can be manipulated artificially through some macros.The following:

#include<sys/select.h>

FD_ZERO(fd_set *fdset)	//Clear the fdset's association with all file handles. 

FD_SET(int fd, fd_set *fdset)	//Establish a connection between the file handle fd and fdset. 

FD_CLR(int fd, fd_set *fdset)	//Clear the connection between the file handle fd and fdset. 

FD_ISSET(int fd, fdset *fdset)	//Check that the file handle fd contacted by fdset is readable and writable, >0 means readable and writable. 

The struct TimeValue structure is used to describe a period of time in which the function returns a value of 0 if no event occurs for the descriptor that needs to be monitored.The structure is as follows:

struct timeval 
	{
          time_t       tv_sec;     /* seconds for second precision */
          suseconds_t   tv_usec; /*Represents microsecond precision (10-6 seconds) */
    };

The struct TimeValue structure is used to set the timeout for a select, and if no event occurs for the descriptor that needs to be monitored, the function returns with a return value of 0.

  • If the structure pointed to by timeout is NULL, select will be blocked until an event occurs on a file descriptor.
  • If the structure pointed to by timeout is set to fixed time, the function returns if events occur or time is exhausted within a specified fixed time period.
  • If the structure pointed to by timeout is set to zero, it is non-blocking, detecting only the state of the set of descriptors, then returning immediately without waiting for external events to occur.

Function return value: Returns the total number of FDS whose corresponding bit is still 1.
Note: Only those fd bits that are readable, writable, and have exception conditions to process remain 1, otherwise 0.

The select function changes the fd_set passed in, leaving only the readable, writable, and abnormal FD bit 1.

Summary: The three fd_sets monitor the read-write abnormalities of the file descriptors, and return a value greater than 0 if there is a select.If not, select returns 0 after the time of timeout and negative if an error occurs.fd_set can pass in a NULL value indicating that it is not concerned with any file read/write/abnormal changes.

[3] Simple understanding of fd_set structure

The key to understanding the select model is to understand fd_set. For illustration purposes, take fd_set as 1 byte long and each bit in fd_set can correspond to a file descriptor fd.A 1-byte fd_set can correspond to a maximum of eight fds.

(1) Execute fd_set set; FD_ZERO(&set); then set bit representation is 0000,0000.

(2) If FD = 5, FD_SET (fd, &set) is performed; then set becomes 0001,0000 (position 5 is 1)

(3) If fd=2 and fd=1 are added, set becomes 0001,0011

(4) Execute select (6, &set, 0,0, 0) Blocking wait

(5) If readable events occur on fd=1 and fd=2, select returns, and set becomes 0000,0011.

Note: fd=5 for no events will be cleared.

[4] Characteristics of select model

From the top we can see the characteristics of the select model:

(1) The number of file descriptors is limited, which generally has a large relationship with system memory.Selectect uses a bit field to pass file descriptors of interest, which has the maximum length.Selectect uses bit fields to return ready file descriptors, and the caller needs to iterate through each bit to determine if they are ready. When there are many file descriptors, but there are much more free file descriptors than ready file descriptors, this is inefficient.

(2) When fd is added to the select monitoring set, a data structure array is used to save the fd in the select monitoring set.
One is to use array as the source data and fd_set to make FD_ISSET judgments to get which file descriptors are ready when selectector returns.Second, once the select returns, it empties the FDS that were previously joined but did not occur, so each time the select starts, it must get the FDS from array one by one.The array is scanned and maxfd, the maximum fd, is obtained, plus one, as the first parameter of the select.

(3) The select model must loop array before the select (add FD to fd_set, use maxfd as a parameter), and loop array after the select returns (FD_ISSET determines if an event has occurred).

The Procedure of Using the select Function

First call the macro FD_ZERO to clear the specified fd_set, then call the macro FD_SET to add the FD you need to test to the fd_set, then call the function select to test all the FD in the fd_set, and finally use the macro FD_ISSET to check if an FD remains 1 after the function select call.

[6] Program example

The following is a TCP-based client-server interaction program. If the client does not send data after three connections, the server will poll the fdset collection until the client sends data, select will detect an event ready to process, and timeout will be set to 5 secondsIf no data is sent to the server within five seconds, "time out" will be printed.

The procedure is as follows:
server.c

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<arpa/inet.h>
#include<netinet/in.h>
#include<sys/socket.h>
#include<sys/time.h>

#define MAXFD 10	//Size of fds array

void fds_add(int fds[],int fd)	//Add a file descriptor to the fds array
{
	int i=0;
	for(;i<MAXFD;++i)
	{
		if(fds[i]==-1)
		{
	      fds[i]=fd;
		  break;
		}
	}
}

int main()
{
	int sockfd=socket(AF_INET,SOCK_STREAM,0);
	assert(sockfd!=-1);
	
    printf("sockfd=%d\n",sockfd);
    
	struct sockaddr_in saddr,caddr;
	memset(&saddr,0,sizeof(saddr));
	saddr.sin_family=AF_INET;
	saddr.sin_port=htons(6000);
	saddr.sin_addr.s_addr=inet_addr("127.0.0.1");

	int res=bind(sockfd,(struct sockaddr*)&saddr,sizeof(saddr));
	assert(res!=-1);
	
	//Create listening queue
	listen(sockfd,5);
	
   //Define fdset collection
    fd_set fdset;
	
	//Define fds array
    int fds[MAXFD];
    int i=0;
    for(;i<MAXFD;++i)
    {
	  	fds[i]=-1;
    }
	
	//Add a file descriptor to the fds array
    fds_add(fds,sockfd);

	while(1)
    {
		FD_ZERO(&fdset);//Clear the fdset array to 0

		int maxfd=-1;

		int i=0;

		//For loop finds the maximum subscript for the ready event in the fds array
		for(;i<MAXFD;i++)
		{
			if(fds[i]==-1)
			{
				continue;
			}

			FD_SET(fds[i],&fdset);

			if(fds[i]>maxfd)
			{
				maxfd=fds[i];
			}
		}

		struct timeval tv={5,0};	//Set timeout of 5 seconds

		int n=select(maxfd+1,&fdset,NULL,NULL,&tv);//Selectect system call, where we only focus on read events
		if(n==-1)	//fail
		{
			perror("select error");
		}
		else if(n==0)//Timeout, meaning no file descriptor returned
		{
			printf("time out\n");
		}
		else//Ready event generation
		{
		//Because we only know the number of ready events by the return value of select, we don't know which events are ready.
		//Therefore, each file descriptor needs to be traversed for judgment
			for(i=0;i<MAXFD;++i)
			{
				if(fds[i]==-1)	//If fds[i]==-1, the event is not ready
				{
					continue;
				}
				if(FD_ISSET(fds[i],&fdset))	//Determine if the event corresponding to the file descriptor is ready
				{
			   
				//There are two kinds of cases for judging file descriptors
			   
					if(fds[i]==sockfd)	//A file descriptor is a socket, meaning accept if a new client requests a connection
					{
						//accept
						struct sockaddr_in caddr;
						int len=sizeof(caddr);

						int c=accept(sockfd,(struct sockaddr *)&caddr,&len);	//Accept new client connections
						if(c<0)
						{
							continue;
						}
					
						printf("accept c=%d\n",c);
						fds_add(fds,c);//Add the connection socket to the array where the file descriptor is stored
					}
					else   //Receive data recv when an existing client sends data
					{
						char buff[128]={0};
						int res=recv(fds[i],buff,127,0);
						if(res<=0)
						{
							close(fds[i]);
							fds[i]=-1;
							printf("one client over\n");
						}
						else
						{
							printf("recv(%d)=%s\n",fds[i],buff);	//Output Client Sent Information
							send(fds[i],"OK",2,0);	//Reply message to client
						}
					}
				}
			}
		}
	}
}

client.c

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
#include<assert.h>
#include<sys/socket.h>
#include<netinet/in.h>
#include<arpa/inet.h>

int main()
{
	int sockfd = socket(AF_INET,SOCK_STREAM,0);	
	assert(sockfd != -1 );

	//Set Address Information
	struct sockaddr_in saddr;
	memset(&saddr,0,sizeof(saddr));
	saddr.sin_family = AF_INET;
	saddr.sin_port = htons(6000);
	saddr.sin_addr.s_addr = inet_addr("127.0.0.1");

	//Link to server
	int res = connect(sockfd,(struct sockaddr*)&saddr,sizeof(saddr));
	assert(res != -1);

	while(1)
	{
		char buff[128] = {0};
		printf("Please Input:");
		fgets(buff,128,stdin);
		if(strncmp(buff,"end",3) ==0 )
		{
			break;
		}
		send(sockfd,buff,strlen(buff),0);
		memset(buff,0,128);
		recv(sockfd,buff,127,0);
		printf("RecvBuff:%s\n",buff);
        printf("\n");
	}
	close(sockfd);
}

Disadvantages of select

  • The number of FDS a single process can monitor is limited, that is, the size of the port it can listen on is limited.Generally, this number is highly dependent on system memory and can be viewed by cat/proc/sys/fs/file-max.The default 32-bit machine is 1024.The default 64-bit machine is 2048.

  • Scanning sockets is a linear scan, i.e. using polling, which is inefficient: when there are more sockets, each select() completes the schedule by traversing FD_SETSIZE Sockets, regardless of which Socket is active.This wastes a lot of CPU time.

  • You need to maintain a data structure that holds a large number of fd s, which can make user and kernel spaces expensive to copy when passing this structure

Keywords: socket

Added by WM_Programmer_noob on Sat, 20 Jul 2019 04:02:43 +0300