Linux network programming - VII

1. Preparation for web server development

In order to write a web server, we need to learn how to write html pages and master some knowledge of http protocol. These two parts will be introduced next. After these two preparations, you also need to know what is the communication process of the web server? We also need to think about how to support multi browser concurrent access!

1.1 Html language foundation (it is similar to Markdown in nature and common to some extent, so I added spaces in < > when editing to avoid problems in the editor)

Introduction to Html
html (Hyper Texture Markup Language) is a hypertext markup language html or htm, as an extension, can be recognized by the browser, which is a frequently seen web page.

html's syntax is very concise and loose. It is combined with corresponding English word keywords. html tags are not case sensitive. Most tags appear in pairs, with a beginning and an end, such as < html >, but it is not required to appear in pairs There are also fixed short labels, such as < br / >, < HR / >.

Learning html can basically be regarded as learning various labels, and labels can also set attributes, such as < font color = "red" > hello,world < / font >. In the example, color represents the color attribute of the label, red represents that the label is a red font, and hello and world are the actual displayed contents You can create a new text document and modify the suffix html file. Open the html file with a code editor to edit the file (such as notepad + +), save the above contents to the file, and double-click the file to see the following effects:

The composition of Html can be divided into the following parts:

  1. < ! DOCTYPE HTML > declares the document type, which can be left blank
  2. Tags of < > html and < > html root / html
  3. < head > < / head > header label. Generally, there is < title > < / Title > in the header label
  4. < body > < / body > body label, which is generally used to display content

For example:

<html> 
	<head>
		<title>This is a title</title>
	</head>
 
	<body>
		<font color="red" size="5">hello, world</font>
	</body>
</html>

If you want to add comments, you can use <! -- I am the way of annotation -- >
You can also specify the page type and character code. Set the page type as html and the character code as utf8 below:
< meta http-equiv="content-Type" content="text/html; charset=utf8">
Html tag attributes can be double quotation marks, single quotation marks, or not written.

1.2 introduction to HTML label

1.2.1 title label

There are 6 kinds, < H1 >, < H2 >,... < H6 >, of which < H1 > is the largest and < H6 > is the smallest.

1.2.2 text labels

< font > label, you can set color and font size attributes
Color representation method (refer to website: http://tool.oschina.net/commons?type=3):

  • The English word red green blue
  • Color in hexadecimal form: #ffffff
  • Using RGB (255255,0)

The font size can use the size attribute. The size range is 1-7, of which 7 is the largest and 1 is the smallest.

Sometimes you need to use a newline label, which is a short label < br / >.

Corresponding to it, there is another horizontal line, which is also a short label, < HR / >, and the horizontal line can also set the color and size.

1.2.3 list labels

The list label is divided into unordered list and ordered list, corresponding to < UL > and < ol > respectively.
The format of unordered list is as follows:

<ul>
	<li>List content 1</li>
	<li>List content 2</li>
	...
</ul>

The type attribute can be set for unordered lists:

  • Filled circle: type=disc
  • Hollow circle: type=circle
  • Small square: type=square

The format of the ordered list is as follows:

<ol>
	<li>List content 1</li>
	<li>List content 2</li>
	...
</ol>

The ordered list can also set the type attribute:

  • Number: type=1, which is also the default
  • English letters: type=A or type=A
  • Roman numerals: type=I or type=I

1.2.4 picture labels

The picture label uses < img >, and several attributes need to be set internally. It is not necessary to write the end label attribute:

  1. src=”3.gif "picture source, required
  2. alt = "Xiaoyue" the content displayed when the picture is not displayed
  3. title = "my God" the text displayed when the mouse moves over the picture
  4. Width = "600" width of the picture display
  5. Height = "400" the height displayed in the picture

For example:

<img src="3.gif" alt="Xiao Yueyue" title="My god!" width="300" height="200" />  

Note: when the width and height of the picture are not defined, the picture will be displayed in 100% proportion. If only the width or height of the picture is changed, it will be scaled in equal proportion.

1.2.5 hyperlink labels

The hyperlink tag uses < a >, which also needs to set the attribute to indicate where to link.
Properties:

  1. href=” http://www.itcast.cn ”, destination address, required, please write http://
  2. title = "go to spread wisdom" the text displayed when the mouse moves over the link
  3. target=”_ "self" or "_ blank”,_ self is the default value. It opens on its own page_ Blank is the connection address of the newly opened page.
    Example:
<a href="http://www.itcast.cn "title =" go to spread wisdom "target =" "_ self "> to spread wisdom</a>

When we visit a website, when the requested resources do not exist, we often report an error as 404 error, which generally returns an error page to the requesting user. You can try to write our own error page by yourself.

1.3 http Hypertext Transfer Protocol

http protocol and ht in front of html mean hypertext, so http and html are a very close pair. We can think that http is to transmit files such as html. http is located in the application layer and focuses on interpretation.

http protocol can distinguish messages into request messages and response messages.

1.3.1 http request message

When we want to use the browser to access a resource (for example, when we want to use the browser to access the server, we need to use the HTTP protocol to access a resource) http://127.0.0.1:8000 ), after the address is entered, when the Enter key is clicked, the browser will send the request message to the server.

We can first create a socket server with the test tool:
Then request the address through the browser, and you will see the request message sent by the browser:
This message looks messy and complex, corresponding to what we call the request message.
The request message is divided into four parts:

  1. The request line describes the request type, the resource to be accessed, and the http version used
  2. The request header indicates the additional information used by the server, which are key value pairs, such as indicating the browser type
  3. Empty lines cannot be omitted - and are \ r\n, including request lines and request headers that end with \ r\n
  4. The request data indicates the specific data content of the request, which can be omitted - for example, when logging in, the user name and password content will be used as the request data

Request type:
There are many types of requests in http protocol. For us, the most common requests are get and post requests. Common request types are as follows:

  1. Get requests the specified page information and returns the entity body
  2. POST submits data to the specified resources for processing requests (such as submitting forms or uploading files). The data is contained in the request body. POST requests may lead to the establishment of new resources and / or the modification of existing resources.
  3. The Head is similar to the get request, but the response message has no content and just gets the header
  4. Put replaces the specified document content with the data transmitted from the client to the browser
  5. Delete requests the server to delete the specified page
  6. The Connect HTTP/1.1 protocol is reserved for proxy servers that can change the connection to pipeline
  7. Options allows the client to view the performance of the browser
  8. The request received by the Trace echo server is mainly used for testing and diagnosis

Both get and post requests are request resources, and both submit data. If you submit password information with get request, it will be displayed in clear text, while post will not display secret related information.

1.3.2 http response message

The response message represents the feedback made by the server to the browser after receiving the request message, so the response message is sent by the server to the browser, and the response message is also divided into four parts:

  1. The status line includes http version number, status code and status information
  2. The message header describes some additional information to be used by the client, which is also a key value pair
  3. Blank lines \ R \ nalso cannot be omitted
  4. The text information returned by the response body server to the client

Example:

http common status code:

The http status code consists of three digits. The first digit represents the category of response. There are five categories:

  1. 1xx indication - indicates that the request has been received and continues processing
  2. 2xx successful – indicates that the request has been successfully received, understood and accepted
  3. 3xx redirection – further action is required to complete the request
  4. 4xx client error - the request has syntax errors or the request cannot be implemented
  5. 5xx server side error - the server failed to implement the legal request

Common status codes are as follows:

  • 200 OK client request succeeded
  • 301 Moved Permanently redirected
  • 400 Bad Request client request has syntax error and cannot be understood by the server
  • 401 Unauthorized request is not authorized. This status code must be used with WWW authenticate header field
  • 403 Forbidden server received the request but refused to provide service
  • 404 Not Found request resource does not exist, eg: wrong URL entered
  • 500 Internal Server Error unexpected error occurred on the server
  • 503 Server Unavailable server cannot process the client's request at present, and may return to normal after a period of time

http common file type classification:

When http interacts with the browser, in order to enable the browser to recognize the file information, it is necessary to pass the file type, which is also a required item in the response message. The common types are as follows:

  • Ordinary file: text/plain; charset=utf-8
  • *.html: text/html; charset=utf-8
  • *.jpg: image/jpeg
  • *.gif: image/gif
  • *.png: image/png
  • *.wav: audio/wav
  • *.avi: video/x-msvideo
  • *.mov: video/quicktime
  • *.mp3: audio/mpeg

Special note:

  • charset=iso-8859-1 Western European code, indicating that the code adopted by the website is English;
  • charset=gb2312 indicates that the code adopted by the website is simplified Chinese;
  • charset=utf-8 represents the universal language code in the world; It can be used in Chinese, Korean, Japanese and other languages in the world;
  • Charset = EUC Kr indicates that the code adopted by the website is Korean;
  • charset=big5 indicates that the code adopted by the website is traditional Chinese;

2 web server development

How do we use the HTTP server to deliver html files? Note that HTTP is only an application layer protocol. We still need to select a transport layer protocol to complete our data transmission work, so the development protocol is TCP+HTTP, that is to say, the server is built and browsed according to TCP, and the data parsing and response work follow the principle of HTTP.

In this way, our idea is very clear. Write a TCP concurrent server, but the format of sending and receiving messages adopts HTTP protocol, as shown in the following figure:
In order to support concurrent servers, we can have multiple choices, such as multi process server, multi-threaded server, select,poll,epoll and other multi-channel IO tools. Even if readers feel that libevent is very skilled, they can also use libevent for development.

2.1 epoll based web server

As we know that epoll is highly efficient in the case of a large number of concurrent and a small number of active, this paper takes epoll as an example to introduce the main process of epoll development:
Process of handling client requests:
We are already familiar with the process developed using epoll. Most of the problems involved are the details of handling http protocol or handling client requests:

2.2 epoll based server program

//web server program -- using epoll model
#include <unistd.h>
#include <sys/epoll.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <string.h>
#include <signal.h>
#include <dirent.h>

#include "pub.h"
#include "wrap.h" / / encapsulated error function

int http_request(int cfd, int epfd);

int main()
{
	//If the browser has closed the connection when the web server sends data to the browser, 
	//Then the web server will receive SIGPIPE signal
	struct sigaction act;
	act.sa_handler = SIG_IGN;
	sigemptyset(&act.sa_mask);
	act.sa_flags = 0;
	sigaction(SIGPIPE, &act, NULL);
	
	//Change the working directory of the current process
	char path[255] = {0};
	sprintf(path, "%s/%s", getenv("HOME"), "webpath");
	chdir(path);
	
	//Create socket -- set port reuse -- bind
	int lfd = tcp4bind(9999, NULL);
	
	//Set listening
	Listen(lfd, 128);

	//Create epoll tree
	int epfd = epoll_create(1024);
	if(epfd<0)
	{
		perror("epoll_create error");
		close(lfd);
		return -1;
	}
	
	//Put the listening file descriptor lfd on the tree
	struct epoll_event ev;
	ev.data.fd = lfd;
	ev.events = EPOLLIN;
	epoll_ctl(epfd, EPOLL_CTL_ADD, lfd, &ev);
	
	int i;
	int cfd;
	int nready;
	int sockfd;
	struct epoll_event events[1024];
	while(1)
	{
		//Wait for the event to occur
		nready = epoll_wait(epfd, events, 1024, -1);
		if(nready<0)
		{
			if(errno==EINTR)
			{
				continue;
			}
			break;
		}
		
		for(i=0; i<nready; i++)
		{
			sockfd = events[i].data.fd;
			//There are client connection requests
			if(sockfd==lfd)
			{
				//Accept new client connections
				cfd = Accept(lfd, NULL, NULL);
				
				//Set cfd to non blocking
				int flag = fcntl(cfd, F_GETFL);
				flag |= O_NONBLOCK;
				fcntl(cfd, F_SETFL, flag);
				
				//Put the new cfd on the tree
				ev.data.fd = cfd;
				ev.events = EPOLLIN;
				epoll_ctl(epfd, EPOLL_CTL_ADD, cfd, &ev);
			}
			else 
			{
				//Client data sent
				http_request(sockfd, epfd);
			}			
		}		
	}
}

int send_header(int cfd, char *code, char *msg, char *fileType, int len)
{
	char buf[1024] = {0};
	sprintf(buf, "HTTP/1.1 %s %s\r\n", code, msg);
	sprintf(buf+strlen(buf), "Content-Type:%s\r\n", fileType);
	if(len>0)
	{
		sprintf(buf+strlen(buf), "Content-Length:%d\r\n", len);
	}
	strcat(buf, "\r\n");
	Write(cfd, buf, strlen(buf));
	return 0;
}

int send_file(int cfd, char *fileName)
{
	//Open file
	int fd = open(fileName, O_RDONLY);
	if(fd<0)
	{
		perror("open error");
		return -1;
	}
	
	//Cycle through the file and send it
	int n;
	char buf[1024];
	while(1)
	{
		memset(buf, 0x00, sizeof(buf));
		n = read(fd, buf, sizeof(buf));
		if(n<=0)
		{
			break;
		}
		else 
		{
			Write(cfd, buf, n);
		}
	}
}

int http_request(int cfd, int epfd)
{
	int n;
	char buf[1024];
	//Read the request line data and analyze the file name of the resource to be requested
	memset(buf, 0x00, sizeof(buf));
	n = Readline(cfd, buf, sizeof(buf));
	if(n<=0)
	{
		//printf("read error or client closed, n==[%d]\n", n);
		//Close connection
		close(cfd);
		
		//Delete the file descriptor from the epoll tree
		epoll_ctl(epfd, EPOLL_CTL_DEL, cfd, NULL);
		return -1;	
	}
	printf("buf==[%s]\n", buf);
	//GET /hanzi.c HTTP/1.1
	char reqType[16] = {0};
	char fileName[255] = {0};
	char protocal[16] = {0};
	sscanf(buf, "%[^ ] %[^ ] %[^ \r\n]", reqType, fileName, protocal);
	//printf("[%s]\n", reqType);
	printf("--[%s]--\n", fileName);
	//printf("[%s]\n", protocal);
	
	char *pFile = fileName;
	if(strlen(fileName)<=1)
	{
		strcpy(pFile, "./");
	}
	else 
	{
		pFile = fileName+1;
	}
	
	//Convert Chinese character code
	strdecode(pFile, pFile);
	printf("[%s]\n", pFile);
	
	//Read the remaining data circularly to avoid sticking packets
	while((n=Readline(cfd, buf, sizeof(buf)))>0);
	
	//Determine whether the file exists
	struct stat st;
	if(stat(pFile, &st)<0)
	{
		printf("file not exist\n");
		
		//Send header information
		send_header(cfd, "404", "NOT FOUND", get_mime_type(".html"), 0);
		
		//Send file content
		send_file(cfd, "error.html");	
	}
	else //If file exists
	{
		//Judge file type
		//Ordinary file
		if(S_ISREG(st.st_mode))
		{
			printf("file exist\n");
			//Send header information
			send_header(cfd, "200", "OK", get_mime_type(pFile), st.st_size);
			
			//Send file content
			send_file(cfd, pFile);
		}
		//Catalog file
		else if(S_ISDIR(st.st_mode))
		{
			printf("Catalog file\n");
			
			char buffer[1024];
			//Send header information
			send_header(cfd, "200", "OK", get_mime_type(".html"), 0);	
			
			//Send html file header
			send_file(cfd, "html/dir_header.html");	
			
			//File list information
			struct dirent **namelist;
			int num;

			num = scandir(pFile, &namelist, NULL, alphasort);
			if (num < 0)
			{
			   perror("scandir");
			   close(cfd);
			   epoll_ctl(epfd, EPOLL_CTL_DEL, cfd, NULL);
			   return -1;
			   
			}
			else 
			{
			   while (num--) 
			   {
			       printf("%s\n", namelist[num]->d_name);
			       memset(buffer, 0x00, sizeof(buffer));
			       if(namelist[num]->d_type==DT_DIR)
			       {
			       		sprintf(buffer, "<li><a href=%s/>%s</a></li>", namelist[num]->d_name, namelist[num]->d_name);
			       }
			       else
			       {
			       		sprintf(buffer, "<li><a href=%s>%s</a></li>", namelist[num]->d_name, namelist[num]->d_name);
			       }
			       free(namelist[num]);
			       Write(cfd, buffer, strlen(buffer));
			   }
			   free(namelist);
			}
			//Send html tail
			sleep(10);
			send_file(cfd, "html/dir_tail.html");		
		}
	}
	
	return 0;
}

2.3 log correlation function

To learn

2.4 Web server based on libevent

No more analysis here, just the code

//web server written through libevent
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include "pub.h"
#include <event.h>
#include <event2/listener.h>
#include <dirent.h>

#define _WORK_DIR_ "%s/webpath"
#define _DIR_PREFIX_FILE_ "html/dir_header.html"
#define _DIR_TAIL_FILE_ "html/dir_tail.html"

int copy_header(struct bufferevent *bev,int op,char *msg,char *filetype,long filesize)
{
    char buf[4096]={0};
    sprintf(buf,"HTTP/1.1 %d %s\r\n",op,msg);
    sprintf(buf,"%sContent-Type: %s\r\n",buf,filetype);
    if(filesize >= 0){
        sprintf(buf,"%sContent-Length:%ld\r\n",buf,filesize);
    }
    strcat(buf,"\r\n");
    bufferevent_write(bev,buf,strlen(buf));
    return 0;
}
int copy_file(struct bufferevent *bev,const char *strFile)
{
    int fd = open(strFile,O_RDONLY);
    char buf[1024]={0};
    int ret;
    while( (ret = read(fd,buf,sizeof(buf))) > 0 ){
        bufferevent_write(bev,buf,ret);
    }
    close(fd);
    return 0;
}
//Sending a directory actually organizes an html page to send to the client, and the contents of the directory are displayed as a list
int send_dir(struct bufferevent *bev,const char *strPath)
{
    //You need to spell out an html page and send it to the client
    copy_file(bev,_DIR_PREFIX_FILE_);
    //send dir info 
    DIR *dir = opendir(strPath);
    if(dir == NULL){
        perror("opendir err");
        return -1;
    }
    char bufline[1024]={0};
    struct dirent *dent = NULL;
    while( (dent= readdir(dir) ) ){
        struct stat sb;
        stat(dent->d_name,&sb);
        if(dent->d_type == DT_DIR){
            //Special handling of directory files
            //Format < a href = "dirname /" > dirname < / a > < p > size < / P > < p > time</p></br>
            memset(bufline,0x00,sizeof(bufline));
            sprintf(bufline,"<li><a href='%s/'>%32s</a>   %8ld</li>",dent->d_name,dent->d_name,sb.st_size);
            bufferevent_write(bev,bufline,strlen(bufline));
        }
        else if(dent->d_type == DT_REG){
            //Ordinary files can be directly displayed in the list
            memset(bufline,0x00,sizeof(bufline));
            sprintf(bufline,"<li><a href='%s'>%32s</a>     %8ld</li>",dent->d_name,dent->d_name,sb.st_size);
            bufferevent_write(bev,bufline,strlen(bufline));
        }
    }
    closedir(dir);
    copy_file(bev,_DIR_TAIL_FILE_);
    //bufferevent_free(bev);
    return 0;
}
int http_request(struct bufferevent *bev,char *path)
{
    
    strdecode(path, path);//Transcode the Chinese question into a string in utf-8 format
    char *strPath = path;
    if(strcmp(strPath,"/") == 0 || strcmp(strPath,"/.") == 0){
        strPath = "./";
    }
    else{
        strPath = path+1;
    }
    struct stat sb;
    
    if(stat(strPath,&sb) < 0){
        //Does not exist, give 404 pages
        copy_header(bev,404,"NOT FOUND",get_mime_type("error.html"),-1);
        copy_file(bev,"error.html");
        return -1;
    }
    if(S_ISDIR(sb.st_mode)){
        //Processing directory
        copy_header(bev,200,"OK",get_mime_type("ww.html"),sb.st_size);
        send_dir(bev,strPath);
        
    }
    if(S_ISREG(sb.st_mode)){
        //process the file
        //Write header
        copy_header(bev,200,"OK",get_mime_type(strPath),sb.st_size);
        //Write file content
        copy_file(bev,strPath);
    }

    return 0;
}

void read_cb(struct bufferevent *bev, void *ctx)
{
    char buf[256]={0};
    char method[10],path[256],protocol[10];
    int ret = bufferevent_read(bev, buf, sizeof(buf));
    if(ret > 0){

        sscanf(buf,"%[^ ] %[^ ] %[^ \r\n]",method,path,protocol);
        if(strcasecmp(method,"get") == 0){
            //Processing client requests
            char bufline[256];
            write(STDOUT_FILENO,buf,ret);
            //Make sure the data is read
            while( (ret = bufferevent_read(bev, bufline, sizeof(bufline)) ) > 0){
                write(STDOUT_FILENO,bufline,ret);
            }
			http_request(bev,path);//Processing requests

        }
    }
}
void bevent_cb(struct bufferevent *bev, short what, void *ctx)
{
    if(what & BEV_EVENT_EOF){//Client shutdown
        printf("client closed\n");
        bufferevent_free(bev);
    }
    else if(what & BEV_EVENT_ERROR){
        printf("err to client closed\n");
        bufferevent_free(bev);
    }
    else if(what & BEV_EVENT_CONNECTED){//Connection successful
        printf("client connect ok\n");
    }
}
void listen_cb(struct evconnlistener *listener, evutil_socket_t fd, struct sockaddr *addr, int socklen, void *arg)
{
    //Defines the bufferevent that communicates with the client
    struct event_base *base = (struct event_base *)arg;
    struct bufferevent *bev = bufferevent_socket_new(base, fd, BEV_OPT_CLOSE_ON_FREE);
    bufferevent_setcb(bev,read_cb,NULL,bevent_cb,base);//Set callback
    bufferevent_enable(bev,EV_READ|EV_WRITE);//Enable read and write
}

int main(int argc,char *argv[])
{
	char workdir[256] = {0};
	sprintf(workdir,_WORK_DIR_,getenv("HOME"));//HOME=/home/itheima 
	chdir(workdir);
    struct event_base *base = event_base_new();//Create root node
    struct sockaddr_in serv;
    serv.sin_family = AF_INET;
    serv.sin_port = htons(9999);
    serv.sin_addr.s_addr = htonl(INADDR_ANY);
    struct evconnlistener * listener =evconnlistener_new_bind(base,
                                     listen_cb, base, LEV_OPT_CLOSE_ON_FREE|LEV_OPT_REUSEABLE, -1,
                                                        (struct sockaddr *)&serv, sizeof(serv));//Connect listener
    

    event_base_dispatch(base);//loop

    event_base_free(base); //Release root node
    evconnlistener_free(listener);//Release link listener
    return 0;
}

Keywords: C Web Development Linux server http

Added by 758 on Sun, 20 Feb 2022 00:39:40 +0200