linux Server HTTP protocol content, implementation program code

http protocol

1. Introduction to http

http protocol is called hypertext transfer protocol.

  • Hypertext: not only text, but also pictures, audio, video, etc.
  • Transmission: it is based on TCP/IP. Please transmit with a response mode.
  • Protocol: connectionless and stateless application layer protocol.

2. Working principle

  • http protocol works on b/s framework. The browser sends a request as an http client and an http server (web server), and the server will close the connection after responding.
  • Web server: as the responder of HTTP request, the three mainstream web servers are Apache, Nginx and IIS.
  • The default port is 80. You can change it to 8080 or other.

3.http request response process

Example: URL: http://www.example.com/filepath/index.html

  1. The DNS server will map the domain name entered on the browser. Find it first http://www.example.com The HTTP client will initiate a connection to the server at port 80 http://www.example.comTCP Connect and send request message. The message content includes the request for filepath/index.html.
  2. The http server accepts the connection and request, parses the message and retrieves the object filepath/index.html, then takes the data from the disk or memory, encapsulates and sends back the http message, and disconnects the connection after sending it.
  3. After receiving the data, the client also disconnects and takes out the resource file from the message.

4.http protocol features

  • No connection: limit each connection to only one request. The server will disconnect after processing the client's request and receiving the client's response. This method can save transmission time.
  • Stateless: there is no memory capability for transaction processing, and subsequent messages need to be retransmitted. This leads to an increase in the amount of data transmitted each connection, but the server responds faster when it does not need previous information.
  • Flexible protocol: any type of data object can be transmitted, and the type is marked with content type.
  • Simple and fast: the protocol is simple, which makes the server program smaller, so the communication speed is very fast.
  • It can also support c/s mode.

5. Client request message

As shown in the figure, message: request method + request header + request data.

  • Request method: as shown in the figure.
  • Request header: there are four types: General header, request header, response header and entity header.
  • For example, in the following figure, the first line is the request method, the second, third and fourth lines are the request header and its corresponding values

6. Server response message

As shown in the figure, message: status line + message header + blank line + response body
for example

7. Advantages and disadvantages of HTTP

1. Advantages

  • Simple, flexible and easy to expand: the protocol is simple, the content is not much, and it is relatively free.
  • Wide application: it is widely used and can be developed across platforms and languages.
  • No connection, no state: no additional resources are required to record state information. The implementation is relatively simple, and can reduce the burden of services.

2. Disadvantages

  • Connectionless and stateless: there is no memory and cannot support multiple transaction operations. The protocol information is checked every time, which adds unnecessary data transmission, so cookie s appear.
  • Plaintext transmission: in the form of readable text, wireshark or tcpdump can be used to directly capture packets and modify them. It is easy to be attacked and unsafe. At the same time, it can not judge the identity of both communication parties or whether the message has been changed. Therefore, https was born.

3. Performance

  • Generally, the Internet is mobile and highly concurrent, and the connection quality cannot be guaranteed. HTTP protocol sometimes performs poorly at the tcp level.
  • "Request response" will lead to the "header blocking" effect, that is, when a request is blocked, the subsequent request sequence will also be blocked, and the client will not get a response.

Program implementation

The program here uses the epoll reactor model as the framework, and the reactor content can be viewed in my blog. If you simply want to see and understand the http protocol, you can go to my home page to see another blog about the tinyhttp open source project.

The following is an introduction to implementing http server under epoll reactor (C1000K) model~

HTTP server based on reactor model

Client structure

struct qsevent{
    int fd;				//clientfd
    int events;			//Event: read, write, or exception
    int status;			//Is it located on the epfd red black monitoring tree
    void *arg;			//parameter
    long last_active;	//Last data sending / receiving event

    int (*callback)(int fd, int event, void *arg);	//Callback function, single callback, later modified to multiple callbacks
    unsigned char buffer[MAX_BUFLEN];				//Data buffer
    int length;										//Data length

    /*http param*/
    int method;						//http protocol request header
    char resource[MAX_BUFLEN];		//Requested resources
    int ret_code;					//Response status code
};

int http_response(struct qsevent *ev)

When the client sends a tcp connection, the listen FD of the server will trigger an input event and call ev - > callback, i.e. accept_ The CB callback function responds to the connection and obtains the clientfd. After the connection, the http data message is sent. The client FD trigger input event on the server will call ev - > callback, that is, recv_cb callback function receives data and parses http message.

int http_request(struct qsevent *ev)
{
    char linebuf[1024] = {0};//It is used to obtain the request message of each line from the buffer
    int idx = readline(ev->buffer, 0, linebuf);//Read the request method in the first line and the readline function, which will be described later
    if(strstr(linebuf, "GET"))//strstr determines whether there is a GET request method
    {
        ev->method = HTTP_METHOD_GET;//The GET method indicates that the client needs to obtain resources

        int i = 0;
        while(linebuf[sizeof("GET ") + i] != ' ')i++;//Skip spaces
        linebuf[sizeof("GET ") + i] = '\0';
        sprintf(ev->resource, "./%s/%s", HTTP_METHOD_ROOT, linebuf+sizeof("GET "));//Store the name of the resource in ev - > resource as a file path
        printf("resource:%s\n", ev->resource);//Echo
    }
    else if(strstr(linebuf, "POST"))//The POST request method has not been written yet. The method is similar
    {}
    return 0;
}

int http_response(struct qsevent *ev)

The Server http encapsulates the response message data of the client and stores it in the buffer. When the event is triggered, it is sent_ The CB callback function is sent to the client. See the code comments for a detailed explanation.

int http_response(struct qsevent *ev)
{
    if(ev == NULL)return -1;
    memset(ev->buffer, 0, MAX_BUFLEN);//Empty buffer to store message

    printf("resource:%s\n", ev->resource);//Resource: the resource file requested by the client through HTTP_ Requests function get
    int filefd = open(ev->resource, O_RDONLY);//Open read-only to get file handle
    if(filefd == -1)//If the acquisition fails, send 404 NOT FOUND
    {
        ev->ret_code = 404;//404 status code
        ev->length = sprintf(ev->buffer,//Pass the following data into ev - > buffer
        					 /***Status line***/
        					 /*Version number status code status code description */
                             "HTTP/1.1 404 NOT FOUND\r\n"
                             /***Message header***/
                             /*Get current time*/
                             "date: Thu, 11 Nov 2021 12:28:52 GMT\r\n"
                             /*Response body type; Coding mode*/
                             "Content-Type: text/html;charset=ISO-8859-1\r\n"
                             /*Response body length blank line*/
                             "Content-Length: 85\r\n\r\n"
                             /***Response body***/
                             "<html><head><title>404 Not Found</title></head><body><H1>404</H1></body></html>\r\n\r\n");
    }
    else 
    {
        struct stat stat_buf;			//file information
        fstat(filefd, &stat_buf);		//fstat gets the file information through the file handle
        if(S_ISDIR(stat_buf.st_mode))	//If the file is a directory
        {

            printf(ev->buffer, //As above, put 404 into the buffer
                   "HTTP/1.1 404 Not Found\r\n"
                   "Date: Thu, 11 Nov 2021 12:28:52 GMT\r\n"
                   "Content-Type: text/html;charset=ISO-8859-1\r\n"
                   "Content-Length: 85\r\n\r\n"
                   "<html><head><title>404 Not Found</title></head><body><H1>404</H1></body></html>\r\n\r\n" );

        } 
        else if (S_ISREG(stat_buf.st_mode)) //If the file exists
        {

            ev->ret_code = 200;		//200 status code

            ev->length = sprintf(ev->buffer, //Length is the record length, and the buffer stores the response message
                                 "HTTP/1.1 200 OK\r\n"
                                 "Date: Thu, 11 Nov 2021 12:28:52 GMT\r\n"
                                 "Content-Type: text/html;charset=ISO-8859-1\r\n"
                                 "Content-Length: %ld\r\n\r\n", 
                                 stat_buf.st_size );//The file length is stored in stat_ buf.st_ In size

        }
        return ev->length;//Return message length
    }
}

git clone code

git clone git clone https://github.com/qiushii/reactor.git

Keywords: Linux server http

Added by Loafin on Thu, 09 Dec 2021 13:20:48 +0200