Reading process of nginx request header data

In the previous article, we explained how nginx reads the request line data and parses the request line. In this paper, we mainly explain how nginx reads the data of the request header sent by the client and analyzes the data. In essence, the data reading process of request line and request header is basically the same, because they are faced with the problem of how to read the data from the discontinuous data flow, and how to process the data.

1. Request header read main process

Before introducing the reading process of request header, we first show an example of http request message:

POST /web/book/read HTTP/1.1
Host: localhost
Connection: keep-alive
Content-Length: 365
Accept: application/json, text/plain, */*

The first row of data in the example is the request row, and the next few rows are the request headers. Each request header is assembled in the format of name: value, and each request header takes a row. In the previous article, which introduced the request line reading process, we said that once the request line reading is completed, nginx will modify the callback function of the current read event to the NGX ﹣ http ﹣ process ﹣ request ﹣ headers() method, and directly call the method to try to read the request header data. This method is the main process of reading request line data. The source code of this method is as follows:

/**
 * Parsing header data sent by client
 */
static void ngx_http_process_request_headers(ngx_event_t *rev) {
  u_char *p;
  size_t len;
  ssize_t n;
  ngx_int_t rc, rv;
  ngx_table_elt_t *h;
  ngx_connection_t *c;
  ngx_http_header_t *hh;
  ngx_http_request_t *r;
  ngx_http_core_srv_conf_t *cscf;
  ngx_http_core_main_conf_t *cmcf;

  c = rev->data;
  r = c->data;

  if (rev->timedout) {
    ngx_log_error(NGX_LOG_INFO, c->log, NGX_ETIMEDOUT, "client timed out");
    c->timedout = 1;
    ngx_http_close_request(r, NGX_HTTP_REQUEST_TIME_OUT);
    return;
  }

  cmcf = ngx_http_get_module_main_conf(r, ngx_http_core_module);
  rc = NGX_AGAIN;

  for (;;) {
    if (rc == NGX_AGAIN) {
      // If there is no space left in the current header buffer, a new space is requested
      if (r->header_in->pos == r->header_in->end) {
        // Apply for a new space
        rv = ngx_http_alloc_large_header_buffer(r, 0);
        if (rv == NGX_ERROR) {
          ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR);
          return;
        }

        // The header sent by the client is too long, which exceeds the maximum size specified by large ﹣ client ﹣ header ﹣ buffers
        if (rv == NGX_DECLINED) {
          p = r->header_name_start;
          r->lingering_close = 1;
          if (p == NULL) {
            ngx_log_error(NGX_LOG_INFO, c->log, 0, "client sent too large request");
            ngx_http_finalize_request(r, NGX_HTTP_REQUEST_HEADER_TOO_LARGE);
            return;
          }

          len = r->header_in->end - p;
          if (len > NGX_MAX_ERROR_STR - 300) {
            len = NGX_MAX_ERROR_STR - 300;
          }

          ngx_http_finalize_request(r, NGX_HTTP_REQUEST_HEADER_TOO_LARGE);
          return;
        }
      }

      // Try to read the newly sent data from the client on the connection
      n = ngx_http_read_request_header(r);
      if (n == NGX_AGAIN || n == NGX_ERROR) {
        return;
      }
    }

    cscf = ngx_http_get_module_srv_conf(r, ngx_http_core_module);
    // This is mainly to transform the read data
    rc = ngx_http_parse_header_line(r, r->header_in, cscf->underscores_in_headers);

    // NGX_OK indicates that a header data is successfully parsed
    if (rc == NGX_OK) {
      r->request_length += r->header_in->pos - r->header_name_start;
      // Filter invalid header s
      if (r->invalid_header && cscf->ignore_invalid_headers) {
        continue;
      }

      // Create a structure to store the header
      h = ngx_list_push(&r->headers_in.headers);
      if (h == NULL) {
        ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR);
        return;
      }

      h->hash = r->header_hash;
      // Use the name of the header as the key of the hash table
      h->key.len = r->header_name_end - r->header_name_start;
      h->key.data = r->header_name_start;
      h->key.data[h->key.len] = '\0';

      // Take the value of the header as the value of the hash table
      h->value.len = r->header_end - r->header_start;
      h->value.data = r->header_start;
      h->value.data[h->value.len] = '\0';

      h->lowcase_key = ngx_pnalloc(r->pool, h->key.len);
      if (h->lowcase_key == NULL) {
        ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR);
        return;
      }

      if (h->key.len == r->lowcase_index) {
        ngx_memcpy(h->lowcase_key, r->lowcase_header, h->key.len);
      } else {
        ngx_strlow(h->lowcase_key, h->key.data, h->key.len);
      }

      // All headers are stored in the headers in hash. This is to find out whether the header passed by the current client is a valid header
      hh = ngx_hash_find(&cmcf->headers_in_hash, h->hash, h->lowcase_key, h->key.len);
      // The handler here is the processing method defined for each header in NGX ﹣ http ﹣ headers ﹣ in, which passes through the
      // After the handler() method is processed, the headers from the client are converted to the attributes in the R - > headers in structure
      if (hh && hh->handler(r, h, hh->offset) != NGX_OK) {
        return;
      }

      continue;
    }

    // NGX > HTTP > parse > header > done indicates that all headers have been processed
    if (rc == NGX_HTTP_PARSE_HEADER_DONE) {
      r->request_length += r->header_in->pos - r->header_name_start;
      r->http_state = NGX_HTTP_PROCESS_REQUEST_STATE;
      // Check the validity of the header data sent by the client
      rc = ngx_http_process_request_header(r);
      if (rc != NGX_OK) {
        return;
      }

      ngx_http_process_request(r);
      return;
    }

    // Ngx_image indicates that the read header row data is incomplete and needs to be read further
    if (rc == NGX_AGAIN) {
      continue;
    }
    
    ngx_log_error(NGX_LOG_INFO, c->log, 0, "client sent invalid header line");
    ngx_http_finalize_request(r, NGX_HTTP_BAD_REQUEST);
    return;
  }
}

Here, the reading of the request header is mainly divided into the following steps:

  • First, check whether the current read event has timed out. If it does, close the current connection directly;
  • Determine whether R - > header in - > POS = = R - > header in - > end is set. This is mainly to check whether there is memory space in the current read buffer that can store the newly read data. If not, a new memory space is applied from the memory pool;
  • Call the NGX ﹣ http ﹣ read ﹣ request ﹣ header() method to read the data on the current connection handle. If the return value is greater than 0, it means the read data length. If it is equal to 0, it means the client is disconnected. If it is NGX ﹣ error, it means the read is abnormal. If it is NGX ﹣ agin, it does not read the data this time. You need to continue to read the new data. It can be seen that if the return value is ngx_agin, it will be returned directly without any other processing. This is mainly because the callback function of the current read event or ngx_http_process_request_headers(), when a new read event is triggered, it will still call ngx_http_read_request_header() to read the data again. On the other hand, in the ngx_http_read_request_header() method, if the return value is found to be ngx_agree, it will add the current read event to the event queue again, and register the read event on the epoll handle for the current connection;
  • Call NGX ﹣ http ﹣ parse ﹣ header ﹣ line() method to parse the read request header data. It should be noted that only one request header will be parsed each time the method is called. However, after infinite for loop and continuous event triggering mechanism, all the request header data will be read finally.
  • According to the return value of NGX ﹣ http ﹣ parse ﹣ header ﹣ line() method, if it is NGX ﹣ OK, the newly read header will be stored in the R - > headers ﹣ in.headers list;
  • If the return value of ngx_http_parse_header_line() method is NGX_HTTP_PARSE_HEADER_DONE, it means that all header is read successfully. At first, ngx_http_process_request_header() method is used to check the legality of the header read, then ngx_http_process_request() method is called to start the 11 stage of HTTP module in nginx. The realization principle of this method is It will be explained in the following articles.

2. Read request header data

As you can see, there are two main ways to read the request header: NGX ﹣ http ﹣ read ﹣ request ﹣ header() and NGX ﹣ http ﹣ parse ﹣ header ﹣ line(). The second method here is relatively long, but its logic is very simple. It mainly analyzes whether the read data can form a complete request header (in the form of name: value, and occupy a row). If it is, it returns NGX_OK, otherwise it returns ngx_agin to expect to continue reading the data. For this method, we will not explain it here. Readers can read the source code by themselves. We mainly explain how the NGX ﹣ http ﹣ read ﹣ request ﹣ header() method reads the request header data sent by the client:

static ssize_t ngx_http_read_request_header(ngx_http_request_t *r) {
  ssize_t n;
  ngx_event_t *rev;
  ngx_connection_t *c;
  ngx_http_core_srv_conf_t *cscf;

  c = r->connection;
  rev = c->read;

  // Calculate how much data remains unprocessed
  n = r->header_in->last - r->header_in->pos;

  // If n is greater than 0, it means that there is still data to be read that is not processed, then n is returned directly
  if (n > 0) {
    return n;
  }

  // When you go here, all the data you have read has been processed. Therefore, you will judge if the read parameter of the current event is 1,
  // It means that the unread data is stored on the handle of the current connection, so call the C - > recv() method to read the data, otherwise continue to add the current event to the
  // Event queue, and continue to listen for read events of the current connection handle
  if (rev->ready) {
    // Read data on connection file descriptor
    n = c->recv(c, r->header_in->last, r->header_in->end - r->header_in->last);
  } else {
    n = NGX_AGAIN;
  }

  // If n is NGX_AGAIN, add the current event to the event listener and continue listening for the read event of the current epoll handle
  if (n == NGX_AGAIN) {
    if (!rev->timer_set) {
      cscf = ngx_http_get_module_srv_conf(r, ngx_http_core_module);
      ngx_add_timer(rev, cscf->client_header_timeout);
    }

    if (ngx_handle_read_event(rev, 0) != NGX_OK) {
      ngx_http_close_request(r, NGX_HTTP_INTERNAL_SERVER_ERROR);
      return NGX_ERROR;
    }

    return NGX_AGAIN;
  }

  // If n is 0, the client has closed the connection
  if (n == 0) {
    ngx_log_error(NGX_LOG_INFO, c->log, 0, "client prematurely closed connection");
  }

  // If the client closes the connection or reads an exception, the current request structure is recycled
  if (n == 0 || n == NGX_ERROR) {
    c->error = 1;
    c->log->action = "reading client request headers";
    ngx_http_finalize_request(r, NGX_HTTP_BAD_REQUEST);
    return NGX_ERROR;
  }

  // Update the currently read data pointer
  r->header_in->last += n;
  return n;
}

Here, the reading of request header data is mainly divided into the following steps:

  • Determine whether there is unprocessed data in the current buffer. If there is, it will be returned directly. The reason for the unread data is that during the previous reading of the request row data, some or all of the request header data may be read, so we will check it here;
  • Determine whether the current read event is ready. If it is ready, call C - > recv() method to read the data on the current connection handle;
  • If the current read event is not ready, add the current read event to the event queue again, and register the read event on the epoll handle for the current connection;
  • Judge the return value of the second step. If it is 0, it means that the client has disconnected. If it is NGX ﹣ error, it means that the reading data is abnormal. In both cases, the current connection will be closed and the 400 status code will be returned to the client. If the return value NGX_AGAIN is returned, follow the steps in step 3 to continue listening for read events. If the return value is greater than 0, it means the reading is successful, and the value greater than 0 indicates the length of the data read;
  • Updates pointer data for the buffer that stores the read data.

3. summary

This paper mainly explains how nginx reads and parses the request header, and focuses on the main program code of reading data and the detailed steps of reading.

Keywords: Programming Nginx JSON

Added by moallam on Tue, 24 Mar 2020 05:06:03 +0200