Write a simple web server

Minimalist version of web server based on Python 3. Used to learn HTTP protocol and WEB server working principle. The author has a superficial understanding of the working principle of WEB server, which is only based on personal understanding. There are many shortcomings and loopholes. The purpose is to provide you with a way to write web server. Project GitHub address: https://github.com/hanrenguang/simple-webserver.

WEB Server Principle

Students who have studied networking should know that HTTP protocol is implemented on top of TCP protocol. The communication between browser and server is to establish TCP connection first, then transmit request and response message. Server is a passive party. When the browser makes a request, the server can communicate with the browser. Before that, the server is in a waiting state.

socket connection

The first step to implement the server is to establish a socket connection, socket socket is a package of TCP/UDP protocol, Python has its own socket module, so it is very convenient to use.

import socket

sk = socket.socket(
    socket.AF_INET, 
    socket.SOCK_STREAM
)

# Listen on Local 8888 Port
host = '127.0.0.1'
port = 8888

sk.bind((host, port))
sk.listen(5)

while True:
    try:
        clientSk, addr = sk.accept()
        print("address is: %s" % str(addr))

        req = clientSk.recv(1024)

        clientSk.sendall('...')
        clientSk.close()

    except Exception as err:
        print(err)
        clientSk.close()

This is a minimalist socket-server. It should be noted that we only implement the part of TCP protocol.

Resolving HTTP requests

The request to get the browser is very simple, clientSk.recv() can get the request message, and some data can not be directly used, because it is based on HTTP protocol encapsulated data, before we proceed to the next step, we need to "unsealing" the request message. Before that, we need to understand the format of the request message. The quickest way is to open the browser (take chrome as an example), open Baidu or something, F12 opens the developer tool, and you can see it in the Network column. Perhaps as follows:

GET / HTTP/1.1
Host: xxx
Connection: xxx
Cache-Control: xxx
Upgrade-Insecure-Requests: xxx
User-Agent: xxx
Accept: xxx
Accept-Encoding: xxx
Accept-Language: xxx
Cookie: xxx

We focus on the first line, the GET method, the resource path requested is /, the protocol used is HTTP 1.1, and then the return line break \r\n. So our analysis of the message is as follows (there are many shortcomings):

# The first step is to decode the data.
# Then divide it into action units
requestList = clientSk.recv(1024).decode().split("\r\n")

# Call a written function to parse it
parseReq(requestList)

# Parse request message
def parseReq(reqList):
    # Save the parsed results
    parseRet = {}

    # Request methods, such as GET
    method = reqList[0].split(' ')[0]
    # Request resource paths, such as'/'
    sourcePath = reqList[0].split(' ')[1]

    parseRet['method'] = method
    parseRet['sourcePath'] = sourcePath

    i = len(reqList) - 1

    # Save the analytical results in the form of key: value
    while i:
        if len(reqList[i].split(':')) == 1:
            i = i - 1
            continue

        idx = reqList[i].find(':')
        key, value = reqList[i][0:idx], reqList[i][idx+1:]
        parseRet[key] = value.strip()
        i = i - 1
    
    return parseRet

Construct response message

After getting the request message and parsing it, we can start to construct the content of the response message. Take the static resource request as an example, assuming that the first behavior of the request message is GET/index.html HTTP/1.1. So the first thing I need to do is to get the file content with the path of / index. html:

# Access to resource content
try:
    f = open(path, 'r')
    while True:
        chunk = f.read(1024)
        if not chunk:
            f.close()
            break;
        content += chunk
except:
    pass

Next is to construct the response message. Similarly, we can observe the format of HTTP response message. In this case, we will not exemplify it, but code it directly:

try:
    f = open(path, 'r')
    while True:
        chunk = f.read(1024)
        if not chunk:
            f.close()
            break;
        content += chunk
except:
    pass

# Most header information is omitted
headers = 'HTTP/1.1 200 OK\r\n'
contentType = 'Content-Type: text/html; charset=utf-8\r\n'
contentLen = 'Content-Length: ' + str(len(content)) + '\r\n'

# Combining response messages res
res = headers + contentType + contentLen + '\r\n' + content

# The code is sent to the browser.
# At this point, the communication is over.
clientSk.sendall(res.encode(encoding='UTF-8'))
clientSk.close()

Example

To project GitHub: https://github.com/hanrenguang/simple-webserver Download this project to local, double-click server.py, and visit http://localhost:8888/index.html You should see very kind Hello world!.

Keywords: Python socket Web Server github

Added by t31os on Fri, 21 Jun 2019 04:49:44 +0300