websocket background
1.websocket protocol was born after HTTP protocol. Before the emergence of websocket protocol, it was found that creating web applications that require two-way communication between the client and the server (for example, instant messaging and game applications) needed to abuse HTTP to poll the server for updates, which would lead to the following problems:
- The server is forced to use many different underlying TCP connections for each client: one for sending information to the client and one for each incoming message.
- The protocol is expensive because each client to server message has an HTTP header.
- Client side scripts are forced to maintain mappings from outgoing connections to incoming connections to track responses.
2. At the same time, the problem of HTTP protocol is also reflected in the data refresh method. The previous implementation methods are the following three:
- Regular client queries: for example, once every 10s, but this is bound to produce a large number of invalid requests. If the server data is not updated, it will cause a lot of bandwidth waste.
- Long polling mechanism: the client still sends a request to the server. When the data is updated, the server sends the data to the client. But in fact, the server does not respond to the client without data update, but waits for a timeout to end the long polling request. In the case of frequent data updates, long polling has no advantage.
- HTTP Streaming: the client sends a data update request to the server. The server keeps the response data stream of the request open all the time. Only data updates are sent to the client in real time.
The vision is beautiful, but it brings new problems:
1. Contrary to the semantics of HTTP protocol itself, the client and server are no longer the request response mode, but the one-way communication channel directly established by them.
2. The server sends data to the client as long as the data is updated, so it needs to negotiate the beginning and end of the data update, and the data is prone to errors.
3. The network intermediary before the client and server may cache the response data, and the client cannot obtain the real update data.
Facing the above problems, websocket also appears.
websocket concept
- WebSocket protocol allows two-way communication between a client running untrusted code in a controlled environment and a remote host, which has chosen to communicate from the code. The security model used for this purpose is the origin based security model commonly used by web browsers. The protocol includes an open handshake, followed by basic message frames, layered over TCP. The goal of this technology is to provide a mechanism for browser based applications that need two-way communication with the server without relying on opening multiple HTTP connections.
websocket features
1. Advantages
- Maintain connection status: websocket needs to create a connection first to make it a stateful protocol.
- Better support for binary: binary frames are defined to increase security.
- Support extension: the extension is defined and can be partially customized.
- Good compression effect: the content of the context can be used to have better compression effect.
2. Disadvantages
- The development requirements are high, and the front-end and back-end have increased a certain degree of difficulty.
- Push messages are relatively complex.
- The HTTP protocol is very mature, but now websocket is a little too new.
websocket protocol communication process
The protocol has two parts: handshake and data transfer.
handshake
client
The client handshake message sends an HTTP protocol upgrade request based on HTTP.
GET /chat HTTP/1.1 Host: server.example.com Upgrade: websocket Connection: Upgrade Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== Origin: http://example.com Sec-WebSocket-Protocol: chat, superchat Sec-WebSocket-Version: 13
The SEC websocket key is randomly generated by the browser and provides basic protection against malicious or unintentional connections.
SEC WebSocket version refers to the version of WebSocket. At first, there were too many WebSocket protocols. Different manufacturers have their own protocol versions, but now it has been determined. If the server does not support this version, you need to return a sec WebSocket versionheader, which contains the version number supported by the server.
Server
The server response handshake also responds to a switching protocol based on the HTTP protocol.
HTTP/1.1 101 Switching Protocols Upgrade: websocket Connection: Upgrade Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= Sec-WebSocket-Protocol: chat
The corresponding implementation code under Linux is annotated in the code
int websocket_handshake(struct qsevent *ev) { char linebuf[128]; int index = 0; char sec_data[128] = {0}; char sec_accept[32] = {0}; do { memset(linebuf, 0, sizeof(linebuf));//Empty to temporarily store one line of message index = readline(ev->buffer, index, linebuf);//Get one line of message if(strstr(linebuf, "Sec-WebSocket-Key"))//If a message line contains a sec websocket key { strcat(linebuf, GUID);//Connect with GUID SHA1(linebuf+WEBSOCK_KEY_LENGTH, strlen(linebuf+WEBSOCK_KEY_LENGTH), sec_data);//SHA1 base64_encode(sec_data, strlen(sec_data), sec_accept);//base64 encoding memset(ev->buffer, 0, MAX_BUFLEN);//Clear the server data buffer ev->length = sprintf(ev->buffer,//Assemble the handshake response message to the data buffer and send it in the next step "HTTP/1.1 101 Switching Protocols\r\n" "Upgrade: websocket\r\n" "Connection: Upgrade\r\n" "Sec-websocket-Accept: %s\r\n\r\n", sec_accept); break; } }while(index != -1 && (ev->buffer[index] != '\r') || (ev->buffer[index] != '\n'));//Before encountering an empty line return 0; }
data transfer
Look at the packet format first
- FIN: indicates that this is the last fragment in the message. The first segment may also be the last segment.
- RSV1, RSV2, RSV3: generally, all are 0. When the client and server negotiate to adopt WebSocket extension, the three flag bits can be non-0, and the meaning of the value is defined by the extension. If there is a non-zero value and the WebSocket extension is not adopted, the connection error occurs.
- opcode: operation code.
%x0: Represents a continuation frame. When Opcode When it is 0, it means that data fragmentation is adopted for this data transmission, and the currently received data frame is one of the data fragmentation; %x1: Indicates that this is a text frame( frame); %x2: Indicates that this is a binary frame( frame); %x3-7: Reserved operation code for subsequent defined non control frames; %x8: Indicates that the connection is disconnected; %x9: Indicates that this is a ping Operation; %xA: Indicates that this is a pong Operation; %xB-F: Reserved operation code for subsequent defined control frames.
- Mask: whether a mask is required.
- Payload length: 7bit or 7 + 16bit or 7 + 64bit
Represents the length of the data load x Is 0~126: The length of the data is x Byte; x 126: the next 2 bytes represent a 16 bit unsigned integer, and the value of the unsigned integer is the length of the data; x 127: the next 8 bytes represent a 64 bit unsigned integer (the highest bit is 0), and the value of the unsigned integer is the length of the data.
- Masking-key: 0 or 4bytes
When Mask If it is 1, it carries 4 bytes Masking-key; When Mask 0, no Masking-key. PS: The role of mask is not to prevent data disclosure, but to prevent proxy cache pollution attacks in earlier versions of the protocol( proxy cache poisoning attacks)And so on.
- payload data: message body.
The following is the code implementation of the server
#define GUID "258EAFA5-E914-47DA-95CA-C5AB0DC85B11" enum { WS_HANDSHAKE = 0, //handshake WS_TANSMISSION = 1, //signal communication WS_END = 2, //end }; typedef struct _ws_ophdr{ unsigned char opcode:4, rsv3:1, rsv2:1, rsv1:1, fin:1; unsigned char pl_len:7, mask:1; }ws_ophdr;//First two bytes of protocol typedef struct _ws_head_126{ unsigned short payload_lenght; char mask_key[4]; }ws_head_126;//Protocol mask and message body length /*decode*/ void websocket_umask(char *payload, int length, char *mask_key) { int i = 0; for( ; i<length; i++) payload[i] ^= mask_key[i%4];//XOR } int websocket_transmission(struct qsevent *ev) { ws_ophdr *ophdr = (ws_ophdr*)ev->buffer;//The first two parts of the agreement printf("ws_recv_data length=%d\n", ophdr->pl_len); if(ophdr->pl_len <126)//If the message body length is less than 126 { char * payload = ev->buffer + sizeof(ws_ophdr) + 4;//Get message address if(ophdr->mask)//If the message is a mask { websocket_umask(payload, ophdr->pl_len, ev->buffer+2);//Decoding, XOR printf("payload:%s\n", payload); } printf("payload : %s\n", payload);//Message echo } else if (hdr->pl_len == 126) { ws_head_126 *hdr126 = ev->buffer + sizeof(ws_ophdr); } else { ws_head_127 *hdr127 = ev->buffer + sizeof(ws_ophdr); } return 0; } int websocket_request(struct qsevent *ev) { if(ev->status_machine == WS_HANDSHAKE) { websocket_handshake(ev);//handshake ev->status_machine = WS_TANSMISSION;//Set flag bit }else if(ev->status_machine == WS_TANSMISSION){ websocket_transmission(ev);//signal communication } return 0; }
The code is implemented based on the reactor million concurrent server framework. The code is on my github. For more information about websocket, please see websocket-rfc6455
Code git clone
git clone https://github.com/qiushii/reactor.git