Http protocol
URL
Usually, the "website" we commonly call is actually the URL
urlencode and urldecode
The urlencode method is to convert any parameter we give into a string suitable for placing in the URL. Specific rules:
- Letters, numbers and hyphens remain unchanged.
- Convert spaces to plus signs.
- All other characters are converted to a percent sign followed by a two digit hexadecimal encoding of the character.
urldecode functions the opposite.
Http protocol format
Http protocol is called hypertext transfer protocol
Basic features of Http protocol:
- No link: 1.TCP has nothing to do with http. 2.http can directly send http request to the other party
- Stateless: no user information will be recorded, only basic responses and requests will be made. The technology for recording basic information is cookie+session
- Simple and fast: there are short links for text (html,img,css...) transmission. http/1.1: long link http/1.0: short link
- http is stateless, but http is for users. If stateless, it will bring users a poor sense of experience.
- A cookie is essentially a file in the browser. The first response will contain a set cookie written to the browser. The next request initiated by the client will contain a cookie, and then there is no need to log in again.
The opening and closing of http links are controlled by the application layer, not by tcp
tcp essentially provides a function of establishing and disconnecting connections. Whether to establish and release connections is decided by the application layer
- http has the ability to control the opening and closing of connections
- Based on the long connection, it can establish and release a connection only according to one channel, and can transmit a variety of network resources to form a complete web page, which can reduce the cost of establishing and releasing connections and improve efficiency
- Connection: keep alive long connection
- When data is delivered to the application layer, it must be delivered layer by layer. As a kind of reliability of data in order, it must be guaranteed to arrive in order. http does not care whether the data is in order, but tcp will care whether it is in order
HTTP method
HTTP status code
302 303 307 status code difference
301 Moved Permanently
Permanent orientation. The status code indicates that the requested resource has been assigned a new URI, and the URI to which the resource now refers should be used in the future.
302 Found
Temporary redirection. The status code indicates that the requested resource has been assigned a new URI. It is hoped that the user (this time) can access it with the new URI. Similar to 301, but the resource represented by 302 is not permanently moved, but only temporary. In other words, the URI corresponding to the moved resource may change in the future. For example, the user saves the URI as a bookmark, but does not update the bookmark as when the 301 status code appears, but still retains the URI corresponding to the page that returns the 302 status code.
303 temporary redirection, status code of HTTP 1.1//
Send a Post request, receive 303, redirect it directly to get, and send a get request without confirming to the user.
307 Temporary Redirect
Temporary redirection. The status code has the same meaning as 302. Although the 302 standard prohibits post from changing get, it is not observed in actual use.
307 will follow the browser standard and will not change from post to get. However, different browsers may have different behaviors when processing responses.
HTTP common Header
- Content type: data type (text/html, etc.)
- Content length: the length of the body
- Host: the client tells the server which port of which host the requested resource is on;
- User agent: declare the user's operating system and browser version information;
- referer: which page does the current page Jump from;
- location: used with 3xx status code to tell the client where to access next;
- Cookie: used to store a small amount of information on the client. It is usually used to implement the function of session
The simplest Http server
HttpServer.hpp
1 #pragma once 2 3 #include <iostream> 4 #include <string> 5 #include <unistd.h> 6 #include <strings.h> 7 #include <sys/types.h> 8 #include <sys/socket.h> 9 #include <arpa/inet.h> 10 #include <netinet/in.h> 11 #include <stdlib.h> 12 #include <signal.h> 13 #include <sys/stat.h> 14 #include <fcntl.h> 15 #define BACKLOG 5 16 using namespace std; 17 18 19 20 21 22 class HttpServer{ 23 private: 24 int port; 25 int lsock; 26 27 28 public: 29 HttpServer(int _port):port(_port),lsock(-1) 30 { 31 33 } 34 void InitHttpServer() 35 { 36 signal(SIGCHLD,SIG_IGN); 37 lsock = socket(AF_INET,SOCK_STREAM,0); 38 if(lsock < 0) 39 { 40 41 cerr << "socket error " << endl; 42 exit(2); 43 } 44 struct sockaddr_in local; 45 bzero(&local,sizeof(local)); 46 local.sin_family = AF_INET; 47 local.sin_port = htons(port); //Host to network 48 local.sin_addr.s_addr = INADDR_ANY; 49 50 if(bind(lsock,(struct sockaddr*)&local,sizeof(local)) < 0) 51 { 52 cerr << "bind error" << endl; 53 exit(3); 54 } 55 if(listen(lsock,BACKLOG) < 0) 56 { 57 cerr << "listen error" << endl; 58 exit(4); 59 } 60 } 61 62 void EchoHttp(int sock) 63 { 64 char buffer[1024]; 65 ssize_t s = recv(sock,buffer,sizeof(buffer),0); 66 if(s > 0) 67 { 68 buffer[s] = 0; 69 cout << buffer << endl; 70 71 72 string response = "HTTP/1.0 OK\r\n"; 73 response += "Content-Type: text/html\r\n"; 74 // response += "location:https://www.bilibili.com\r\n"; 75 response += "\r\n"; 76 77 // int fd = open("/home/mzp/Internt/4_course/web/index.html",O_RDONLY); 78 // char buffer[4096]; 79 // if(fd > 2) 80 // { 81 // cout << "fd :" << fd << endl; 82 // } 83 response += 84 "\ 85 <!DOCTYPE html>\ 86 <html>\ 87 <head>\ 88 <title>Cc&Fxx.html</title>\ 89 </head>\ 90 <body>\ 91 <h1>People who read this page</h1>\ 92 <h1>is a beautiful goddess</h1>\ 93 <h1>AND I am your father!</h1>\ 94 </body>\ 95 </html>\ 96 "; 97 98 send(sock,response.c_str(),response.size(),0); 99 100 } 101 close(sock); 102 } 103 104 void Strat() 105 { 106 107 struct sockaddr_in peer; 108 for(;;) 109 { 110 socklen_t len = sizeof(peer); 111 int sock = accept(lsock,(struct sockaddr*)&peer,&len); 112 { 113 if(sock < 0) 114 { 115 cerr << "accept error" << endl; 116 continue; 117 } 118 119 cout << "get a link...." << endl; 120 if(fork() == 0) 121 {//child to carry out network service 122 close(lsock); 123 EchoHttp(sock); 124 exit(0); 126 } 127 close(sock); 128 } 129 } 130 } 131 132 133 ~HttpServer() 134 { 135 if(lsock != -1) 136 { 137 close(lsock); 138 } 139 } 140 };
HttpServer.cc
1 #include "HttpServer.hpp" 2 3 #include <iostream> 4 5 void Usage(std::string proc) 6 { 7 cout << "Usage:\n\t"; 8 cout << proc << "port" << endl; 9 } 10 int main(int argc,char* argv[]) 11 { 12 if(argc != 2) 13 { 14 Usage(argv[0]); 15 exit(1); 16 } 17 HttpServer *hs =new HttpServer(atoi(argv[1])); 18 hs->InitHttpServer(); 19 hs->Strat(); 20 }
Http vs Https
-
http server port number: 80 vs https server port number: 443
-
Is the intermediate information tampered with?
Defense method: data summary + data signature
-
Remote server authentication identity problem?
Port number
A port number identifies different applications that communicate on a host.
In TCP/IP protocol, a communication is identified by a quintuple of "source IP", "source port number", "destination IP", "destination port number" and "protocol number" (which can be viewed through netstat -n).
Port number division range
-
0 - 1023: well known port numbers, HTTP, FTP, SSH and other widely used application layer protocols. Their port numbers are fixed
-
1024 - 65535: port number dynamically assigned by the operating system. The port number of the client program is assigned by the operating system from this range
Well known port number
- ssh server, using port 22
- ftp server, using port 21
- telnet server, using port 23
- http server, using port 80
- https server, using 443
View the well-known port number command: cat /etc/services
When using port numbers, avoid well-known port numbers.
A process can bind multiple port numbers
A port number cannot be bound by multiple processes
netstat
netstat is an important tool for viewing network status
Syntax: netstat [options]
Function: View network status
Common options:
- n refuse to display aliases and convert all that can display numbers into numbers
- l only the service status that is listening is listed
- p displays the name of the program that establishes the relevant link
- t (tcp) displays only tcp related options
- u (udp) displays only udp related options
- a (all) displays all options, and list related is not displayed by default
pidof
It is very convenient to view the process id of the server
Syntax: pidof [process name]
Function: view the process id through the process name
UDP protocol
udp: user datagram protocol.
tcp: transmission control protocol.
UDP protocol side format
- 16 bit UDP length, indicating the maximum length of the whole datagram (UDP header + UDP data);
- If the checksum is wrong, it will be discarded directly;
UDP features
The process of UDP transmission is similar to sending a letter
-
No connection: directly transmit the IP and port number of the opposite end without establishing a connection;
-
Unreliable: there is no confirmation mechanism and no retransmission mechanism; If the segment cannot be sent to the other party due to network failure, the UDP protocol layer will not return any error information to the application layer;
-
Datagram oriented: unable to flexibly control the number and number of data read and write, how much the application layer gives, how much UDP sends, no more, no less.
-
UDP buffer: UDP does not have a real sending buffer. Add UDP protocol header directly. Do nothing else and deliver it down
-
UDP has a receive buffer. However, this receive buffer cannot ensure that the order of UDP messages received is consistent with the order of UDP messages sent; If the buffer is full, the incoming UDP data will be discarded.
-
UDP socket can read and write, that is, full duplex.
Precautions for use:
There is a maximum length of 16 bits in the header of UDP protocol. That is, the maximum length of data that can be transmitted by a UDP is 64K (including UDP header).
If we need to transmit more than 64K data, we need to manually subcontract at the application layer, send multiple times, and manually assemble at the receiving end.
UDP based application layer protocol
- NFS: network file system
- TFTP: simple file transfer protocol
- DHCP: Dynamic Host Configuration Protocol
- BOOTP: startup protocol (for diskless device startup)
- DNS: domain name resolution protocol
agreement
udp: user datagram protocol.
tcp: transmission control protocol.
UDP protocol side format
[external chain picture transferring... (img-MtzP0HGt-1635341192762)]
- 16 bit UDP length, indicating the maximum length of the whole datagram (UDP header + UDP data);
- If the checksum is wrong, it will be discarded directly;
UDP features
The process of UDP transmission is similar to sending a letter
-
No connection: directly transmit the IP and port number of the opposite end without establishing a connection;
-
Unreliable: there is no confirmation mechanism and no retransmission mechanism; If the segment cannot be sent to the other party due to network failure, the UDP protocol layer will not return any error information to the application layer;
-
Datagram oriented: unable to flexibly control the number and number of data read and write, how much the application layer gives, how much UDP sends, no more, no less.
-
UDP buffer: UDP does not have a real sending buffer. Add UDP protocol header directly. Do nothing else and deliver it down
-
UDP has a receive buffer. However, this receive buffer cannot ensure that the order of UDP messages received is consistent with the order of UDP messages sent; If the buffer is full, the incoming UDP data will be discarded.
-
UDP socket can read and write, that is, full duplex.
Precautions for use:
There is a maximum length of 16 bits in the header of UDP protocol. That is, the maximum length of data that can be transmitted by a UDP is 64K (including UDP header).
If we need to transmit more than 64K data, we need to manually subcontract at the application layer, send multiple times, and manually assemble at the receiving end.
UDP based application layer protocol
- NFS: network file system
- TFTP: simple file transfer protocol
- DHCP: Dynamic Host Configuration Protocol
- BOOTP: startup protocol (for diskless device startup)
- DNS: domain name resolution protocol