Feel that the PHP-FPM process is not enough?
As a phper, the most used architecture is LNMP. Every time the traffic comes, our service will change from hundreds of milliseconds to a few seconds. At this time, we speculate that mysql has slow sql, redis has large key s, and the number of PHP FPM processes is not enough. The above situation can be checked through some business logs. What we mainly prove this time is the practice of insufficient PHP FPM processes.
Reproduce the scene
Adjust the number of my local PHP-FPM processes to 2
#vim /etc/php-fpm.d/www.conf pm = static pm.max_children = 2
Use ab to pressure test the interface
$ ab -c 40 -n 3000 http://127.0.0.1/group/check_groups Server Software: nginx/1.16.0 Server Hostname: miner_platform.cn Server Port: 80 Document Path: /group/check_groups Document Length: 44 bytes Concurrency Level: 40 Time taken for tests: 29.384 seconds Complete requests: 3000 Failed requests: 0 Write errors: 0 Total transferred: 699000 bytes HTML transferred: 132000 bytes Requests per second: 102.10 [#/sec] (mean) Time per request: 391.788 [ms] (mean) Time per request: 9.795 [ms] (mean, across all concurrent requests) Transfer rate: 23.23 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.2 0 3 Processing: 306 344 80.6 318 3558 Waiting: 306 343 80.5 318 3555 Total: 307 344 80.6 318 3558 Percentage of the requests served within a certain time (ms) 50% 318 66% 322 75% 333 80% 369 90% 428 95% 461 98% 508 99% 553 100% 3558 (longest request)
Try to solve the problem
1. PHP-FPM STATUS
We found interfaces from 318ms to 3.558s. How do we know that there are not enough PHP FPM processes to cause this problem? In other words, is there any way to let us know that PHP FPM cannot be handled internally? At this time, we need to open PHP FPM built-in status. Detailed steps refer to: www.cnblogs.com/tinywan/p/6848269....
$ curl http://127.0.0.1/status.php pool: www process manager: static start time: 29/Nov/2021:18:27:38 +0800 start since: 6493 accepted conn: 3136 listen queue: 38 max listen queue: 39 listen queue len: 128 idle processes: 0 active processes: 2 total processes: 2 max active processes: 2 max children reached: 0 slow requests: 0
Please refer to the links above for details. We will mainly talk about the following parameters
- listen queue: This is the number of PHP FPM servers in the accept queue.
- max listen queue: the maximum number of waiting connections since the PHP FPM process was started (to put it bluntly, it is the maximum persistence of the listen queue we mentioned above)
- listen queue len: students with socket network programming experience know it. int listen(int sockfd, int backlog); Yes, this parameter can be set, but it is related to system settings.
2. netstat view link status
Our conclusion is that when the PHP FPM process can't handle it, the request will be placed in the accept queue. After knowing this, we don't even need to pass status.
- The first line represents the listening socket, and Recv-Q represents the length of the accept queue.
$netstat -antp | grep php-fpm tcp 38 0 127.0.0.1:9000 0.0.0.0:* LISTEN 97/php-fpm: master tcp 8 0 127.0.0.1:9000 127.0.0.1:55540 ESTABLISHED 964/php-fpm: pool w tcp 8 0 127.0.0.1:9000 127.0.0.1:55536 ESTABLISHED 965/php-fpm: pool w
To sum up, we know that when the number of PHP-FPM processes is insufficient, the accept queue length of the connection requested by nginx clients will become larger. Is that all? No, we still need to analyze why we can get this phenomenon.
Principle analysis
Briefly describe the working process of PHP-FPM
First, we need to briefly talk about the working process of PHP FPM. Let's simply model its pseudo code (here only to describe the whole socket process)
// 1. Create socket $socket = socket_create(AF_INET, SOCK_STREAM, 0); // 2. Bind socket socket_bind($socket, "0.0.0.0", 9000); // 3. Monitor socket socket_listen($socket, 5); for($i=0;$i<2;$i++) { $pid = pcntl_fork() // 4. Create 2 processes if ($pid == 0) { // 5. Sub process accepts socket while($fd = socket_accept($socket)) { echo "client ${fd}connect" . PHP_EOL; $tmp = socket_read($fd, 1024); echo "client data:" . $tmp . PHP_EOL; $data = "HTTP/1.1 200 ok\r\nContent-Length:2\r\n\r\nhi"; socket_write($fd, $data, strlen($data)); } exit; } } // 5. Listen for subprocess exit // Other TODO
- The master process creates a listening socket, but does not process business
- The work process accepts the synchronous blocking request (blocked in accept), and then processes the business.
Grab nginx - > PHP FPM socket
We know the working process of PHP FPM. At this time, we need to know the interaction process between nginx and PHP FPM through a request.
$curl http://miner_platform.cn/group/check_groups {"code":10006,"message":"sign\u65e0\u6548."}
nginx system call
All the points needing attention are annotated in this. What is captured is the nginx work process
$ strace -f -s 64400 -p 958 strace: Process 958 attached epoll_wait(8, [{EPOLLIN, {u32=1226150064, u64=94773974503600}}], 512, -1) = 1 accept4(6, {sa_family=AF_INET, sin_port=htons(46616), sin_addr=inet_addr("127.0.0.1")}, [112->16], SOCK_NONBLOCK) = 3 epoll_ctl(8, EPOLL_CTL_ADD, 3, {EPOLLIN|EPOLLRDHUP|EPOLLET, {u32=1226159737, u64=94773974513273}}) = 0 epoll_wait(8, [{EPOLLIN, {u32=1226159737, u64=94773974513273}}], 512, 60000) = 1 recvfrom(3, "GET /group/check_groups HTTP/1.1\r\nUser-Agent: curl/7.29.0\r\nHost: miner_platform.cn\r\nAccept: */*\r\n\r\n", 1024, 0, NULL, NULL) = 99 stat("/data/miner_platform/src/public/group/check_groups", 0x7ffcb593d1b0) = -1 ENOENT (No such file or directory) stat("/data/miner_platform/src/public/group/check_groups", 0x7ffcb593d1b0) = -1 ENOENT (No such file or directory) epoll_ctl(8, EPOLL_CTL_MOD, 3, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=1226159737, u64=94773974513273}}) = 0 lstat("/data", {st_mode=S_IFDIR|0777, st_size=4096, ...}) = 0 lstat("/data/miner_platform", {st_mode=S_IFDIR|0777, st_size=4096, ...}) = 0 lstat("/data/miner_platform/src", {st_mode=S_IFDIR|0777, st_size=4096, ...}) = 0 lstat("/data/miner_platform/src/public", {st_mode=S_IFDIR|0777, st_size=4096, ...}) = 0 getsockname(3, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("127.0.0.1")}, [112->16]) = 0 // 1. Create socket socket(AF_INET, SOCK_STREAM, IPPROTO_IP) = 11 ioctl(11, FIONBIO, [1]) = 0 epoll_ctl(8, EPOLL_CTL_ADD, 11, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=1226163953, u64=94773974517489}}) = 0 // 2. Connection 127.0.0.1:9000 connect(11, {sa_family=AF_INET, sin_port=htons(9000), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) epoll_wait(8, [{EPOLLOUT, {u32=1226159737, u64=94773974513273}}, {EPOLLOUT, {u32=1226163953, u64=94773974517489}}], 512, 60000) = 2 getsockopt(11, SOL_SOCKET, SO_ERROR, [0], [4]) = 0 // 3. Write this request according to FASTCGI protocol writev(11, [{iov_base="\1\1\0\1\0\10\0\0\0\1\0\0\0\0\0\0\1\4\0\1\2!\7\0\17)SCRIPT_FILENAME/data/miner_platform/src/public/index.php\f\0QUERY_STRING\16\3REQUEST_METHODGET\f\0CONTENT_TYPE\16\0CONTENT_LENGTH\v\nSCRIPT_NAME/index.php\v\23REQUEST_URI/group/check_groups\f\nDOCUMENT_URI/index.php\r\37DOCUMENT_ROOT/data/miner_platform/src/public\17\10SERVER_PROTOCOLHTTP/1.1\16\4REQUEST_SCHEMEhttp\21\7GATEWAY_INTERFACECGI/1.1\17\fSERVER_SOFTWAREnginx/1.16.0\v\tREMOTE_ADDR127.0.0.1\v\5REMOTE_PORT46616\v\tSERVER_ADDR127.0.0.1\v\2SERVER_PORT80\v\21SERVER_NAMEminer_platform.cn\17\3REDIRECT_STATUS200\17\vHTTP_USER_AGENTcurl/7.29.0\t\21HTTP_HOSTminer_platform.cn\v\3HTTP_ACCEPT*/*\0\0\0\0\0\0\0\1\4\0\1\0\0\0\0\1\5\0\1\0\0\0\0", iov_len=592}], 1) = 592 epoll_wait(8, [{EPOLLIN|EPOLLOUT, {u32=1226163953, u64=94773974517489}}], 512, 60000) = 1 // 4. Accept PHP-FPM response results recvfrom(11, "\1\6\0\1\0\257\1\0X-Powered-By: PHP/7.2.16\r\nCache-Control: no-cache, private\r\nDate: Wed, 01 Dec 2021 12:24:52 GMT\r\nContent-Type: application/json\r\n\r\n{\"code\":10006,\"message\":\"sign\\u65e0\\u6548.\"}\0\1\3\0\1\0\10\0\0\0\0\0\0\0\"}\0", 4096, 0, NULL, NULL) = 200 epoll_wait(8, [{EPOLLIN|EPOLLOUT|EPOLLRDHUP, {u32=1226163953, u64=94773974517489}}], 512, 60000) = 1 readv(11, [{iov_base="", iov_len=3896}], 1) = 0 // 5. Close the socket connection close(11) = 0 // 6. Respond to the browser writev(3, [{iov_base="HTTP/1.1 200 OK\r\nServer: nginx/1.16.0\r\nContent-Type: application/json\r\nTransfer-Encoding: chunked\r\nConnection: keep-alive\r\nX-Powered-By: PHP/7.2.16\r\nCache-Control: no-cache, private\r\nDate: Wed, 01 Dec 2021 12:24:52 GMT\r\n\r\n", iov_len=222}, {iov_base="2c\r\n", iov_len=4}, {iov_base="{\"code\":10006,\"message\":\"sign\\u65e0\\u6548.\"}", iov_len=44}, {iov_base="\r\n", iov_len=2}, {iov_base="0\r\n\r\n", iov_len=5}], 5) = 277 write(5, "127.0.0.1 - - [01/Dec/2021:20:24:52 +0800] \"GET /group/check_groups HTTP/1.1\" 200 55 \"-\" \"curl/7.29.0\" \"-\" 1.029 127.0.0.1:9000 200 1.030\n", 138) = 138 setsockopt(3, SOL_TCP, TCP_NODELAY, [1], 4) = 0 epoll_wait(8, [{EPOLLIN|EPOLLOUT|EPOLLRDHUP, {u32=1226159737, u64=94773974513273}}], 512, 65000) = 1 recvfrom(3, "", 1024, 0, NULL, NULL) = 0 close(3) = 0 epoll_wait(8,
PHP FPM system call
Grab PHP FPM work process
// 1. accept received the data sent by nginx(127.0.0.1:45512) client 965 accept(9, {sa_family=AF_INET, sin_port=htons(45512), sin_addr=inet_addr("127.0.0.1")}, [112->16]) = 4 Many are omitted in the middle // 2. Respond to the client 965 write(4, "\1\6\0\1\0\257\1\0X-Powered-By: PHP/7.2.16\r\nCache-Control: no-cache, private\r\nDate: Wed, 01 Dec 2021 12:37:18 GMT\r\nContent-Type: application/json\r\n\r\n{\"code\":10006,\"message\":\"sign\\u65e0\\u6548.\"}\0\1\3\0\1\0\10\0\0\0\0\0\0\0p\0\0", 200) = 200 // 3. Do not write data to this socket 965 shutdown(4, SHUT_WR) = 0 // 4. Accept nginx(127.0.0.1:45512) client data 965 recvfrom(4, "\1\5\0\1\0\0\0\0", 8, 0, NULL, NULL) = 8 // 5. Accept nginx(127.0.0.1:45512) client data 965 recvfrom(4, "", 8, 0, NULL, NULL) = 0 // 6. Close this connection 965 close(4) = 0 965 lstat("/data/miner_platform/src/vendor/composer/../../app/Http/Middleware/BusinessHeaderCheck.php", {st_mode=S_IFREG|0777, st_size=989, ...}) = 0 965 stat("/data/miner_platform/src/app/Http/Middleware/BusinessHeaderCheck.php", {st_mode=S_IFREG|0777, st_size=989, ...}) = 0 965 chdir("/") = 0 965 times({tms_utime=3583, tms_stime=1977, tms_cutime=0, tms_cstime=0}) = 4315309933 965 setitimer(ITIMER_PROF, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}, NULL) = 0 965 fcntl(3, F_SETLK, {l_type=F_UNLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0 965 setitimer(ITIMER_PROF, {it_interval={tv_sec=0, tv_usec=0}, it_value={tv_sec=0, tv_usec=0}}, NULL) = 0 965 accept(9,
TCP triple handshake
We have made it clear that the process is the same when the request concurrency is high. At this time, we lead to the following figure, which is the same as the process described above, but details the process of three handshakes. At this time, we introduce sync queue and accept queue.
- We call listen (the above is executed by the PHP FPM master process), and at the same time, the kernel creates two queues, sync queue and accept queue
- Step 2: after the Server (referring to the PHP FPM master process) sends the SYN+ACK message, this information will be put into the sync queue
- When the three handshakes are completed, the connection queue that is not taken away by the application (referring to the PHP FPM work process) calling accept. At this time, the socket is in the ESTABLISHED state. Each time the application calls the accept() function, the connection of the queue header will be removed. If the queue is empty, accept () usually blocks. A fully connected queue is also called an accept queue.
conclusion
After the above analysis, we know what sync queue and accept queue are. Application and accept queue and kernel are a production and consumption model. The kernel is the producer, the accept queue stores queue information, and the application is the consumer. Students who have used queues know that when concurrency is high, there will be more data in the queue, or the slow consumption of producers will lead to slower and slower connection processing. Therefore, the usual approach is to increase consumers and improve consumption speed. This also coincides with our above phenomenon.