MySQL abort alarm case analysis

Yesterday, I frequently received the alarm email about Aborted from MySQL instances. I saw the instance information of the alarm email and the test instance. The priority is not so high. I'll take time to see it later. Maybe it'll be better by then. I continue to paddle with a fluke mentality. However, I received more than 50 alarm emails within an hour. I really can't stand it, Prepare to put down the matter at hand and deal with the alarm problem first; The following is a screenshot of the alarm message:

From the alarm, it is due to the MySQL instance state variable aborted_ It is caused by the continuous increase of connections. Under normal circumstances, the variable value should remain 0, but now it is accumulating. Under what circumstances will the state value continue to increase? This is the first thing that needs to be clarified to solve the problem;

Let's first look at the specific phenomena:

Aborted_ The value of connections is increasing, and there are a lot of time at the system level_ TCP connections in wait state are from the same IP address, as shown below:

# netstat -tuplan | grep 4306 | grep TIME_WAIT  | wc -l
1159
 
The following is the part tcp connect
......
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:7586      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:5198      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:8314      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:5076      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:7610      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:7962      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:6546      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:5354      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:6596      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:8342      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:7184      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:7476      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:5994      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:7322      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:5340      TIME_WAIT   -                   
tcp        0      0 192.168.7.xxx:4306      10.21.1.28:6798      TIME_WAIT   -

At the instance level, you can see that aborted_ The value of connections keeps increasing

mysql> show status like '%aborted%';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| Aborted_clients  | 4356  |
| Aborted_connects | 112524|
+------------------+-------+
2 rows in set (0.00 sec)

Then, first look at the error log to see if there are relevant records in the error log, but no error information is found when looking at the error log; I feel very confused;

Next, let's take a look at the situation_ Will the values of connections continue to increase?

The following is the explanation of the state quantity in the official document and the possible reasons for the continuous increase of the variable:

Aborted_connects

The number of failed attempts to connect to the MySQL server.

If a client is unable even to connect, the server increments the Aborted_connects status variable. Unsuccessful connection attempts can occur for the following reasons:

  • A client attempts to access a database but has no privileges for it.
  • A client uses an incorrect password.
  • A connection packet does not contain the right information.
  • It takes more than connect_timeout seconds to obtain a connect packet.

As can be seen from the above, the above four situations will lead to the constant increase of the variable value:

  • The client tried to connect to the database but did not have permission
  • The client used the wrong password
  • The connected packet contains incorrect information
  • This parameter controls connect when the connection time limit is exceeded_ The default timeout for mysql is 10s. Generally, it will not timeout unless the network environment is extremely bad

However, no useful information is found in the error log. Next, open general_log to see the operation information related to the database. After opening, a new continent, general, is found_ The following logs are recorded in the log:

2021-11-18T09:16:19.234261Z      8328 Connect   dev_test_rwh@10.21.1.28 on game_report_sg using SSL/TLS
2021-11-18T09:16:19.255753Z      8329 Connect   dev_test_rwh@10.21.1.28 on game_report_sg using SSL/TLS
2021-11-18T09:16:19.352826Z      8330 Connect   dev_test_rwh@10.21.1.28 on game_report_sg using SSL/TLS
2021-11-18T09:16:19.452269Z      8331 Connect   dev_test_rwh@10.21.1.28 on game_report_sg using SSL/TLS

From the log information, it can be found that the connection is initiated from the client 10.21.1.28 through dev_test_rwh account connection database game_report_sg, but it can be found that there is no database game for this account_ report_ SG's access permission causes an error. Under normal circumstances, the client without access permission should have an error message, so the connection business personnel check the relevant connection and finally solve the problem;

Added by aniket_dj on Tue, 23 Nov 2021 10:31:03 +0200