How to locate the problem of DDL blocking in MySQL

I often encounter the development and testing of children's shoes and ask if a DDL has been executed in the offline development and testing environment. I find that it has not been executed for a long time. Is it blocked? How to solve it?

Including in the group, we often encounter similar problems: DDL is blocked. How to find the SQL blocking it?

In fact, how to solve the problem of DDL blocking is a common and high-frequency problem in MySQL.

Here is a clear and ready to use solution to this problem:

  1. How to judge whether a DDL is blocked?
  2. When DDL is blocked, how to find the session blocking it?

 

How to judge whether a DDL is blocked?

First, look at a simple Demo

session1> create table sbtest.t1(id int primary key,name varchar(10));
Query OK, 0 rows affected (0.02 sec)

session1> insert into sbtest.t1 values(1,'a');
Query OK, 1 row affected (0.01 sec)

session1> begin;
Query OK, 0 rows affected (0.00 sec)

session1> select * from sbtest.t1;
+----+------+
| id | name |
+----+------+
|  1 | a    |
+----+------+
1 row in set (0.00 sec)

session2> alter table sbtest.t1 add c1 datetime;
Blocking...

session3> show processlist;
+----+-----------------+-----------+------+---------+-------+---------------------------------+---------------------------------------+
| Id | User            | Host      | db   | Command | Time  | State                           | Info                                  |
+----+-----------------+-----------+------+---------+-------+---------------------------------+---------------------------------------+
|  5 | event_scheduler | localhost | NULL | Daemon  | 47628 | Waiting on empty queue          | NULL                                  |
| 24 | root            | localhost | NULL | Sleep   |    11 |                                 | NULL                                  |
| 25 | root            | localhost | NULL | Query   |     5 | Waiting for table metadata lock | alter table sbtest.t1 add c1 datetime |
| 26 | root            | localhost | NULL | Query   |     0 | init                            | show processlist                      |
+----+-----------------+-----------+------+---------+-------+---------------------------------+---------------------------------------+
4 rows in set (0.00 sec)

To judge whether a DDL is blocked, simply execute show processlist to view the corresponding status of the DDL operation.

If Waiting for table metadata lock is displayed, it means that the DDL is blocked.

Once the DDL is blocked, all subsequent operations on the table will be blocked and Waiting for table metadata lock will be displayed. This is also the reason why DDL turns pale.

In a similar scenario, either Kill the DDL operation or Kill blocks the DDL session.

Kill DDL operation is a method to treat symptoms rather than root causes. After all, DDL operations always have to be executed.

In addition, for DDL operations, there are two stages to obtain metabase locks: at the beginning of DDL and before the end of DDL. If it is the latter, it means that previous operations have to be rolled back, and the cost is relatively high.

Therefore, in similar scenarios, we usually Kill to block DDL sessions.

So, how do you know which sessions are blocking DDL?

Let's take a look at the specific positioning method.

 

Positioning method

Method 1: sys schema_ table_ lock_ waits

sys. schema_ table_ lock_ Wait is introduced in MySQL 5.7 to locate the problem of DDL blocking.

For the above Demo.

Let's look at sys schema_ table_ lock_ Output of waits.

mysql> select * from sys.schema_table_lock_waits\G
*************************** 1. row ***************************
               object_schema: sbtest
                 object_name: t1
           waiting_thread_id: 62
                 waiting_pid: 25
             waiting_account: root@localhost
           waiting_lock_type: EXCLUSIVE
       waiting_lock_duration: TRANSACTION
               waiting_query: alter table sbtest.t1 add c1 datetime
          waiting_query_secs: 17
 waiting_query_rows_affected: 0
 waiting_query_rows_examined: 0
          blocking_thread_id: 61
                blocking_pid: 24
            blocking_account: root@localhost
          blocking_lock_type: SHARED_READ
      blocking_lock_duration: TRANSACTION
     sql_kill_blocking_query: KILL QUERY 24
sql_kill_blocking_connection: KILL 24
*************************** 2. row ***************************
               object_schema: sbtest
                 object_name: t1
           waiting_thread_id: 62
                 waiting_pid: 25
             waiting_account: root@localhost
           waiting_lock_type: EXCLUSIVE
       waiting_lock_duration: TRANSACTION
               waiting_query: alter table sbtest.t1 add c1 datetime
          waiting_query_secs: 17
 waiting_query_rows_affected: 0
 waiting_query_rows_examined: 0
          blocking_thread_id: 62
                blocking_pid: 25
            blocking_account: root@localhost
          blocking_lock_type: SHARED_UPGRADABLE
      blocking_lock_duration: TRANSACTION
     sql_kill_blocking_query: KILL QUERY 25
sql_kill_blocking_connection: KILL 25
2 rows in set (0.00 sec)

There is only one alter operation, but two records are generated, and the Kill objects of the two records are different. One of the Kill objects is the alter operation itself.

If you are not familiar with the table structure or do not read the record carefully, you will inevitably Kill the wrong object.

Moreover, after the DDL operation is blocked, if N subsequent queries are blocked by the DDL operation, N*2 records will be generated.

When locating the problem, the N*2 records are completely noise.

At this time, we need to filter the above records.

The key to filtering is blocking_lock_type is not equal to SHARED_UPGRADABLE.

SHARED_ Upgradeable is an upgradeable shared metadata lock. During locking, concurrent queries and updates are allowed. It is commonly used in the first stage of DDL operation.

Therefore, shared will not block DDL_ UPGRADABLE.

Therefore, for the above case, we can accurately locate the session that needs Kill through the following query.

SELECT sql_kill_blocking_connection
FROM sys.schema_table_lock_waits
WHERE blocking_lock_type <> 'SHARED_UPGRADABLE'
 AND waiting_query = 'alter table sbtest.t1 add c1 datetime';

 

Method 2: kill the session before DDL

sys. schema_ table_ lock_ Wait was introduced in MySQL 5.7.

However, in the actual production environment, MySQL 5.6 still occupies a considerable share.

How to solve this pain point of MySQL 5.6?

After careful investigation, there are no more than two types of operations that lead to DDL blocking:

  1. There are slow queries on the table that have not ended.

  2. There are uncommitted transactions on the table.

Among them, the first category is easy to locate, which can be found through show processlist.

The second type is difficult to locate only based on the show processlist, because the status of uncommitted connections in the show processlist is the same as that of idle connections, which are Sleep.

Therefore, it is reasonable to say that there is a Kill idle connection on the Internet, but it is too simple and rough to do so, which will inevitably be killed by mistake.

In fact, since it is a transaction, it is in information_ schema. innodb_ There must be records in TRX, such as the transactions in session1. The records in the table are as follows,

mysql> select * from information_schema.innodb_trx\G
*************************** 1. row ***************************
                    trx_id: 421568246406360
                 trx_state: RUNNING
               trx_started: 2022-01-02 08:53:50
     trx_requested_lock_id: NULL
          trx_wait_started: NULL
                trx_weight: 0
       trx_mysql_thread_id: 24
                 trx_query: NULL
       trx_operation_state: NULL
         trx_tables_in_use: 0
         trx_tables_locked: 0
          trx_lock_structs: 0
     trx_lock_memory_bytes: 1128
           trx_rows_locked: 0
         trx_rows_modified: 0
   trx_concurrency_tickets: 0
       trx_isolation_level: REPEATABLE READ
         trx_unique_checks: 1
    trx_foreign_key_checks: 1
trx_last_foreign_key_error: NULL
 trx_adaptive_hash_latched: 0
 trx_adaptive_hash_timeout: 0
          trx_is_read_only: 0
trx_autocommit_non_locking: 0
       trx_schedule_weight: NULL
1 row in set (0.00 sec)

Where trx_mysql_thread_id is the thread id, combined with information_schema.processlist to further narrow the scope.

Therefore, we can locate transactions whose execution time is earlier than DDL through the following SQL.

SELECT concat('kill ', i.trx_mysql_thread_id, ';')
FROM information_schema.innodb_trx i, (
    SELECT MAX(time) AS max_time
    FROM information_schema.processlist
    WHERE state = 'Waiting for table metadata lock'
      AND (info LIKE 'alter%'
      OR info LIKE 'create%'
      OR info LIKE 'drop%'
      OR info LIKE 'truncate%'
      OR info LIKE 'rename%'
  )) p
WHERE timestampdiff(second, i.trx_started, now()) > p.max_time;

Fortunately, the currently executing query will also be displayed in information_ schema. innodb_ In TRX.

Therefore, the above SQL is also applicable to the scenario where the slow query is not finished.

 

Using sys. In MySQL 5.7 schema_ table_ lock_ Precautions for waits

sys. schema_ table_ lock_ The waits view relies on an MDL related table - performance_schema.metadata_locks.

This table is introduced in MySQL 5.7 and displays MDL related information, including action object, lock type and lock status.

However, in MySQL 5.7, the table is empty by default, because the related instrument is not enabled by default. MySQL 8.0 is enabled by default.

mysql> select * from performance_schema.setup_instruments where name='wait/lock/metadata/sql/mdl';
+----------------------------+---------+-------+
| NAME                       | ENABLED | TIMED |
+----------------------------+---------+-------+
| wait/lock/metadata/sql/mdl | NO      | NO    |
+----------------------------+---------+-------+
1 row in set (0.00 sec)

Therefore, in MySQL 5.7, if we want to use sys schema_ table_ lock_ Wait, you must first open the MDL related instrument.

The opening method is very simple. You can directly modify performance_schema.setup_instruments table.

The details are as follows.

UPDATE performance_schema.setup_instruments SET ENABLED = 'YES', TIMED = 'YES'
WHERE NAME = 'wait/lock/metadata/sql/mdl';

However, this method takes effect temporarily. After the instance is restarted, it will be restored to the default value.

It is recommended to modify the configuration file synchronously.

[mysqld]
performance-schema-instrument='wait/lock/metadata/sql/mdl=ON'

 

summary

1. Execute show processlist. If the DDL status is Waiting for table metadata lock, it means that the DDL is blocked.

2. There are two common methods to locate the session that causes DDL blocking:

2.1 sys.schema_table_lock_waits

SELECT sql_kill_blocking_connection
FROM sys.schema_table_lock_waits
WHERE blocking_lock_type <> 'SHARED_UPGRADABLE'
  AND (waiting_query LIKE 'alter%'
  OR waiting_query LIKE 'create%'
  OR waiting_query LIKE 'drop%'
  OR waiting_query LIKE 'truncate%'
  OR waiting_query LIKE 'rename%');

This method is applicable to MySQL 5.7 and 8.0.

Note that in MySQL 5.7, MDL related instrument s are not opened by default.

2.2 session before kill DDL

SELECT concat('kill ', i.trx_mysql_thread_id, ';')
FROM information_schema.innodb_trx i, (
    SELECT MAX(time) AS max_time
    FROM information_schema.processlist
    WHERE state = 'Waiting for table metadata lock'
      AND (info LIKE 'alter%'
      OR info LIKE 'create%'
      OR info LIKE 'drop%'
      OR info LIKE 'truncate%'
      OR info LIKE 'rename%'
  )) p
WHERE timestampdiff(second, i.trx_started, now()) > p.max_time;

If the MDL related instrument in MySQL 5.7 is not opened or in MySQL 5.6, you can use this method.

Keywords: MySQL

Added by gewthen on Tue, 11 Jan 2022 08:02:40 +0200