Experience from a slow UPDATE SQL optimization

Recently, there was a problem with the online ETL data archiving SQL. An UPDATE SQL has not run for two days:

 update t_order_record set archive_id = '420a7fe7-4767-45e8-a5f5-72280c192faa', update_time = update_time  where order_id in (select order_id from t_retailer_order_record force index (idx_archive_id) where archive_id = '420a7fe7-4767-45e8-a5f5-72280c192faa')

In fact, this SQL is to_ retailer_ order_ Archive in record_ Order ID of all records with ID 420a7fe7-4767-45e8-a5f5-72280c192faa_ ID, the archive of the record in the corresponding order table_ The ID is also updated to 420a7fe7-4767-45e8-a5f5-72280c192faa, and the update time remains unchanged (because there is a trigger for update_time to update according to the current time on the table).

For SQL optimization, we can use the following three tools for analysis:

  1. EXPLAIN: This is a relatively simple analysis and will not really execute SQL. The analysis may not be accurate and detailed enough. But some key problems can be found.
  2. PROFILING: execute sampling through SQL enabled by set profiling = 1. You can analyze which stages of SQL execution are divided and how time-consuming each stage is. SQL needs to be executed and executed successfully, and the analyzed stages are not detailed enough. Generally, optimization can only be carried out by whether there are some stages and how to avoid the occurrence of these stages (for example, avoiding the occurrence of memory sorting, etc.).
  3. OPTIMIZER TRACE: show each step of the optimizer in detail. SQL needs to be executed and executed successfully. The MySQL optimizer considers too many factors, too many iterations, and the configuration is quite complex. The default configuration is OK in most cases, but there will be problems in some special cases, requiring human intervention.

First, we EXPLAIN this SQL:

+----+--------------------+-------------------------+------------+-------+----------------+----------------+---------+-------+-----------+----------+-------------+
| id | select_type        | table                   | partitions | type  | possible_keys  | key            | key_len | ref   | rows      | filtered | Extra       |
+----+--------------------+-------------------------+------------+-------+----------------+----------------+---------+-------+-----------+----------+-------------+
|  1 | UPDATE             | t_order_record          | NULL       | index | NULL           | PRIMARY        | 8       | NULL  | 668618156 |   100.00 | Using where |
|  2 | DEPENDENT SUBQUERY | t_retailer_order_record | NULL       | ref   | idx_archive_id | idx_archive_id | 195     | const |         1 |    10.00 | Using where |
+----+--------------------+-------------------------+------------+-------+----------------+----------------+---------+-------+-----------+----------+-------------+

Discovery t_ order_ There is a problem with the use of the index of record, which is strange:

  1. t_order_record in order_ There is an index on the ID, but here is a full scan of the primary key (the primary key is not order_id but ID)
  2. In fact, only more than 30000 pieces of data were hit in the sub query.

Generally, this situation must be caused by the SQL optimizer.

The SQL optimizer is not entirely to blame

In our daily development and design of tables, it is difficult to avoid some unreasonable use, many indexes and large row s. In this strange situation, it is really difficult for the SQL optimizer to find the best solution. Take a simple example: suppose we have a table with a primary key ID and a record with id = 1. One year later, we have a record with id = 1000000. Then, when we update the records with id = 1 and id = 1000000 at the same time, the data that passes through other indexes but hits only id = 1 and id = 1000000 may not go through the index but search through the primary key. Because of the recent update, the two pieces of data run to the same page and are in memory.

The SQL optimizer considers many such complex situations and can optimize SQL in most cases to better adapt to the current situation. However, due to the complexity of logic, the optimization in some simple cases is poor, which requires us to manually optimize according to the results of OPTIMIZER TRACE.

Use the test database to optimize trace. First analyze whether there are problems in the steps before index analysis

Due to Optimizer_trace needs SQL to be executed, but the SQL cannot be executed. Optimizer_trace can analyze the whole steps of the optimizer. We can start in a test environment with a small amount of data, Look, before entering statistical analysis (for example, analyzing the discrete data of the index to determine which index to use cannot be simulated by the test environment, because there must be differences between the data and the online data, not even copying the online data, because the data is on which pages, how the index is updated, the file structure is different from the online, and the information of the statistician will not be exactly the same). Is there a problem with SQL rewrite conversion .

Execution:

mysql> set session optimizer_trace="enabled=on";
Query OK, 0 rows affected (0.20 sec)

mysql>  update t_order_record set archive_id = '420a7fe7-4767-45e8-a5f5-72280c192faa', update_time = update_time  where order_id in (select order_id from t_retailer_order_record force index (idx_archive_id) where archive_id = '420a7fe7-4767-45e8-a5f5-72280c192faa');
Query OK, 0 rows affected (2.95 sec)
Rows matched: 0  Changed: 0  Warnings: 0

mysql> SELECT trace FROM information_schema.OPTIMIZER_TRACE;

steps": [
    {
      "join_preparation": {
        "select#": 2,
        "steps": [
          {
            "expanded_query": "/* select#2 */ select `main`.`t_retailer_order_record`.`order_id` from `main`.`t_retailer_order_record` FORCE INDEX (`idx_archive_id`) where (`main`.`t_retailer_order_record`.`archive_id` = '420a7fe7-4767-45e8-a5f5-72280c192faa')"
          },
          {
            "transformation": {
              "select#": 2,
              "from": "IN (SELECT)",
              "to": "semijoin",
              "chosen": false
            }
          },
          {
            "transformation": {
              "select#": 2,
              "from": "IN (SELECT)",
              "to": "EXISTS (CORRELATED SELECT)",
              "chosen": true,
              "evaluating_constant_where_conditions": [
              ]
            }
          }
        ]
      }
    },
    {
      "substitute_generated_columns": {
      }
    },
    {
      "condition_processing": {
        "condition": "WHERE", 
        ## Omitted below

Through Optimizer_trace, we found that there is a problem with optimization! Optimized IN to EXISTS. IN this way, what we wanted to do was to use each record of the sub query to match the records of the outer order table, but instead to traverse each record of the outer order table to see whether it EXISTS IN the sub query. This also explains why the result of explain is to traverse each record of the order table through the primary key for query.

If you want to change this, you can only change the writing method to adapt, but you can't turn off the optimizer option

Therefore, we rewrite and optimize the SQL (using JOIN, which is the closest way to writing SQL that is most easily understood by the optimizer), In addition, the time condition is added (we want to only operate the data 179 days ago, and the data corresponding to this archive_id is 179 days ago). Since the order ID itself has time (starting with time, for example, 211211094621ord123421 represents an order at 9:46:21 on December 11, 2021), the time is limited by the order ID:

UPDATE t_order_record
JOIN t_retailer_order_record ON t_order_record.order_id = t_retailer_order_record.order_id 
SET t_order_record.archive_id = '420a7fe7-4767-45e8-a5f5-72280c192faa',
t_order_record.update_time = t_order_record.update_time 
WHERE
    t_order_record.order_id < DATE_FORMAT( now() - INTERVAL 179 DAY, '%y%m%d' ) 
    AND t_retailer_order_record.order_id < DATE_FORMAT( now() - INTERVAL 179 DAY, '%y%m%d' ) 
    AND t_retailer_order_record.archive_id = '420a7fe7-4767-45e8-a5f5-72280c192faa'

Subsequent optimization experience

If you encounter the situation that the execution is slow but the update hits very little data and the indexes are all available, you can start in a test environment with a small amount of data, Look, before entering statistical analysis (for example, analyzing the discrete data of the index to determine which index to use cannot be simulated by the test environment, because there must be differences between the data and the online data, not even copying the online data, because the data is on which pages, how the index is updated, the file structure is different from the online, and the information of the statistician will not be exactly the same). Is there a problem with SQL rewrite conversion .

If there is a problem, consider human intervention and manual optimization. Manual optimization methods include:

  1. force index enforces an index
  2. Turn off some options of the MySQL optimizer for the current session
  3. Rewriting SQL makes it easier for the optimizer to understand (JOIN is the most easily understood by the SQL optimizer)

WeChat search "my programming meow" attention to the official account, daily brush, easy to upgrade technology, and capture all kinds of offer:

Keywords: MySQL

Added by Jim from Oakland on Tue, 21 Dec 2021 18:10:28 +0200