MySQL Case -- Optimizing OR Statement Trample Recording

Problem Description

R&D colleagues feedback that an application is slow to execute, corresponding to SQL:

UPDATE bs_serial_trac 
SET SERIAL_IS_LOCK = 0,
LOCK_VALUE = '',
UPDATE_USER = 'transSys' 
WHERE GOODS_NO = '4418095740626' 
AND (
PARENT_CODE = 'F9G7S19722001835' 
OR SERIAL = 'F9G7S19722001835'
);

The index on table bs_serial_trac is as follows:

PRIMARY KEY (`ID`),
UNIQUE KEY `idx_complex_serial_goodsNo` (`SERIAL`,`GOODS_NO`),
KEY `idx_update_time` (`UPDATE_TIME`),
KEY `idx_serial_goodsNo` (`GOODS_NO`),
KEY `idx_parent_code` (`PARENT_CODE`),
KEY `idx_lock_value` (`LOCK_VALUE`)

Because of the OR condition, the query can only search data based on the condition GOODS_NO='4418095740626'. Its execution plan is as follows:

*************************** 1. row ***************************
           id: 1
  select_type: UPDATE
        table: bs_serial_trac
   partitions: NULL
         type: range
possible_keys: idx_serial_goodsNo
          key: idx_serial_goodsNo
      key_len: 93
          ref: const
         rows: 404920
     filtered: 100.00
        Extra: Using where
1 row in set (0.00 sec)

Due to the poor selectivity of GOODS_NO columns and the large number of records satisfying the conditions, the query performance is poor.

SELECT COUNT(1)
FROM bs_serial_trac
WHERE GOODS_NO = '4418095740626';

+----------+
| COUNT(1) |
+----------+
|   215447 |
+----------+

 

Solution steps

Obviously, the problem is that the index cannot be used properly because of OR statement. The OR is adjusted to UNION ALL and the adjusted SQL statement is as follows:

SELECT *
FROM bs_serial_trac 
WHERE GOODS_NO = '4418095740626' 
AND PARENT_CODE = 'F9G7S19722007485' 
UNION ALL
SELECT *
FROM bs_serial_trac 
WHERE GOODS_NO = '4418095740626' 
AND SERIAL = 'F9G7S19722007485'
AND PARENT_CODE <> 'F9G7S19722007485'

Query speed is very fast, from the previous 800ms optimization to less than 10ms, it looks perfect, but the query results have no data...

Under simple positioning, it is found that the PARENT_CODE column type is varchar(50) DEFAULT'', and the PARENT_CODE column value is NULL. There are some problems in judging whether the PARENT_CODE column is not equal to NULL.

SELECT 
NOT(PARENT_CODE <=> 'F9G7S19722007485') AS C1,
NOT(PARENT_CODE = 'F9G7S19722007485') AS C2,
PARENT_CODE <> 'F9G7S19722007485' AS C3
FROM bs_serial_trac 
WHERE GOODS_NO = '4418095740626' 
AND SERIAL = 'F9G7S19722007485' ;

+----+------+------+
| C1 | C2   | C3   |
+----+------+------+
|  1 | NULL | NULL |
+----+------+------+

Therefore, when you rewrite a UNION ALL statement, you need to rewrite it as follows:

SELECT *
FROM bs_serial_trac 
WHERE GOODS_NO = '4418095740626' 
AND PARENT_CODE = 'F9G7S19722007485' 
UNION ALL
SELECT *
FROM bs_serial_trac 
WHERE GOODS_NO = '4418095740626' 
AND SERIAL = 'F9G7S19722007485'
AND NOT(PARENT_CODE <=> 'F9G7S19722007485')

Or rewrite it as:

SELECT *
FROM bs_serial_trac 
WHERE GOODS_NO = '4418095740626' 
AND PARENT_CODE = 'F9G7S19722007485' 
UNION ALL
SELECT *
FROM bs_serial_trac 
WHERE GOODS_NO = '4418095740626' 
AND SERIAL = 'F9G7S19722007485'
AND (PARENT_CODE <> 'F9G7S19722007485' OR PARENT_CODE IS NULL)

Since PARENT_CODE is not used for index lookup but for data filtering in the second part of UNION ALL query, neither of the two methods will affect query performance.

 

After confirming that the UNION ALL performance meets the requirements, the UPDATE operation is rewritten as follows:

UPDATE bs_serial_trac 
SET SERIAL_IS_LOCK = 0,
LOCK_VALUE = '',
UPDATE_USER = 'transSys' 
WHERE ID IN(
    SELECT ID FROM(
    SELECT ID
    FROM bs_serial_trac 
    WHERE GOODS_NO = '4418095740626' 
    AND PARENT_CODE = 'F9G7S19722007485' 
    UNION ALL
    SELECT ID
    FROM bs_serial_trac 
    WHERE GOODS_NO = '4418095740626' 
    AND SERIAL = 'F9G7S19722007485'
    AND (PARENT_CODE <> 'F9G7S19722007485' OR PARENT_CODE IS NULL)
    ) AS T1
)

For the implementation plan:

*************************** 1. row ***************************
           id: 1
  select_type: UPDATE
        table: bs_serial_trac
   partitions: NULL
         type: index
possible_keys: NULL
          key: PRIMARY
      key_len: 8
          ref: NULL
         rows: 13270473
     filtered: 100.00
        Extra: Using where
*************************** 2. row ***************************
           id: 2
  select_type: DEPENDENT SUBQUERY
        table: <derived3>
   partitions: NULL
         type: index_subquery
possible_keys: <auto_key0>
          key: <auto_key0>
      key_len: 8
          ref: func
         rows: 2
     filtered: 100.00
        Extra: Using index
*************************** 3. row ***************************
           id: 3
  select_type: DERIVED
        table: bs_serial_trac
   partitions: NULL
         type: ref
possible_keys: idx_serial_goodsNo,idx_parent_code
          key: idx_parent_code
      key_len: 153
          ref: const
         rows: 1
     filtered: 5.00
        Extra: Using where
*************************** 4. row ***************************
           id: 4
  select_type: UNION
        table: bs_serial_trac
   partitions: NULL
         type: const
possible_keys: idx_complex_serial_goodsNo,idx_serial_goodsNo,idx_parent_code
          key: idx_complex_serial_goodsNo
      key_len: 695
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: NULL
4 rows in set (0.00 sec)

 

Although it is updated according to the primary key ID, the use of IN statement still results in the query being scanned in a full table with poor performance. It is necessary to convert the IN query into INNER JOIN:

UPDATE
bs_serial_trac AS T2
INNER JOIN (
    SELECT ID
    FROM bs_serial_trac 
    WHERE GOODS_NO = '4418095740626' 
    AND PARENT_CODE = 'F9G7S19722007485' 
    UNION ALL
    SELECT ID
    FROM bs_serial_trac 
    WHERE GOODS_NO = '4418095740626' 
    AND SERIAL = 'F9G7S19722007485'
    AND (PARENT_CODE <> 'F9G7S19722007485' OR PARENT_CODE IS NULL)
) AS T1 
ON T1.ID=T2.ID
SET
T2.SERIAL_IS_LOCK = 0,
T2.LOCK_VALUE = '',
T2.UPDATE_USER = 'transSys'

The revised implementation plan is as follows:

*************************** 1. row ***************************
           id: 1
  select_type: PRIMARY
        table: <derived2>
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3
     filtered: 100.00
        Extra: NULL
*************************** 2. row ***************************
           id: 1
  select_type: UPDATE
        table: T2
   partitions: NULL
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 8
          ref: T1.ID
         rows: 1
     filtered: 100.00
        Extra: NULL
*************************** 3. row ***************************
           id: 2
  select_type: DERIVED
        table: bs_serial_trac
   partitions: NULL
         type: ref
possible_keys: idx_serial_goodsNo,idx_parent_code
          key: idx_parent_code
      key_len: 153
          ref: const
         rows: 1
     filtered: 5.00
        Extra: Using where
*************************** 4. row ***************************
           id: 3
  select_type: UNION
        table: bs_serial_trac
   partitions: NULL
         type: const
possible_keys: idx_complex_serial_goodsNo,idx_serial_goodsNo,idx_parent_code
          key: idx_complex_serial_goodsNo
      key_len: 695
          ref: const,const
         rows: 1
     filtered: 100.00
        Extra: NULL
4 rows in set (0.00 sec)

Adjustment can be operated according to the main key normally, and the performance is guaranteed.

 

<=> operator

<=> : NULL-safe equal. This operator performs an equality comparison like the = operator, but returns 1 rather than NULL if both operands are NULL, and 0 rather than NULL if one operand is NULL.

Reference: https://dev.mysql.com/doc/refman/8.0/en/comparison-operators.html#operator_equal-to

Keywords: PHP SQL less MySQL

Added by KC_Geek on Mon, 05 Aug 2019 11:48:33 +0300