Implicit conversion of MySQL index invalidation

Common index failures:

1. Condition index field "unclean": Function operation,Arithmetic operation

2. Implicit type conversion: String to valueOther type conversion

3. Implicit character encoding conversion : convert according to the direction with large length of character coded data to avoid data interception

1, Common index failure scenarios

root@test 10:50 > show create table t_num\G
*************************** 1. row ***************************
       Table: t_num
Create Table: CREATE TABLE `t_num` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c1` int(11) NOT NULL,
  `c2` varchar(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_c1` (`c1`)
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8mb4
​
root@test 10:51 > select * from t_num;
+----+----+----+
| id | c1 | c2 |
+----+----+----+
|  1 | -2 | -2 |
|  2 | -1 | -1 |
|  3 |  0 |  0 |
|  4 |  1 |  1 |
|  5 |  2 |  2 |
+----+----+----+
​
# Add an index to the c1 field
root@test 10:52 > alter table t_num add index ix_c1(c1);
​
# In the case of standard use, the index is valid
root@test 10:55 > explain select * from t_num where c1 = -1;
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t_num | NULL       | ref  | ix_c1         | ix_c1 | 4       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+

1. Conditional field function operation

# Add the abs() absolute value function to c1 in where to see the type=ALL,Full table scan, in Server Layers are compared after absolute value processing
root@test 10:58 > explain select * from t_num where abs(c1) = 1;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t_num | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+

As mentioned above, when the index field is operated as a function, that is, when the where condition column is not clean, the order of the index value may be destroyed (the index tree is organized orderly according to the value of c1). Therefore, the optimizer decides to give up the search function of the index tree.

However, under the operation of conditional field function, the full table scan is not complete, and the optimizer does not completely abandon the field index.

# Select the data to query. Only the id and c1 fields are available. You can see the type=index,Used ix_c1 Indexes
root@test 10:59 > explain select id,c1 from t_num where abs(c1) = 1;
+----+-------------+-------+------------+-------+---------------+-------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys | key   | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+-------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | t_num | NULL       | index | NULL          | ix_c1 | 4       | NULL |    5 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+-------+---------+------+------+----------+--------------------------+

As above, due to IX_ The c1 index tree is constructed by the id of the root node c1 and the leaf node. Although the index positioning is abandoned due to the function operation on c1, the optimizer can choose to traverse the index tree and use the Using index without returning to the table. After returning the required id and c1 data to the Server layer, the subsequent abs() and where filtering can be performed.

2. Conditional field operation

# In the where condition, the operation is performed on c1
root@test 11:03 > explain select * from t_num where c1 + 1 = 2;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t_num | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+

As mentioned above, although the operation of "+ 1" did not destroy the order of c1 index, the optimizer still did not use the index to locate quickly. Therefore, to the left of the equal sign, pay attention to optimizing the operation on the index field.

3. Implicit type conversion

# Add an index to the c2 field
root@test 12:30 > alter table t_num add index ix_c2(c2);
​
# In the case of standard use (Note: c2 is of varchar type), the index is valid
root@test 12:30 > explain select * from t_num where c2 = "2";
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t_num | NULL       | ref  | ix_c2         | ix_c2 | 42      | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-------+
​
# Remove the quotation marks of the value to the right of the equal sign, that is, compare the string with the value, and the index becomes invalid
root@test 12:30 > explain select * from t_num where c2 = 2;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t_num | NULL       | ALL  | ix_c2         | NULL | NULL    | NULL |    5 |    20.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+

As mentioned above, the c2 field is of varchar type, which is the comparison between strings and values. At this time, MySQL converts strings into numbers, that is, c2 here is CAST(c2 AS signed int), which is equivalent to a function operation on the condition field, and the optimizer gives up tree index positioning.

4. Implicit character encoding conversion

# Create a t_cou table, the table structure is basically the same as the previous t_num is the same, the only difference is the table character set CHARSET=utf8
root@test 14:02 > show create table t_cou\G
*************************** 1. row ***************************
       Table: t_cou
Create Table: CREATE TABLE `t_cou` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c1` int(11) NOT NULL,
  `c2` varchar(10) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_c1` (`c1`),
  KEY `ix_c2` (`c2`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8
​
root@test 14:02 > insert into t_cou select * from t_num;
​
# join table, t_num and t_cou performs Association query through c2 field
root@test 14:03 > select n.* from t_num n
    -> join t_cou c
    -> on n.c2 = c.c2
    -> where n.c1 = 1;
+----+----+----+
| id | c1 | c2 |
+----+----+----+
|  4 |  1 | 1  |
+----+----+----+
​
root@test 14:23 > explain select n.* from t_num n join t_cou c  on n.c2 = c.c2 where c.c1 = 1;
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | c     | NULL       | ref  | ix_c1         | ix_c1 | 4       | const |    1 |   100.00 | NULL                  |
|  1 | SIMPLE      | n     | NULL       | ref  | ix_c2         | ix_c2 | 42      | func  |    1 |   100.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+-----------------------+
# Execution plan analysis:
# 1.Operational c Table, using ix_c1 Navigate to a row of data
# 2.from c Locate the row data of the table and get it c2 Field to operate n Watch, t_cou Called the drive table, t_num Called driven table
# 3.ref=func Description function operation is used, which refers to n.c2=CONVERT(c.c2 USING utf8mb4)
# 4.meanwhile Using index condition,ix_c2 When reading a query, the pushed condition is used for filtering, and only those that meet the condition are returned to the table
​
root@test 14:23 > explain select n.* from t_num n join t_cou c  on n.c2 = c.c2 where n.c1 = 1;
+----+-------------+-------+------------+-------+---------------+-------+---------+-------+------+----------+-----------------------------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key   | key_len | ref   | rows | filtered | Extra                                                           |
+----+-------------+-------+------------+-------+---------------+-------+---------+-------+------+----------+-----------------------------------------------------------------+
|  1 | SIMPLE      | n     | NULL       | ref   | ix_c1,ix_c2   | ix_c1 | 4       | const |    1 |   100.00 | NULL                                                            |
|  1 | SIMPLE      | c     | NULL       | index | NULL          | ix_c2 | 32      | NULL  |    5 |   100.00 | Using where; Using index; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+-------+---------------+-------+---------+-------+------+----------+-----------------------------------------------------------------+
# Execution plan analysis:
# 1.Operational n Table, using ix_c1 Navigate to a row of data
# 2.from n Locate the row data of the table and get it c2 Field to operate c Watch, t_num Called the drive table, t_cou Called driven table
# 3.alike n.c2=c.c2,Will c.c2 That is, add a function to the index field of the driven table, and the index becomes invalid
# 4.BNL,surface join Drive table data read in join buffer,If the connected field of the driven table has no index, the whole table is scanned, and each row and join buffer Data comparison judgment is returned as a result set

As above, t_num, t_cou is used as the execution plan analysis of the driven table and the driven table, and the summary is as follows:

  1. When comparing strings with utf8mb4 and UTF8 character sets (encoding), MySQL will first convert the UTF8 string into utf8mb4 character set, and then compare. Why? The character set utf8mb4 is a superset of utf8. During implicit automatic type conversion, in order to avoid data errors caused by truncation in the conversion process, it will be converted "in the direction of increasing data length".

  2. In the process of table connection, adding a function to the index field of the driven table will lead to a full table scan of the driven table.

Optimization method:

  1. Modify the character set of the unified join field

  2. Start with the driver table and convert the character set of the connection field into the character set of the connection field of the driven table

root@test 18:09 > explain select n.* from t_num n join t_cou c  on convert(n.c2 using utf8) = c.c2 where n.c1 = 1;
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key   | key_len | ref   | rows | filtered | Extra                    |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+--------------------------+
|  1 | SIMPLE      | n     | NULL       | ref  | ix_c1         | ix_c1 | 4       | const |    1 |   100.00 | NULL                     |
|  1 | SIMPLE      | c     | NULL       | ref  | ix_c2         | ix_c2 | 32      | func  |    1 |   100.00 | Using where; Using index |
+----+-------------+-------+------------+------+---------------+-------+---------+-------+------+----------+--------------------------+

 

2, Type conversion

1. String to integer

# The beginning of the character is always 0
root@test 18:44 > select convert("abc", unsigned integer);
+----------------------------------+
| convert("abc", unsigned integer) |
+----------------------------------+
|                                0 |
+----------------------------------+
# 'abc' = 0 Yes, so it is important to use the corresponding type on the right side of the equal sign when querying. 0 matches the data at the beginning of the field character,'0'Match only 0
root@test 18:44 > select 'abc' = 0;
+-----------+
| 'abc' = 0 |
+-----------+
|         1 |
+-----------+
​
# If it starts with a number, it is directly intercepted to the first position that is not a character
root@test 18:45 > select convert("123abc", unsigned integer);
+-------------------------------------+
| convert("123abc", unsigned integer) |
+-------------------------------------+
|                                 123 |
+-------------------------------------+

2. Time type conversion

root@test 19:11 > show create table time_demo\G
*************************** 1. row ***************************
       Table: time_demo
Create Table: CREATE TABLE `time_demo` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c1` datetime DEFAULT NULL,
  `c2` date DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `ix_c1` (`c1`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4
​
root@test 19:15 > select count(*) from time_demo;
+----------+
| count(*) |
+----------+
|       11 |
+----------+
​
root@test 19:16 > select * from time_demo limit 4;
+----+---------------------+------------+
| id | c1                  | c2         |
+----+---------------------+------------+
|  1 | 2022-01-08 00:01:01 | 2022-01-08 |
|  2 | 2022-01-06 23:01:01 | 2022-01-06 |
|  3 | 2022-01-06 00:00:00 | 2022-01-06 |
|  4 | 2022-01-08 00:00:00 | 2022-01-08 |
+----+---------------------+------------+
​
# 1.date turn datetime: Append at the end 00:00:00
root@test 19:11 > select * from time_demo where c1 between "2022-01-06" and "2022-01-08";
+----+---------------------+------------+
| id | c1                  | c2         |
+----+---------------------+------------+
|  2 | 2022-01-06 23:01:01 | 2022-01-06 |
|  3 | 2022-01-06 00:00:00 | 2022-01-06 |
|  4 | 2022-01-08 00:00:00 | 2022-01-08 |
+----+---------------------+------------+
# Result analysis: c1 is datetime type. When comparing,between and Medium date The type is converted to datetime
# Namely where c1 between "2022-01-06 00:00:00" and "2022-01-08 00:00:00";
# with where c1 >= "2022-01-06 00:00:00" and c1 <= "2022-01-08 00:00:00";
root@test 19:42 > explain select * from time_demo where c1 between "2022-01-06" and "2022-01-08";
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+-----------------------+
| id | select_type | table     | partitions | type  | possible_keys | key   | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | time_demo | NULL       | range | ix_c1         | ix_c1 | 6       | NULL |    3 |   100.00 | Using index condition |
+----+-------------+-----------+------------+-------+---------------+-------+---------+------+------+----------+-----------------------+
# Format date to datetime
root@test 19:23 > select date_format("2022-01-08","%Y-%m-%d %H:%i:%s");
+-----------------------------------------------+
| date_format("2022-01-08","%Y-%m-%d %H:%i:%s") |
+-----------------------------------------------+
| 2022-01-06 00:00:00                           |
+-----------------------------------------------+
​
# 2.datetime turn date: direct interception  date part
root@test 19:47 > select date(c1) from time_demo limit 1;
+------------+
| date(c1)   |
+------------+
| 2022-01-06 |
+------------+
​
# 3.date turn time,Meaningless, directly become 00:00:00

 

Keywords: MySQL

Added by TeamTJ on Sun, 09 Jan 2022 08:38:36 +0200