Wan a #20, how to filter data in index push down

  • The original content of GreatSQL community cannot be used without authorization. Please contact Xiaobian and indicate the source for reprint.

Experimental environment

GreatSQL 8.0.25 InnoDB

1. Introduction to index push down

  • 1. Index Condition Pushdown is short for ICP.
  • 2.MySQL5. Version 6 introduces the function for optimizing queries.
  • 3. Under some specific index conditions, ICP can reduce the number of queries back to the table by the storage engine.

2. Applicable conditions

  • 1. ICP is used for range, ref and EQ when full table records need to be accessed_ Ref and ref_or_null access method.
  • 2.ICP can be used for InnoDB and MyISAM tables, including partition InnoDB and MyISAM tables.
  • 3. For InnoDB table, ICP is only used for secondary index. The goal of the ICP is to reduce the number of reads of the entire line of records, thereby reducing I/O operations. For the InnoDB clustered index, the complete records have been read into the InnoDB buffer. In this case, using ICP will not reduce I/O.
  • 4. The secondary index created on the virtual column does not support ICP.
  • 5. The SQL using subquery does not support ICP.
  • 6. The SQL calling the stored procedure does not support ICP because the storage engine cannot call the stored procedure located in MySQL Server.
  • 7. Trigger does not support ICP.

3. How to enable

  • ICP is on by default, and can be closed, enabled and viewed through the following commands
# Close ICP
SET optimizer_switch = 'index_condition_pushdown=off'; 
# Turn on ICP
SET optimizer_switch = 'index_condition_pushdown=on';
# View ICP current status
show VARIABLES like '%optimizer_switch%'

4. How ICP works

Query steps without ICP optimization

  • 1. Get the next row, first read the index information, and then read the whole row of data according to the index.
  • 2. Then judge whether the current data meets the conditions through the where condition and return the data.

Query steps when using ICP optimization

  • 1. Get the index information of the next row.
  • 2. Check whether the column information stored in the index meets the index conditions. If yes, read the whole row of data. If not, skip reading the next row.
  • 3. Use the remaining judgment conditions to judge whether the data in this line meets the requirements, and return the data if it meets the requirements

5. Experimental test

The table structure is as follows

CREATE TABLE `student` (
  `id` int NOT NULL AUTO_INCREMENT COMMENT 'Self increasing id',
  `uid` int NOT NULL COMMENT 'Student number',
  `age` int NOT NULL COMMENT 'Age',
  `name` char(32) NOT NULL COMMENT 'full name',
  `sex` char(4) NOT NULL COMMENT 'Gender',
  `grade` int NOT NULL COMMENT 'grade',
  `class` varchar(32) NOT NULL COMMENT 'class',
  `major` varchar(64) NOT NULL COMMENT 'major',
  PRIMARY KEY (`id`),
  KEY `idx_anm` (`age`,`name`,`major`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci 

There is a need to query the information of students aged 16, surnamed Chen and studying software engineering

# Enable ICP
[root@GreatSQL][test]>explain select * from student where age=16 and name like 'Chen%' and major='software engineering';
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | student | NULL       | range | idx_anm       | idx_anm | 390     | NULL |    1 |    33.33 | Using index condition |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-----------------------+
1 row in set, 1 warning (0.00 sec)

# Do not enable ICP
[root@GreatSQL][test]>explain select /*+ no_icp (student) */ * from student where age=16 and name like 'Chen%' and major='software engineering';
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table   | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | student | NULL       | range | idx_anm       | idx_anm | 390     | NULL |    1 |    33.33 | Using where |
+----+-------------+---------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

Extra parsed with ICP enabled is Using index condition, and extra parsed without ICP enabled is Using where

Other query results are basically the same. There is no difference in efficiency. You can view them by turning on profiling

[root@GreatSQL][test]>set profiling=1;
Query OK, 0 rows affected, 1 warning (0.00 sec)

[root@GreatSQL][test]>select * from student where age=16 and name like 'Chen%' and major='software engineering';
+----+--------+-----+--------+-----+-------+-------+--------------+
| id | uid    | age | name   | sex | grade | class | major        |
+----+--------+-----+--------+-----+-------+-------+--------------+
|  1 | 100001 |  16 | Chen Hong   | male  |     4 | 3     | software engineering     |
+----+--------+-----+--------+-----+-------+-------+--------------+
1 row in set (0.00 sec)

(Tue Jan  4 15:51:50 2022)[root@GreatSQL][test]>select /*+ no_icp (student) */ * from student where age=16 and name like 'Chen%' and major='software engineering';
+----+--------+-----+--------+-----+-------+-------+--------------+
| id | uid    | age | name   | sex | grade | class | major        |
+----+--------+-----+--------+-----+-------+-------+--------------+
|  1 | 100001 |  16 | Chen Hong   | male  |     4 | 3     | software engineering     |
+----+--------+-----+--------+-----+-------+-------+--------------+
1 row in set (0.00 sec)

[root@GreatSQL][test]>show profiles\G;
*************************** 1. row ***************************
Query_ID: 1
Duration: 0.00043725
   Query: select * from student where age=16 and name like 'Chen%' and major='software engineering'
*************************** 2. row ***************************
Query_ID: 2
Duration: 0.00048500
   Query: select /*+ no_icp (student) */ * from student where age=16 and name like 'Chen%' and major='software engineering'
2 rows in set, 1 warning (0.00 sec)

ERROR:
No query specified

The Duration with ICP is slightly shorter than that without ICP. The efficiency comparison results of multiple tests are the same. From the test, the query efficiency optimized with ICP will be better.

6. Query process

ICP is not turned on

1. First find the record with age=16 according to the leftmost principle, then return to the table, and find the row that meets the record according to the primary key.

2. Find out all the row records that match like 'Chen%', and then filter the qualified records according to the primary key according to the data found in step 1

3. Then find out all records that meet major = 'software engineering', and then find out all records that meet the conditions according to step 2

4. In step 1 of the query process, each record conforming to age=16 must be returned to the table first.

Turn on ICP

1. Find the record with age=16 according to the leftmost principle.

2. View the index and filter out the data that does not conform to like 'Chen%'

3. Check the index and filter out the data that does not conform to major = 'software engineering'

4. In step 1, the query process does not perform the table return operation. First, find out the conditions that meet the conditions 2 and 3 through the index. If they do not meet the conditions, the query in the next step will be performed directly, so the table return times will be less.

7.ICP diagram

  • Illustration source: MariaDB COM, note sharing only, non-commercial use.

Figure 1: ICP query process is not enabled

Figure 2: enabling ICP query process

Note: Several X's in Figure 2 are filtered in the index layer, so there is no need to go back to the table.

8. See the official website for more information

Enjoy GreatSQL :)

Article recommendation:

GreatSQL MGR FAQ
https://mp.weixin.qq.com/s/J6...

Wan Da #12, how can the MGR cluster automatically select the master without manual intervention
https://mp.weixin.qq.com/s/07...

"2021 data technology Carnival · ON LINE": evolution and practice of MySQL high availability architecture
https://mp.weixin.qq.com/s/u7...

Packet capture analysis of where an sql statement is slow
https://mp.weixin.qq.com/s/AY...

Wan Da #15, what are the conditions that may cause the MGR service to fail to start
https://mp.weixin.qq.com/s/in...

Technology sharing | why MGR consistency mode does not recommend AFTER
https://mp.weixin.qq.com/s/rN...

About GreatSQL

GreatSQL is a MySQL branch maintained by Wanli database. It focuses on improving the reliability and performance of MGR and supports the parallel query feature of InnoDB. It is a branch version of MySQL suitable for financial applications.

Gitee:
https://gitee.com/GreatSQL/Gr...

GitHub:
https://github.com/GreatSQL/G...

Bilibili:
https://space.bilibili.com/13...

Wechat & QQ group:
You can search and add GreatSQL community assistant wechat friends, send verification information "add group" to join GreatSQL/MGR communication wechat group

QQ group: 533341697
Wechat assistant: wanlidbc

This article is composed of blog one article multi posting platform OpenWrite release!

Keywords: Database MySQL SQL

Added by marc2000 on Wed, 19 Jan 2022 22:37:52 +0200