4. Performance analysis
4.1 common performance bottlenecks of MySQL
① CPU: when the CPU runs at full load, it usually occurs when the data is loaded into the memory or read from the disk;
② IO: the disk IO bottleneck occurs when the loaded data is much larger than the memory capacity;
③ Server hardware bottleneck: check the performance and status of the system through top, free, iostat and vmstat commands;
④ Database server configuration problem;
4.2 MySQL performance analysis
4.2.1 EXPLAIN introduction
Use the EXPLAIN keyword to simulate the MySQL optimizer to execute SQL query statements, so as to see how MySQL understands your SQL statements and analyze the performance bottlenecks of SQL query statements or table structures.
A laboratory report similar to a hospital examination.
effect:
① Obtain the reading order of the table;
② Obtain the operation type of data reading operation;
③ View which indexes can be used;
④ Check which indexes are actually used;
⑤ View the reference relationship between tables;
⑥ Check how many rows of each table are queried by the optimizer;
4.2.2 application method:
EXPLAIN + SQL statement
Example:
mysql> explain select * from tbl_emp where id = 2; +----+-------------+---------+-------+---------------+---------+---------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------+-------+---------------+---------+---------+-------+------+-------+ | 1 | SIMPLE | tbl_emp | const | PRIMARY | PRIMARY | 4 | const | 1 | | +----+-------------+---------+-------+---------------+---------+---------+-------+------+-------+ 1 row in set (0.00 sec)
Information column returned after EXPLAIN execution:
- id: the serial number of the query, including a group of numbers, indicating the order in which the SELECT clause or operation table is executed in the query;
- select_type: indicates the type of query;
- Table: indicates the table where the data is located;
- Type: the access type of the query;
- possible_keys: displays one or more possible indexes in this table;
- key: the index actually used;
- key_len: indicates the number of bytes used in the index. This column can be used to calculate the index length used in the query
- ref: the column displaying the index is used;
- Rows: displays the number of rows that MySQL thinks must be checked when executing a query;
- Extra: other additional important information;
4.2.2 explanation of each analysis field
Create a data table using the following SQL statement
CREATE TABLE t1(id INT(10) AUTO_INCREMENT,content VARCHAR(100) NULL , PRIMARY KEY (id)); CREATE TABLE t2(id INT(10) AUTO_INCREMENT,content VARCHAR(100) NULL , PRIMARY KEY (id)); CREATE TABLE t3(id INT(10) AUTO_INCREMENT,content VARCHAR(100) NULL , PRIMARY KEY (id)); CREATE TABLE t4(id INT(10) AUTO_INCREMENT,content VARCHAR(100) NULL , PRIMARY KEY (id)); INSERT INTO t1(content) VALUES(CONCAT('t1_',FLOOR(1+RAND()*1000))); INSERT INTO t2(content) VALUES(CONCAT('t2_',FLOOR(1+RAND()*1000))); INSERT INTO t3(content) VALUES(CONCAT('t3_',FLOOR(1+RAND()*1000))); INSERT INTO t4(content) VALUES(CONCAT('t4_',FLOOR(1+RAND()*1000)));
4.2.2.1 id
The serial number of the query, including a group of numbers, indicating the order in which the SELECT clause or operation table is executed in the query;
There are three main situations:
① The id is the same, and the execution order is from top to bottom;
mysql> explain select * from t1,t2,t3 where t1.id = t2.id and t2.id = t3.id; +----+-------------+-------+--------+---------------+---------+---------+-----------------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+-----------------+------+-------+ | 1 | SIMPLE | t1 | ALL | PRIMARY | NULL | NULL | NULL | 1 | | | 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | base_crud.t1.id | 1 | | | 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | base_crud.t1.id | 1 | | +----+-------------+-------+--------+---------------+---------+---------+-----------------+------+-------+ 3 rows in set (0.00 sec)
② The id is different. If it is a sub query, the id sequence number will increase. The larger the id value, the higher the priority, and the earlier it will be executed;
mysql> explain select t1.id from t1 where t1.id in (select t2.id from t2 where t2.id in (select t3.id from t3 where t3.content = '')); +----+--------------------+-------+-----------------+---------------+---------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+-----------------+---------------+---------+---------+------+------+--------------------------+ | 1 | PRIMARY | t1 | index | NULL | PRIMARY | 4 | NULL | 1 | Using where; Using index | | 2 | DEPENDENT SUBQUERY | t2 | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using index; Using where | | 3 | DEPENDENT SUBQUERY | t3 | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where | +----+--------------------+-------+-----------------+---------------+---------+---------+------+------+--------------------------+ 3 rows in set (0.00 sec)
③ IDs are the same and different. If the IDs are the same, they can be considered as a group and executed from top to bottom; The higher the id value in the group, the higher the priority
mysql> explain select t2.* from t2, (select t3.* from t3) s3 where s3.id = t2.id; +----+-------------+------------+--------+---------------+---------+---------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+---------+---------+-------+------+-------+ | 1 | PRIMARY | <derived2> | system | NULL | NULL | NULL | NULL | 1 | | | 1 | PRIMARY | t2 | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 2 | DERIVED | t3 | ALL | NULL | NULL | NULL | NULL | 1 | | +----+-------------+------------+--------+---------------+---------+---------+-------+------+-------+ 3 rows in set (0.00 sec)
Summary: each id number. Represents an independent query. The fewer rows of SQL query, the better
4.2.2.2 select_type
Represents the type of query, which is mainly used to distinguish between common query, joint query, sub query and other complex queries.
select_type attribute | meaning |
---|---|
SIMPLE | A simple SELECT query that does not contain subqueries or unions |
PRIMARY | If a query contains sub parts of a complex query, the outermost query is marked as PAIMARY |
DERIVED | The sub queries contained in the FROM list are marked as DERIVED. MySQL will recursively execute these sub queries and put the results in the temporary table. |
SUBQUERY | Subqueries are included in the SELECT or WHERE list. |
DEPEDENT SUBQUERY | The sub query is included in the SELECT or WHERE list. The sub query is based on the outer layer. |
UNCACHEABLE SUBQUERY | Cannot use cached subqueries. |
UNION | If it appears after the second UNION, it is marked as SELECT; If UNION is included in subquery of the FROM clause, outer SELECT will be marked as DERIVED |
UNION RESULT | SELECT to get the result from the UNION table |
① SIMPLE: indicates a single table query
mysql> explain select * from t1; +----+-------------+-------+------+---------------+------+---------+------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+------+-------+ | 1 | SIMPLE | t1 | ALL | NULL | NULL | NULL | NULL | 1 | | +----+-------------+-------+------+---------------+------+---------+------+------+-------+ 1 row in set (0.00 sec)
② PRIMARY: if the query contains any complex sub parts, the outermost query will be marked as PRIMARY
mysql> explain select * from (select * from t2) a; +----+-------------+------------+--------+---------------+------+---------+------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+------+---------+------+------+-------+ | 1 | PRIMARY | <derived2> | system | NULL | NULL | NULL | NULL | 1 | | | 2 | DERIVED | t2 | ALL | NULL | NULL | NULL | NULL | 1 | | +----+-------------+------------+--------+---------------+------+---------+------+------+-------+ 2 rows in set (0.00 sec)
③ DERIVED: the subqueries contained in the FROM list are marked as DERIVED. MySQL will recursively execute these subqueries and enlarge the results into the temporary table.
mysql> explain select * from (select * from t2) a; +----+-------------+------------+--------+---------------+------+---------+------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+--------+---------------+------+---------+------+------+-------+ | 1 | PRIMARY | <derived2> | system | NULL | NULL | NULL | NULL | 1 | | | 2 | DERIVED | t2 | ALL | NULL | NULL | NULL | NULL | 1 | | +----+-------------+------------+--------+---------------+------+---------+------+------+-------+ 2 rows in set (0.00 sec)
④ SUBQUERY: the SELECT or WHERE list contains subqueries
mysql> EXPLAIN SELECT t2.id FROM t2 WHERE t2.id = (SELECT t3.id FROM t3 WHERE t3.id = 1); +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ | 1 | PRIMARY | t2 | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index | | 2 | SUBQUERY | t3 | const | PRIMARY | PRIMARY | 4 | | 1 | Using index | +----+-------------+-------+-------+---------------+---------+---------+-------+------+-------------+ 2 rows in set (0.00 sec)
⑤ Dependent subquery: contains subqueries in the SELECT or WHERE list. Subqueries are based on the outer layer.
mysql> explain select t2.id from t2 where t2.id in (select t3.id from t3 where t3.content = 't3_993'); +----+--------------------+-------+-----------------+---------------+---------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+-----------------+---------------+---------+---------+------+------+--------------------------+ | 1 | PRIMARY | t2 | index | NULL | PRIMARY | 4 | NULL | 1 | Using where; Using index | | 2 | DEPENDENT SUBQUERY | t3 | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where | +----+--------------------+-------+-----------------+---------------+---------+---------+------+------+--------------------------+ 2 rows in set (0.00 sec)
Subqueries are added after WHERE. SUPERQUERY is a single value and DEPENDENT SUBQUERY is a group of values.
⑥ UNCACHEABLE SUBQUERY: cannot use cached subquery.
mysql> explain select * from t3 where id = (select id from t2 where t2.id = @@sort_buffer_size); +----+----------------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+----------------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+ | 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE noticed after reading const tables | | 2 | UNCACHEABLE SUBQUERY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | no matching row in const table | +----+----------------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+ 2 rows in set (0.01 sec)
When you use @ @ to reference a system variable, the cache is not used.
⑦ UNION: if the second SELECT appears after UNION, it is marked as UNION; If UNION is included in subquery of the FROM clause, outer SELECT is marked as DERIVED
mysql> explain select * from t1 union all select * from t2; +----+--------------+------------+------+---------------+------+---------+------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------+------------+------+---------------+------+---------+------+------+-------+ | 1 | PRIMARY | t1 | ALL | NULL | NULL | NULL | NULL | 1 | | | 2 | UNION | t2 | ALL | NULL | NULL | NULL | NULL | 1 | | | NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | | +----+--------------+------------+------+---------------+------+---------+------+------+-------+ 3 rows in set (0.00 sec)
⑧ UNION RESULT: SELECT the result from the UNION table
4.2.2.3 table
Represents the table where the data is located
4.2.2.4 type
Access type of query;
attribute | meaning |
---|---|
system | A table with only one row of records (equal to the system table) is a special case of const type, which generally does not appear. |
const | Indicates that the index can be found at one time. const is used to compare the primary key or unique index. |
eq_ref | Unique index scan. For each index key, only one record in the table matches it. Common in primary key or unique index scanning. |
ref | Non unique index scan that returns all rows that match a single value. In essence, it is also an index access. It returns all rows matching a single value. However, it may find multiple qualified rows, so it belongs to a mixture of search and scan. |
range | Retrieve only rows in the specified range, using an index to select rows. The key column shows which index is used. Generally, queries such as between, <, >, in appear in the WHERE statement. This range scanning is better than the full table scanning, because it only needs to start at one point of the index and end at another point, and there is no need to scan all the indexes. |
index | The index appears because the SQL statement uses the index but does not use the index for filtering. Generally, it uses the overlay index or uses the index for sorting and grouping. |
ALL | Full table scan to find matching rows |
index_merge | Multiple indexes need to be combined in query, which usually appears in SQL with OR keyword. |
ref_or_null | When a field requires both association conditions and null values, the query optimizer will choose to use ref_or_null connection query. |
index_subquery | The index is used to associate sub queries, and the whole table is no longer scanned. |
unique_subquery | Similar to index_subquery is the unique index in the subquery. |
The access type of query is an important indicator. The result values from the best to the worst are:
system > const > eq_ref > ref > fulltext > ref_or_null > index_merege > unique_subquery > index_subquery > range > index > ALL
Generally speaking, to ensure that the query reaches at least range level, it is best to reach ref
Difference between index and ALL:
Although both index and ALL read the whole table, index is read from the index and only traverses the index tree, while ALL is read from the hard disk. In comparison, the index file is usually smaller than the data file, so index is faster than ALL.
4.2.2.5 possible_keys
Displays one or more indexes that may be applied to this table. If there is an index on the field involved in the query, the index is listed, but it may not be actually used by the query.
4.2.2.6 key
The index actually used. If it is NULL, it means that the index is not used;
If an overlay index is used in the query, the index will only appear in the key list;
4.2.2.7 key_len
Indicates the number of bytes used in the index. This column can be used to calculate the index length used in the query. key_ The len field can help check whether the index is fully utilized.
key_ The longer len indicates that the index is used more fully, that is, the more indexes are used, the more accurate the matching is;
Without loss of accuracy, the shorter the length, the better;
key_ Calculation method of len:
① First, check the type and length of the field on the index, for example: int = 4; varchar(20) = 20; char(20) = 20;
② For string fields such as varchar and char, different character sets should be multiplied by different values, such as utf8 multiplied by 3 and GBK multiplied by 2;
③ Dynamic characters such as varchar need to add 2 bytes;
④ The field allowed to be empty shall be added with 1 byte;
Column type | key_len | remarks |
---|---|---|
int | 4 + 1 | Allow NUll, plus 1 byte |
int not null | 4 | NULL is not allowed |
char(30) utf8 | 30 * 3 + 1 | NULL allowed |
varchar(30) not null utf8 | 30 * 3 + 2 | Dynamic column type, plus 2 bytes |
varchar(30) utf8 | 30 * 3 + 2 + 1 | Dynamic column type, plus two bytes; NULL plus one byte is allowed |
text(10) utf8 | 30 * 3 + 3 + 1 | The text intercepted part is regarded as a dynamic column type and is allowed to be NULL |
4.2.2.8 ref
The column that displays the index is used, if possible, as a const (constant). Those columns or constants are used to find values on index columns.
4.2.2.9 rows
According to table statistics and index selection, roughly estimate the number of rows to read to find the required records. The fewer the rows, the better.
4.2.2.10 Extra
Contains additional information that is not suitable for display in other columns but is important.
① Using filesort
MySQL will use an external index to sort the data instead of reading according to the index order in the table. The sort operation that MySQL cannot complete by using the index is called * * "file sort"**
For the sorted fields in the query, if the sorted fields are accessed through the index, the sorting speed will be greatly improved.
② Using temporary
Temporary tables are used to save intermediate results. MySQL uses temporary tables when sorting query results. It is commonly used in sorting ORDER BY and grouping query GROUP BY
Sorting ORDER BY and grouping query GROUP BY are the culprits that slow down the execution of SQL, and temporary tables will increase the burden of SQL.
③ Using index
It means that the Covering Index is used in the corresponding SELECT operation to avoid accessing the data rows of the table, and the efficiency is good. If Using where appears at the same time, it indicates that the index is used to search the index key value; If Using where does not appear at the same time, it indicates that the index is only used to read data rather than use the index to perform lookup. Sorted or grouped by index.
Covering Index:
- The selected data column can only be obtained from the index without reading the data row. MySQL can use the index to return the fields in the SELECT list without reading the data file again according to the index, that is, the query column should be overwritten by the established index.
- Index is an efficient way to find rows, but general databases can also use index to find the data of a column, so it does not have to read the whole row. After all, index leaf nodes store the data they index; When the required data can be obtained through the index, there is no need to read rows. If an index contains (or covers) data that meets the query results, it is called an overlay index;
- To put it simply, overlay index means that the order and order of the query fields in SELECT are exactly the same as that of the composite index;
be careful:
If you want to use overlay index, you must pay attention to taking out the required columns from the SELECT list. You cannot use SELECT *, because if you index all fields together, the index file will be too large and the query performance will be degraded.
Using filesort: using file sorting affects the efficiency of SQL execution;
Using temporary: temporary tables are used, which seriously affects the efficiency of SQL execution;
Using index: coverage index is used, with good efficiency;
The above three parameters are the basic methods to judge the efficiency of SQL execution.
④ Using where
Indicates WHERE filtering is used.
⑤ Using join buffer
Indicates that connection caching is used.
⑥ impossible where
The value of the WHERE clause is always false and cannot be used to get any tuples.
⑦ select tables optimized away
If there is no GROUP BY clause, the MIN/MAX operation based on index optimization or the COUNT(*) operation for MyISAM storage engine optimization does not need to wait until the execution stage for calculation, and the optimization is completed in the query execution plan generation stage.
⑧ distinct
Optimize distinct and stop searching for the same value after finding the first matching tuple.
5. Index optimization
5.1 index analysis
5.1.1 single table optimization analysis
① Create data table
CREATE TABLE IF NOT EXISTS `article`( `id` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT, `author_id` INT (10) UNSIGNED NOT NULL, `category_id` INT(10) UNSIGNED NOT NULL , `views` INT(10) UNSIGNED NOT NULL , `comments` INT(10) UNSIGNED NOT NULL, `title` VARBINARY(255) NOT NULL, `content` TEXT NOT NULL ); INSERT INTO `article`(`author_id`,`category_id` ,`views` ,`comments` ,`title` ,`content` )VALUES (1,1,1,1,'1','1'), (2,2,2,2,'2','2'), (3,3,3,3,'3','3');
Database after creation
mysql> select * from article; +----+-----------+-------------+-------+----------+-------+---------+ | id | author_id | category_id | views | comments | title | content | +----+-----------+-------------+-------+----------+-------+---------+ | 1 | 1 | 1 | 1 | 1 | 1 | 1 | | 2 | 2 | 2 | 2 | 2 | 2 | 2 | | 3 | 3 | 3 | 3 | 3 | 3 | 3 | +----+-----------+-------------+-------+----------+-------+---------+ 3 rows in set (0.00 sec)
② Query requirements
Query category_ When ID is 1 and comments is greater than or equal to 1, the article with the most views_ id
mysql> select id,author_id from article where category_id = 1 AND comments >= 1 order by views desc limit 1; +----+-----------+ | id | author_id | +----+-----------+ | 1 | 1 | +----+-----------+ 1 row in set (0.00 sec)
③ Analyze and optimize SQL
Problems in SQL query:
- Full table scanning of SQL;
- No and no index is used;
- The use of file sorting is inefficient;
④ Solution
Create index
create index idx_article_ccv on article(category_id, comments, views); alter table `article` add index idx_article_ccv(`category_id`,`comments`,`views`)
View index
mysql> show index from article; +---------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +---------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | article | 0 | PRIMARY | 1 | id | A | 3 | NULL | NULL | | BTREE | | | | article | 1 | idx_article_ccv | 1 | category_id | A | 3 | NULL | NULL | | BTREE | | | | article | 1 | idx_article_ccv | 2 | comments | A | 3 | NULL | NULL | | BTREE | | | | article | 1 | idx_article_ccv | 3 | views | A | 3 | NULL | NULL | | BTREE | | | +---------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 4 rows in set (0.00 sec)
Performance view
After optimization, you can see that the query has used the index, but there is still the problem of file sorting.
Reasons for indexing but not fully using:
According to the working principle of B-Tree index
① Sort category first_ id;
② If the same category is encountered_ ID, and then sort comments;
③ If you encounter the same comments, then sort the views;
④ When comments is in the middle of the joint index, because comments > = 1, the condition is the range value (i.e. range)
⑤ MySQL can no longer use the views part behind the index for retrieval;
⑥ Finally, the index type used by MySQL is range, and the subsequent indexes are invalid;
Optimize index
# Delete inappropriate indexes mysql> drop index idx_article_ccv on article; Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 # Create a new index mysql> create index idx_article_cv on article(category_id, views); Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 # View new index mysql> show index from article; +---------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +---------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | article | 0 | PRIMARY | 1 | id | A | 3 | NULL | NULL | | BTREE | | | | article | 1 | idx_article_cv | 1 | category_id | A | 3 | NULL | NULL | | BTREE | | | | article | 1 | idx_article_cv | 2 | views | A | 3 | NULL | NULL | | BTREE | | | +---------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 3 rows in set (0.00 sec)
View performance
After using the new index, the type type becomes ref and the index idx is used_ article_ CV and both index columns are used, and there will be no file sorting problem. SQL performance is very good.
5.1.2 optimization analysis of two tables
Where is the associated index of the two tables added?
① Create data table
CREATE TABLE IF NOT EXISTS `class`( `id` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT, `card` INT (10) UNSIGNED NOT NULL ); CREATE TABLE IF NOT EXISTS `book`( `bookid` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT, `card` INT (10) UNSIGNED NOT NULL ); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO class(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO book(card)VALUES(FLOOR(1+(RAND()*20)));
Data table after creation
mysql> select * from book; +--------+------+ | bookid | card | +--------+------+ | 1 | 1 | | 2 | 16 | | 3 | 18 | | 4 | 2 | | 5 | 13 | | 6 | 19 | | 7 | 18 | | 8 | 10 | | 9 | 17 | | 10 | 12 | | 11 | 10 | | 12 | 13 | | 13 | 15 | | 14 | 14 | | 15 | 6 | | 16 | 6 | | 17 | 11 | | 18 | 17 | | 19 | 10 | | 20 | 20 | +--------+------+ 20 rows in set (0.00 sec) mysql> select * from class; +----+------+ | id | card | +----+------+ | 1 | 10 | | 2 | 3 | | 3 | 4 | | 4 | 9 | | 5 | 14 | | 6 | 4 | | 7 | 16 | | 8 | 10 | | 9 | 19 | | 10 | 6 | | 11 | 11 | | 12 | 17 | | 13 | 12 | | 14 | 6 | | 15 | 16 | | 16 | 1 | | 17 | 15 | | 18 | 10 | | 19 | 8 | | 20 | 9 | +----+------+ 20 rows in set (0.00 sec)
② Practical problems
Where to create an index when two tables are associated:
- Do you want to index the left table or the right table when connecting to the left?
- Do you want to index the left table or the right table when connecting to the right?
② Analysis and optimization
When left join, class is the left table and book is the right table
The analysis shows that the query is scanned for the whole table, and the index is not used.
③ Solution
Index the class table
# Index in class table mysql> create index idx_class_card on class(card); Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 # Query index mysql> show index from class; +-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | class | 0 | PRIMARY | 1 | id | A | 20 | NULL | NULL | | BTREE | | | | class | 1 | idx_class_card | 1 | card | A | 20 | NULL | NULL | | BTREE | | | +-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 2 rows in set (0.00 sec)
After indexing the left table of the left join, the type reaches the index level, but the number of rows scanned does not decrease.
Index book table
mysql> drop index idx_class_card on class; Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> create index idx_book_card on book(card); Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> show index from book; +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | book | 0 | PRIMARY | 1 | bookid | A | 20 | NULL | NULL | | BTREE | | | | book | 1 | idx_book_card | 1 | card | A | 20 | NULL | NULL | | BTREE | | | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 2 rows in set (0.00 sec)
The index is added to the right table of the left connection. The type reaches the ref level, and the index is used. The number of rows scanned by the right table is reduced.
The above situation is caused by the characteristics of the left connection:
- The LEFT JOIN condition is used to determine how to make the search rows of the right table, and the left table must contain all of them.
- Therefore, the right table is the key table for our search and needs to be indexed.
Similarly, it can be deduced that the right connection has the same problem. Therefore, it is concluded that when the left connection is used, the index is established in the right table, and the right connection is used to establish the index in the left table, that is * * "the connection index is established in the opposite direction"**
5.1.3 optimization analysis of three tables
① Create data table
CREATE TABLE IF NOT EXISTS `phone`( `phoneid` INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT, `card` INT (10) UNSIGNED NOT NULL )ENGINE = INNODB; INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20))); INSERT INTO phone(card)VALUES(FLOOR(1+(RAND()*20)));
mysql> select * from phone; +---------+------+ | phoneid | card | +---------+------+ | 1 | 18 | | 2 | 13 | | 3 | 10 | | 4 | 12 | | 5 | 8 | | 6 | 5 | | 7 | 2 | | 8 | 11 | | 9 | 12 | | 10 | 4 | | 11 | 3 | | 12 | 5 | | 13 | 13 | | 14 | 11 | | 15 | 13 | | 16 | 15 | | 17 | 15 | | 18 | 7 | | 19 | 10 | | 20 | 7 | +---------+------+ 20 rows in set (0.00 sec)
② Query requirements
Where is the three table Association index established?
③ Analysis and optimization
The three table association does not use an index and has a full table scan.
④ Solution
Through the association between the two tables, it can be seen that the index in the opposite direction of the connection needs to be established. Therefore, the index needs to be established on the card field of the book table and the phone table.
# Index the book table mysql> create index idx_book_card on book(card); Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 # Index the phone table mysql> alter table phone add index idx_phone_card(card); Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 # Query index mysql> show index from book; +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | book | 0 | PRIMARY | 1 | bookid | A | 20 | NULL | NULL | | BTREE | | | | book | 1 | idx_book_card | 1 | card | A | 20 | NULL | NULL | | BTREE | | | +-------+------------+---------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 2 rows in set (0.00 sec) mysql> show index from pnone; ERROR 1146 (42S02): Table 'base_crud.pnone' doesn't exist mysql> show index from phone; +-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | phone | 0 | PRIMARY | 1 | phoneid | A | 20 | NULL | NULL | | BTREE | | | | phone | 1 | idx_phone_card | 1 | card | A | 20 | NULL | NULL | | BTREE | | | +-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 2 rows in set (0.00 sec)
Query analysis
After adding the index, you can see that the query uses the index and the number of rows scanned by the table decreases.
5.1.4 summary of association query optimization
① Reduce the number of NestedLoop loops in the Join statement as much as possible; "Drive large result sets with small result sets"
② Give priority to optimizing the inner field of NestedLoop;
③ When the JOIN field of the drive table cannot be indexed and the memory is sufficient, the JoinBuffer setting can be adjusted appropriately;
5.2 index failure
Index invalidation in MySQL:
① Operations on the index (calculation, function, manual / automatic type conversion) will lead to index failure and full table scanning;
② Using (! = or < >) in SQL will lead to index invalidation and full table scanning;
③ Using IS NOT NULL in SQL will lead to index invalidation and full table scanning;
④ If you scan the SQL table with the wildcard character LIKE (& LIKE), the full index will become invalid;
⑤ Not adding single quotation marks to strings in SQL will lead to index invalidation and full table scanning instead;
⑥ Using OR to connect in SQL will lead to index invalidation and full table scanning;
5.2.1 try to use full value matching
The fields of WHERE query in SQL statement can all match in the index in order.
The order of query fields in SQL has nothing to do with the order of using fields in the index. The optimizer will automatically optimize without affecting the SQL execution results.
5.2.2 optimal left prefix rule
The different order of query fields and index establishment in SQL will make the index unusable or even invalid. Therefore, when performing conditional filtering in SQL, you need to match and skip a certain field in order according to the order when establishing the index. The fields behind the index cannot be used. If you use a composite index, follow the leftmost prefix rule.
Leftmost prefix rule: the query starts from the leftmost column of the index and does not skip the columns in the index.
① When querying, the order of index creation is completely followed. You can see that the index is used and the query is constant
② If you do not query according to the index order and do not fully use the index column, you can see that the index is not fully used
③ If you do not use the first row of the index column to query, you can see that the full table scan is not performed using the index at all
5.2.3 do nothing on the index column
Calculation, function, (automatic / manual) type conversion will lead to index failure and turn to full table scanning.
If the WHERE query string in SQL is not quoted, SQL will convert the string, resulting in index invalidation.
5.2.4 no range query on index column
The storage engine cannot use the column on the right of the index range condition, so it puts the index order of the fields that may be queried in the range last.
5.2.5 try to use overlay index
Queries that only access indexes (index columns and query columns are consistent) reduce SELECT * operations
5.2.6 do not judge null on index column
When MySQL is used (! = or < >), the index will become invalid, resulting in full table scanning
Although blank judgment will lead to index invalidation, specific conditions need to be analyzed. SQL query cannot only consider whether the index is invalid.
5.2.7 non empty queries are not allowed on index columns
In SQL, if the table field is allowed to be NULL, the use of IS NULL in the WHERE condition will not invalidate the index, but IS NOT NULL will invalidate the index.
Do not judge that the field is not empty. It is best to set the default value for the field.
5.2.8 correct use of fuzzy query
In SQL, LIKE starts with a wildcard ('% abc...') MySQL will cause the index to fail and turn to full table scanning. That is, when using LIKE for fuzzy matching, left blur and full blur will lead to index failure, and right blur can use index.
Interview question: how to solve the problem that the LIKE '%' character% 'index is not used? Establish overlay index to solve the problem of index failure caused by total fuzziness.
As shown in the figure below, the overlay index can be used when establishing an index in the name and age fields and then performing a full fuzzy query.
5.2.9 pay attention to the use of connection
Using OR connection will lead to index failure. Try to use UNION OR UNION ALL instead.
5.2.10 index optimization summary
Full value matching is my favorite and the leftmost principle should be observed.
The leading brother cannot die, and the middle brother cannot break.
The index column is less calculated, and the range is invalid.
LIKE 100% is written to the right, and the overlay index is not written *.
Unequal null values also have OR. You should pay attention to the impact of index.
VAR quotation marks should not be lost. There are tricks for SQL optimization.
5.2.11 index optimization interview questions
① Create database
create table test( id int primary key not null auto_increment, c1 char(10), c2 char(10), c3 char(10), c4 char(10), c5 char(10)); insert into test(c1,c2,c3,c4,c5) values ('a1','a2','a3','a4','a5'); insert into test(c1,c2,c3,c4,c5) values ('b1','b2','b3','b4','b5'); insert into test(c1,c2,c3,c4,c5) values ('c1','c2','c3','c4','c5'); insert into test(c1,c2,c3,c4,c5) values ('d1','d2','d3','d4','d5'); insert into test(c1,c2,c3,c4,c5) values ('e1','e2','e3','e4','e5'); create index idx_test03_c1234 on test03(c1,c2,c3,c4);
mysql> select * from test; +----+------+------+------+------+------+ | id | c1 | c2 | c3 | c4 | c5 | +----+------+------+------+------+------+ | 1 | a1 | a2 | a3 | a4 | a5 | | 2 | b1 | b2 | b3 | b4 | b5 | | 3 | c1 | c2 | c3 | c4 | c5 | | 4 | d1 | d2 | d3 | d4 | d5 | | 5 | e1 | e2 | e3 | e4 | e5 | +----+------+------+------+------+------+ 5 rows in set (0.00 sec)
② Create index
# Create index mysql> create index idx_test_c1234 on test(c1,c2,c3,c4); Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 # Query index mysql> show index from test; +-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | +-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ | test | 0 | PRIMARY | 1 | id | A | 5 | NULL | NULL | | BTREE | | | | test | 1 | idx_test_c1234 | 1 | c1 | A | 5 | NULL | NULL | YES | BTREE | | | | test | 1 | idx_test_c1234 | 2 | c2 | A | 5 | NULL | NULL | YES | BTREE | | | | test | 1 | idx_test_c1234 | 3 | c3 | A | 5 | NULL | NULL | YES | BTREE | | | | test | 1 | idx_test_c1234 | 4 | c4 | A | 5 | NULL | NULL | YES | BTREE | | | +-------+------------+----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+ 5 rows in set (0.00 se
③ Analyze the following SQL execution
1) Basic query
explain select * from test where c1 = 'a1'; explain select * from test where c1 = 'a1' and c2 = 'a2'; explain select * from test where c1 = 'a1' and c2 = 'a2' and c3 = 'a3'; explain select * from test where c1 = 'a1' and c2 = 'a2' and c3 = 'a3' and c4 = 'a4';
Indexes can be used for the above SQL, and the index columns used gradually increase.
2) Basic query
explain select * from test where c1 = 'a1' and c2 = 'a2' and c3 = 'a3' and c4 = 'a4'; explain select * from test where c1 = 'a1' and c3 = 'a3' and c2 = 'a2' and c4 = 'a4'; explain select * from test where c4 = 'a4' and c3 = 'a3' and c2 = 'a2' and c1 = 'a1';
The above SQL queries use four index columns because MySQL internal optimizer performs SQL optimization, but it is recommended that the index order be consistent with the query order.
3) Range query
explain select * from test where c1 = 'a1' and c2 = 'a2' and c3 > 'a3' and c4 = 'a4';
The above SQL can only apply index columns c1, c2 and c3, because the range query at c3 leads to index invalidation.
explain select * from test where c1 = 'a1' and c2 = 'a2' and c4 > 'a4' and c3 = 'a3';
The above SQL application index columns c1, c2, c3 and c4 are invalid because of the range query at c4, but c4 is the last index column. Although it is a range search, all index columns are still used.
4) Single valued sort query
explain select * from test where c1 = 'a1' and c2 = 'a2' and c4 = 'a4' order by c3;
The above SQL actually uses three index columns c1, c2 and c3, but only shows that c1 and c2 are used. The actual situation is that c1 and c2 participate in query operation and a3 participate in sorting operation. This leads to the two functions of index, query and sorting.
explain select * from test where c1 = 'a1' and c2 = 'a2' order by c3;
It is consistent with the SQL execution above.
explain select * from test where c1 = 'a1' and c2 = 'a2' order by c4;
The reason why filesort appears in the above SQL is that the composite index breaks at a3, resulting in the subsequent a4 can only sort files
5) Multi valued sort query
explain select * from test where c1 = 'a1' and c5 = 'a5' order by c2,c3;
The above SQL only uses the c1 index column. In fact, c2 and c3 are used for sorting.
explain select * from test where c1 = 'a1' and c5 = 'a5' order by c3,c2;
filesort occurs in the above SQL because the index breaks at c2, resulting in index failure and file sorting instead.
explain select * from test where c1 = 'a1' and c2 = 'a2' order by c2,c3;
Apply two index columns c1 and c2, and c2 and c3 participate in sorting at the same time
explain select * from test where c1 = 'a1' and c2 = 'a2' and c5 = 'a5' order by c2,c3;
Consistent with the previous SQL execution.
explain select * from test where c1 = 'a1' and c2 = 'a2' and c5 = 'a5' order by c3,c2;
Combined with the characteristics of MySQL, c2 does not participate in sorting, and only c3 participates in sorting. In fact, the index columns c1 and c2 are used for searching, and c3 is used for sorting, and the index is not invalid.
Reason: in MySQL, when the sorted value is a constant, this constant will not participate in sorting.
explain select * from test where c1 = 'a1' and c5 = 'a5' order by c3,c2;
filesort will appear in the above SQL, and only the index column c1 is used for query. When c3 is used for sorting, the index fails because the index breaks at c2, so the file sorting is carried out instead.
6) Grouping query
explain select * from test where c1 = 'a1' and c4 = 'a4' group by c2,c3;
The index column c1 is used for query, and the index columns c2 and c3 are used for grouping. The index is not invalid.
explain select * from test where c1 = 'a1' and c4 = 'a4' group by c3,c2;
filesort and temporary will appear in the above SQL, and only the index column c1 is used for query. When c3 is used for grouping, the index fails because the index breaks at c2, so the file sorting is carried out instead. Meanwhile, due to the grouping operation of c3, MySQL will sort before grouping, resulting in temporary tables.
④ Summary of interview questions
- The fixed value and range are actually sorted. Generally, order by gives a range;
- group by generally needs to be sorted, and a temporary table will be generated;
5.3 optimization suggestions
① For single key indexes, try to select fields with better filtering performance for the current query;
② When selecting a combined index, the fields with the best filtering performance in the current query are the better in the order of index fields;
③ When selecting a composite index, try to select an index that can contain more fields in the WHERE clause of the current query;
④ Try to select the appropriate index by analyzing the statistical information and adjusting the writing method of query;