MySQL 7: MySQL indexing practice

1, Prepare test environment

1.1. Create data table

/* Employee table */
CREATE TABLE `emp` (
  `id` int(8) NOT NULL AUTO_INCREMENT,
  `name` varchar(20) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL DEFAULT '',
  `job` varchar(9) NOT NULL DEFAULT '',
  `hiredate` date NOT NULL, /* Entry date */
  `sal_grade` int(8) NOT NULL,
  `age` int(4) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=4500001 DEFAULT CHARSET=utf8;

/* Wage scale */
CREATE TABLE `sal_grade` (
  `grade` int(8) unsigned NOT NULL DEFAULT '0',
  `losal` decimal(17,2) NOT NULL, /* minimum wage */
  `hisal` decimal(17,2) NOT NULL  /* Maximum wage */
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

1.2. Create a function to generate data

DELIMITER $$
/* Create a function that generates random numbers */
CREATE PROCEDURE `rand_num`() RETURNS int(11)
BEGIN
 DECLARE i INT DEFAULT 0;
 SET i = FLOOR(10+RAND()*100);
 RETURN i;
END $$
DELIMITER ;

DELIMITER $$
/* Create a function that generates a random string */
CREATE PROCEDURE `rand_string`(n INT) RETURNS varchar(255) CHARSET utf8
BEGIN
 DECLARE chars_str VARCHAR(100) DEFAULT 'abcdefghijklmnopqrstuvwxyzABCDEFJHIJKLMNOPQRSTUVWXYZ'; 
 DECLARE return_str VARCHAR(255) DEFAULT '';
 DECLARE i INT DEFAULT 0; 
 WHILE i < n DO
   SET return_str =CONCAT(return_str,SUBSTRING(chars_str,FLOOR(1+RAND()*52),1));
   SET i = i + 1;
   END WHILE;
 RETURN return_str;
END $$
DELIMITER ;

/* Function to insert data into emp table */
DELIMITER $$
CREATE PROCEDURE insert_emp(IN START INT(10),IN max_num INT(10))
BEGIN
 DECLARE i INT DEFAULT 0;
 SET autocommit = 0;
 REPEAT
  SET i = i + 1;
  INSERT INTO emp(`name`, `job`, `hiredate`, `sal_grade` ) VALUES (rand_string(6),'sale', DATE_ADD(CURDATE(), interval -rand_num() day), i%5 + 1);
	INSERT INTO emp(`name`, `job`, `hiredate`, `sal_grade` ) VALUES (rand_string(6),'development', DATE_ADD(CURDATE(), interval -rand_num() day), i%5 + 1);
	INSERT INTO emp(`name`, `job`, `hiredate`, `sal_grade` ) VALUES (rand_string(6),'product', DATE_ADD(CURDATE(), interval -rand_num() day), i%5 + 1);
  UNTIL i = max_num END REPEAT;
 COMMIT;
END $$
DELIMITER ;

If there is an error in execution, you can try to create one by one!

1.3 test data generated during execution

/* Insert 1500000 * 3 = 4500000 pieces of data into the emp table */
DELIMITER $$
CALL insert_emp(1,1500000)$$
DELIMITER ;

2, Index actual test

1. The first field of the joint index uses the range. MySQL generally does not use the index, but adopts the method of full table scanning.
Explain select * from EMP where name > 'Hu' and age = 70 and job = 'development';
Why possible_ The keys field has a joint index, that is, the SQL statement can use the joint index, but MySQL chooses full table scanning?
My guess is that the MySQL optimizer thinks that the first field of the joint index is used in a wide range, the result set is relatively large, and the use of SELECT * does not meet the conditions for covering the index, so it needs to return to the table, which will lead to low efficiency, so it turns to full table scanning.
Since MySQL chose full table scanning because of the low efficiency of returning to the table, let's test the situation that full table scanning is not required, that is, make the queried fields meet the conditions of overwriting the index.
Explain select name, age, job from EMP where name > "Hu" and age = 70 and job = 'development';
I didn't know. I was startled by the test. As expected, it was the same as I thought. Although range lookup is used for the first field of the joint index, MySQL chooses to use the joint index when the queried fields meet the conditions of overwriting the index and no back table operation is required.
Note: key_len = 62, and the name field is varchar(20). According to the formula of 3n+2, the SQL only uses the first field name of the joint index, while the age and job fields are not used.

2. The first field of the joint index uses the right fuzzy query, and MySQL generally uses the index.
explain select * from emp where name like "hu%" and age = 70 and job = 'development';

explain select name, age, job from emp where name like "hu%" and age = 70 and job = 'development';
As can be seen from the above two figures, when the first field of the joint index uses the like right fuzzy query, the index will be used regardless of whether the conditions for overwriting the index are met, and the key_len is 96, indicating that the three fields of the joint index are used.
Why does MySQL use full table scanning when the first field of the joint index uses range search and does not meet the coverage index, and use the index when the first field of the joint index uses like right fuzzy query?
My guess is that compared with the query condition name > 'Hu%' and name like 'hu%', the result set of the former is relatively large and needs to be returned more times.
Why only the name of the first field of the joint index is used when the first field of the joint index uses the range search and meets the coverage index, but when the first field of the joint index uses the like right fuzzy query, all the fields of the joint index name, age and job are used?
Haven't figured out o(╥﹏╥) o

3. In a federated index, when a field uses a range condition, the index field behind the field cannot use the index.

  1. Use range condition of the first field of the union index:
    Select name, age, job from EMP where name > "Hu" and age = 20 and job = "product";

    key_ The value of len is 62, indicating that only the first field name of the joint index is used.
  2. Use range condition of the second field of the union index:
    Explain select name, age, job from EMP where name = "Hu" and age > 20 and job = "product";

    key_ The value of len is 67, indicating that the first two fields name and age of the joint index are used.
  3. Only the last field of the union index uses the range condition:
    Explain select name, age, job from EMP where name = "Hu" and age = 20 and job > "product";

    key_ The value of len is 96, indicating that all the fields name, age and job of the joint index are used.

4. To be continued

Keywords: Database MySQL

Added by DirtySnipe on Fri, 11 Feb 2022 11:32:23 +0200