MySQL learning notes - 9-order by

select city,name,age from t where city='Hangzhou' order by name limit 1000;

I-1 full field sorting execution process

1. Initialize sort_buffer: confirm to put the name, city and age fields;
2. Find the first primary key id that meets the condition of city = 'Hangzhou' from the index City, that is, the id in the figure_ X;
3. Go to the primary key id index, take out the whole row, take the values of the three fields name, city and age, and store them in sort_ In buffer;
4. Get the primary key id of a record from the index city;
5. Repeat steps 3 and 4 until the value of city does not meet the query criteria. The corresponding primary key id is the id in the figure_ Y;
6. Yes, sort_ The data in the buffer is quickly sorted according to the field name;
7. Take the first 1000 rows according to the sorting results and return them to the client.

Disadvantages of I-2 full field sorting:

1. Cause sort_ There is not much data in the buffer, because there are other fields besides sorting fields. Sort_ The utilization efficiency of buffer is not high
2. When a large amount of sorting data is required, there will be a lot of temporary files and the sorting performance will be very poor

I-3 advantages of full field sorting:

MySQL thinks that when the memory is large enough, it will give priority to full field sorting, because this method avoids a back to table operation than rowid sorting

Relevant knowledge points of full field sorting:
MySQL One memory is allocated for each thread( sort_buffer)The memory size for sorting is sort_buffer_size: 
If the amount of data sorted is less than sort_buffer_size,Sorting will be done in memory;
If the amount of sorting data is large and so much data cannot be stored in memory, temporary disk files will be used to assist sorting, also known as external sorting;
When using external sorting, MySQL It will be divided into several separate temporary files to store the sorted data, and then these files will be merged into a large file.

II-1rowid sorting execution process

1. Initialize sort_buffer: confirm to put in two fields, namely name and id;
2. Find the first primary key id that meets the condition of city = 'Hangzhou' from the index City, that is, the id in the figure_ X;
3. Go to the primary key id index, take out the whole row, take the name and id fields, and store them in sort_ In buffer;
4. Get the primary key id of a record from the index city; Repeat steps 3 and 4 until the condition of city = 'Hangzhou' is not met, that is, the id in the figure_ Y;
5. Yes, sort_ The data in the buffer is sorted by the field name; Traverse the sorting result, take the first 1000 rows, return to the original table according to the id value, and take out the three fields of city, name and age to return to the client.

Disadvantages of II-2rowid sorting:

The operation of returning to the table is random IO, which will cause a large number of random reads. It does not necessarily reduce the access to the disk than full field sorting

II-3rowid sorting advantages:

Better use of memory sort_ The buffer performs sorting operations to minimize access to the disk

rowid Sort relevant knowledge points:
mysql The data that meets the conditions will be read to by traversing the index sort_buffer,And quickly sort according to the sorting field
 If the queried fields are not included in the secondary index, you need to return the clustered index according to the primary key of the secondary index record to get the required fields
 This method will cause random IO,stay MySQL5.6 Provided MRR The mechanism will take out the primary key of the auxiliary index matching record, sort it in memory, and then return it to the table
 Establish a joint index according to the situation to avoid the performance loss caused by sorting. If allowed, you can also establish an overlay index to avoid returning to the table

III-1 joint index sorting: create a joint index of city and name

alter table t add index city_user(city, name);

1. Find the first primary key id that meets the condition of city = 'Hangzhou' from the index (city,name);
2. Get the whole row from the primary key id index, get the values of name, city and age, and return them directly as part of the result set;
3. Remove the primary key id of a record from the index (city,name);
4. Repeat steps 2 and 3 until the 1000th record is found or the condition of city = 'Hangzhou' is not met.

III-2 disadvantages of joint index:

Cost of maintaining index

III-3 advantages of joint index:

The query process does not need temporary tables or sorting, and the performance is optimized a lot.

IV-1 overlay index sorting: joint index of city, name and age

alter table t add index city_user_age(city, name, age);

1. Find the first record that meets the condition of city = 'Hangzhou' from the index (city,name,age), take out the values of the three fields of city, name and age, and return them directly as part of the result set;
2. Take a record from the index (city,name,age), also take out the values of these three fields, and return them directly as part of the result set;
3. Repeat step 2 until the 1000th record is found or the condition of city = 'Hangzhou' is not met.

IV-2 disadvantages of joint index:

Cost of maintaining index

IV-3 joint index advantages:

The query process does not need temporary tables or sorting, and the performance is optimized a lot.

How to determine whether a sort statement uses temporary files:

/* Open optimizer_trace, valid only for this thread */
SET optimizer_trace='enabled=on'; 

/* @a Save InnoDB_ rows_ Initial value of read */
select VARIABLE_VALUE into @a from  performance_schema.session_status where variable_name = 'Innodb_rows_read';

/* Execute statement */
select city, name,age from t where city='Hangzhou' order by name limit 1000; 

/* View OPTIMIZER_TRACE output */
SELECT * FROM `information_schema`.`OPTIMIZER_TRACE`\G

/* @b Save InnoDB_ rows_ Current value of read */
select VARIABLE_VALUE into @b from performance_schema.session_status where variable_name = 'Innodb_rows_read';

/* Calculate Innodb_rows_read difference */
select @b-@a;

Keywords: MySQL

Added by millwardt on Tue, 01 Feb 2022 19:28:55 +0200