The difference of NULL between oracle and mysql indexes

1, Question

oracle's btree index does not store NULL value, so index range scanning will not be used with is null or is not null, but is this also true in mysql?

2, Experiment

Let's take a look at NULL in oracle (11g)
Prepare test data

SQL> create table t1 as select * from dba_objects;
SQL> update t1 set object_id = null where object_id > 17840;
SQL> update t1 set data_object_id = null where data_object_id > 60;
SQL> commit;
SQL> create index idx1_id on t1(object_id);
SQL> create index idx2_data on t1(data_object_id);

Collect statistical information

SQL> begin
dbms_stats.gather_table_stats(ownname => 'SCOTT',
tabname => 'T1',
estimate_percent => 100,
cascade => true,  
method_opt => 'for all indexed columns size auto', 
no_invalidate => false,
degree => 4);
end;
/

View data distribution

SQL> select count(*) "Total number of rows", 
  2         count(distinct object_id) "object_id Non null different values", 
  3         count(decode(object_id,null,1,null)) "object_id Total number of null values",
  4         count(distinct data_object_id) "data_object_id Non null different values",
  5         count(decode(data_object_id,null,1,null)) "data_object_id Total number of null values"
  6    from t1;

    Total number of rows object_id Non null different values object_id Total number of null values data_object_id Non null different values data_object_id Total number of null values
---------- ------------------- ----------------- ------------------------ ----------------------
     13582               13578                 4                       47                  13510

Execute sql and view the execution plan
Article 1 sql: is null returns fewer rows

SQL> select * from t1 where object_id is null;  

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |    50 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   |     4 |   352 |    50   (0)| 00:00:01 |
--------------------------------------------------------------------------

Article 2 sql: is not null returns a large number of rows

SQL> select * from t1 where object_id is not null;  

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |    50 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   | 13578 |  1166K|    50   (0)| 00:00:01 |
--------------------------------------------------------------------------

Article 3 sql: is null returns more rows

SQL> select * from t1 where data_object_id is null;  

--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |       |       |    50 (100)|          |
|*  1 |  TABLE ACCESS FULL| T1   | 13510 |  1161K|    50   (0)| 00:00:01 |
--------------------------------------------------------------------------

Article 4 sql: is not null returns fewer rows

SQL> select * from t1 where data_object_id is not null;  

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |       |       |     7 (100)|          |
|   1 |  TABLE ACCESS BY INDEX ROWID| T1        |    72 |  6336 |     7   (0)| 00:00:01 |
|*  2 |   INDEX FULL SCAN           | IDX2_DATA |    72 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

It can be seen that the indexes will not be used for the sql in Articles 1 and 3. This is because the oracle btree index does not store NULL, so no results can be found in the index using is null as the condition, and the whole table can only be scanned.
The index is not used in the second sql because there are many rows returned. Article 4 sql uses the index, but uses the index full scan. In fact, the principle is that the index does not store NULL. is not null is exactly the same as the index characteristics.

Next, let's look at what happens in mysql (8.0). Import the above table into mysql through tools
Update statistics for t1 table

analyze table t1;

View execution plan
Article 5 sql: is null returns fewer rows

(scott@localhost)[hello]> explain select * from t1 where object_id is null; 
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | IDX1_ID       | IDX1_ID | 5       | const |    4 |   100.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+---------+---------+-------+------+----------+-----------------------+

Article 6 sql: is not null returns a large number of rows

(scott@localhost)[hello]> explain select * from t1 where object_id is not null; 
+----+-------------+-------+------------+------+---------------+------+---------+------+-------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows  | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+-------+----------+-------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | IDX1_ID       | NULL | NULL    | NULL | 13541 |    50.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+-------+----------+-------------+

Article 7 sql: is null returns more rows

(scott@localhost)[hello]> explain select * from t1 where data_object_id is null;
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key       | key_len | ref   | rows | filtered | Extra                 |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-----------------------+
|  1 | SIMPLE      | t1    | NULL       | ref  | IDX2_DATA     | IDX2_DATA | 5       | const | 6770 |   100.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+-----------+---------+-------+------+----------+-----------------------+

Article 8 sql: is not null returns fewer rows

(scott@localhost)[hello]> explain select * from t1 where data_object_id is not null;
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+
| id | select_type | table | partitions | type  | possible_keys | key       | key_len | ref  | rows | filtered | Extra                 |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+
|  1 | SIMPLE      | t1    | NULL       | range | IDX2_DATA     | IDX2_DATA | 5       | NULL |   72 |   100.00 | Using index condition |
+----+-------------+-------+------------+-------+---------------+-----------+---------+------+------+----------+-----------------------+

It can be seen that in mysql, is not null determines whether to use the index according to the number of returned rows. If the number of returned rows is more, the index is not used, and the number of returned rows is less, which is the same as oracle. But is null will use the index. No matter how many rows you return, this is really beyond my expectation. Since is null uses the index, does MySQL's btree index contain NULL values?

View index information

(scott@localhost)[hello]> select * from mysql.innodb_index_stats where database_name='hello' and table_name='t1' and index_name in ('IDX1_ID', 'IDX2_DATA');


You can see from the statistics of the index that mysql considers the object of the t1 table_ ID,DATA_ OBJECT_ The different values of ID are 13579 and 48 respectively. And we know that object_ ID non NULL different values and data_ object_ The different values of ID are 13578 and 47 respectively. The difference between the two is 1, which means that the index does contain NULL values.

3, Summary

  1. The btree index in mysql contains NULL, which is different from oracle.
  2. Indexes are used when using is null in mysql. No matter how many rows are returned, I think this is a bug.
    The level is limited. If there is any error, please correct it!

Keywords: MySQL

Added by fsumba on Sun, 26 Dec 2021 03:21:16 +0200