[dry goods, what happened to the senior Java Engineer dismissed by Tencent

END;;

– generate 100w random data
– it is expected to take half an hour or more. In fact, 1w messages can also be generated. The main reason is that more data can better reflect the importance of the index
call create_test_data(100*10000);

# Explain

Explain Is to identify a SQL Whether to take the easiest way to index, we can use this method to SQL For tuning, this article only needs to focus on the following items, about Exolain The specific description of is available[Specify]( )

*   type(  From the best to the worst const  > eq_ref > ref > range > index > all )
    *   const It means that it can be found once through the index, const For comparison primary key perhaps unique Indexes
    *   eq_ref Used in multi table joins primary key perhaps unique key As association condition
    *   ref  Non unique index scanning returns all rows matching an individual value. In essence, it is also an index access. It returns all rows matching an individual value. However, it may find multiple qualified rows, so it should belong to a mixture of search and scanning
    *   range Retrieve only rows in a given range, generally where An error occurred in the statement between,in Equal range query. This range scan index scan is better than full table scan
    *   index  Traverse the entire table, ALL The difference is index Type traverses only the index tree ( select Index column or order by There are two cases, but where No index is used )
    *   all  Traverse the entire table to find matching rows
    *   Generally, the query is guaranteed to reach at least range Level, preferably ref. 
*   key Which index will be used in this query
*   key_len The prefix length or the entire length used by the index
*   row Number of record lines scanned

– test that field b has an index and field c has no index

SELECT * from users where b = 'whatever, just test'; – Cost 0.001s

SELECT * from users where c = 'whatever, just test'; – Cost 0.306s

![](https://upload-images.jianshu.io/upload_images/24195226-47cfab1c1c5bd232.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

![](https://upload-images.jianshu.io/upload_images/24195226-2f65c04179ce5618.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)

# SQL optimization recommendations

## Use less select*

It's a cliche. Everyone knows it.

## Reasonable use limit 1

If you know that there is only one query result or only one record, it is recommended to use limit 1，Of course, there is no need to use a unique index if it already exists.

## Rational use of join

> Inner join Inner join: when two tables are queried, only the exact matching result sets in the two tables are retained
> 
> left join When a join query is performed on two tables, all rows in the left table will be returned, even if there is no matching record in the right table
> 
> right join When two tables are queried, all rows in the right table will be returned, even if there is no matching record in the left table

All satisfied SQL On the premise of demand, it is recommended to give priority to use Inner join(Internal connection), if you want to use left join，The data results in the left table shall be as small as possible. If conditions permit, they shall be processed on the left as far as possible.

## Batch insert data

When the number is small, it is not a big problem to insert one by one. If the amount of data is two, it is more efficient to use batch pull in statements

for(){

INSERT INTOtest.users(a,b,c) VALUES ('hLQK51GcL6','1DXIzvIS3t','4LsQGKva6U')

}

Better:

INSERT INTO test.users (a, b, c)
VALUES
– you can splice statements here, such as mybatis
(
'hLQK51GcL6',
'1DXIzvIS3t',
'4LsQGKva6U'
),
(
'hLQK51GcL6',
'1DXIzvIS3t',
'4LsQGKva6U'
)

## Try to replace union with union all

If used union，No matter whether the search results are duplicate or not, it will try to merge and sort before outputting the final results. If there are no duplicate records of known search results, use union all replace union，This will improve efficiency.

– execution time 0.06s
SELECT
*
FROM
users
LIMIT 0,
10000
UNION ALL
SELECT
*
FROM
users
LIMIT 10000,
20000

– execution time 0.2s
SELECT
*
FROM
users
LIMIT 0,
10000
UNION
SELECT
*
FROM
users
LIMIT 10000,
20000

## Several situations that invalidate an index

*   where There are no matching field types in the criteria
*   where Used in NOT,!=,IN ("IN" Mysql5.6 And above support index)
*   where Used in OR Connect fields without indexes
*   where Used in in (mysql5.6 And above support index)
*   like '%keyword%'
*   where Operate on fields or use functions in
*   Composite index used but not used"Boot column"

We know that in the test table b Fields are indexed, c There is no index. Let's test it one by one

**where There are no matching field types in the criteria**

– b is a string type, where and an integer is written. Although the sql can be executed normally, the index will not be used
EXPLAIN SELECT * from users where b=1;

** NOT,!=**

– will invalidate the index
EXPLAIN SELECT * from users where b not in('a');
EXPLAIN SELECT * from users where b is not null;
EXPLAIN SELECT * from users where b !='a'

**OR**

– use or to connect fields without indexes. Suppose it goes through the index of b, but when it comes to the query condition of c, it has to scan the whole table
– that is, a three-step process is required: full table scan + index scan + merge. Therefore, OR will cause index invalidation
– note that c in the test table does not have an index. If c also has an index, it is OK to use or
EXPLAIN SELECT * from users where b='a' or c='a'

– optimization method

1. Use in instead
SELECT * from users where b in ('b','bbb')

2.UNION
– for or, we can optimize our sql in this way. Although the second sql does not go through the index, the first sql goes through the index
SELECT * from users where b = 'b'
UNION
SELECT * from users where c = 'c'

**LIKE**

-% keyword% invalidates the index
SELECT * from users where a like '%abc%'

– for a positive example, "keyword%" can be used to improve query efficiency, similar to prefix index
SELECT * from users where a like 'abc%'

**where Operate on fields or use functions in**

– will invalidate the index
EXPLAIN SELECT * from users where YEAR(ctime) = '2020';
EXPLAIN SELECT * from users where d+1=2;

**Greater than sign and less than sign**

– in mysql, the greater than sign and the less than sign are magical things. Sometimes the index is used and sometimes the index is not used. It is said that it is related to the number of results. When the number is small (there is a proportion found on the Internet), the index is used
– it is recommended not to use ><

– you can filter out different number of tests by time
SELECT id from users where ctime>'2020-03-30 19:45:30'

**Composite index used but not used"Boot column"**

– it can be seen that there is a composite index idx in the table_ ABC (a, B, c), and an idx_b index, let's put IDX first_ B delete
– the following sql does not use the "boot column", so IDX will not be used_ ABC index, "guide column" only refers to the first field of the composite index
EXPLAIN SELECT * from users where c='c' and b='b' ;

– for the positive example, only a appears
EXPLAIN SELECT * from users where a='a' and b='b' ;
EXPLAIN SELECT * from users where a='a' and c='c' ;

## limit paging optimization

In our daily paging requirements, we usually use limit realization

– common practices
SELECT * from users LIMIT 1000000,10

When the offset is the largest, the query efficiency will be lower because Mysql Instead of skipping the offset to get the subsequent data directly, the offset is set first+The number of entries to be retrieved, and then discard the data of the previous offset and return.

Optimizing paging needs to be combined with business. Here are several solutions. There is no best but the most appropriate

**where Plus time filtering**

For example, only the latest year's data and only this year's data are obtained where createtime>'2020-01-01' 

**Discard the selected page, that is, there is only the previous page and the next page**

1.  Page 1 direct query
2.  Get first page max(id)，For example, 123 is usually the last data,
3.  Then query with index, so that only 10 pieces of data can be scanned at a time where id>123 limit 10

**Limit pages**

If only the first 100 pages are allowed to be obtained

# Index optimization

## Indexing

mysql Indexes in are divided into primary key index, unique index, general index and full-text index. The first three are commonly used. The first follows the primary key without manual creation, and the fourth full-text index is used for full-text search. only InnoDB and MyISAM Storage engine support FULLTEXT Index and are only available for CHAR， VARCHAR and TEXT Columns are rarely used, because some full-text retrieval frameworks such as elasticsearch，Instead of searching in the database.

– single column index
CREATE INDEX index_name ON users (name);
– multi column index

Last words

CodeChina open source project: [analysis of Java interview questions of front-line large manufacturers + core summary learning notes + latest explanation Video]

No matter which company, it attaches great importance to Spring framework technology and foundation, so don't underestimate any knowledge. Interview is a two-way choice process. Don't interview with fear, which is not conducive to your own play.
At the same time, you should focus not only on salary, but also on whether you really like the company. Well, I hope this article will be helpful to you!
In addition, I have collected interview knowledge points from many companies and shared various Java core knowledge points for free,
Below is only a partial screenshot
Factory Java interview question analysis + core summary learning notes + latest explanation Video]]( https://codechina.csdn.net/m0_60958482/java-p7)**

Keywords: Java Database SQL Back-end Programmer

Added by sujithtomy on Tue, 07 Sep 2021 04:21:14 +0300

Programming VIP

[dry goods, what happened to the senior Java Engineer dismissed by Tencent

Last words

Popular Keywords