Why does Alibaba prohibit MyBatis from batch inserting thousands of data and using foreach

1. Method foreach batch insert

Because the code of the project cannot be leaked. The following is the simulated code.

<insert id="batchInsert" parameterType="java.util.List">
    insert into USER (id, name) values
    <foreach collection="list" item="model" index="index" separator=","> 
        (#{model.id}, #{model.name})
    </foreach>
</insert>

The principle of this method to improve the batch insertion speed is to combine the traditional:

INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");
INSERT INTO `table1` (`field1`, `field2`) VALUES ("data1", "data2");

Convert to:

INSERT INTO `table1` (`field1`, `field2`) 
VALUES ("data1", "data2"),
("data1", "data2"),
("data1", "data2"),
("data1", "data2"),
("data1", "data2");

At first glance, there seems to be no problem with this foreach, but through the project practice, it is found that when the number of columns in the table is large (20 +) and the number of rows inserted at one time is large (5000 +), the whole insertion takes a long time, up to 14 minutes

SqlSession session = sessionFactory.openSession(ExecutorType.BATCH);
for (Model model : list) {
    session.insert("insertStatement", model);
}
session.flushStatements();

The default executor type is Simple. A new preprocessing statement will be created for each statement, that is, a PreparedStatement object will be created. In our project, we will keep using the method of batch insertion. Because MyBatis cannot cache the contained statements, the sql statements will be re parsed every time the method is called.

Time consuming: since I have 5000 + values after foreach, this PreparedStatement is very long and contains many placeholders. The mapping of placeholders and parameters is particularly time-consuming. Moreover, referring to relevant data, we can see that the growth of values and the required analysis time increase exponentially.

Therefore, if you have to use foreach for batch insertion, you can consider reducing the number of values in an insert statement. It is best to reach the bottom value of the above curve to make the speed the fastest. Generally speaking, according to experience, it is more appropriate to insert 20 ~ 50 rows at one time, and the time consumption is acceptable.

As mentioned above, if you have to insert in a way, you can improve the performance. In fact, another method is recommended when batch insertion is written in MyBatis document.

2. Method executortype Batch batch insert

SqlSession session = sqlSessionFactory.openSession(ExecutorType.BATCH);
try {
    SimpleTableMapper mapper = session.getMapper(SimpleTableMapper.class);
    List<SimpleTableRecord> records = getRecordsToInsert(); // not shown
 
    BatchInsert<SimpleTableRecord> batchInsert = insert(records)
            .into(simpleTable)
            .map(id).toProperty("id")
            .map(firstName).toProperty("firstName")
            .map(lastName).toProperty("lastName")
            .map(birthDate).toProperty("birthDate")
            .map(employed).toProperty("employed")
            .map(occupation).toProperty("occupation")
            .build()
            .render(RenderingStrategy.MYBATIS3);
 
    batchInsert.insertStatements().stream().forEach(mapper::insert);
 
    session.commit();
} finally {
    session.close();
}

That is, the basic idea is to set the executor type of MyBatis session to Batch, and then execute the insert statement multiple times. It is similar to the following statement of JDBC.

Connection connection = DriverManager.getConnection("jdbc:mysql://127.0.0.1:3306/mydb?useUnicode=true&characterEncoding=UTF-8&useServerPrepStmts=false&rewriteBatchedStatements=true","root","root");
connection.setAutoCommit(false);
PreparedStatement ps = connection.prepareStatement(
        "insert into tb_user (name) values(?)");
for (int i = 0; i < stuNum; i++) {
    ps.setString(1,name);
    ps.addBatch();
}
ps.executeBatch();
connection.commit();
connection.close();

After testing, executortype The performance of batch insertion method is significantly improved, and all insertion can be completed in less than 2s.

3. Summary

If MyBatis requires batch insertion, executortype is recommended For the insertion method of batch, if it is not necessary to use the insertion, the record of each insertion needs to be controlled at about 20 ~ 50.

Keywords: Interview batch

Added by PTS on Wed, 02 Feb 2022 08:50:16 +0200