Still using POI? Try EasyExcel, easily export 100W data, do not get stuck, easy to use!

I brief introduction

Export is a common function of the background management system. When the amount of data is particularly large, there will be memory overflow and stuck pages. Once I encapsulated an export. Batch query data is used to avoid memory overflow and SXSSFWorkbook is used to cache data to files to solve the problem of blocking pages when downloading large files EXCEL.

However, first, the encapsulation is not friendly and inconvenient to use. Second, the operation mode of these POIs still occupies too much memory. Third, there are data defects during empty loop and division, as well as the hidden danger of memory overflow.

Inadvertently queried the EasyExcel framework open source by Alibaba and found that the memory occupation of parsed EXCEL can be controlled at KB level, and there will be no memory overflow (internal implementation to be studied). In addition, it is extremely fast, with about 100W records and more than a dozen fields, which can be downloaded in only 70 seconds.

So he abandoned his own package and moved to study Alibaba's open source EasyExcel However, to be honest, the one you encapsulated at that time still had some technical content, such as appearance mode, template method mode, delegation idea and combination idea. You can have a look. In addition, wechat search focuses on the Java technology stack and sends: design patterns. You can get the practical tutorial of Java design patterns I sorted out.

The github address of EasyExcel is: https://github.com/alibaba/ea...

II case

2.1 POM dependency

<!-- Ali Kaiyuan EXCEL -->
<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>easyexcel</artifactId>
    <version>1.1.1</version>
</dependency>

2.2 POJO object

package com.authorization.privilege.excel;
 
import java.util.Date;
 
/**
 * @author qjwyss
 * @description
 */
public class User {
 
    private String uid;
    private String name;
    private Integer age;
    private Date birthday;
 
    public User() {
    }
 
    public User(String uid, String name, Integer age, Date birthday) {
        this.uid = uid;
        this.name = name;
        this.age = age;
        this.birthday = birthday;
    }
 
    public String getUid() {
        return uid;
    }
 
    public void setUid(String uid) {
        this.uid = uid;
    }
 
    public String getName() {
        return name;
    }
 
    public void setName(String name) {
        this.name = name;
    }
 
    public Integer getAge() {
        return age;
    }
 
    public void setAge(Integer age) {
        this.age = age;
    }
 
    public Date getBirthday() {
        return birthday;
    }
 
    public void setBirthday(Date birthday) {
        this.birthday = birthday;
    }
}

2.3 test environment

2.3. 1. Less data (within 20W): one SHEET can be queried and exported at a time
/**
 * For a small number of records (about within 20W), you can call this method to find out at one time and then write it to a SHEET in EXCEL
 * Note: the number of records queried at one time should not be too large to prevent memory overflow.
 *
 * @throws IOException
 */
@Test
public void writeExcelOneSheetOnceWrite() throws IOException {

    //Generate EXCEL and specify the output path
    OutputStream out = new FileOutputStream("E:\\temp\\withoutHead1.xlsx");
    ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX);

    //Set SHEET
    Sheet sheet = new Sheet(1, 0);
    sheet.setSheetName("sheet1");

    //Set title
    Table table = new Table(1);
    List<List<String>> titles = new ArrayList<List<String>>();
    titles.add(Arrays.asList("user ID"));
    titles.add(Arrays.asList("name"));
    titles.add(Arrays.asList("Age"));
    titles.add(Arrays.asList("birthday"));
    table.setHead(titles);

    //The query data can be exported. For example, a total of 100 pieces of data can be queried at one time
    List<List<String>> userList = new ArrayList<>();
    for (int i = 0; i < 100; i++) {
        userList.add(Arrays.asList("ID_" + i, "Xiao Ming" + i, String.valueOf(i), new Date().toString()));
    }

    writer.write0(userList, sheet, table);
    writer.finish();
}
2.3. 2. Moderate data volume (within 100W): one SHEET is queried and exported in batches
/**
 * For the number of records within 105W, you can call this method to find out in multiple batches, and then write it to a SHEET in EXCEL
 * be careful:
 * The number of records queried each time should not be too large. Set a reasonable number of records queried each time according to the memory size without memory overflow.
 * The data volume cannot exceed the maximum data volume of 105W stored in a SHEET
 *
 * @throws IOException
 */
@Test
public void writeExcelOneSheetMoreWrite() throws IOException {

    //Generate EXCEL and specify the output path
    OutputStream out = new FileOutputStream("E:\\temp\\withoutHead2.xlsx");
    ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX);

    //Set SHEET
    Sheet sheet = new Sheet(1, 0);
    sheet.setSheetName("sheet1");

    //Set title
    Table table = new Table(1);
    List<List<String>> titles = new ArrayList<List<String>>();
    titles.add(Arrays.asList("user ID"));
    titles.add(Arrays.asList("name"));
    titles.add(Arrays.asList("Age"));
    titles.add(Arrays.asList("birthday"));
    table.setHead(titles);

    //Simulated batch query: the total number of records is 50, 20 records are queried each time, and the number of records queried in three times is 10
    Integer totalRowCount = 50;
    Integer pageSize = 20;
    Integer writeCount = totalRowCount % pageSize == 0 ? (totalRowCount / pageSize) : (totalRowCount / pageSize + 1);

    //Note: This is only for the purpose of simulating data. The practical environment does not need to separate the last time and synthesize one. The parameter is: currentPage = i+1;   pageSize = pageSize
    for (int i = 0; i < writeCount; i++) {

        //The first two queries {each time 20 pieces of data were queried
        if (i < writeCount - 1) {

            List<List<String>> userList = new ArrayList<>();
            for (int j = 0; j < pageSize; j++) {
                userList.add(Arrays.asList("ID_" + Math.random(), "Xiao Ming", String.valueOf(Math.random()), new Date().toString()));
            }
            writer.write0(userList, sheet, table);

        } else if (i == writeCount - 1) {

            //The last query} found 10 extra records
            List<List<String>> userList = new ArrayList<>();
            Integer lastWriteRowCount = totalRowCount - (writeCount - 1) * pageSize;
            for (int j = 0; j < lastWriteRowCount; j++) {
                userList.add(Arrays.asList("ID_" + Math.random(), "Xiao Ming", String.valueOf(Math.random()), new Date().toString()));
            }
            writer.write0(userList, sheet, table);
        }
    }

    writer.finish();
}
2.3. 3. Large amount of data (millions are OK): multiple sheets are queried and exported in batches
/**
 * For millions of records, you can call this method to find them in multiple batches, and then write them to multiple sheets in EXCEL
 * be careful:
 * perSheetRowCount % pageSize It should be divisible. For simplicity, the non divisible part is not handled
 * The number of records queried each time should not be too large. Set a reasonable number of records queried each time according to the memory size without memory overflow.
 *
 * @throws IOException
 */
@Test
public void writeExcelMoreSheetMoreWrite() throws IOException {

    //Generate EXCEL and specify the output path
    OutputStream out = new FileOutputStream("E:\\temp\\withoutHead3.xlsx");
    ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX);

    //Set SHEET name
    String sheetName = "test SHEET";

    //Set title
    Table table = new Table(1);
    List<List<String>> titles = new ArrayList<List<String>>();
    titles.add(Arrays.asList("user ID"));
    titles.add(Arrays.asList("name"));
    titles.add(Arrays.asList("Age"));
    titles.add(Arrays.asList("birthday"));
    table.setHead(titles);

    //Simulated batch query: the total number of records is 250, 100 are stored in each SHEET, and 20 are queried each time. 3 sheets are generated. The first two sheets are queried 5 times, the last SHEET is queried 3 times, and the number of records written last is 10
    //Note: in this version, in order to reduce the complexity of data judgment, the perSheetRowCount should be able to divide the pageSize without too much processing, and can reasonably allocate the query data size without memory overflow.
    Integer totalRowCount = 250;
    Integer perSheetRowCount = 100;
    Integer pageSize = 20;
    Integer sheetCount = totalRowCount % perSheetRowCount == 0 ? (totalRowCount / perSheetRowCount) : (totalRowCount / perSheetRowCount + 1);
    Integer previousSheetWriteCount = perSheetRowCount / pageSize;
    Integer lastSheetWriteCount = totalRowCount % perSheetRowCount == 0 ?
            previousSheetWriteCount :
            (totalRowCount % perSheetRowCount % pageSize == 0 ? totalRowCount % perSheetRowCount / pageSize : (totalRowCount % perSheetRowCount / pageSize + 1));

    for (int i = 0; i < sheetCount; i++) {

        //Create SHEET
        Sheet sheet = new Sheet(i, 0);
        sheet.setSheetName(sheetName + i);

        if (i < sheetCount - 1) {

            //For the first two sheets, check each SHEET five times, check 20 items each time, write 100 lines for each SHEET, and 200 lines for the two sheets in total. Practical environment: parameters: currentPage: j+1 + previousSheetWriteCount*i, pageSize: pageSize
            for (int j = 0; j < previousSheetWriteCount; j++) {
                List<List<String>> userList = new ArrayList<>();
                for (int k = 0; k < 20; k++) {
                    userList.add(Arrays.asList("ID_" + Math.random(), "Xiao Ming", String.valueOf(Math.random()), new Date().toString()));
                }
                writer.write0(userList, sheet, table);
            }

        } else if (i == sheetCount - 1) {

            //The last SHEET utility environment does not need to separate the last one, but can be synthesized into one. The parameters are: currentPage = i+1;   pageSize = pageSize
            for (int j = 0; j < lastSheetWriteCount; j++) {

                //The first two queries # 20 at a time
                if (j < lastSheetWriteCount - 1) {

                    List<List<String>> userList = new ArrayList<>();
                    for (int k = 0; k < 20; k++) {
                        userList.add(Arrays.asList("ID_" + Math.random(), "Xiao Ming", String.valueOf(Math.random()), new Date().toString()));
                    }
                    writer.write0(userList, sheet, table);

                } else if (j == lastSheetWriteCount - 1) {

                    //The last query} will find the remaining 10 items
                    List<List<String>> userList = new ArrayList<>();
                    Integer lastWriteRowCount = totalRowCount - (sheetCount - 1) * perSheetRowCount - (lastSheetWriteCount - 1) * pageSize;
                    for (int k = 0; k < lastWriteRowCount; k++) {
                        userList.add(Arrays.asList("ID_" + Math.random(), "Xiaoming 1", String.valueOf(Math.random()), new Date().toString()));
                    }
                    writer.write0(userList, sheet, table);

                }
            }
        }
    }

    writer.finish();
}

2.4 production environment

2.4.0.Excel constant class
package com.authorization.privilege.constant;
 
/**
 * @author qjwyss
 * @description EXCEL Constant class
 */
public class ExcelConstant {
 
    /**
     * The number of records stored in each sheet is 100W
     */
    public static final Integer PER_SHEET_ROW_COUNT = 1000000;
 
    /**
     * The number of records written to EXCEL each time (query the data size of each page) is 20W
     */
    public static final Integer PER_WRITE_ROW_COUNT = 200000;
 
}

Note: for the convenience of writing, the two here must be divided, which can save a lot of unnecessary judgment. In addition, if you test by yourself, it can be changed to 100,20.

2.4. 1. Less data (within 20W): one SHEET can be queried and exported at a time
@Override
public ResultVO<Void> exportSysSystemExcel(SysSystemVO sysSystemVO, HttpServletResponse response) throws Exception {

    ServletOutputStream out = null;
    try {
        out = response.getOutputStream();
        ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX);

        //Set EXCEL name
        String fileName = new String(("SystemExcel").getBytes(), "UTF-8");

        //Set SHEET name
        Sheet sheet = new Sheet(1, 0);
        sheet.setSheetName("System list sheet1");

        //Set title
        Table table = new Table(1);
        List<List<String>> titles = new ArrayList<List<String>>();
        titles.add(Arrays.asList("System name"));
        titles.add(Arrays.asList("System identification"));
        titles.add(Arrays.asList("describe"));
        titles.add(Arrays.asList("state"));
        titles.add(Arrays.asList("Creator"));
        titles.add(Arrays.asList("Creation time"));
        table.setHead(titles);

        //Query data and write EXCEL
        List<List<String>> dataList = new ArrayList<>();
        List<SysSystemVO> sysSystemVOList = this.sysSystemReadMapper.selectSysSystemVOList(sysSystemVO);
        if (!CollectionUtils.isEmpty(sysSystemVOList)) {
            sysSystemVOList.forEach(eachSysSystemVO -> {
                dataList.add(Arrays.asList(
                        eachSysSystemVO.getSystemName(),
                        eachSysSystemVO.getSystemKey(),
                        eachSysSystemVO.getDescription(),
                        eachSysSystemVO.getState().toString(),
                        eachSysSystemVO.getCreateUid(),
                        eachSysSystemVO.getCreateTime().toString()
                ));
            });
        }
        writer.write0(dataList, sheet, table);

        //Download EXCEL
        response.setHeader("Content-Disposition", "attachment;filename=" + new String((fileName).getBytes("gb2312"), "ISO-8859-1") + ".xls");
        response.setContentType("multipart/form-data");
        response.setCharacterEncoding("utf-8");
        writer.finish();
        out.flush();

    } finally {
        if (out != null) {
            try {
                out.close();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    return ResultVO.getSuccess("Export system list EXCEL success");
}
2.4. 2. Moderate data volume (within 100W): one SHEET is queried and exported in batches
@Override
public ResultVO<Void> exportSysSystemExcel(SysSystemVO sysSystemVO, HttpServletResponse response) throws Exception {

    ServletOutputStream out = null;
    try {
        out = response.getOutputStream();
        ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX);

        //Set EXCEL name
        String fileName = new String(("SystemExcel").getBytes(), "UTF-8");

        //Set SHEET name
        Sheet sheet = new Sheet(1, 0);
        sheet.setSheetName("System list sheet1");

        //Set title
        Table table = new Table(1);
        List<List<String>> titles = new ArrayList<List<String>>();
        titles.add(Arrays.asList("System name"));
        titles.add(Arrays.asList("System identification"));
        titles.add(Arrays.asList("describe"));
        titles.add(Arrays.asList("state"));
        titles.add(Arrays.asList("Creator"));
        titles.add(Arrays.asList("Creation time"));
        table.setHead(titles);

        //Query the total number and [encapsulate relevant variables] just copy this directly [do not change]
        Integer totalRowCount = this.sysSystemReadMapper.selectCountSysSystemVOList(sysSystemVO);
        Integer pageSize = ExcelConstant.PER_WRITE_ROW_COUNT;
        Integer writeCount = totalRowCount % pageSize == 0 ? (totalRowCount / pageSize) : (totalRowCount / pageSize + 1);

        //Write data. Just copy the maximum value of i directly. Don't change it
        for (int i = 0; i < writeCount; i++) {
            List<List<String>> dataList = new ArrayList<>();

            //You can query and encapsulate the data here. The encapsulated variables , currentPage and pageSize , do not change
            PageHelper.startPage(i + 1, pageSize);
            List<SysSystemVO> sysSystemVOList = this.sysSystemReadMapper.selectSysSystemVOList(sysSystemVO);
            if (!CollectionUtils.isEmpty(sysSystemVOList)) {
                sysSystemVOList.forEach(eachSysSystemVO -> {
                    dataList.add(Arrays.asList(
                            eachSysSystemVO.getSystemName(),
                            eachSysSystemVO.getSystemKey(),
                            eachSysSystemVO.getDescription(),
                            eachSysSystemVO.getState().toString(),
                            eachSysSystemVO.getCreateUid(),
                            eachSysSystemVO.getCreateTime().toString()
                    ));
                });
            }
            writer.write0(dataList, sheet, table);
        }

        //Download EXCEL
        response.setHeader("Content-Disposition", "attachment;filename=" + new String((fileName).getBytes("gb2312"), "ISO-8859-1") + ".xls");
        response.setContentType("multipart/form-data");
        response.setCharacterEncoding("utf-8");
        writer.finish();
        out.flush();

    } finally {
        if (out != null) {
            try {
                out.close();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    return ResultVO.getSuccess("Export system list EXCEL success");
}
2.4. 3. The data is large (millions are OK): multiple sheets are queried and exported in batches
@Override
public ResultVO<Void> exportSysSystemExcel(SysSystemVO sysSystemVO, HttpServletResponse response) throws Exception {

    ServletOutputStream out = null;
    try {
        out = response.getOutputStream();
        ExcelWriter writer = new ExcelWriter(out, ExcelTypeEnum.XLSX);

        //Set EXCEL name
        String fileName = new String(("SystemExcel").getBytes(), "UTF-8");

        //Set SHEET name
        String sheetName = "System list sheet";

        //Set title
        Table table = new Table(1);
        List<List<String>> titles = new ArrayList<List<String>>();
        titles.add(Arrays.asList("System name"));
        titles.add(Arrays.asList("System identification"));
        titles.add(Arrays.asList("describe"));
        titles.add(Arrays.asList("state"));
        titles.add(Arrays.asList("Creator"));
        titles.add(Arrays.asList("Creation time"));
        table.setHead(titles);

        //Query the total number and encapsulate related variables (just copy this directly, don't change it)
        Integer totalRowCount = this.sysSystemReadMapper.selectCountSysSystemVOList(sysSystemVO);
        Integer perSheetRowCount = ExcelConstant.PER_SHEET_ROW_COUNT;
        Integer pageSize = ExcelConstant.PER_WRITE_ROW_COUNT;
        Integer sheetCount = totalRowCount % perSheetRowCount == 0 ? (totalRowCount / perSheetRowCount) : (totalRowCount / perSheetRowCount + 1);
        Integer previousSheetWriteCount = perSheetRowCount / pageSize;
        Integer lastSheetWriteCount = totalRowCount % perSheetRowCount == 0 ?
                previousSheetWriteCount :
                (totalRowCount % perSheetRowCount % pageSize == 0 ? totalRowCount % perSheetRowCount / pageSize : (totalRowCount % perSheetRowCount / pageSize + 1));


        for (int i = 0; i < sheetCount; i++) {

            //Create SHEET
            Sheet sheet = new Sheet(i, 0);
            sheet.setSheetName(sheetName + i);

            //Write data. Judge the maximum value of j. just copy it directly. Don't change it
            for (int j = 0; j < (i != sheetCount - 1 ? previousSheetWriteCount : lastSheetWriteCount); j++) {
                List<List<String>> dataList = new ArrayList<>();

                //You can query and encapsulate the data here. The encapsulated variables currentPage and pageSize do not change
                PageHelper.startPage(j + 1 + previousSheetWriteCount * i, pageSize);
                List<SysSystemVO> sysSystemVOList = this.sysSystemReadMapper.selectSysSystemVOList(sysSystemVO);
                if (!CollectionUtils.isEmpty(sysSystemVOList)) {
                    sysSystemVOList.forEach(eachSysSystemVO -> {
                        dataList.add(Arrays.asList(
                                eachSysSystemVO.getSystemName(),
                                eachSysSystemVO.getSystemKey(),
                                eachSysSystemVO.getDescription(),
                                eachSysSystemVO.getState().toString(),
                                eachSysSystemVO.getCreateUid(),
                                eachSysSystemVO.getCreateTime().toString()
                        ));
                    });
                }
                writer.write0(dataList, sheet, table);
            }
        }

        //Download EXCEL
        response.setHeader("Content-Disposition", "attachment;filename=" + new String((fileName).getBytes("gb2312"), "ISO-8859-1") + ".xls");
        response.setContentType("multipart/form-data");
        response.setCharacterEncoding("utf-8");
        writer.finish();
        out.flush();

    } finally {
        if (out != null) {
            try {
                out.close();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    }

    return ResultVO.getSuccess("Export system list EXCEL success");
}

3, Summary

Fake data, 100W records, 18 fields, test export is 70s. When actually using the production environment, it depends on the performance of the sql you write. If sql performance is fast, it will be fast.

One thing to recommend: when paging, use a single table query. For the redundant fields corresponding to the foreign keys to be processed, check them outside and put them in the map at one time (it is recommended to use @ MapKey annotation), and then get the corresponding names from the map according to the foreign keys when traversing the list.

One purpose: less query sql can be exported faster.

Aside: if the amount of data is too large, it will be very slow when using count(1) to query the total number. You can speed up the query by adjusting the buffer pool parameters of mysql.

There is also a problem. When using pagehelper, when there is a large amount of data, limit 0,20W, limit 20W,40W, limit 40W,60W, limit 60W,80W} query will be fast sometimes and slow sometimes, which needs to be studied.

Keywords: Java

Added by scuzzo84 on Fri, 10 Dec 2021 03:46:58 +0200