How to apply POI

The following links are from: POI official how to use poi.

preface

This version of the operations guide outlines the functionality of the current svn backbone. If you want to find information about previous versions, you should view the documents distributed with that version.

HSSF allows number, string, date, or formula cell values to be written to or read from XLS files. This release also includes row and column sizing, cell styles (bold, italic, border, etc.), and support for built-in and user-defined data formats. You can also use the event based API to read XLS files. It is very different from the read / write API and is suitable for intermediate developers who need less memory.

Different APIs

There are several different ways to access the HSSF API. They have different characteristics, so you should read everything carefully to choose the one that suits you best.

User API (HSSF and XSSF)
Event API (HSSF Only)
Event API with extensions to be Record Aware (HSSF Only)
XSSF and SAX (Event API)
SXSSF (Streaming User API)
Low Level API

common method

User API (HSSF and XSSF)

Write new file
The advanced API (package: org.apache.poi.ss.usermodel) is what most people should use. The usage is very simple.

The workbook is created by creating org apache. poi. ss. usermodel. Created by an instance of workbook. Either directly create a specific class (org.apache.poi.hssf.usermodel.HSSFWorkbook or org.apache.poi.xssf.usermodel.XSSFWorkbook), or use the convenient factory class org apache. poi. ss. usermodel. WorkbookFactory.

Worksheets are created by calling createSheet() from an existing instance of the Workbook, and the created worksheets are automatically added to the Workbook in order. The worksheet itself has no worksheet name (label at the bottom); You can call Workbook Setsheetname (sheetindex, "SheetName", encoding) to set the name associated with the worksheet. For HSSF, the name can be in 8-bit format (HSSFWorkbook.ENCODING_COMPRESSED_UNICODE) or Unicode (HSSFWorkbook.ENCODING_UTF_16). The default encoding for HSSF is 8 bits per character. For XSSF, the name is automatically processed to Unicode.

Create a row by calling createRow(rowNumber) from an existing Sheet instance. Only rows with cell values should be added to the worksheet. To set the height of a row, you simply call setRowHeight(height) on the row object. The height must be in twips or 1 / 20 of the point. There is also a setRowHeightInPoints method if you prefer.

Cells are created by calling createCell(column, type) from an existing row. Only cells with values should be added to the row. The cell type of the cell should be set to cell CELL_ TYPE_ Numeric or cell CELL_ TYPE_ Strings, depending on whether they contain numeric or text values. The cell must also have a value set. Set the value by calling setCellValue with String or double as a parameter. A single cell has no width; You must call setcolumnwidth (columnindex, width) (in 1 / 256 units of characters) on the Sheet object. (you cannot do this alone in the GUI).

Cells are styled using CellStyle objects, which in turn contain references to Font objects. These are created through the Workbook object by calling createCellStyle() and createFont(). After you create an object, you must set its parameters (color, border, etc.). To set the Font for CellStyle, call setFont(fontobj).

After the Workbook is generated, you can write it out by calling write(outputStream) from the Workbook instance and passing it to OutputStream (such as FileOutputStream or ServletOutputStream). You must close OutputStream yourself. HSSF will not close it for you.

Here's an official example

short rownum;
// create a new file
FileOutputStream out = new FileOutputStream("workbook.xls");
// create a new workbook
Workbook wb = new HSSFWorkbook();
// create a new sheet
Sheet s = wb.createSheet();
// declare a row object reference
Row r = null;
// declare a cell object reference
Cell c = null;
// create 3 cell styles
CellStyle cs = wb.createCellStyle();
CellStyle cs2 = wb.createCellStyle();
CellStyle cs3 = wb.createCellStyle();
DataFormat df = wb.createDataFormat();
// create 2 fonts objects
Font f = wb.createFont();
Font f2 = wb.createFont();
//set font 1 to 12 point type
f.setFontHeightInPoints((short) 12);
//make it blue
f.setColor( (short)0xc );
// make it bold
//arial is the default font
f.setBoldweight(Font.BOLDWEIGHT_BOLD);
//set font 2 to 10 point type
f2.setFontHeightInPoints((short) 10);
//make it red
f2.setColor( (short)Font.COLOR_RED );
//make it bold
f2.setBoldweight(Font.BOLDWEIGHT_BOLD);
f2.setStrikeout( true );
//set cell stlye
cs.setFont(f);
//set the cell format
cs.setDataFormat(df.getFormat("#,##0.0"));
//set a thin border
cs2.setBorderBottom(cs2.BORDER_THIN);
//fill w fg fill color
cs2.setFillPattern((short) CellStyle.SOLID_FOREGROUND);
//set the cell format to text see DataFormat for a full list
cs2.setDataFormat(HSSFDataFormat.getBuiltinFormat("text"));
// set the font
cs2.setFont(f2);
// set the sheet name in Unicode
wb.setSheetName(0, "\u0422\u0435\u0441\u0442\u043E\u0432\u0430\u044F " +
                   "\u0421\u0442\u0440\u0430\u043D\u0438\u0447\u043A\u0430" );
// in case of plain ascii
// wb.setSheetName(0, "HSSF Test");
// create a sheet with 30 rows (0-29)
int rownum;
for (rownum = (short) 0; rownum < 30; rownum++)
{
    // create a row
    r = s.createRow(rownum);
    // on every other row
    if ((rownum % 2) == 0)
    {
        // make the row height bigger  (in twips - 1/20 of a point)
        r.setHeight((short) 0x249);
    }
    //r.setRowNum(( short ) rownum);
    // create 10 cells (0-9) (the += 2 becomes apparent later
    for (short cellnum = (short) 0; cellnum < 10; cellnum += 2)
    {
        // create a numeric cell
        c = r.createCell(cellnum);
        // do some goofy math to demonstrate decimals
        c.setCellValue(rownum * 10000 + cellnum
                + (((double) rownum / 1000)
                + ((double) cellnum / 10000)));
        String cellValue;
        // create a string cell (see why += 2 in the
        c = r.createCell((short) (cellnum + 1));
        // on every other row
        if ((rownum % 2) == 0)
        {
            // set this cell to the first cell style we defined
            c.setCellStyle(cs);
            // set the cell's string value to "Test"
            c.setCellValue( "Test" );
        }
        else
        {
            c.setCellStyle(cs2);
            // set the cell's string value to "\u0422\u0435\u0441\u0442"
            c.setCellValue( "\u0422\u0435\u0441\u0442" );
        }
        // make this column a bit wider
        s.setColumnWidth((short) (cellnum + 1), (short) ((50 * 8) / ((double) 1 / 20)));
    }
}
//draw a thick black border on the row at the bottom using BLANKS
// advance 2 rows
rownum++;
rownum++;
r = s.createRow(rownum);
// define the third style to be the default
// except with a thick black border at the bottom
cs3.setBorderBottom(cs3.BORDER_THICK);
//create 50 cells
for (short cellnum = (short) 0; cellnum < 50; cellnum++)
{
    //create a blank type cell (no value)
    c = r.createCell(cellnum);
    // set it to the thick black border style
    c.setCellStyle(cs3);
}
//end draw thick black border
// demonstrate adding/naming and deleting a sheet
// create a sheet, set its title then delete it
s = wb.createSheet();
wb.setSheetName(1, "DeletedSheet");
wb.removeSheetAt(1);
//end deleted sheet
// write the workbook to the output stream
// close our file (don't blow out our file handles
wb.write(out);
out.close();

Read or modify an existing file
Reading in files is just as simple. To read in a file, create org. Org apache. poi. poifs. A new instance of Filesystem, passing the open InputStream (for example, FileInputStream of XLS) to the constructor. Construct org apache. poi. hssf. usermodel. A new instance of hssfworkbook, passing the Filesystem instance to the constructor. From there, you can access all advanced model objects through their evaluator methods (workbook.getSheet(sheetNum), sheet.getRow(rownum), and so on).

It's easy to modify the file you read in. You can retrieve the object through the evaluator method, delete it through the parent object's delete method (sheet. Removerrow (hssrow)), and then create the object, just like creating a new xls. When you are finished modifying the cells, simply call workbook as above Write (OutputStream).

An example can be found in the link: org.apache.poi.hssf.usermodel.examples.HSSFReadWrite . See in.

Event API (HSSF only)

The event API is newer than the user API. It is suitable for intermediate developers who are willing to learn some low-level API structures. It is relatively simple to use, but it requires a basic understanding (or willing to learn) of all parts of Excel files. The advantage is that you can read XLS that takes up relatively small memory.

An important thing to note about using the basic event API is that it triggers events only for what is actually stored in the file. For the XLS file format, it is common that what has not been edited does not exist in the file at all. This means that there may be obvious "gaps" in the record flow, and you need to solve these problems or use the record aware extension of the event API.

To use this API, you need to build an org apache. poi. hssf. eventmodel. An instance of hssfrequest. Use hssfrequest AddListener (your listener, recordid) registers the support org apache. poi. hssf. eventmodel. Class of the hssflistener interface. Recordid should be included in org apache. poi. hssf. The static reference number in the class in record (for example, BOFRecord.sid). The trick is that you have to know what these records are. Alternatively, you can call hssfrequest addListenerForAllRecords(mylistener). To understand these records, you can read org apache. poi. hssf. All JavaDocs in the record package, or you can crack an org. Doc apache. poi. hssf. Dev.efhssf and modify it to meet your needs. TODO: better documentation.

Once you have registered your listener in the HSSFRequest object, you can build an org apache. poi. poifs. filesystem. File system instance (see POIFS howto) and pass your XLS file input stream to it. You can use HSSFEventFactory The processworkbookevents (request, filesystem) method passes this along with the request you construct to the instance of HSSFEventFactory, or you can get it from filesystem Createdocumentinputstream ("Workbook") gets an instance of DocumentInputStream and passes it to HSSFEventFactory processEvents(request, inputStream). After making this call, the listeners you construct will receive calls to their processRecord(Record) methods and listen with each Record they register until the file is completely read.

From org apache. poi. hssf. The code excerpt of dev.efhssf (in CVS or source code distribution) is reproduced as follows:

/**
 * This example shows how to use the event API for reading a file.
 */
public class EventExample
        implements HSSFListener
{
    private SSTRecord sstrec;
    /**
     * This method listens for incoming records and handles them as required.
     * @param record    The record that was found while reading.
     */
    public void processRecord(Record record)
    {
        switch (record.getSid())
        {
            // the BOFRecord can represent either the beginning of a sheet or the workbook
            case BOFRecord.sid:
                BOFRecord bof = (BOFRecord) record;
                if (bof.getType() == bof.TYPE_WORKBOOK)
                {
                    System.out.println("Encountered workbook");
                    // assigned to the class level member
                } else if (bof.getType() == bof.TYPE_WORKSHEET)
                {
                    System.out.println("Encountered sheet reference");
                }
                break;
            case BoundSheetRecord.sid:
                BoundSheetRecord bsr = (BoundSheetRecord) record;
                System.out.println("New sheet named: " + bsr.getSheetname());
                break;
            case RowRecord.sid:
                RowRecord rowrec = (RowRecord) record;
                System.out.println("Row found, first column at "
                        + rowrec.getFirstCol() + " last column at " + rowrec.getLastCol());
                break;
            case NumberRecord.sid:
                NumberRecord numrec = (NumberRecord) record;
                System.out.println("Cell found with value " + numrec.getValue()
                        + " at row " + numrec.getRow() + " and column " + numrec.getColumn());
                break;
                // SSTRecords store a array of unique strings used in Excel.
            case SSTRecord.sid:
                sstrec = (SSTRecord) record;
                for (int k = 0; k < sstrec.getNumUniqueStrings(); k++)
                {
                    System.out.println("String table value " + k + " = " + sstrec.getString(k));
                }
                break;
            case LabelSSTRecord.sid:
                LabelSSTRecord lrec = (LabelSSTRecord) record;
                System.out.println("String cell found with value "
                        + sstrec.getString(lrec.getSSTIndex()));
                break;
        }
    }
    /**
     * Read an excel file and spit out what we find.
     *
     * @param args      Expect one argument that is the file to read.
     * @throws IOException  When there is an error processing the file.
     */
    public static void main(String[] args) throws IOException
    {
        // create a new file input stream with the input file specified
        // at the command line
        FileInputStream fin = new FileInputStream(args[0]);
        // create a new org.apache.poi.poifs.filesystem.Filesystem
        POIFSFileSystem poifs = new POIFSFileSystem(fin);
        // get the Workbook (excel part) stream in a InputStream
        InputStream din = poifs.createDocumentInputStream("Workbook");
        // construct out HSSFRequest object
        HSSFRequest req = new HSSFRequest();
        // lazy listen for ALL records with the listener shown above
        req.addListenerForAllRecords(new EventExample());
        // create our event factory
        HSSFEventFactory factory = new HSSFEventFactory();
        // process our events based on the document input stream
        factory.processEvents(req, din);
        // once all the events are processed close our file input stream
        fin.close();
        // and our document input stream (don't want to leak these!)
        din.close();
        System.out.println("done.");
    }
}

Record Aware Event API (HSSF only)

Not for the time being, the hssf file is still a little old. We'll talk about it later

XSSF and SAX (Event API)

If memory consumption is a problem, you can get the underlying XML data and process it yourself for XSSF. This applies to those who are willing to learn The low-level structure of XLSX files and intermediate developers who are willing to process XML in Java. It is relatively simple to use, but it requires a basic understanding of the file structure. The advantage is that you can read XLSX files that occupy relatively small memory.

An important thing to note about using the basic event API is that it triggers events only for what is actually stored in the file. For the XLSX file format, it is common for unedited content to not exist in the file at all. This means that there may be obvious "gaps" in the record stream, which you need to solve.

To use this API, you need to build an org apache. poi. xssf. eventmodel. An instance of xssfreader. This will selectively provide a good interface on shared string tables and styles. It provides a way to get raw xml data from the rest of the file and then pass it to SAX.

This example shows how to get all worksheets in a single known worksheet or file. It is based on poi-examples/src/main/java/org/apache/poi/examples/xssf/eventusermodel/FromHowTo.java . Examples of

import java.io.InputStream;
import java.util.Iterator;
import org.apache.poi.util.XMLHelper;
import org.apache.poi.openxml4j.opc.OPCPackage;
import org.apache.poi.xssf.eventusermodel.XSSFReader;
import org.apache.poi.xssf.model.SharedStringsTable;
import org.xml.sax.Attributes;
import org.xml.sax.ContentHandler;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.ParserConfigurationException;
public class ExampleEventUserModel {
    public void processOneSheet(String filename) throws Exception {
        OPCPackage pkg = OPCPackage.open(filename);
        XSSFReader r = new XSSFReader( pkg );
        SharedStringsTable sst = r.getSharedStringsTable();
        XMLReader parser = fetchSheetParser(sst);
        // To look up the Sheet Name / Sheet Order / rID,
        //  you need to process the core Workbook stream.
        // Normally it's of the form rId# or rSheet#
        InputStream sheet2 = r.getSheet("rId2");
        InputSource sheetSource = new InputSource(sheet2);
        parser.parse(sheetSource);
        sheet2.close();
    }
    public void processAllSheets(String filename) throws Exception {
        OPCPackage pkg = OPCPackage.open(filename);
        XSSFReader r = new XSSFReader( pkg );
        SharedStringsTable sst = r.getSharedStringsTable();
        XMLReader parser = fetchSheetParser(sst);
        Iterator<InputStream> sheets = r.getSheetsData();
        while(sheets.hasNext()) {
            System.out.println("Processing new sheet:\n");
            InputStream sheet = sheets.next();
            InputSource sheetSource = new InputSource(sheet);
            parser.parse(sheetSource);
            sheet.close();
            System.out.println("");
        }
    }
    public XMLReader fetchSheetParser(SharedStringsTable sst) throws SAXException, ParserConfigurationException {
        XMLReader parser = XMLHelper.newXMLReader();
        ContentHandler handler = new SheetHandler(sst);
        parser.setContentHandler(handler);
        return parser;
    }
    /**
     * See org.xml.sax.helpers.DefaultHandler javadocs
     */
    private static class SheetHandler extends DefaultHandler {
        private SharedStringsTable sst;
        private String lastContents;
        private boolean nextIsString;
        private SheetHandler(SharedStringsTable sst) {
            this.sst = sst;
        }
        public void startElement(String uri, String localName, String name,
                                 Attributes attributes) throws SAXException {
            // c => cell
            if(name.equals("c")) {
                // Print the cell reference
                System.out.print(attributes.getValue("r") + " - ");
                // Figure out if the value is an index in the SST
                String cellType = attributes.getValue("t");
                if(cellType != null && cellType.equals("s")) {
                    nextIsString = true;
                } else {
                    nextIsString = false;
                }
            }
            // Clear contents cache
            lastContents = "";
        }
        public void endElement(String uri, String localName, String name)
                throws SAXException {
            // Process the last contents as required.
            // Do now, as characters() may be called more than once
            if(nextIsString) {
                int idx = Integer.parseInt(lastContents);
                lastContents = sst.getItemAt(idx).getString();
                nextIsString = false;
            }
            // v => contents of a cell
            // Output after we've seen the string contents
            if(name.equals("v")) {
                System.out.println(lastContents);
            }
        }
        public void characters(char[] ch, int start, int length) {
            lastContents += new String(ch, start, length);
        }
    }
    public static void main(String[] args) throws Exception {
        ExampleEventUserModel example = new ExampleEventUserModel();
        example.processOneSheet(args[0]);
        example.processAllSheets(args[0]);
    }
}

For more complete examples, including support for getting numeric format information and applying it to numeric cells, such as formatting dates or percentages, see in svn
XLSX2CSV example.

An example is also provided to show how to combine the user API and SAX API through streaming parsing of a large worksheet and traditional user model parsing of the rest of the workbook.

SXSSF (Streaming Usermodel API)

SXSSF (package: org.apache.poi.xssf.streaming) is an API compatible stream extension of XSSF. It is used when a very large spreadsheet must be generated, and the heap space is limited. SXSSF achieves its low memory footprint by restricting access to rows in the sliding window, while XSSF allows access to all rows in the document. Old lines that no longer appear in the window cannot be accessed because they are written to disk.

You can specify the window size through new SXSSFWorkbook(int windowSize) or set it for each page through SXSSFSheet#setRandomAccessWindowSize(int windowSize) when the workbook is built

When a new row is created through createRow() and the total number of records that are not refreshed will exceed the specified window size, the row with the lowest index value is refreshed and can no longer be accessed through getRow().

The default window size is 100 and is controlled by sxssfworkbook DEFAULT_ WINDOW_ Size definition.

A windowSize of - 1 indicates unrestricted access. In this case, all records that have not been refreshed by the call to flushRows() can be used for random access.

Note that SXSSF allocates temporary files, which you must always explicitly clean up by calling the dispose method.

SXSSFWorkbook uses inline strings instead of shared string tables by default. This is very effective because there is no need to save the document content in memory, but it is also known that documents incompatible with some clients will be generated. When shared strings are enabled, all unique strings in the document must be saved in memory. Depending on your document content, this may use more resources than disabling shared strings.

Please note that some things may consume a lot of memory according to the functions you use, such as merged areas, hyperlinks, comments... Are still only stored in memory, so a lot of memory may be required.

Before deciding whether to enable shared strings, carefully check your memory budget and compatibility requirements.

The following example writes a worksheet with a 100 row window. When the number of rows reaches 101, the row with rownum=0 is refreshed to disk and deleted from memory. When rownum reaches 102, the row with rownum=1 is refreshed, and so on.

import junit.framework.Assert;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
   public static void main(String[] args) throws Throwable {
       SXSSFWorkbook wb = new SXSSFWorkbook(100); // keep 100 rows in memory, exceeding rows will be flushed to disk
       Sheet sh = wb.createSheet();
       for(int rownum = 0; rownum < 1000; rownum++){
           Row row = sh.createRow(rownum);
           for(int cellnum = 0; cellnum < 10; cellnum++){
               Cell cell = row.createCell(cellnum);
               String address = new CellReference(cell).formatAsString();
               cell.setCellValue(address);
           }
       }
       // Rows with rownum < 900 are flushed and not accessible
       for(int rownum = 0; rownum < 900; rownum++){
         Assert.assertNull(sh.getRow(rownum));
       }
       // ther last 100 rows are still in memory
       for(int rownum = 900; rownum < 1000; rownum++){
           Assert.assertNotNull(sh.getRow(rownum));
       }
       FileOutputStream out = new FileOutputStream("/temp/sxssf.xlsx");
       wb.write(out);
       out.close();
       // dispose of temporary files backing this workbook on disk
       wb.dispose();
   }

The next example turns off automatic refresh (windowSize=-1), and the code manually controls how to write some data to disk

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.ss.util.CellReference;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
   public static void main(String[] args) throws Throwable {
       SXSSFWorkbook wb = new SXSSFWorkbook(-1); // turn off auto-flushing and accumulate all rows in memory
       Sheet sh = wb.createSheet();
       for(int rownum = 0; rownum < 1000; rownum++){
           Row row = sh.createRow(rownum);
           for(int cellnum = 0; cellnum < 10; cellnum++){
               Cell cell = row.createCell(cellnum);
               String address = new CellReference(cell).formatAsString();
               cell.setCellValue(address);
           }
          // manually control how rows are flushed to disk
          if(rownum % 100 == 0) {
               ((SXSSFSheet)sh).flushRows(100); // retain 100 last rows and flush all others
               // ((SXSSFSheet)sh).flushRows() is a shortcut for ((SXSSFSheet)sh).flushRows(0),
               // this method flushes all rows
          }
       }
       FileOutputStream out = new FileOutputStream("/temp/sxssf.xlsx");
       wb.write(out);
       out.close();
       // dispose of temporary files backing this workbook on disk
       wb.dispose();
  }

SXSSF refreshes the worksheet data in the temporary file (one temporary file per worksheet). The size of these temporary files may grow to very large values. For example, for 20 MB of csv data, the size of temporary xml becomes more than 1 GB. If there is a problem with the size of the temporary file, you can tell SXSSF to use gzip compression:

SXSSFWorkbook wb = new SXSSFWorkbook();
wb.setCompressTempFiles(true); // temp files will be gzipped

Low Level APIs

Low level APIs have nothing to see. It is organized by org apache. poi. hssf. record.* Many "records" in the package and org apache. poi. hssf. model.* Consists of a set of help classes in. The record class is consistent with the low-level binary structure in the BIFF8 file (embedded in the POIFS file system). You may need this book: Microsoft Excel 97 Developer's Kit from Microsoft Press to see how they fit together (out of print, but easy to get from Amazon's old book). To better understand how to use low-level APIs, you should check out org. Org apache. poi. hssf. usermodel.* Source code and org apache. poi. hssf. model.* Class in. You should also read the documentation for the POIFS library.

The above is my version of poi5 1.0.

Keywords: Java poi

Added by wellmoon on Mon, 13 Dec 2021 04:13:35 +0200