Sample code for Python manipulating Excel workbooks (.xlsx)

Preface

As a popular personal computer data processing software, Excel is mixed up in all fields, and is often processed by programmers here. It can handle Python of Excel format files.
There are many libraries, such as xlrd, xlwt, xlutils, openpyxl, xlwings, and so on, but each library handles Excel
Different ways, some libraries have limitations when dealing with them.

Next, I'll compare the different libraries, and then I'll focus on documenting xlwings
This library is currently a convenient one for you to use. Several other libraries always have problems like this or that during use, but it is also good to use under certain circumstances.

EXCEL file

Excel is called a spreadsheet and can actually be saved in many formats, but "Excel Workbook (.xlsx)" and "Excel 97-2003"
Workbook (.xls)"is one of the two most commonly used and can be considered. Tables in XLS format were commonly used before Excel version 03, and. xlsx is after version 03.
Formats commonly used after Excel version 07.

General Excel
The program can open edits for both formats, or convert storage to each other. However, it is recommended to use the new format without special requirements. On the one hand, the new stable version may repair some previous BUG s, but it will also bring some optimization.

I also found it empty before writing this summary. The file size in xlsx format is 7KB and an empty one. File sizes in xls format are
24KB, when I write the same Chinese character, the two files become 10KB and 30KB in size. The difference is not small. Another problem is to write the same characters in the same size. Save files in xlsx format as
There will also be compatibility tips in.xls format to remind users that some settings may be lost, so choose a new version or try using the new one.

testing environment

Because many applications are iterating and the corresponding Python libraries are iterating, as many versions as possible are given here, different versions may have different problems:

  • Operating System: Windows 10 Random Edition
  • Python: 3.75
  • xlrd: 1.2.0
  • xlwt: 1.3.0
  • xlutils: 2.0.0
  • openpyxl: 3.0.3
  • xlwings: 0.18.0

Install each of these libraries before you use them. Installation methods are not overridden, but you can provide a quick installation source using pip install -i https://pypi.doubanio.com/simple The library name can solve the problem of slow download and installation as much as possible.

Excel specific operations

There are three groups of methods for using Python to manipulate Excel, using xlrd, xlwt, xlutils as the first group, and openpyxl as the library
As the second group, and xlwings as the third group, this summary focuses on the use of xlwings, the other two groups of simple understanding.

xlrd,xlwt,xlutils **
**

This group of libraries that operate on Excel have very good names, one read, one write, one widget. Together, you can wanton about Excel. Here's a little exercise to open an Excel
The file then modifies the value of the first cell and saves it as a new file with the following code:

    import xlrd
    import xlwt
    import xlutils.copy
    
    def save_as_new_file(file_name, new_file_name):
     # Open Excel File
     rb = xlrd.open_workbook(file_name)
     # Create a writable copy
     wb = xlutils.copy.copy(rb)
     # Get the first sheet tab
     ws = wb.get_sheet(0)
     # The first cell writes the test value
     ws.write(0, 0, 'test value')
     # Save as a new file
     wb.save(new_file_name)

The above code is for both operations. Xlsx file or operation. The xls file will not fail, but will be saved as. Files in xlsx format will not open and you will find that they are stored normally. xls
After opening the file, the format is all gone. What can I do? Try another parameter and change the code of opening the file as follows:

    rb = xlrd.open_workbook(file_name, formatting_info=True)

Where the parameter formatting_info=True means to keep the original format when opening Excel, but this is relative to. Files in xls format, for. xlsx
Format file directly runs out of exception raise NotImplementedError("formatting_info=True not yet yet yet"
implemented), because it can't be handled. For files in xlsx format, I am not using these libraries to operate on Excel at this time.

Also, when these libraries operate on cells, the index for rows and columns starts at 0.

openpyxl

First of all, this library is mainly used for operations. Files in xlsx format, for. xls format file cannot be opened, will report openpyxl does not support the
Old. Errors such as XLS file format, but can be stored in such a format that warns of format mismatch when reopened, but the underlying data is still there, so the operation is preferred
Files in.xls format.

Common uses for writing a new file:

    from openpyxl import Workbook
    from openpyxl import load_workbook
    from openpyxl.styles import Font, Fill, Alignment, PatternFill
    
    def write_new_excel(file_name):
     # Create an excel document
     wb = Workbook()
     # Get the currently active sheet object
     ws = wb.active
     # Assign values to A2 cells
     ws['A2'] = 'This is A2 cell'
     # Add multiple columns of data in one row
     ws.append([1, 2, 'hello'])
     # Add a new sheet
     ws = wb.create_sheet(title='NewInfo',index=0)
     # Set Cell Value
     ws['A1'] = 'This is new sheet'
    
     # Save excel
     wb.save(file_name)

Common uses for reading and rewriting an existing file:

    def read_update_excel(file_name):
     # Loading Excel tables
     wb = load_workbook(file_name)
     # Print sheet quantity
     print('sheet count:', len(wb.sheetnames))
     # Print all sheet names
     print('sheet name list:', wb.sheetnames)
     # Get the first sheet object
     ws = wb[wb.sheetnames[0]]
     # Print sheet table rows and columns
     print('rows count:', ws.max_row, 'cols count:', ws.max_column)
     # Update the contents of cell A1
     ws['A1'] = 'this is A1'
     # Insert a row at the second line position
     ws.insert_rows(2)
     # Delete the fifth line
     ws.delete_rows(5)
     # Gets the cell object corresponding to the B2 cell
     cell = ws.cell(2,2)
     # Set cell contents
     cell.value = 'this is B2'
     # Modify font format to bold
     cell.font = Font(bold=True)
     # Modify Cell Formatting
     cell.fill = PatternFill("solid", fgColor="F0CDCD")
    
     # Save the original file or save a file
     wb.save(file_name)

When you use this library, there are no problems with storing styled data, but when you add a calculation formula and save it as a file, it's clear that the file size is smaller, but the data and formulas do not find any problems.

There is data saying that the processing speed is really slow because I am working with small files, but I did not find any problems with this. Another problem is that all macros in Excel are lost. This test was really lost, but it seems to be related to the file format. To save macros, you need to store them as
The.Xlsm format, but openpyxl is used to manipulate it. xlsx file, storage will result in macro loss, forced storage as. The xlsm format will result in the final file not opening.

Also, when this library operates on cells, the index for rows and columns starts at 1.

xlwings

This library first creates an App to manipulate Excel through this created App object, much like encapsulating the various operations of Excel together, and then through this
App object to call, if you create App without setting hidden parameters, will normally open the Excel program.

Basic ways to use xlwings:

    import xlwings as xw
    
    # Set Excel program invisible
    app = xw.App(visible=False, add_book=False)
    
    # Operating Excel files through app
    # app.bala bala bala .....
    # app.bala bala bala .....
    
    # Elegant Exit
    app.quit()

Create a new Excel file and write the data:

    def write_new_excel(app, file_name):
     # Create a new Excel table
     wb = app.books.add()
     # Get the currently active sheet
     ws = wb.sheets.active
     # Initialize values for two-dimensional zones
     arr_data = [[1, 2, 3], [4, 5, 6], [7, 8, 'end']]
     # Set to New Excel
     ws.range('A1:B3').value=arr_data
     # Set the value of a single cell
     ws.range('A4').value='this is A4'
     # Set the value of a single cell
     ws[3,1].value='this is B4'
     # Save Excel File
     wb.save(file_name)
     wb.close()

It is important to note that when cells are modified through row and column indexes, the starting index is 0.

Read in an existing Excel table and modify it

    def read_update_excel(app, file_name):
     # Load an existing table
     load_wb = app.books.open(file_name)
     # Get the first sheet in the Excel table
     load_ws = load_wb.sheets[0]
     # Print the name of the sheet
     print(load_ws.name)
     # Get sheet object from sheet name
     load_ws = load_wb.sheets[load_ws.name]
     # Get the currently active sheet
     load_ws = load_wb.sheets.active
    
     # Get the number of rows and columns with data
     rows = load_ws.api.UsedRange.Rows.count
     cols = load_ws.api.UsedRange.Columns.count
     print('rows count:', rows, 'cols count:', cols)
    
     # Modify specified cell data (A1 cell)
     load_ws[0,0].value='this is A1'
    
     # Get the exact number of rows and columns when there are empty rows or columns
     print(load_ws.used_range.shape)
    
     # Expand from cell A1 to non-empty rows and empty columns, the last number of rows and columns
     print((load_ws.range('A1').expand().last_cell.row,
      load_ws.range('A1').expand().last_cell.column))
    
     # Expand from cell A1 to non-empty rows and empty columns, the last number of rows and columns
     print((load_ws.range('A1').expand().last_cell.row,
      load_ws.range('A1').expand().last_cell.column))
    
     # Expand from cell A1 to non-empty rows, empty columns, and finally shape
     print(load_ws.range(1,1).expand().shape)
    
     # Expand from cell A1 to non-empty rows and empty columns, the last number of rows and columns
     print((load_ws.range('A1').expand('table').rows.count,
      load_ws.range('A1').expand('table').columns.count))
    
     # Save modified Excel
     load_wb.save(file_name)
     load_wb.close()

Excel Add Delete Rows and Columns

    def insert_delete_rowscols(app, file_name):
     # Load an existing table
     load_wb = app.books.open(file_name)
     # Get the currently active sheet
     load_ws = load_wb.sheets.active
    
     # Insert 4 rows from line 2, that is, 2-5 rows become newly inserted blank rows
     load_ws.api.rows('2:5').insert
     # Delete lines 6 and 7
     load_ws.api.rows('6:7').delete
     # Insert a cell, the actual test effect is that column B moves down from B2, B2 is the newly added cell
     load_ws.range('B2').api.insert
     # Insert a new column
     load_ws.api.columns('B').insert
     # Delete a column
     load_ws.api.columns('C').delete
    
     # Save modified Excel
     load_wb.save(file_name)
     load_wb.close()

Cell Width and Height Query Setting and Merging

    def cell_operation(app, file_name):
     # Load an existing table
     load_wb = app.books.open(FILE_PATH_ROOT + file_name)
     # Get the currently active sheet
     load_ws = load_wb.sheets.active
    
     # merge cell
     load_ws.range('A2:A3').api.merge
    
     #Get Cells
     cell = xw.Range('B2')
     # Print rows and columns of cells
     print("row is:", cell.row, "col is:", cell.column)
    
     # Print the height and width of the current grid
     print("cell.width:", cell.width, "cell.height:", cell.height)
    
     # Set the height and width of the current grid
     cell.row_height = 32
     cell.column_width = 64
    
     # Specify cell height and width adaptive
     cell.columns.autofit()
     cell.rows.autofit()
    
     # Print the height and width of the current grid again
     print("cell.width:", cell.width, "cell.height:", cell.height)
    
     # Save modified Excel
     load_wb.save(file_name)
     load_wb.close()

Several library support comparisons

Although I have written so many ways before, I still hesitate when I encounter a real problem. Which way should I use it? Here is a simple comparison, just a simple comparison based on my experiments. If there are inaccuracies or even errors, you are welcome to point out that I will correct them as soon as possible.

Scenarios/Librariesxlrd,xlwt,xlutilsopenpyxlxlwings
Read. xlsReadable with stylesI won't support itReadable
Preservation. xlsCan be saved with stylesCan save, but prompt file extension does not match, you can see the original data

Can save, but prompt file extension does not match, you can see the original data
Read. xlsx | is readable but has no style | can be read with style | can be read with style
Preservation. xlsx | cannot be opened after saving | can be saved with a style | can be saved with a style
Read. xlsm | is readable, but there are no styles and macros | to read, but there are no macros | to read tables containing macros
Preservation. xlsm | cannot be opened after saving, save as. Macros in xls format are missing|cannot be opened after saving, save as. xls wants to format macros missing | macros are still there after storage
Add or Remove Rows and Columns|No Direct Method|Support|Support
Save Size|. xls file unchanged |. The xlsx file will be smaller |. xls,. Xlsx file unchanged
Use Suggestions|Operations Only. xls file can be considered | operation only. Consider that xlsx files cannot have macros | a better choice and will feel slightly slower when used

**Summary
**

  1. Excel spreadsheet program has changed a lot from version to version when the same content appears. xls ratio. A lot of files in xlsx format
  2. Based on the previous point, if you can use a new version of the table, discard the old format
  3. There's also a magical case, one with a small amount of data. The table in xlsx format is smaller than an empty one. What's the situation? I don't know what's going on for a while. The God who wants to know tells me one or two

This is the point in this article about sample code (*.xlsx) for Python operation Excel workbook. For more information on Python operation Excel Workbook content, please search for previous articles in Script House or continue to browse the related articles below. I hope you will support Script House more in the future!

Keywords: Python

Added by telvitajoel on Wed, 26 Jan 2022 01:25:16 +0200