Preface
As a popular personal computer data processing software, Excel is mixed up in all fields, and is often processed by programmers here. It can handle Python of Excel format files.
There are many libraries, such as xlrd, xlwt, xlutils, openpyxl, xlwings, and so on, but each library handles Excel
Different ways, some libraries have limitations when dealing with them.
Next, I'll compare the different libraries, and then I'll focus on documenting xlwings
This library is currently a convenient one for you to use. Several other libraries always have problems like this or that during use, but it is also good to use under certain circumstances.
EXCEL file
Excel is called a spreadsheet and can actually be saved in many formats, but "Excel Workbook (.xlsx)" and "Excel 97-2003"
Workbook (.xls)"is one of the two most commonly used and can be considered. Tables in XLS format were commonly used before Excel version 03, and. xlsx is after version 03.
Formats commonly used after Excel version 07.
General Excel
The program can open edits for both formats, or convert storage to each other. However, it is recommended to use the new format without special requirements. On the one hand, the new stable version may repair some previous BUG s, but it will also bring some optimization.
I also found it empty before writing this summary. The file size in xlsx format is 7KB and an empty one. File sizes in xls format are
24KB, when I write the same Chinese character, the two files become 10KB and 30KB in size. The difference is not small. Another problem is to write the same characters in the same size. Save files in xlsx format as
There will also be compatibility tips in.xls format to remind users that some settings may be lost, so choose a new version or try using the new one.
testing environment
Because many applications are iterating and the corresponding Python libraries are iterating, as many versions as possible are given here, different versions may have different problems:
- Operating System: Windows 10 Random Edition
- Python: 3.75
- xlrd: 1.2.0
- xlwt: 1.3.0
- xlutils: 2.0.0
- openpyxl: 3.0.3
- xlwings: 0.18.0
Install each of these libraries before you use them. Installation methods are not overridden, but you can provide a quick installation source using pip install -i https://pypi.doubanio.com/simple The library name can solve the problem of slow download and installation as much as possible.
Excel specific operations
There are three groups of methods for using Python to manipulate Excel, using xlrd, xlwt, xlutils as the first group, and openpyxl as the library
As the second group, and xlwings as the third group, this summary focuses on the use of xlwings, the other two groups of simple understanding.
xlrd,xlwt,xlutils **
**
This group of libraries that operate on Excel have very good names, one read, one write, one widget. Together, you can wanton about Excel. Here's a little exercise to open an Excel
The file then modifies the value of the first cell and saves it as a new file with the following code:
import xlrd import xlwt import xlutils.copy def save_as_new_file(file_name, new_file_name): # Open Excel File rb = xlrd.open_workbook(file_name) # Create a writable copy wb = xlutils.copy.copy(rb) # Get the first sheet tab ws = wb.get_sheet(0) # The first cell writes the test value ws.write(0, 0, 'test value') # Save as a new file wb.save(new_file_name)
The above code is for both operations. Xlsx file or operation. The xls file will not fail, but will be saved as. Files in xlsx format will not open and you will find that they are stored normally. xls
After opening the file, the format is all gone. What can I do? Try another parameter and change the code of opening the file as follows:
rb = xlrd.open_workbook(file_name, formatting_info=True)
Where the parameter formatting_info=True means to keep the original format when opening Excel, but this is relative to. Files in xls format, for. xlsx
Format file directly runs out of exception raise NotImplementedError("formatting_info=True not yet yet yet"
implemented), because it can't be handled. For files in xlsx format, I am not using these libraries to operate on Excel at this time.
Also, when these libraries operate on cells, the index for rows and columns starts at 0.
openpyxl
First of all, this library is mainly used for operations. Files in xlsx format, for. xls format file cannot be opened, will report openpyxl does not support the
Old. Errors such as XLS file format, but can be stored in such a format that warns of format mismatch when reopened, but the underlying data is still there, so the operation is preferred
Files in.xls format.
Common uses for writing a new file:
from openpyxl import Workbook from openpyxl import load_workbook from openpyxl.styles import Font, Fill, Alignment, PatternFill def write_new_excel(file_name): # Create an excel document wb = Workbook() # Get the currently active sheet object ws = wb.active # Assign values to A2 cells ws['A2'] = 'This is A2 cell' # Add multiple columns of data in one row ws.append([1, 2, 'hello']) # Add a new sheet ws = wb.create_sheet(title='NewInfo',index=0) # Set Cell Value ws['A1'] = 'This is new sheet' # Save excel wb.save(file_name)
Common uses for reading and rewriting an existing file:
def read_update_excel(file_name): # Loading Excel tables wb = load_workbook(file_name) # Print sheet quantity print('sheet count:', len(wb.sheetnames)) # Print all sheet names print('sheet name list:', wb.sheetnames) # Get the first sheet object ws = wb[wb.sheetnames[0]] # Print sheet table rows and columns print('rows count:', ws.max_row, 'cols count:', ws.max_column) # Update the contents of cell A1 ws['A1'] = 'this is A1' # Insert a row at the second line position ws.insert_rows(2) # Delete the fifth line ws.delete_rows(5) # Gets the cell object corresponding to the B2 cell cell = ws.cell(2,2) # Set cell contents cell.value = 'this is B2' # Modify font format to bold cell.font = Font(bold=True) # Modify Cell Formatting cell.fill = PatternFill("solid", fgColor="F0CDCD") # Save the original file or save a file wb.save(file_name)
When you use this library, there are no problems with storing styled data, but when you add a calculation formula and save it as a file, it's clear that the file size is smaller, but the data and formulas do not find any problems.
There is data saying that the processing speed is really slow because I am working with small files, but I did not find any problems with this. Another problem is that all macros in Excel are lost. This test was really lost, but it seems to be related to the file format. To save macros, you need to store them as
The.Xlsm format, but openpyxl is used to manipulate it. xlsx file, storage will result in macro loss, forced storage as. The xlsm format will result in the final file not opening.
Also, when this library operates on cells, the index for rows and columns starts at 1.
xlwings
This library first creates an App to manipulate Excel through this created App object, much like encapsulating the various operations of Excel together, and then through this
App object to call, if you create App without setting hidden parameters, will normally open the Excel program.
Basic ways to use xlwings:
import xlwings as xw # Set Excel program invisible app = xw.App(visible=False, add_book=False) # Operating Excel files through app # app.bala bala bala ..... # app.bala bala bala ..... # Elegant Exit app.quit()
Create a new Excel file and write the data:
def write_new_excel(app, file_name): # Create a new Excel table wb = app.books.add() # Get the currently active sheet ws = wb.sheets.active # Initialize values for two-dimensional zones arr_data = [[1, 2, 3], [4, 5, 6], [7, 8, 'end']] # Set to New Excel ws.range('A1:B3').value=arr_data # Set the value of a single cell ws.range('A4').value='this is A4' # Set the value of a single cell ws[3,1].value='this is B4' # Save Excel File wb.save(file_name) wb.close()
It is important to note that when cells are modified through row and column indexes, the starting index is 0.
Read in an existing Excel table and modify it
def read_update_excel(app, file_name): # Load an existing table load_wb = app.books.open(file_name) # Get the first sheet in the Excel table load_ws = load_wb.sheets[0] # Print the name of the sheet print(load_ws.name) # Get sheet object from sheet name load_ws = load_wb.sheets[load_ws.name] # Get the currently active sheet load_ws = load_wb.sheets.active # Get the number of rows and columns with data rows = load_ws.api.UsedRange.Rows.count cols = load_ws.api.UsedRange.Columns.count print('rows count:', rows, 'cols count:', cols) # Modify specified cell data (A1 cell) load_ws[0,0].value='this is A1' # Get the exact number of rows and columns when there are empty rows or columns print(load_ws.used_range.shape) # Expand from cell A1 to non-empty rows and empty columns, the last number of rows and columns print((load_ws.range('A1').expand().last_cell.row, load_ws.range('A1').expand().last_cell.column)) # Expand from cell A1 to non-empty rows and empty columns, the last number of rows and columns print((load_ws.range('A1').expand().last_cell.row, load_ws.range('A1').expand().last_cell.column)) # Expand from cell A1 to non-empty rows, empty columns, and finally shape print(load_ws.range(1,1).expand().shape) # Expand from cell A1 to non-empty rows and empty columns, the last number of rows and columns print((load_ws.range('A1').expand('table').rows.count, load_ws.range('A1').expand('table').columns.count)) # Save modified Excel load_wb.save(file_name) load_wb.close()
Excel Add Delete Rows and Columns
def insert_delete_rowscols(app, file_name): # Load an existing table load_wb = app.books.open(file_name) # Get the currently active sheet load_ws = load_wb.sheets.active # Insert 4 rows from line 2, that is, 2-5 rows become newly inserted blank rows load_ws.api.rows('2:5').insert # Delete lines 6 and 7 load_ws.api.rows('6:7').delete # Insert a cell, the actual test effect is that column B moves down from B2, B2 is the newly added cell load_ws.range('B2').api.insert # Insert a new column load_ws.api.columns('B').insert # Delete a column load_ws.api.columns('C').delete # Save modified Excel load_wb.save(file_name) load_wb.close()
Cell Width and Height Query Setting and Merging
def cell_operation(app, file_name): # Load an existing table load_wb = app.books.open(FILE_PATH_ROOT + file_name) # Get the currently active sheet load_ws = load_wb.sheets.active # merge cell load_ws.range('A2:A3').api.merge #Get Cells cell = xw.Range('B2') # Print rows and columns of cells print("row is:", cell.row, "col is:", cell.column) # Print the height and width of the current grid print("cell.width:", cell.width, "cell.height:", cell.height) # Set the height and width of the current grid cell.row_height = 32 cell.column_width = 64 # Specify cell height and width adaptive cell.columns.autofit() cell.rows.autofit() # Print the height and width of the current grid again print("cell.width:", cell.width, "cell.height:", cell.height) # Save modified Excel load_wb.save(file_name) load_wb.close()
Several library support comparisons
Although I have written so many ways before, I still hesitate when I encounter a real problem. Which way should I use it? Here is a simple comparison, just a simple comparison based on my experiments. If there are inaccuracies or even errors, you are welcome to point out that I will correct them as soon as possible.
Scenarios/Libraries | xlrd,xlwt,xlutils | openpyxl | xlwings |
---|---|---|---|
Read. xls | Readable with styles | I won't support it | Readable |
Preservation. xls | Can be saved with styles | Can save, but prompt file extension does not match, you can see the original data |
Can save, but prompt file extension does not match, you can see the original data
Read. xlsx | is readable but has no style | can be read with style | can be read with style
Preservation. xlsx | cannot be opened after saving | can be saved with a style | can be saved with a style
Read. xlsm | is readable, but there are no styles and macros | to read, but there are no macros | to read tables containing macros
Preservation. xlsm | cannot be opened after saving, save as. Macros in xls format are missing|cannot be opened after saving, save as. xls wants to format macros missing | macros are still there after storage
Add or Remove Rows and Columns|No Direct Method|Support|Support
Save Size|. xls file unchanged |. The xlsx file will be smaller |. xls,. Xlsx file unchanged
Use Suggestions|Operations Only. xls file can be considered | operation only. Consider that xlsx files cannot have macros | a better choice and will feel slightly slower when used
**Summary
**
- Excel spreadsheet program has changed a lot from version to version when the same content appears. xls ratio. A lot of files in xlsx format
- Based on the previous point, if you can use a new version of the table, discard the old format
- There's also a magical case, one with a small amount of data. The table in xlsx format is smaller than an empty one. What's the situation? I don't know what's going on for a while. The God who wants to know tells me one or two
This is the point in this article about sample code (*.xlsx) for Python operation Excel workbook. For more information on Python operation Excel Workbook content, please search for previous articles in Script House or continue to browse the related articles below. I hope you will support Script House more in the future!