Process Excel files - delete records that exist in other files

In work and study, it is often necessary to delete records in an Excel file. Today, I will learn how to delete records in other files in one file.

Question: the first Excel file stores student information, as shown in "Table 1", including name, student number, grade, gender, age and class. The second Excel file stores the information of the first group of students, as shown in "Table 2", and the fields are the same as table 1.

Please program and save all the student information of the non first group into the new file result.xls.

Table 1

Table 2

Solution:
Analysis: since the student number is unique, you can first read the student number from the student information table of the first group (Table 2) and store it in the dictionary, and then judge whether the student number appears in the dictionary for each record in the student information table (Table 1) to be processed. If not, the student is not the first group student and writes the record to the result file.

Step 1: write the function read_student_id, which reads the student number from the student information table file of the first group, stores it in the dictionary structure, and its key is stored as the student number.

import xlrd  # Import xlrd library for reading excel files
import xlwt # Import xlwt library for writing excel files

def read_student_id(group1_student_file):
    data = xlrd.open_workbook(group1_student_file) # Open the first group student information sheet excel file
    table = data.sheets()[0] # Get the first sheet of excel file
    res_dict = {} # The student number Dictionary of the first group. key is the student number and value is True (the dictionary mainly stores the student number. Value can be assigned to other field values, empty strings, Boolean values, True, etc.)
    id_col= 1 # Column corresponding to student number field 
    for row in range(table.nrows): # Traverse each row of data
        student_id=table.cell_value(row, id_col)  #Get student number
        res_dict[student_id] = True # The key of the dictionary is set to the student number, and the value is set to the Boolean value True
    return res_dict # Return to the student number Dictionary of the first group

Step 2: write the delete function_ Record: for each record in the student information table to be processed, judge whether the student number appears in the dictionary in turn. If not, write the record to the result file result.xls.

def delete_record(student_info_file, group1_student_file, res_file):
    group_one_student_id_dict = read_student_id(group1_student_file) # Read the student number from the student information table of the first group and store it in the dictionary
    print(group_one_student_id_dict) # Print the dictionary storing the student number of the first group

    data = xlrd.open_workbook(student_info_file) # Open the student information sheet excel file
    table = data.sheets()[0] # Get the first sheet of excel file

    xls = xlwt.Workbook()  # Create an excel Workbook
    sheet = xls.add_sheet("Sheet1")  # Add a worksheet to the workbook
    num_cols = 6 # Number of original file columns
    id_col= 1 # Column corresponding to student number field
    res_row = 0 # What line is the next element written to the result file
    for row in range(table.nrows): # Traverse each line
        student_id = table.cell_value(row, id_col) # Get student number
        if student_id not in group_one_student_id_dict: # If the student number does not exist in the student information table of the first group, write the student information to the result file
            print(student_id) # Print student ID
            for col in range(num_cols):  # Traverse each field
                sheet.write(res_row , col,  table.cell_value(row, col)) # Write field to result file
            res_row += 1 # Update the row index of the next element in the result file
    xls.save(res_file) # Store workbook contents in res_file file

Step 3: initialize the student information table path, the first group student information table path and the result file path to be processed, and call the function delete_record, execute function logic.

if __name__ == '__main__':
    student_info_file = "D:\\program\\data\\Student information sheet 2.xls"  # Student information table file path
    group_one_student_info_file = "D:\\program\\data\\Group 1 student information.xls"  # First group student information table file path
    res_file = "D:\\program\\data\\result.xls"  # Result file path
    delete_record(student_info_file, group_one_student_info_file, res_file)  # Call delete_ The record function saves the student information of the non first group to the result file

Practice 1: understand the above three steps and execute the code and observe the results on your computer
Tip: after executing the program, open the result file "result.xls" to check whether the result is consistent with table 3

Table 3

Keywords: Python AI

Added by public-image on Mon, 06 Dec 2021 06:26:12 +0200