BS-1 Python basic knowledge-1.19 reading and writing of data files

Reading and writing data files

Text file read

Illustrate with examples
Put the operations of different files in different directories,
Right click in project to add a directory
If it's just a pure file and there's no package concept, select directory directly

Create a txt file directory

This way, no "init" files are created in the directory

Put the operation files in the txt directory, and different files operate in different directories
Right click on the txt directory

Add a run.py file. It has nothing to do with the original main.py file. It can be operated directly in txt

# Define a function in which you can write a string description of a document. In this way, if you want to view the help of this function, you can write help TXT? Writer () to see the information about this function
def txt_writer()
    """Writing file"""
    with open('data.txt','w',encoding='utf8') as f # Write to the context to release the handle / resource automatically. f is the file handle. What to write next
    f.write('Superior class\n') #Write is to write one line. Write multiple lines to put them in a list. Use the writelines method
    lines = ['Address: Beijing\n', 
             'QQ:95001678\n',
             'Website:http://uke.cc']
    f.writelines(lines) #The writelines method writes the list lines to a collection
# Write a function on it. The running script will not execute. You need to write a main function
if __name__ == '__main__' # This code snippet is commonly used, which has been integrated in pycharm, and can be used to help generate quickly. Write main and press tab.
    txt_writer()                      

Run the above code and create the data.txt file in the txt directory. The result is as follows

Read:

def txt_writer()
    """Writing file"""
    with open('data.txt','w',encoding='utf8') as f 
    f.write('Superior class\n') 
    lines = ['Address: Beijing\n', 
             'QQ:95001678\n',
             'Website:http://uke.cc']
    f.writelines(lines)

def txt_read()
    """read file"""
    with open('data.txt','r',encoding='utf8') as f #The next read can traverse it or use readlines to read it into a collection
         for line in f #f is an iterator. Read is to read all text at once as a whole, and read lines is to read each line into a list
             print(lineļ¼Œend=' ') #Print will automatically add / N after each line, and the content of the previous file will also add / N, so the printed result may be two line breaks in the middle. In this way, the / n that comes with the print is replaced with null
                                                                      
if __name__ == '__main__' # This code snippet is commonly used, which has been integrated in pycharm, and can be used to help generate quickly. Write main and press tab.
    txt_writer()                


The actual development involves the data exchange of different data sources, with a large amount of data, such as exporting information from the database, or downloading some xsml or Json formats from the Internet. If these files are replaced with. txt files, later parsing and reading are troublesome, because there is no semantic label or meaning in them.
Consider other formats
Data exchange between different language framework platforms is common, including csv,

csv file read

Definition

csv:
c. comma, s, separate, v, value
Comma separated values

tsv
tab separated values

This kind of file is essentially a text file, with the suffix of csv/tsv, but with semantic labels. Commas distinguish different fields, different lines and different columns

  1. Where do csv files come from
    The data comes from the input and output of the program, maybe a database export

For example, export the data of a table from the database, turn it into csv, put it under my project, and operate on it

Create a directory first

Import the files in the database into the csv directory in csv format

Create a new run.py file in the csv directory to process the csv file

Basic read

reader method basic read

  1. csv file read
    csv file operations do not use third-party packages / libraries, but are available in built-in modules

The two classes under the CSV package, reader and write, and the product.csv file are iterative

import csv

def csv_read():
    """csv Basic read"""
    with open('product.csv',encoding='utf8') as f
# The two classes under the CSV package, reader and write, and the product.csv file are iterative
    reader = csv.reader(f) #Next, print out the first line. Because it is an iterator, the iterator can use for traversal
    headers = next(reader) #Next global function, traversing the first line. After the first line is traversed, the pointer will not go back. The first line can no longer be traversed. The next traversal starts directly from the second line
    print(heardes)
    for row in reader
        print(row)

if __name__ == '__main__'
    csv_read()           

Operation error

Coding problem:
Open product.csv to see what its essential encoding is
Pycharm will prompt what the code is, the lower right corner of pycharm

Change it here,
Click utf16 in the lower right corner to convert it to utf8,

Convert, convert to utf8

Rerun code

It is to make a list of each line and print it out. In this way, the list information can be retrieved at will
A row is a list

For example, print a line of information, or format it,

Read by list index

The reader read result itself is a list

  1. Print the following string result:
import csv

def csv_read():
    """csv Basic read"""
    with open('product.csv',encoding='utf8') as f
    reader = csv.reader(f) #
    headers = next(reader) 
    print(heardes)
    for row in reader
# A row is a list. Print a string result
        print('number:{}\t product:{}\t Unit Price:{}'.format(row[0],row[1],row[3]))

if __name__ == '__main__'
    csv_read()           


At the database level, the top row is from the database column header, but when it comes to the csv file, it becomes text, which has nothing to do with the following. In the actual operation, you want the id to correspond with the following information one by one. You can use name.tuple to find the tuple named by name. The above words are index, easy to make mistakes.

Give the contents read by reader to namedtuple

  1. namedtuple to map a column name

In actual operation, you want the id to correspond with the following information one by one. You can use the named tuple to find the tuple by name.

namedtuple is equivalent to defining a simple class, that is, defining an object,

import csv
from collections import namedtuple

def csv_read():
    """csv Basic read"""
    with open('product.csv',encoding='utf8') as f
    reader = csv.reader(f) #
    headers = next(reader) 
    print(heardes)
    for row in reader

# Give this object a name, and don't worry about forgetting it.
def csv_read_by_namedtuple():
     """read csv And use namedtuple Mapping column names"""
     with open('product.csv',encoding='utf8') as f
         reader = csv.reader(f)
         headers = next(reader) #Separate the first line, or print the first line as data. List the headers and read the first row. Next, define a namedtuple object
         #Row = namedtuple('Row',['productID','Product'] ) # name object, defined by namedtuple, is called Row. Give him a list later. What the list later gives him is the final object, the assumed member of this class. As I have just read it, there is no need to write it one by one in the headers. The heardes is a list, and the headers list is assigned to the
         Row = namedtuple('Row',headers) # The columns read in the headers will be used as the name of Row. Next, put each column of data in its ID field. For example, put apple juice on the product name field. The declared row tuple is an object, just a framework structure with a label. When a Row of data is obtained, it has a value. Every Row traversed creates an instance, which uses the declared Row structure, and then one-to-one correspondence between the traversed Row and the structure label. This is what we do in the following cycle.
         for r in Reader  #Above is the declaration of a tuple, which is equivalent to a class. Now it is equivalent to creating an instance of a class. Create instance to pass value   
              row = Row(*r) #Give him the list information r read just now. Method parameter passing, tuple or list can be unpacked with asterisk * and each column in the list will be automatically mapped to each column corresponding to the header headers in Row. roe is equal to Row creating his instance and unpacking R's information to it.                          
              print(row)              
                                                                          
         print('number:{}\t product:{}\t Unit Price:{}'.format(row[0],row[1],row[3]))

if __name__ == '__main__'
    csv_read_by_namedtuple() 

Explain and declare Row, Row = namedtuple('Row ', headers), traverse the content in Row, for r in Row:
As an object, the declared row tuple is just a framework structure with labels, and then when a Row of data is obtained, there is value. Every Row traversed creates an instance, which uses the declared Row structure, and then one by one corresponds the traversed Row with the structure label. Do it in the following cycle
This is it.

Run result, print out is the type of Row object

  1. namedtuple, you can click its result just like using the instance of the class. When printing, you write a row to save trouble. You can do whatever you want.

For example: with formatting,
It doesn't take into account the column number when using it, just like using an instance of a class,

import csv
from collections import namedtuple

def csv_read():
    """csv Basic read"""
    with open('product.csv',encoding='utf8') as f
    reader = csv.reader(f) #
    headers = next(reader) 
    print(heardes)
    for row in reader

def csv_read_by_namedtuple():
     """read csv And use namedtuple Mapping column names"""
     with open('product.csv',encoding='utf8') as f
         reader = csv.reader(f)
         headers = next(reader)       
         Row = namedtuple('Row',headers) 
         for r in Reader   
              row = Row(*r)              
              print('{} -> {} -> {}'.format(row.ProductID,row.ProductName,row.UnitPrice)              
                                                                          
         print('number:{}\t product:{}\t Unit Price:{}'.format(row[0],row[1],row[3]))

if __name__ == '__main__'
    csv_read_by_namedtuple() 

Give the dictionary what you read

Dictionary table's own reading method, csv.DictReader()
7. dictionary table

mport csv
from collections import namedtuple

def csv_read_by_dict():
     """read csv To the dictionary table"""
     with open('product.csv',encoding='utf8') as f
         reader = csv.DictReader(f) # Use csv.reader(f), always as a list to wrap, as a dictionary table with dictreader method
         for row in reader #Each row is a dictionary table, which is traversed
            print(row) 
            print('{} -> {} ->{}'.format(row['ProductID'],row.get('ProductName','Not filled'},row.get{'UnitPrice',0.0}                


This is to treat the comma separated text in the text file as a python object, and then to get each column, you can operate.
Operation result

Different data structures call different methods

Writing csv file

Read file through reader
Write file through writer

The data processed in python is saved as a csv file.


Do not take the file name as the same as the existing module name of python, otherwise when importing a module, python cannot figure out which one to import.
For example, if the file name here is csv_script.py, named csv.py, because the. Py file itself is both a script and a module, an error will occur when writing code import csv in csv.py.
Do not define the names of python built-in modules or third-party modules that already exist.

Write the contents of the list

import csv

def csv_write()
    """Write in csv file""" #Similar to text files, the operation is to find a context object named f through the open method. Before writing, prepare the data, and organize the data from outside into python data structure. If there is a header, put the header in the headers list
    headers = ['number'.'curriculum','lecturer'] #Head, head. Next, there are several pieces of data, similar to rows / columns in the database. The most typical one is tuple
    rows = [(1,'Python','Eason'), 
            (2,'C#','Eason'),
            (3,'Django','Eason'),
            (4,'.NET','Eason')]
#The header is placed in a list, multiple tuples are placed in a row, one tuple has three columns, corresponding to the above, and multiple tuples are placed in one list. 
#Now there are two data structures: one is a single row and multiple columns, one is a list, and two is multiple rows. Tuple is suitable for describing the rows in the data table. The rows returned by the database read-write operation python are also tuple
#Data can be assigned, ctrl+d in pycharm
#Next, write them to a file. Instead of writing to an existing file, create a
    with open('my_course.csv','w',encoding='utf8') as f
# with is not necessary, but it can complete the creation, reading and writing, and closing. After the code is written, it will automatically close
#The contents of the header and the bottom are written on the top. Next, the writer should be constructed
        writer = csv.writer(f) #Next, write the file to the writer after the csv package to help
        writer.writerow(headers) #Write method, write a line, write the header first,
        writer.writerows(rows) #Write multiple lines in the list,
                        
if __name__ == '__main__'
    csv.write()

Operation result:

One less r,
Run results after changes
There is no change in the running result, but there is an extra my course.csv in the project file on the left, as shown in the right figure. A blank line is added between each line of text in the result.
When creating a file just now, add newline = ', an empty character, and the default value after newline is / n. replace it.

 with open('my_course.csv','w',encoding='utf8',newline=' ') as f

Operation result

As a result, the comma in the text is not the comma in the tuple, but created by the wrapped csv.write.

Put reading and writing together

import csv

def csv_reader()
    """read csv"""
    with open('my_course.csv',encoding='utf8') as f
    reader=csv.reader(f)
    headers = next(reader)
    print(headers)
    for row in reader
        print(row)

def csv_write()
    """Write in csv"""
    headers = ['number'.'curriculum','lecturer'] 
    rows = [(1,'Python','Eason'), 
            (2,'C#','Eason'),
            (3,'Django','Eason'),
            (4,'.NET','Eason')]

    with open('my_course.csv','w',encoding='utf8') as f
        writer = csv.writer(f)
        writer.writerow(headers) 
        writer.writerows(rows) 
                        
if __name__ == '__main__'
    csv.write()

Operation result

Dictionary table content writing

import csv

def csv_reader()
    """read csv"""
    with open('my_course.csv',encoding='utf8') as f
    reader=csv.reader(f)
    headers = next(reader)
    print(headers)
    for row in reader
        print(row)

def csv_write_dict()
    """with dict Form write csv""" #Prepare data as a dictionary    
    headers = ['ID'.'Title','Org','Url']#In the list, the content in the bottom rows is still a list outside, and the content in the bottom rows is changed to dict.
    rows = [{'ID':1,'title':'Python','Org':'Youpinketang','Url':'http://uke.cc'}
            {'ID':1,'title':'Python','Org':'Youpinketang','Url':'http://uke.cc'}
            {'ID':1,'title':'Python','Org':'Youpinketang','Url':'http://uke.cc'}
            {'ID':1,'title':'Python','Org':'Youpinketang','Url':'http://uke.cc'}
            dict(ID:1,title:'Python',Org:'Youpinketang',Url:'http://uke.cc')
            dict(ID:1,title:'Python',Org:'Youpinketang',Url:'http://uke.cc')]
#Dictionary table declaration method, directly write curly bracket / write dict function
    with open('my_course2.csv','w',encoding='utf8',newline=' ') as f
    writer = csv.DictWriter(f,headers) #Take headers as the second parameter of the function, and then construct the file with DictWriter to know that headers are headers. Just to know, not to write
    writer = csv.writer(f)
    writer.writeheader() #Writer method writeheader
    writer.writerows(rows) 
                        
if __name__ == '__main__'
    csv.writer_dict()
#The dictionary table only has this concept in the code. After writing to csv, the format is unified, separated by commas. There is no dictionary table


The existing one is being written once, and the original one will be overwritten.

JSON file processing

json introduction

JSON is also a standard format for exchanging data between different programs or framework languages. In essence, it is a text file, which is actually a document of comments described by javaScript.

JSON format, no matter what language of web development, will be used. Because their open API s return JSON,
XML is relatively complex in structure, a little trouble in parsing, and JSON is relatively lightweight.
json is an API interface for web development, which is to put it on the Internet and publish a public URL address. If someone requests it, they need to submit data / transmission parameters. After passing it to the URL address, a json document will be returned in the browser. This document is a string. The format looks very similar, but it is not the same.

If you want to convert the dictionary table in python to the object described in JSON format, you can: declare a string in JSON format, which comes from the data just now. Using the json.dump s method, dump (dump), adding an S is not a complex number, it is to change the content passed in brackets into a string form.

The format is similar to the dictionary table in python, but there are also some differences,
Give an example:
Open json.org for details

Dictionary table converted to json data

To convert the dictionary table in python to the object described in JSON format, you can: declare a string in JSON format, which comes from the data just now. Using the json.dump s method, dump (dump), adding an S is not a complex number, it is to change the content passed in brackets into a string form.

New working directory and file

import json

def json_basic()
    data = {'ID':1,
            'curriculum':'Python explain incisively',
            'mechanism':'Superior class',
            'Unit Price':98.00,
            'Website':'http://codr.cn'}
    print('Raw data')
    print(data)
    print('-'*20)#Dividing line
# To convert the dictionary table in python to the object described in JSON format, you can: declare a string in JSON format, which comes from the data just now. Using the json.dumps method,
    json_str = json.dumps(data) #The data in the data dictionary table has been changed to a string in json format, which has not been written to a file
    print(json_str) #If you want to make a comparison, you can print out the previous data 

Operation result

The upper line is the original data, the lower line is the string of json format data, and utf8 encoding is also done.
It can be seen from the figure that the structure of the dictionary table is the same as that of json, which is different in semantics. json belongs to the string of json, and the dictionary is the dictionary table of python.

Return json string as python's dictionary table

JSON data is converted to dict, json_data = json.loads, s is not the third person singular, but the meaning of string.

import json

def json_basic()
    data = {'ID':1,
            'curriculum':'Python explain incisively',
            'mechanism':'Superior class',
            'Unit Price':98.00,
            'Website':'http://codr.cn'}
    print('Raw data')
    print(data)
    print('-'*20)#Dividing line
# To convert the dictionary table in python to the object described in JSON format, you can: declare a string in JSON format, which comes from the data just now. Using the json.dumps method,
    json_str = json.dumps(data) #The data in the data dictionary table has been changed to a string in json format, which has not been written to a file
    print(json_str) #If you want to make a comparison, you can print out the previous data 
    print('--'*20)
# json data to dict
    json_data = json.loads(json_str) #Restore the result from the JSON STR string to the python dictionary table
    print(json_data)

Operation result

json document operation

Written by json document
json.dump(data,f)
Read from json document
json.load()

In the actual development, it is more to write it into a file. When doing web programming later, there is a request api. Here, a json document is returned to your browser for you to display, so it can be saved as a json document directly, or written to the web page this time.
Next, treat it as a file

import json

def json_writer_file()
    """Write in json File"""
    data = {'ID':1,
            'curriculum':'Python explain incisively',
            'mechanism':'Superior class',
            'Unit Price':98.00,
            'Website':'http://codr.cn'}
    with open('data.json','w',encoding='utf8') as f
    json.dump(data,f)  #dump does not add s because it is not written to a string, but directly to a document. Write the data in f.

if __name__ == '__main__'
    json_writer_file()

Operation result

This result can be taken out and put on the Internet as an interface data for external publishing. To be a web api, you need to output it to html,

import json

def json_writer_file()
    """Write in json File"""
    data = {'ID':1,
            'curriculum':'Python explain incisively',
            'mechanism':'Superior class',
            'Unit Price':98.00,
            'Website':'http://codr.cn'}
    with open('data.json','w',encoding='utf8') as f
    json.dump(data,f)  #dump does not add s because it is not written to a string, but directly to a document. Write the data in f.

def json_read_file()
    """read json file"""
    with open('data.json','r',encoding='utf8') as f
    data = json.load(f)
    print(data)
        
if __name__ == '__main__'
    json_writer_file()

Operation result

The dict of restoring data to python

Difference between dict and json types: true/false, different writing methods of null

import json

def json_type_diff()
    """ Type difference"""
   print(json.dumps(False))
   print(json.dumps(None))
   data = {'Discontinued':False,
           'Title':'iphone7s',
           'Category':None,
           'Price':5999.00}
    print(json.dumps(data))
        
if __name__ == '__main__'
    json_type_diff()

False in python, false in json
None----null

Other operations. For example, put the key value pair of json into the class and load it as a member of the class

Reading excel file

Install xlrd

excel files can also be exported through the database

python does not have a built-in module to deal with excel. It installs the third-party package. It uses pip to install under the command line. In pycharm, it installs directly on the interface. It searches for xlrd (read package), xlwr (written package), and more xlutils (tool kit)

Currently, only read package xlrd is used

read

  1. Form information

    If there are many methods written in a script, different methods handle different documents, some handle json, some handle csv, and the import package is not put on the top of the import line, write it into the function
def xl_read()
    """excel read"""
    import xlrd #Import in function, valid only in current function, not outside
    book =xlrd.open_workbook('product.xls')#xlrd.openworkbook Method, open a table #Next, find the workbook sheet to operate on
    for sheet in book.sheets() #book.sheets method. It can be seen from the name that sheets are a collection. This method function can find all sheet labels and traverse sheets in the sheet,
        print(sheet.name) #Workbook, sheet, write properties to him in object-oriented form, print out the property name of sheet

if __name__ == '__main__'
    xl_read()                  

As a result, the name of each workbook in the worksheet appears

  1. print data
def xl_read()
    """excel read"""
    import xlrd #Import in function, valid only in current function, not outside
    book =xlrd.open_workbook('product.xls')
    for sheet in book.sheets()
        print(sheet.name) 

def xl_read_data()
    """data fetch"""
    book = xlrd.open_workbook('product.xls')
    sheet = book.sheet_by_name('Product') #Filter out which sheet I want to operate, the. Sheet by index method, the. Sheet by name method, and the name of the sheet
    print('workbook:{}'.format(sheet.name))
    print('Data rows:{}'.format(sheet.nrows))
# Traversal, you can use for traversal, but the data in excel may be large, for one-time reading, low efficiency. After knowing the number of rows, you can cycle the specified number of times and only find the specified rows
    print('Product data')
    print('=' * 50)
    for i in range((sheet.nrows))
        print(sheet.row_values(i)) #Gets the data row specified by the index. . row? Values method
        
if __name__ == '__main__'
    xl_read()              

Function

int cannot traverse. Use range, for i in range((sheet.nrows))
Print results:

  1. As long as a specific row, you can further traverse, loop nesting, loop to this row, and recycle each column of this row. If a row is looped out, it is a list. Every column in this list can also be looped out
Published 21 original articles, won praise 3, visited 2310
Private letter follow

Keywords: JSON Python encoding Database

Added by Cesar on Fri, 21 Feb 2020 06:38:58 +0200