File operation
1. Text file
The text file stores ordinary "character" text. The default is unicode character set (two bytes represent one character, up to 65536), which can be opened by Notepad program. However, documents edited by word software are not text files.
2. Binary files
Binary files store the data content in "bytes" and cannot be opened with Notepad. Special software must be used for decoding. Common are: MP4 video files, MP3 audio files, JPG pictures, doc documents, etc.
Overview of modules related to file operation
Create file object (open)
The open() function is used to create file objects. The basic syntax format is as follows:
Open (file name [, opening method])
f = open(r"d:\b.jpg","bw")
Opening method:
Creation of text file objects and binary file objects:
If we do not add mode "b", the default is to create a text file object, and the basic unit of processing is "character". If it is binary mode "b", it is to create a binary file object, and the basic unit of processing is "byte".
write()/writelines() writes data
write(a): writes the string a to a file
writelines(b): writes a list of strings to a file without adding line breaks
close() closes the file stream
Since the underlying file is controlled by the operating system, the file object we open must explicitly call the close() method to close the file object. When the close() method is called, the buffer data will be written to the file first (or the flush() method can be called directly), and then the file will be closed to release the file object.
In order to ensure the normal closing of open file objects, it is generally implemented in combination with the finally or with keyword of the exception mechanism to close open file objects in any case.
Reading of text files
Use the following three methods:
1.read([size])
Read size characters from the file and return them as results. If there is no size parameter, the entire file is read.
Reading to the end of the file returns an empty string.
2.readline()
Read a line and return it as a result. Reading to the end of the file returns an empty string.
3.readlines()
In the text file, each line is stored in the list as a string, and the list is returned
enumerate() function
a = ["1111",:2222\n","3333\n"] b = enumerate(a) c = [temp.rstrip()+"#"+str(index) for index,temp in enumerate(a)]
Binary file reading and writing
For example:
f = opne(r"d:\a.txt",'wb')#Writable, overridden binary object f = opne(r"d:\a.txt",'ab')#Writable, append mode binary object f = opne(r"d:\a.txt",'rb')#Readable binary object
with open("aa.gif",'rb') as f: with open("aa_copy.gif",'wb') as w: for line in f.readlines(): w.wite(line) #pictures copying
Using pickle serialization
In Python, an object is essentially a "memory block for storing data". Sometimes, it is necessary to save the "data of memory block" to the hard disk or transmit it to other computers through the network. At this time, you need to "serialize and deserialize objects". Object serialization mechanism is widely used in distributed and parallel systems.
Serialization refers to converting objects into "serialized" data form, storing them on hard disk or transmitting them to other places through network. Deserialization refers to the reverse process of converting the read "serialized data" into objects.
We can use the functions in the pickle module to realize serialization and deserialization.
Serialization we use:
pickle.dump(obj,file) obj is the object to be serialized, and file refers to the stored file
pickle.load(file) reads data from file and reversely serializes it into an object
import pickle with open(r"d:\data.dat","wb") as f: a1="full name" a2= 234 a3=[20,30,40] pickle.dump(a1,f) pickle.dump(a2,f) pickle.dump(a3,f) with open(r"d:\data.dat","rb") as f: a1 = pickle.load(f) a2 = pickle.load(f) a3 = pickle.load(f) print(a1) print(a2) print(a3)
Reading and writing CSV files
import csv with open("a.csv","r") as f: a_csv = csv.reader(f) print(list(a_csv)) for row in a_csv: print(row) with open("b.csv","w") as f: b_csv = csv.writer(f) b_csv.writerow(["ID","full name","Age"]) c = [["ID","full name","Age"],["111","DJ","23"]] b_csv.writerows(c)
os and os Path module
os module can help us operate the operating system directly. We can directly call the executable files, commands, direct operation files, directories, etc. of the operating system. Core foundation of system operation and maintenance
import os os.system("notepad.exe") #Call Notepad program of windows system os.system("ping www.baidu.com")#Calling ping command in windows system os.startfile(r"Where files exist")#Calling program
import os print(os.name) #Windows -- > NT Linux and UNIX -- > POSIX print(os.sep) #Windows - > linux and UNIX - >/ print(repr(os.linesep)) #windows->\r\n linux-->\n\ print(os.stat("file name")) #Find file information print(os.getcwd()) #Find current working directory os.mkdir() #Create in current working directory os.chdir("position") #Change working directory os.rmdir() #Relative paths are relative to the current working directory os.makedirs("Multilevel directory") #Generate multi-level directory/ It refers to the upper level directory os.removedirs("Multilevel directory") #Only empty directories can be deleted dirs = os.listdir("Directory name") print(dirs)
os.path module
os. The path module provides directory related operations (path judgment, path segmentation, path link, folder traversal)
import os path = os.getcwd() file_list = os.listdir(path) #List subdirectories and sub files for filename in file_list: if filename.endswith("py"): print(filename) file_list2 = [filename for filename in os.listdir(path) if filename.endwith("py")] for f in file_list2: print(f,end="\t")
walk() recursively traverses all files and directories
os.walk() method:
Returns a tuple of 3 elements, (dirpath,dirnames,filenames)
dirpath: the path to list the specified directory
dirnames: all folders in the directory
filenames: all files in the directory
import os all_files = [] path = os.getcwd() list_files = os.walk(path) for dirpath,dirnames,filenames in list_files: for dir in dirnames: all_files.append(os.path.join(dirpath,dir)) for file in filenames: all_files.append(os.path.join(dirpath,file)) for file in all_files: print(file)
shutil module (copy and compression)
The shutil module is provided in the python standard library. It is mainly used to copy, move and delete files and folders. It can also compress and decompress files and folders.
The os module provides general operations on directories or files. As a supplement, the shutil module provides operations such as moving, copying, compressing and decompressing, which are not provided by these os modules.
Copy:
import shutil shutil.copyfile("1.txt","1_copy.txt") shutil.copytree("movie/steel","film") #The movie directory can only be copied normally when it does not exist shutil.copytree("movie/steel","film",ignore=shutile.ignore_patterns("*.txt","*.html")) #Ignore copying some unnecessary files
Compression and decompression:
import shutil import zipfile shutil.make_archive("Compressed to destination address/Compressed file name","zip","Destination address/Target file") #compress z1 = zipfile.ZipFile("a.zip","w") z1.write("file name") z1.close #decompression z2 = zipfile.ZipFile("d:/a.zip","r") z2.extractall("File directory") z2.close()
Principle of recursive algorithm_ Directory number structure display
import os allfiles = [] def getAllFiles(path,level): childFiles = os.listdir(path) for file in childFiles: filepath = os.path.join(path,file) if os.path.isdir(filepath) getAllFiles(filepath,level+1) allfiles.append("\t"*level+filefath) for f in reversed(allfiles): print(f)