1. File Pointer
Where does the file pointer mark start reading data
When a file is first opened, the file pointer usually points to the beginning of the file, and when the read method executes, the file pointer moves to the end of the line where the content is read
Example [1]
Source code to execute:
Note: Try to write a file yourself, do not use the directory or file under the root directory, in case the contents of the file are damaged during operation
#Open File file =open('passwd') #The read method reads the contents of the file (returns all the contents of the file at once) text =file.read() print text print type(text) print len(text) print '********' #Move file pointer to beginning of file file.seek(0) text = file.read() print len(text) #The close method closes the file file.close()
For passwd file operation, the contents of the file are as follows:
Execute result demonstration:
2. How to Read Large Files
Read method: By default, all the contents of the file will be read to memory at once, if the file is too large, the memory usage will be very serious
readline method: can read one line at a time, the method executes, will move the file pointer to the next line, ready to read again
How to read large files correctly:
# Read large files correctly file = open('passwd') # Why set it to an infinite loop: # Because we don't know the condition of the loop, we don't know exactly how many lines there are in the file while True: text = file.readline() # Determine if content has been read # If the file pointer reaches the last line of the file, the contents will not be read if not text: break # There is already a'\n'at the end of each line read print text file.close()
3. Copying Documents
The requirement is to open an existing file, read the entire content, and write to another file
# Source file opened read-only file_read = open('passwd') # Target file opens in write-only mode file_write = open('passwd_copy','w') # Read content from source file text = file_read.read() # Write the read content to the target file file_write.write(text) # Close File file_read.close() file_write.close()
As you can see from the execution, a file called passwd_copy appears after the code is executed
IV. Processing Documents
Read, write, append (r,r+, w,w+, a+) operations that can be performed on files
The role of these three operations:
r: -Read only, not write -Read file does not exist, error will occur r+: -Read-write -File does not exist, error occurred w: -Write only, not read -Will empty question to overwrite file content -If the file does not exist, a new one will be created w+: -Read-write -The file exists, overwriting the original file -File does not exist, create a new file a: -Write only, not read -File does not exist, new file does not error -File exists, file content will not be emptied a+: -Read-write -Create a new file when it does not exist -The file exists and will not overwrite the contents of the original file
Trilogy of File Operations: open, Operations (r,w,a), close
f = open('passwd','r') content = f.read() #f.write('redhat') #Write to a file print content f.close()
Open file read-only
If you open the line you write on, you will get an error
5. Read the contents of binary files
f1 = open('han.jpeg',mode='rb') #The binary han.jpeg exists content = f1.read() f1.close() f2 = open('happy.png',mode='wb') #happy This file does not exist f2.write(content) f2.close()
han.jpeg binary file content
The final happpy.png file
6. Processing files with open() as
By using the with structure, you let python determine that if you just open the file and use it when needed, python will automatically close it when appropriate
We called open(), but we did not call close(); you can also call open() and close to open and close files, but in doing so, if there is a bug in the program that causes the close() statement to not execute, the files will not close, and improper closing of files may result in data loss or damage.
1. Read the contents of the file
with open('passwd') as f: lines=f.readline() print lines
The readline method can only read one line at a time
2. Read file contents line by line
with open('passwd') as f: lines=f.readline() print lines for line in lines: print line
7. Exercises in Documents
Topic Content:
- Generate a large file ips.txt, which requires 1200 lines, each of which is randomly 172.25.254.0/24 segments of ip;
- Read the ips.txt file to count the top 10 ip occurrences in this file.
import random def create_ips_file(filename): ips = ['172.25.254.' + str(i) for i in range(1, 255)] with open(filename, 'a+') as f: for count in range(1200): f.write(random.sample(ips, 1)[0] + '\n') def sorted_by_ip(filename,count =10): ips_dict = dict() with open(filename) as f: for ip in f: if ip in ips_dict: ips_dict[ip] += 1 else: ips_dict[ip] = 1 sorted_ip = sorted(ips_dict.items(), key=lambda x:x[1],reverse=True)[:count] return sorted_ip print sorted_by_ip('ips.txt')