File handling in Python

1. File Pointer

Where does the file pointer mark start reading data

When a file is first opened, the file pointer usually points to the beginning of the file, and when the read method executes, the file pointer moves to the end of the line where the content is read

Example [1]

Source code to execute:

Note: Try to write a file yourself, do not use the directory or file under the root directory, in case the contents of the file are damaged during operation

#Open File
file =open('passwd')  
#The read method reads the contents of the file (returns all the contents of the file at once)
print text
print type(text)
print len(text)
print '********'
#Move file pointer to beginning of file
text =
print len(text)
#The close method closes the file

For passwd file operation, the contents of the file are as follows:

Execute result demonstration:

2. How to Read Large Files

Read method: By default, all the contents of the file will be read to memory at once, if the file is too large, the memory usage will be very serious

readline method: can read one line at a time, the method executes, will move the file pointer to the next line, ready to read again

How to read large files correctly:

# Read large files correctly
file = open('passwd')
# Why set it to an infinite loop:
# Because we don't know the condition of the loop, we don't know exactly how many lines there are in the file
while True:
    text = file.readline()
    # Determine if content has been read
    # If the file pointer reaches the last line of the file, the contents will not be read
    if not text:
    # There is already a'\n'at the end of each line read
    print text

3. Copying Documents

The requirement is to open an existing file, read the entire content, and write to another file

# Source file opened read-only
file_read = open('passwd')
# Target file opens in write-only mode
file_write = open('passwd_copy','w')

# Read content from source file
text =
# Write the read content to the target file

# Close File

As you can see from the execution, a file called passwd_copy appears after the code is executed

IV. Processing Documents

Read, write, append (r,r+, w,w+, a+) operations that can be performed on files

The role of these three operations:

    -Read only, not write
    -Read file does not exist, error will occur
    -File does not exist, error occurred
    -Write only, not read
    -Will empty question to overwrite file content
    -If the file does not exist, a new one will be created
    -The file exists, overwriting the original file
    -File does not exist, create a new file
    -Write only, not read
    -File does not exist, new file does not error
    -File exists, file content will not be emptied
    -Create a new file when it does not exist
    -The file exists and will not overwrite the contents of the original file

Trilogy of File Operations: open, Operations (r,w,a), close

f = open('passwd','r')
content =
#f.write('redhat')    #Write to a file
print content

Open file read-only

If you open the line you write on, you will get an error

5. Read the contents of binary files

f1 = open('han.jpeg',mode='rb')    #The binary han.jpeg exists
content =

f2 = open('happy.png',mode='wb')    #happy This file does not exist

han.jpeg binary file content

The final happpy.png file

6. Processing files with open() as

By using the with structure, you let python determine that if you just open the file and use it when needed, python will automatically close it when appropriate

We called open(), but we did not call close(); you can also call open() and close to open and close files, but in doing so, if there is a bug in the program that causes the close() statement to not execute, the files will not close, and improper closing of files may result in data loss or damage.

1. Read the contents of the file

with open('passwd') as f:
    print lines

The readline method can only read one line at a time

2. Read file contents line by line

with open('passwd') as f:
    print lines
for line in lines:
    print line

7. Exercises in Documents

Topic Content:

  1. Generate a large file ips.txt, which requires 1200 lines, each of which is randomly segments of ip;
  2. Read the ips.txt file to count the top 10 ip occurrences in this file.
import random
def create_ips_file(filename):
    ips = ['172.25.254.' + str(i) for i in range(1, 255)]
    with open(filename, 'a+') as f:
        for count in range(1200):
            f.write(random.sample(ips, 1)[0] + '\n')

def sorted_by_ip(filename,count =10):
    ips_dict = dict()
    with open(filename) as f:
        for ip in f:
            if ip in ips_dict:
                ips_dict[ip] += 1
                ips_dict[ip] = 1
    sorted_ip = sorted(ips_dict.items(),
                       key=lambda x:x[1],reverse=True)[:count]
    return sorted_ip
print sorted_by_ip('ips.txt')



Keywords: Python Lambda

Added by lakshmiyb on Wed, 15 May 2019 16:32:54 +0300