Graphic python | file and directory operation

Author: Han Xinzi@ShowMeAI
Tutorial address: http://www.showmeai.tech/tuto...
Article address: http://www.showmeai.tech/article-detail/86
Notice: All Rights Reserved. Please contact the platform and the author for reprint and indicate the source

1.Python file directory operation and OS module

In our actual development, we often need to read, traverse and modify files. Through the standard built-in os module of python, we can complete these operations in a concise and efficient way. Common operations are summarized as follows:

  • Folder operation: including folder creation, modification (renaming / moving), query (viewing, traversing), deletion, etc.
  • File operation: including file creation, modification, reading, deletion, etc.
  • Path operation: path operation of folder or file, such as absolute path, file name and path division, Extension Division, etc

To complete the operation of files and directories, first import the corresponding os module. The code is as follows:

import os

2. Folder operation

Take the local pythontest directory as the demonstration directory, and the current files in this directory are as follows:

test
 │ test.txt
 └─test-1
     test-1.txt

Test and test-1 are folders, test Txt and test-1 Txt is a file.

(1) Query operation

In linux, we use ls / pwd / cd to complete operations such as querying and switching paths. The corresponding python operation methods are as follows:

  • listdir: list of files and directories
  • getcwd: get the current directory
  • chdir: change directory
  • stat: basic information of files and directories
  • walk: recursively traverse the directory
>>> os.chdir("./pythontest")  # change directory
>>> os.getcwd()                 # Get current directory
'/Users/ShowMeAI/pythontest'
>>> os.listdir("test")          # File and directory list, relative path
['test-1', 'test.txt']          
>>> os.listdir("/Users/ShowMeAI/test")  # List of files and directories, absolute path
['test-1', 'test.txt']
>>> os.stat("test")             # Get directory information
os.stat_result(st_mode=16877, st_ino=45805684, st_dev=16777221, st_nlink=11, st_uid=501, st_gid=20, st_size=352, st_atime=1634735551, st_mtime=1634735551, st_ctime=1634735551)
>>> os.stat("test/test.txt")    # Get file information
os.stat_result(st_mode=33188, st_ino=45812567, st_dev=16777221, st_nlink=1, st_uid=501, st_gid=20, st_size=179311, st_atime=1634699986, st_mtime=1634699966, st_ctime=1634699984)

The stat function returns the basic information of the file or directory, as follows:

  • st_mode: inode protection mode
  • st_ino: inode node number.
  • st_ Dev: device where inode resides.
  • st_ Nlink: number of links to inode.
  • st_uid: user ID of the owner.
  • st_gid: the group ID of the owner.
  • st_size: the size of ordinary files in bytes
  • st_atime: time of last visit.
  • st_mtime: the time of the last modification.
  • st_ctime: creation time.

In daily use, we usually use st_size ,st_ctime and st_mtime gets the file size, creation time and modification time. In addition, we can see that the output time is seconds. Let's mention here about the date conversion processing.

(2) Traversal operation

The walk function recursively traverses the directory and returns root, dirs and files, respectively corresponding to the current traversed directory, subdirectories and files in this directory.

data = os.walk("test")               # Traverse the test directory
for root,dirs,files in data:         # Recursive traversal and output
   print("root:%s" % root)
   for dir in dirs:
      print(os.path.join(root,dir))
   for file in files:
      print(os.path.join(root,file))

(3) Create operation

  • mkdir: create a new single directory. If the parent directory in the directory path does not exist, the creation fails
  • makedirs: create multiple directories. If the parent directory in the directory path does not exist, it will be created automatically
>>> os.mkdir("new")
>>> os.mkdir("new1/new1-1")          # The parent directory does not exist. An error is reported
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: The system cannot find the specified path.: 'new1/new1-1'
>>> os.makedirs("new1/new1-1")       # The parent directory does not exist. It will be created automatically
>>> os.listdir("new1")
['new1-1']

(4) Delete operation

  • rmdir: delete a single empty directory. If the directory is not empty, an error will be reported
  • Removediers: delete the recursive multi-level empty directory according to the path. If the directory is not empty, an error will be reported
>>> os.rmdir("new1")                         # If the directory is not empty, an error will be reported
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OSError: The directory is not empty.: 'new1'
>>> os.rmdir("new1/new1-1")
>>> os.removedirs("new1/new1-1")            # Delete multilevel empty directory
>>> os.listdir(".")
['new']

Due to the restriction of deleting empty directories, the rmtree function in the shutil module is more used to delete non empty directories and their files.

(5) Modify operation

  • Rename: rename the directory or file. You can modify the path of the file or directory (i.e. move operation). If the target file directory does not exist, an error will be reported.
  • Renames: renames a directory or file. If the target file directory does not exist, it will be created automatically
>>> os.makedirs("new1/new1-1")
>>> os.rename("new1/new1-1","new1/new1-2")     # new1-1 new1-2
>>> os.listdir("new1")
['new1-2']
>>> os.rename("new1/new1-2","new2/new2-2")     # Because the new2 directory does not exist, an error is reported
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: The system cannot find the specified path.: 'new1/new1-2' -> 'new2/new2-2'
>>> os.renames("new1/new1-2","new2/new2-2")    # renames automatically creates directories that do not exist
>>> os.listdir("new2")
['new2-2']

If the destination path file already exists, then OS Rename() and OS Renames() will report an error: FileExistsError. When the file already exists, it cannot be created.

3. File operation

(1) Query operation

  • open/read/close: file read
  • stat: file information. See the description of stat in the previous folder for details
>>> f = os.open("test/test.txt", os.O_RDWR|os.O_CREAT)  # Open file
>>> str_bytes = os.read(f,100)                          # Read 100 bytes
>>> str = bytes.decode(str_bytes)                       # Byte to string
>>> print(str)
test write data
>>> os.close(f)                                         # Close file

Note that open/read/close needs to be operated together. The open operation needs to specify the mode. The above is to open the file in read-write mode. If the file does not exist, create the file. Each mode is as follows:

flags -- this parameter can be the following options, separated by "|":

  • os.O_RDONLY: open as read-only
  • os.O_WRONLY: open in write only mode
  • os.O_RDWR: open in read-write mode
  • os.O_NONBLOCK: open without blocking
  • os.O_APPEND: open in append mode
  • os.O_CREAT: create and open a new file
  • os.O_TRUNC: open a file and truncate it to zero length (must have write permission)
  • os.O_EXCL: returns an error if the specified file exists
  • os.O_SHLOCK: automatically acquire shared locks
  • os.O_EXLOCK: automatically acquire independent locks
  • os.O_DIRECT: eliminate or reduce cache effect
  • os.O_FSYNC: synchronous write
  • os.O_NOFOLLOW: do not track soft links

(2) Create operation

Use open to create a file and specify the mode. If the file does not exist, it will be created. It is a bit similar to the touch in linux operation.

>>> f = os.open("test/ShowMeAI.txt", os.O_RDWR|os.O_CREAT)   # If the file does not exist, it is created
>>> os.close(f)

(3) Modify operation

  • open/write/close: write the contents of the file
  • rename, renames: consistent with the name modification and move operations described above.
>>> f = os.open("test/ShowMeAI.txt", os.O_RDWR|os.O_CREAT)     # Open file
>>> os.write(f,b"ShowMeAI test write data")                         # Write content
15
>>> os.close(f)                                   # Close file

(4) Delete

  • remove: delete files. Note that directories cannot be deleted (use rmdir / removediers)
>>> os.remove("test/test-1")       # Error in deleting directory
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: The system cannot find the specified file.: 'test/test1'
>>> os.remove("test/ShowMeAI.txt")     # Delete file
>>> os.listdir("test")
['test-1']

4. Path operation

In the process of using files or directories, files and directory paths often need to be processed. Therefore, there is a sub module path in os, which is specially used to process path operations. The main operations are as follows:

  • Absolute path: abspath
>>> os.path.abspath("test")
'/Users/ShowMeAI/test'
  • Exists: determines whether a file or directory exists
>>> os.path.exists("test")
True
>>> os.path.exists("test/test.txt")
False
>>> os.path.exists("test/test-1/test-1.txt")
True
  • isfile/isdir: judge whether it is a file / directory
>>> os.path.isdir("test")
True
>>> os.path.isfile("test/test-1/test-1.txt")
True
  • basename/dirname: get the path tail and path header. In fact, it takes the last / in the path as the delimiter and is divided into two parts: head and tail. Tail is the content returned by basename and head is the content returned by dirname. It is often used to obtain file name, directory name and other operations
>>> os.path.basename("test/test-1/test-1.txt")   # file name
'test-1.txt'
>>> os.path.basename("test/test-1/")     # Empty content
''
>>> os.path.basename("test/test-1")      # Directory name
'test-1'
>>> os.path.dirname("test/test-1/test-1.txt")   # Directory path of the file
'test/test-1'
>>> os.path.dirname("test/test-1/")   # Directory path
'test/test-1'
>>> os.path.dirname("test/test-1")   # Parent directory path
'test'
  • join: composite path, that is, connect the two parameters with the system path separator to form a complete path.
>>> os.path.join("test","test-1")   # Connect two directories
'test/test-1'
>>> os.path.join("test/test-1","test-1.txt")   # Connection directory and file name
'test/test-1/test-1.txt'
  • Split: split the file name and folder, that is, take the last slash "/" as the separator, cut the path into head and tail, and return it in the form of (head, tail) tuples.
>>> os.path.split("test/test-1")     # Split directory
('test', 'test-1')
>>> os.path.split("test/test-1/")    # Directory segmentation ending in /
('test/test-1', '')
>>> os.path.split("test/test-1/test-1.txt")  # Split file
('test/test-1', 'test-1.txt')
  • Split text: split the path name and file extension, and use the last extension separator "." Split, cut into head and tail, and return in the form of (head, tail) tuples. Note that the difference from split is the separator.
>>> os.path.splitext("test/test-1")  
('test/test-1', '')
>>> os.path.splitext("test/test-1/") 
('test/test-1/', '')
>>> os.path.splitext("test/test-1/test-1.txt")  # Distinguish between file name and extension
('test/test-1/test-1', '.txt')
>>> os.path.splitext("test/test-1/test-1.txt.mp4") # With the last "." Is the split point
('test/test-1/test-1.txt', '.mp4')

5. Typical applications

(1) Batch modify file name

def batch_rename(dir_path):
    itemlist = os.listdir(dir_path)
    # Get directory file list
    for item in itemlist:
        # Connect to full path
        item_path = os.path.join(dir_path, item)
        print(item_path)
        # Modify file name
        if os.path.isfile(item_path):
            splitext = os.path.splitext(item_path)
            os.rename(item_path, splitext[0] + "-ShowMeAI" + splitext[1])

(2) Traverse all files with specified extension under the directory and subdirectory

def walk_ext_file(dir_path, ext_list):
   # @dir_path parameter: traversal directory
   # @ext_list parameter: extended name list, for example ['. mp4', '.mkv', '.flv']
    # ergodic
    for root, dirs, files in os.walk(dir_path):
        # Get file name and path
        for file in files:
            file_path = os.path.join(root, file)
            file_item = os.path.splitext(file_path)
            # Output the file path with the specified extension
            if file_item[1] in ext_list:
                print(file_path)

(3) Sort the files in the specified directory according to the modification time

def sort_file_accord_to_time(dir_path):
    # Before sorting
    itemlist = os.listdir(dir_path)
    print(itemlist)
    # Forward sort
    itemlist.sort(key=lambda filename: os.path.getmtime(os.path.join(dir_path, filename)))
    print(itemlist)
    # Reverse sort
    itemlist.sort(key=lambda filename: os.path.getmtime(os.path.join(dir_path, filename)), reverse=True)
    print(itemlist)
    # Get the latest modified file
    print(itemlist[0])

6. Video tutorial

Please click to station B to view the version of [bilingual subtitles]

https://www.bilibili.com/vide...

Data and code download

The code for this tutorial series can be found in github corresponding to ShowMeAI Download in, you can run in the local python environment. Babies who can surf the Internet scientifically can also directly use Google lab to run and learn through interactive operation!

The Python quick look-up table involved in this tutorial series can be downloaded and obtained at the following address:

Extended references

ShowMeAI related articles recommended

ShowMeAI series tutorial recommendations

Keywords: Python Programming Machine Learning AI

Added by michalurban on Wed, 23 Feb 2022 10:48:33 +0200