Python 3 standard library: os.path platform independent file name management

1. Independent file name management of os.path platform

It is easy to write code to deal with files on multiple platforms by using the functions contained in the os.path module. Even if the program does not intend to migrate between platforms, os.path should be used to complete reliable file name resolution.

1.1 resolution path

The first set of functions in os.path can be used to parse a string representing a filename into its various components. These functions do not require paths to actually exist: they just handle strings.

Path resolution depends on some variables defined in os:

os.sep: separator between parts of the path (for example, "/" or "\").

os.extsep: the separator between the filename and the "extension" of the file (for example ".").

os.pardir: the part of the path that represents the level above the directory tree (for example: '..').

os.curdir: the part of the path that indicates the current directory (for example: ".").

The split() function breaks the path into two separate parts and returns a tuple containing the results. The second element of the tuple is the last part of the path, and the first element is all the previous content.

import os.path

PATHS = [
    '/one/two/three',
    '/one/two/three/',
    '/',
    '.',
    '',
]

for path in PATHS:
    print('{!r:>17} : {}'.format(path, os.path.split(path)))

When the input parameter ends with os.sep, the last element of the path is an empty string.

The value returned by the basename() function is equivalent to the second part of the split() value.  

import os.path

PATHS = [
    '/one/two/three',
    '/one/two/three/',
    '/',
    '.',
    '',
]

for path in PATHS:
    print('{!r:>17} : {!r}'.format(path, os.path.basename(path))

The entire path will be stripped to the last element, whether it indicates a file or a directory. If the path ends with a directory separator (os.sep), the base part is considered empty.

The dirname() function returns the first part of the decomposition path.

import os.path

PATHS = [
    '/one/two/three',
    '/one/two/three/',
    '/',
    '.',
    '',
]

for path in PATHS:
    print('{!r:>17} : {!r}'.format(path, os.path.dirname(path)))

Combine the result of basename() with dirname() to get the original path.

splitext() works like split(), but it splits the path based on the extension separator rather than the directory separator.  

import os.path

PATHS = [
    'filename.txt',
    'filename',
    '/path/to/filename.txt',
    '/',
    '',
    'my-archive.tar.gz',
    'no-extension.',
]

for path in PATHS:
    print('{!r:>21} : {!r}'.format(path, os.path.splitext(path)))

When looking for an extension, only the last occurrence of os.extsep is used, so if a filename has multiple extensions, some of them will remain on the prefix when the filename is exploded.

commonprefix() takes a path list as a parameter and returns a string representing the common prefixes that appear in all paths. This value may represent a path that does not exist at all and does not consider path separators, so the prefix may not fall on a separator boundary.  

import os.path

paths = ['/one/two/three/four',
         '/one/two/threefold',
         '/one/two/three/',
         ]
for path in paths:
    print('PATH:', path)

print()
print('PREFIX:', os.path.commonprefix(paths))

In this case, the common prefix string is / one/two/three. Although one of the paths does not include a directory called three.

commonpath() takes the path separator into account. It returns prefixes that do not include partial path values.

import os.path

paths = ['/one/two/three/four',
         '/one/two/threefold',
         '/one/two/three/',
         ]
for path in paths:
    print('PATH:', path)

print()
print('PREFIX:', os.path.commonpath(paths))

Since "threefold" does not have a path separator after "three", the public prefix is / one/two.

1.2 establishment path

In addition to decomposing existing paths, you often need to build paths from other strings. To combine multiple path components into a single value, use join().

import os.path

PATHS = [
    ('one', 'two', 'three'),
    ('/', 'one', 'two', 'three'),
    ('/one', '/two', '/three'),
]

for parts in PATHS:
    print('{} : {!r}'.format(parts, os.path.join(*parts)))

If a parameter to connect starts with os.sep, all the previous parameters will be discarded, and the new parameter will become the beginning of the return value.

You can also work with paths that contain "variable" sections that can be automatically expanded. For example, expanduser() can convert a tilde (~) string to the user's home directory name.

import os.path

for user in ['', 'dhellmann', 'nosuchuser']:
    lookup = '~' + user
    print('{!r:>15} : {!r}'.format(
        lookup, os.path.expanduser(lookup)))

If the user's home directory cannot be found, the string will be returned without any changes, such as ~ nosuchuser in the following example.

expandvars() is more generic, extending all shell environment variables that appear in the path.

import os.path
import os

os.environ['MYVAR'] = 'VALUE'

print(os.path.expandvars('/path/to/$MYVAR'))

No validation is done here to ensure that the value of the variable gets the real filename.

1.3 standardization path

When using join() or using embedded variables to combine paths from separate strings, the resulting path may end up with redundant separators or relative path parts. You can use normpath() to clear this content.

import os.path

PATHS = [
    'one//two//three',
    'one/./two/./three',
    'one/../alt/two/three',
]

for path in PATHS:
    print('{!r:>22} : {!r}'.format(path, os.path.normpath(path)))

The path segments composed of os.curdir and os.pardir are estimated and folded here.

To convert a relative path to an absolute filename, use abspath().

import os
import os.path

os.chdir('/')

PATHS = [
    '.',
    '..',
    './one/two/three',
    '../one/two/three',
]

for path in PATHS:
    print('{!r:>21} : {!r}'.format(path, os.path.abspath(path)))

The result is a full path from the top of the file system count.

1.4 document time

In addition to processing paths, os.path also includes functions for obtaining file properties, similar to the results returned by os.stat().

import os.path
import time

print('File         :', __file__)
print('Access time  :', time.ctime(os.path.getatime(__file__)))
print('Modified time:', time.ctime(os.path.getmtime(__file__)))
print('Change time  :', time.ctime(os.path.getctime(__file__)))
print('Size         :', os.path.getsize(__file__))

os.path.getatime() returns the access time, os.path.getmtime() returns the modification time, and os.path.getctime() returns the creation time. os.path.getsize() returns the amount of data in the file, in bytes.

1.5 test documents

When a program encounters a pathname, it usually needs to know whether the path indicates a file, directory, or symlink, and whether it really exists. os.path contains functions for testing all of these conditions.

import os.path

FILENAMES = [
    __file__,
    os.path.dirname(__file__),
    '/',
    './broken_link',
]

for file in FILENAMES:
    print('File        : {!r}'.format(file))
    print('Absolute    :', os.path.isabs(file))
    print('Is File?    :', os.path.isfile(file))
    print('Is Dir?     :', os.path.isdir(file))
    print('Is Link?    :', os.path.islink(file))
    print('Mountpoint? :', os.path.ismount(file))
    print('Exists?     :', os.path.exists(file))
    print('Link Exists?:', os.path.lexists(file))
    print()

All of these test functions return Booleans.

Keywords: Python shell

Added by Divine Winds on Wed, 11 Mar 2020 04:13:50 +0200