setuptools Library: build your own python package

A clear introduction to the basic usage of setuptools
MANIFEST. Official explanation of in documents
Root directory setup py, MANIFEST. in, README. You don't need to specify what to add to the package automatically.
python setup.py clean --all clear the intermediate results of previous compilation (that is, the build directory). If not, modify setup py, MANIFEST. The recompilation of the contents in the in file may not take effect, and the program still packages the file directly according to the intermediate results of the previous compilation. (postscript: however, according to the actual measurement, clean is not enough, and RM - RF *. Egg info /, otherwise the PY file specified before will be packaged)

The opening sentence from setuptools import setup is written in the front,
For the parameters of the setup function, let's talk about them one by one (when talking about a parameter, other parameters are ignored and not written):
The setup(name="demo", version="1.0") package contains only the setup in the root directory py, MANIFEST. in, README. MD and other files
setup(packages=["mod1", "mod2"] assuming that there are many packages under the root directory, we only need mod1 and mod2 packages, then add them to the packages parameter, and all. py files under these directories will be packaged
setup(include_package_data=True) according to manifest In specifies the files that need and do not need to be packaged
Setup (exclude_package_data = {mod1 ': ['. gitignore ']}) excludes the data in the mod1 directory gitignore file. If 'mod1' is replaced by '', it means that all directories are excluded gitignore file
setup(package_data={"": ["*.txt", "*.pth"]}) followed by exclude_ package_ The usage of data is basically the same. It should be noted that it only checks the directories in packages and does not add data files in other directories, even if the '' written here looks like all directories in the whole root directory. package_ Data feels that there is no manifest In is convenient to use.

python setup.py install is to directly install the contents in the current directory into a package in the pip list, while Python setup Py development is more suitable for the code in the debugging stage. Instead of generating a package in the site packages directory, it pulls a soft chain to the current directory, so that we can immediately reflect the code modified in the current directory without repeating the python setup every time py install.

python setup.py sdist can create a source package, python setup py bdist_ Wheel create a wheel distribution package
More commands are available through Python setup Py -- help commands view.
python setup.py install/develop will create a module in the current directory_ name. Egg info / directory, if Python setup py sdist && pip install dist/module_ name-version. tar. GZ will report some strange errors, and then put the module_ name. Delete the egg info / directory and you can compile and install it normally.

Further: add c + + file extension when packaging
The first is to make a C + + dynamic link library that can be called by python, that is, so file. There are several common technical paths, such as ctypes, Boost and pybind11( Reference link 1Reference link 2Reference link 3 ), choose pybind11 here to explain.

Review the relationship between compilation and linking: Look here . Compilation is to convert our own code into binary object files o or obj, link is to combine all target files and system components into an executable file exe.
Using cmake to compile pure C + + code can see my Another article.

First, install pybind11. The result of source code installation is different from that of pip install installation. The latter can import pybind11. The former can't. adding usr/local/include/pybind11 to PYTHONPATH doesn't help. The former can find pybind11 config Cmake file is not available for the latter. The wheel package does not contain this file, so we will talk about the usage separately here. pip install: pip install pybind11
Then write a simple summation function in C + +, named example CPP (from here (copied):

#include <pybind11/pybind11.h>
namespace py = pybind11;

int add(int i, int j)
{
    return i + j;
}

PYBIND11_MODULE(test, m)
{
    // optional module docstring
    m.doc() = "pybind11 example plugin";
    // expose add function, and add keyword arguments and default arguments
    m.def("add", &add, "A function which adds two numbers", py::arg("i")=1, py::arg("j")=2);

    // exporting variables
    m.attr("the_answer") = 42;
    py::object world = py::cast("World");
    m.attr("what") = world;
}

Compile - C + + - 11 d + + and then - C + + including - O3 CPP - O test $(Python 3-config -- extension suffix), super long command, Official website The example command of is so long, which is outrageous.
The meaning of each parameter is( Reference link):
-O3: the optimization level is set to level 3
-Wall: enable maximum warning
-Shared: generate a shared library file;
-fPIC: generate location independent object code, which is suitable for dynamic connection;
-L path: search the path library file for the path list;
-I path: add path to the header file search path list;
-o file: specify the output file name, that is, the first half of the so file name and the package name during python import. The file must be the same as pybind11_ The first parameter of module() is the same;
Other parameters can be viewed through c++ -v --help.
python -m pybind11 --includes will give the search path of the library file used by pybind11, such as "- I / usr / include / Python 3.9 - I / usr / local / lib / Python 3.9 / dist packages / pybind11 / include"
Python 3 config -- extension suffix gives the suffix of so file in the current environment, such as ". Cpython-39-x86_64 Linux GNU. So"
So if you untie the two parameters wrapped in $(), the whole command is: C + + - O3 - wall - shared - STD = C + + 11 - FPIC - I / usr / include / Python 3 9 -I/usr/local/lib/python3. 9/dist-packages/pybind11/include example. cpp -o test. cpython-39-x86_ 64-linux-gnu. so

If the so file has been generated, but import says it cannot be found in python, you can use the following command to see which suffix so files are supported by the current python interpreter( Reference link):

import importlib.machinery
print(importlib.machinery.all_suffixes())

If you find that the so file name is incorrect, for example, the python version number in the so file should have been 39, but the result becomes 38, you need to confirm whether the target of the python 3 and python 3 config command soft chain is consistent with the expectation.
At this step, there should be a so file starting with test in the cpp file directory, and you can execute the following commands in python:

import test
test.add(1, 2)

So far, a mixed use case of python and C + + based on pybind11 and C + + compilation commands has been completed~

Next, use cmake to compile, and cmakelists in the same directory as the cpp file Txt reads as follows:

# Minimum requirements for a given cmake version
cmake_minimum_required(VERSION 3.10) 

# Set project name
project(example)

# Add search path for header file
include_directories("/usr/include/python3.9")
include_directories("/usr/local/lib/python3.9/dist-packages/pybind11/include")

# Set variables, which are consistent with the parameters when compiling with the command 'c + +'
SET(CMAKE_CXX_FLAGS "-std=c++11 -O3")

# Compile one or more source files into library files
add_library(test SHARED example.cpp)

For a more detailed description of the cmake command, see here.
Then execute cmake - B build / & & CD build / & & make, and you will get a libtest So file. I haven't found a way to automatically remove the previous lib prefix here, but the so file actually corresponds to the module name test, so I need to manually change it back to test So, no other name can import successfully.
mv libtest.so test.so, finally, check whether the import is successful:

import test
test.add(1, 2)

So far, a mixed use case of python and C + + based on pybind11 and cmake is completed!

Next, let's talk about the source code installation method. pybind11 is mostly installed online in this way:

#There are some mixed projects of python and C + +, which are compiled with eigen. You should also install it
git clone git@github.com:pybind/pybind11.git \
 && mkdir pybind11/build && cd pybind11/build \
 && cmake .. \
 && make -j12 \
 && make install \
 && apt-get install libeigen3-dev \
 && ln -s /usr/include/eigen3/Eigen /usr/include/Eigen \
 && ln -s /usr/include/eigen3/unsupported /usr/include/unsupported
 && export PYTHONPATH=$PYTHONPATH:/usr/local/include/pybind11

At this point, cmakelists Txt should read:

# Minimum requirements for a given cmake version
cmake_minimum_required(VERSION 3.10)

# Set the project name, which can then be used through ${PROJECT_NAME}
project(example)

# Look for pybind11. There should be / usr/local/include/pybind11 directory at this time, otherwise the program cannot work
find_package(pybind11 REQUIRED)

# Put example The contents of the cpp file are added to the output file test. Finally, an so file starting with test will be generated for python to import
# example. Pybind11 expected in CPP_ Module function, and its first parameter should be passed to pybind11 here_ add_ The first parameter of module is consistent,
# Otherwise, an error importerror will be reported later: dynamic module does not define module export function
pybind11_add_module(test example.cpp)

Then execute cmake - B build / & & CD build / & & make, and you will get a test xxx. So file. Finally, check whether the import is successful:

import test
test.add(1, 2)

Done!

The following shows a setup. Exe that introduces c + + files py:

import os
import subprocess
from setuptools import setup, Extension
from setuptools.command.build_ext import build_ext


project_name = 'projection_render'

class CMakeExtension(Extension):

    def __init__(self, name, sourcedir=''):
        Extension.__init__(self, name, sources=[])
        self.sourcedir = os.path.abspath(sourcedir)
        if not os.path.exists(self.sourcedir):
            os.makedirs(self.sourcedir)


class CMakeBuild(build_ext):
    r"""
    During the process of `python setup.py install`, it will automatically call `python setup.py build_ext`
    to build C/C++ extensions, so we need to rewrite the event of build_ext command
    """

    def run(self):
        for ext in self.extensions:
            self.build_extension(ext)

    def build_extension(self, ext):
        if not os.path.exists(self.build_temp):
            os.makedirs(self.build_temp)

        extdir = self.get_ext_fullpath(ext.name)
        if not os.path.exists(extdir):
            os.makedirs(extdir)

        # This is the temp directory where your build output should go
        install_prefix = os.path.abspath(os.path.dirname(extdir))
        if not os.path.exists(install_prefix):
            os.makedirs(install_prefix)
        cmake_list_dir = os.path.join(install_prefix, project_name)
        # run cmake to build a .so file according to CMakeLists.txt
        subprocess.check_call(['cmake', cmake_list_dir], cwd=self.build_temp)
        subprocess.check_call(['cmake', '--build', '.'], cwd=self.build_temp)


setup(
    name=project_name,
    # include_package_data is set to True through manifest In file will not be The files required for the compilation of py (such as. cpp) are packaged
    include_package_data=True,
    # Not written__ init__.py but it needs to be packed py file directory, written in packages
    packages=[
        'projection_render', 'projection_render.include',
        'projection_render.src', 'projection_render.cuda_renderer'
    ],
    version="0.1",
    ext_modules=[CMakeExtension('projection_render')],
    # Those with C + + packages need to rewrite Python setup py build_ Ext command, compile cpp file in it
    cmdclass={
        "build_ext": CMakeBuild,
    },
    description='python verison of projection_render',
)

MANIFEST. The contents of the in file are:

include projection_render/README.md
include projection_render/cuda_renderer/*
include projection_render/include/*
graft projection_render/pybind/
include projection_render/src/*
include projection_render/CMakeLists.txt

If there are multiple C + + modules in a top-level directory, each has its own cmakelists Txt, the situation is different.
The simplest way to introduce C + + modules is to use ext_modules parameter, specify the source file involved in compilation through sources, and specify the compiled file name through name( Reference link ). However, when more source files are needed to compile, this parameter is a little weak. At this time, we need to use more powerful tools to rewrite build through cmdclass parameters_ Ext function.
setup(cmdclass={"build_ext": MyCommand}) this MyCommand must inherit from distutils core. Command class, setuptools has been packaged for one layer. Generally, it inherits the classes packaged by setuptools, such as build_ MyCommand corresponding to ext command inherits from setuptools command. build_ ext.build_ Ext class, if we don't rewrite it, python setup py build_ The EXT command runs this class.
Rewriting is mainly to rewrite the run() function. When entering run(), self Extensions has been assigned as ext of input setup()_ Modules can be used by the MyCommand instance. I don't know at which stage the assignment is made. No matter what it is, I'll use it first
Whether the command to run is install, develop or build_ext or whatever, the parameters in setup() will be in the command instance get_finalized_command() is parsed, including ext_modules.
Here is another multi module setup Py example:

import os
import subprocess
from setuptools import setup, Extension
from setuptools.command.build_ext import build_ext


project_name = "algo_utils"

# Because the files required for compilation are managed by manifest In, the specific operation is specified by cmakelists Txt, so the sources parameter of the Extension is useless, so a fixed empty list is directly assigned here
# The operation without makedir will report CMake Error: The source directory "xxx" does not exist. The reason is unknown
class CMakeExtension(Extension):

    def __init__(self, name, sourcedir=""):
        Extension.__init__(self, name, sources=[])
        self.sourcedir = os.path.abspath(sourcedir)
        if not os.path.exists(self.sourcedir):
            os.makedirs(self.sourcedir)


class CMakeBuild(build_ext):
    r"""
    During the process of `python setup.py install`, it will automatically call `python setup.py build_ext`
    to build C/C++ extensions, so we need to rewrite the event of build_ext command
    """

    def run(self):
        for ext in self.extensions:
            self.build_extension(ext)

    def build_extension(self, ext):
        if not os.path.exists(self.build_temp):
            os.makedirs(self.build_temp)

        extdir = self.get_ext_fullpath(ext.name)
        if not os.path.exists(extdir):
            os.makedirs(extdir)

        # This is the temp directory where your build output should go
        install_prefix = os.path.abspath(os.path.dirname(extdir))
        if not os.path.exists(install_prefix):
            os.makedirs(install_prefix)
        cmake_list_dir = os.path.join(install_prefix, project_name, ext.name)
        # run cmake to build a .so file according to CMakeLists.txt
        subprocess.check_call(["cmake", cmake_list_dir], cwd=self.build_temp)
        subprocess.check_call(["cmake", "--build", "."], cwd=self.build_temp)

setup(
    name=project_name,
    include_package_data=True,
    version="0.1",
    # The secondary module name is written here
    ext_modules=[
        CMakeExtension("projection_render"),
    ],
    cmdclass={
        "build_ext": CMakeBuild,
    },
    description="algorithm utility package containing .cpp file",
)

After installation, check:

from algo_utils import projection_render

It's done without a problem.

Advanced: multiple packages share a namespace
With the expansion of the code warehouse, sometimes we need to separate some functional modules for separate version control. The separated modules exist in the form of a separate python package and also need to be installed with their own version number. However, it is simply separated into an independent library. When importing, the writing method is different from that before. There are many import statements that need to be changed. setuptools gives a solution through namespace_ The packages parameter indicates the namespace. For details, see Reference link and Sample project (there's another one) Google project You can also refer to). In short, it is in setup The setup() function in py adds a parameter namespace_packages = [top-level module name], and then press Sample project Is written in the corresponding__ init__. Write in PY__ import__("pkg_resources").declare_namespace(__name__), The meaning of this sentence is to temporarily import PKG_ Resources package, call its declare_ The namespace () method specifies the top-level module as a namespace( Reference link ). It should be noted that both the original main warehouse and the newly separated sub warehouse should be configured for setup() and__ init__.py does the above processing. If only the sub warehouse does not do the main warehouse, it is in Python setup After py development, the namespace of the main warehouse will be overwritten by the sub warehouse. The import top-level module name can only find the sub warehouse, but the main warehouse cannot be found. This must be noted. If the writing is correct, the Import command before and after code separation does not need to be changed.
Another scheme is based on pkgutils library, which is similar to PKG above_ The resources method is similar to the corresponding__ init__. Write in PY__ path__ = __import__('pkgutil').extend_path(__path__, __name__), Then the setup() function does not need to add namespace_packages parameter. According to the document, this scheme is more compatible with Python 2 and python 3. Personally, I think PKG_ The resources method has a namespace_ The packages parameter can directly see the top-level package name, which is more convenient. PKG is selected when compatibility with Python 2 is not required_ The resources scheme should be more appropriate.

other
Extended setup Commands supported by py: cmdclass parameter is required
setup(cmdclass={"build_ext": CMakeBuild}) adds a new build_ Ext command through Python setup py build_ Ext, where CMakeBuild is a parent class, setuptools command. build_ ext.build_ The class of ext needs to rewrite the run method to run Python setup py build_ Ext is actually calling the run method of CMakeBuild class.

Keywords: Python

Added by l_evans on Fri, 04 Mar 2022 14:36:29 +0200