Packaging Python Modules

From PCLinuxOSHelp Knowledge Base
Jump to: navigation, search

There are a huge number of packages which make up "Python" on PCLinuxOS. Not only is there the main python3 package which contains the interpreter plus the standard library but also a large number of additional python modules which extend the capabilities of the language and provide ready-made python code which can be used by applications. These modules are contained in separate packages (one for each module).

Keeping a standard format for these module packages will greatly help us maintain Python on PCLinuxOS. Some of the standards will be defined here using the requests module as an example.

Each SPEC file should start by defining a tag identifying the module e.g.

  %define module requests

This defines the name of the module and should be used throughout the SPEC wherever the name of the module is required. Here we have used module as the tag but you may see other things like pypi_name or srcname. It doesn't matter too much what you use as long as it is consistent and obvious

Our standard requires that the name of the SRPM conforms to the form python-module_name so the SPEC will have

   Name:    python-%{module}

This will generate an SRPM called python-requests.

Since the same SRPM can be used to generate packages for different versions of python this should be accounted for by creating sub-packages for each version of python.

   %package -n python3-%{module}

Now on to building the package. There are 2 main interfaces for building python modules: setup.py based and pyproject.toml based. The next 2 sections deal with the details of each.


setup.py

This is a legacy interface which the Python developers are trying to migrate users away from because it has no good mechanisms for declaring build dependencies. The majority of Python modules are distributed as a .tar.gz file (sometimes called an sdist or source distribution). Usually this will have been generated using distutils which has been the standard for packaging Python modules for a long time. Modules packaged with distutils come with a setup.py which will build and install the module.

Packaging such a module for PCLinuxOS should be as simple as:

  %build
  %py3_build
  %install
  %py3_install

These macros will automatically run the setup.py with appropriate parameters to generate the module under the BUILDROOT.

The files produced then need to be packaged in the %files section. The general format will be:

  %files -n python3-%{module}
  %doc LICENSE README.md HISTORY.md
  %{python3_sitelib}/%{module}-%{version}-py%{python3_version}.egg-info
  %{python3_sitelib}/%{module}/

Note here important use of macro %{python3_sitelib}. Python modules are stored in a standard directory on the system where the python interpreter will search for them. The macro %{python3_sitelib} expands to this base directory where the module is stored. As an example, in python 3.10 this directory will be /usr/lib/python3.10/site-packages.

Some packages build extension module libraries which will be architecture specific. In this case use the macro %{python3_sitearch} to specify the module base directory instead of %{python3_sitelib}. The difference is that this is based under /usr/lib64 rather than /usr/lib

Another important macro is %{python3_version} which expands to the version of python being packaged for (e.g. 3.10). You can see it's use above in the definition of the path to the egg-info.

You should always use these macros. Do not hard-code paths or version numbers as this will mean we need to edit the SPEC file rather than just rebuild when we move to a new python version.


pyproject.toml

The problem with the legacy setup.py approach is that it is not known what packages that file depends on. It is not possible to reliably introspect Python code without executing it which will trigger all global level imports. This presents a nasty chicken/egg problem for projects which want to use something other than setuptools (e.g. flit) to build the module. To solve this a new Python standard has been created (PEP 517 and PEP 518) which introduces a new file pyproject.toml which will used to control the build/install. Trying to build such a module with the legacy macros will usually result in a error:

  /usr/bin/python3: can't open file '/home/terry/src/rpm/BUILD/keyring-23.8.2/setup.py': [Errno 2] No such file or directory
  error: Bad exit status from /home/terry/src/tmp/rpm-tmp.gKg4ao (%build) 

For these modules we have a set of macros in the pyproject-rpm-macros package which can be used to package the modules. These macros can be used as follows:

  %build
  %pyproject_wheel
  %install
  %pyproject_install

There is also a macro which can generate a %files list for the files which make up the module.

  %pyproject_save_files %{module}

The file list can than be used in the %files section but note that you still have to manually add any doc and licence files plus any files which are outside the module tree.

  %files -n python3-%{module} -f %{pyproject_files}
  %doc README LICENSE
  %{_bindir}/keyring

HINT: if on building you see this error:

  FileExistsError: %pyproject install has found more than one *.dist-info/RECORD file. Currently, %pyproject_save_files supports only one wheel → one file list mapping. Feel free to open a bugzilla for pyproject-rpm-macros and describe your usecase. 

The delete the file pyproject-record under the BUILD directory and re-try the build.