How to deploy python extensions (shared libraries) with multiple architectures?

Question

Context: one can compile C code such that it can be used as a Python module. The compiled object is a shared library with a specific naming, so Python can find and load it as a module.

Great. I have successfully compiled and tested code such that file "foo.c" becomes a shared library "foo.so", and Python code import foo works.

The goal is to distribute a set of shared libraries for Mac, Linux, and Windows, where import foo loads the appropriate shared library.

Conceptually, I want my distribution to contain a directory with three files:

mypkg/
  ┠─ __init__.py
  ┠─ foo.so    (linux)
  ┠─ foo.dylib (mac)
  ┖─ foo.dll   (windows)

so that from mypkg import foo picks the appropriate library. I do not want to distribute the source code foo.c.

The problem is, Mac will pick the .so file and complain:

ImportError: dlopen(/.../mypkg/foo.so, 0x0002): tried: '/.../mypkg/foo.so' (not a mach-o file)

Is there a pattern / naming scheme which would permit this (short of writing a custom module loader)?

Edit: Explanation why PyPI / pip / wheel-type distribution is not desired... these aren't running in a standard python.exe process.

The main executable is a C program, which enables C-language plugins using an SDK. I've written a C plugin, which provides Python and exposes a Python interface to the original C API (doing Py_Initialize() etc. This C extension looks for, loads, and executes Python plugins. The result is one can now write Python plugins instead of C plugins. Users place Python plugins in a specific directory & each is read and executed. (plugins cannot execute standalone.) That all works fine.

Now, I'm looking at how one of these plugins can define and use a shared library Python module.

main.c -> 1) InitPython
          2) PyImport_Import("plugins/a.py")
          3) PyImport_Import("plugins/b.py")
               -> import mypkg.foo
             ...

If mypkg/foo.py is pure Python, this works great. If foo is a shared library, then it must be named foo.so on Linux and macOS, so I cannot simply ship my plugin as b.py + mypkg/*. I might be able to use pip install --target=plugins foo.whl.

Alternatively I'm testing a different loading mechanism, similar to @shadowtalker's non-recommendation, for mypkg/foo.py:

import os
import platform
_system = platform.system()

from importlib.machinery import ExtensionFileLoader
from importlib.utils import spec_from_file_location

filename = f'{os.path.dirname(__file__)/foo.{_system.lower()}.so'
_loader = ExtensionFileLoader('foo', filename)
_spec = spec_from_file_location('foo', filename)
_mod = _loader.create_module(_spec)
_loader.exec_module(_mod)
from foo import *

It looks like it's working, but I'll continue to test.

shadowtalker · Accepted Answer · 2023-12-03 01:58:08Z

3

This is handled by the binary package distribution format called Wheel: https://pythonwheels.com/

You will end up building one a separate wheel for each platform, and each wheel will contain only the .so/.dylib/.dll files that are needed for that particular platform.

Just about every major library in the Python ecosystem today distributes their code in the wheel format, even those that do not use compiled extensions.

If you are using a PEP-517-compatible build backend (Setuptools, Flit, Hatch, Poetry) building a wheel is as simple as pip install build && python -m build. This will produce a wheel file with a standardized filename, which can be uploaded directly to PyPI for distribution. When end users want to install your package, Pip will download the version of the wheel that is compatible with the user's system.

However, this will not handle cross-compiling for multiple platforms. For that, you will need something like a container or VM, or a tool like cibuildwheel.

Note that the details of how to set up your particular build backend to compile an extension, or to include existing compiled artifacts in the wheel, will vary considerably depending on the build backend you are using.

Finally, if you're unfamiliar with the concept of a "PEP 517 build backend", see here for a brief explanation and a tutorial for using Setuptools: https://setuptools.pypa.io/en/latest/build_meta.html

edited Dec 3, 2023 at 1:58

answered Dec 3, 2023 at 1:52

shadowtalker

14.1k5 gold badges65 silver badges121 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

pbuck Over a year ago

I'll investigate wheels, but I think it has added distribution complexity. In this case, I'd have three wheels (one per architecture) and have to upload to PyPI (or other hosted site). It's true, the user would easily install with pip....but that's still an additional step... Yes, it works which I suppose beats the alternative.

shadowtalker Over a year ago

@pbuck You already need to cross-compile your binary extension, so presumably you already have some kind of VM setup or cloud CI setup for doing that. I'd say that's probably the hardest part. Once that's done, it's just a matter of uploading 3 files to PyPI instead of 1, and you can upload them all at once with a single command using the Twine tool. See packaging.python.org/en/latest/guides/…. On the user side of things, there's arguably less complexity, because pip install should "just work".

pbuck · Accepted Answer · 2023-12-04 05:29:26Z

Summarizing and providing "my" solution.

Wheel is a great approach, if distribution can be via pip. Wheels can include binaries without source code & installation selects the correct architecture. Unfortunately, that doesn't fit my deployment environment, so I can't use that.
Using importlib.machinery, as described in the original query works, but seems too "clever" to be a good idea: There's a simpler solution.
I'm sure there's a way using a fancier combination of importlib.machinery with custom PathFinder could be done. (See also https://github.com/soroco/pyce/tree/master for custom loader example.) But it's like #2: overkill.
Using the pattern described by @shadowtalker is simple and workable, with a couple of notes:

a. Instead of from _foo_linux import *, use relative import from ._foo_linux import *, so make sure the local shared library is loaded.

b. Exact same C source code can be used to compile all three architectures: Note that the shared library filename MUST match the module being loaded....

Three different shared library filenames, means three different modules. But (what I hadn't thought of) you can include all three PyMODINIT_FUNC in the same file, all creating the module using the same PyModuleDef:

PyMODINIT_FUNC
PyInit_foo_linx(void) {
    return PyModule_Create(&foo_module);
}
PyMODINIT_FUNC
PyInit_foo_windows(void) {
    return PyModule_Create(&foo_module);
}
PyMODINIT_FUNC
PyInit_foo_darwin(void) {
    return PyModule_Create(&foo_module);
}

The build code simply compiles the same code and names the Darwin compilation foo_darwin.so; the Windows compilation foo_windows.dll, and the Linux compilation foo_linux.so.

This way, import .foo_darwin, loads the shared object, and invokes the PyInit_foo_darwin() function; import .foo_linux loads and invokes the PyInit_foo_linux() function, etc.

Thanks for spotting the error in my import syntax, I updated it in my answer.

shadowtalker · Accepted Answer · 2023-12-07 15:29:29Z

If for whatever reason the officially-recommended tools and techniques don't work for you (or if you just prefer to do things the hard way), your other option is to give each compiled extension a different module name, and then dynamically import the correct one based on platform.system() or similar:

mypkg/
  ┠─ __init__.py
  ┠─ foo.py
  ┠─ _foo_linux.so
  ┠─ _foo_mac.dylib
  ┖─ _foo_win.dll

And in foo.py:

import platform

_system = platform.system()

if _system == "Linux":
    from ._foo_linux import *
elif _system == "Darwin":
    from ._foo_mac import *
elif _system == "Windows":
    from ._foo_win import *
else:
    raise RuntimeError(f"Unsupported system: {_system!r}")

However you would still want to distribute this as a wheel, since you are shipping binary artifacts. At which point you're better off building separate wheels for each platform, which removes the burden from you to get this right, and generally will work better with the rest of the ecosystem.

In case it wasn't clear from my tone and comments in the other answer: I do not recommend doing this.

There might be other niche use cases for this kind of dynamic dispatching behavior (e.g. checking for the presence of a command or shared library in on the user's system). So hopefully this answer still serves as a useful example for someone out there, even though I think it's the wrong answer for the particular question that was asked here.

Chris · Accepted Answer · 2023-12-03 23:58:44Z

-4

import platform

_system = platform.system()

if _system == "Linux":
    from _foo_linux import *
elif _system == "Darwin":
    from _foo_mac import *
elif _system == "Windows":
    from _foo_win import *
else:
    raise RuntimeError(f"Unsupported system: {_system!r}")

edited Dec 3, 2023 at 23:58

Chris

139k139 gold badges317 silver badges293 bronze badges

answered Dec 3, 2023 at 11:32

Jai Meena

1

1 Comment

Brian61354270 Over a year ago

This appears to just be a copy-pasted fragment of shadowtalker's answer. What is this answer supposed to add?

Collectives™ on Stack Overflow

How to deploy python extensions (shared libraries) with multiple architectures?

4 Answers 4

2 Comments

1 Comment

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

2 Comments

1 Comment

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related