Skip to content

gh-122931 Allow stable API extensions to include a multiarch tuple in the filename#122917

Open
stefanor wants to merge 5 commits into
python:mainfrom
stefanor:stable-abi-multiarch
Open

gh-122931 Allow stable API extensions to include a multiarch tuple in the filename#122917
stefanor wants to merge 5 commits into
python:mainfrom
stefanor:stable-abi-multiarch

Conversation

@stefanor
Copy link
Copy Markdown
Contributor

@stefanor stefanor commented Aug 11, 2024

This permits stable ABI extensions for multiple architectures to be co-installed into the same directory, without clashing with each other, the same way (non-stable ABI) regular extensions can.

It is listed below the current .abi3 suffix because setuptools will select the first suffix containing .abi3, as the target filename. We do this to protect older Python versions predating this patch.

@stefanor stefanor force-pushed the stable-abi-multiarch branch 2 times, most recently from 5e380a1 to 7185a58 Compare August 12, 2024 07:35
stefanor added a commit to stefanor/cpython that referenced this pull request Aug 12, 2024
…ple in the filename

This permits stable ABI extensions for multiple architectures to be
co-installed into the same directory, without clashing with each other,
the same way (non-stable ABI) regular extensions can.

It is listed below the current .abi3 suffix because setuptools will
select the first suffix containing .abi3, as the target filename.
We do this to protect older Python versions predating this patch.
@stefanor stefanor force-pushed the stable-abi-multiarch branch from 7185a58 to 2955027 Compare August 12, 2024 07:51
@stefanor stefanor changed the title Allow stable API extensions to include a multiarch tuple in the filename gh-122931 Allow stable API extensions to include a multiarch tuple in the filename Aug 12, 2024
stefanor added a commit to stefanor/cpython that referenced this pull request Aug 12, 2024
…ple in the filename

This permits stable ABI extensions for multiple architectures to be
co-installed into the same directory, without clashing with each other,
the same way (non-stable ABI) regular extensions can.

It is listed below the current .abi3 suffix because setuptools will
select the first suffix containing .abi3, as the target filename.
We do this to protect older Python versions predating this patch.
@stefanor stefanor force-pushed the stable-abi-multiarch branch from 2955027 to c5b58a4 Compare August 12, 2024 09:21
Copy link
Copy Markdown
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should document this change in https://docs.python.org/dev/whatsnew/3.14.html (Doc/whatsnew/3.14.rst). I expect that you would elaborate a bit on the usage, how to use the feature, why it's needed, etc.

@stefanor stefanor requested a review from encukou as a code owner August 28, 2024 12:03
@stefanor
Copy link
Copy Markdown
Contributor Author

You should document this change in https://docs.python.org/dev/whatsnew/3.14.html

Done.

FWIW, in Debian we plan to backport this to 3.13 too.

@vstinner
Copy link
Copy Markdown
Member

I'm only aware of Debian who uses "multiarch". Do other operating systems also use it? Maybe Debian variants, Ubuntu, and Ubuntu variants?

This change will slow down any "import module". I don't recall if there is a cache for that or not.

@stefanor
Copy link
Copy Markdown
Contributor Author

Maybe Debian variants, Ubuntu, and Ubuntu variants?

All Debian derivatives, yes. They typically don't deviate very much, when it comes to plumbing.

@stefanor
Copy link
Copy Markdown
Contributor Author

This change will slow down any "import module". I don't recall if there is a cache for that or not.

What do you want to do about that? Is there a benchmark you'd like to see results for?

I see a table of 5 entries (including NULL) increasing to 6. That is one extra item to search, when:

  1. Importing C extensions from multiarch filenames.
  2. Importing C extensions with no tag of any kind (foo.so).
  3. Failing to import something because there isn't a C extension with this name.

@vstinner
Copy link
Copy Markdown
Member

vstinner commented Sep 2, 2024

What do you want to do about that? Is there a benchmark you'd like to see results for?

Example of command:

strace -o trace python3 -c pass

strace output:

newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.cpython-312-x86_64-linux-gnu.so", 0x7ffe20002e50, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.abi3.so", 0x7ffe20002e50, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.so", 0x7ffe20002e50, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5884, ...}, 0) = 0

newfstatat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5884, ...}, 0) = 0
openat(AT_FDCWD, "/usr/lib64/python3.12/encodings/__pycache__/__init__.cpython-312.pyc", O_RDONLY|O_CLOEXEC) = 3

Internally, when Python imports the encoding package, it has to do 4 fstat() calls. You can see a check on the .abi3.so suffix.

If we add a new entry to the import suffix, it will add one fstat() syscall per import (at least, to import a package). We should consider the impact on performance if we decide to add this feature.

@freakboy3742
Copy link
Copy Markdown
Contributor

I'm only aware of Debian who uses "multiarch". Do other operating systems also use it? Maybe Debian variants, Ubuntu, and Ubuntu variants?

FWIW: macOS, iOS and Android all use multiarch as a configuration value - it's used to differentiate ARM from x86_64 simulators, and devices from simulators/emulators.

However, on iOS and Android, there's limited need to keep binary artefacts in the same folder, as any given executable can only have a single architecture's executables. In addition, in the case of iOS, the binaries need to be migrated to the Frameworks folder and named as Frameworks, so any "side-by-side" benefits would be lost. Any "other platform" executables need to be stripped out as part of the build process, at which point there's no naming conflict.

macOS uses the multiarch config value, but the value is always "darwin", and the universal binary format exists to support multiple architectures in a single binary file. I guess it might be useful to be able to have x86_64 and ARM64 binaries side-by-side... but there's probably only a year or two left in the official supported life of x86_64, so I don't think adding this feature will ultimately be that helpful for the macOS use case.

@vstinner
Copy link
Copy Markdown
Member

vstinner commented Sep 2, 2024

@stefanor: Did you consider to maintain this change as a downstream-only patch in Debian?

If you would like to make it upstream, I would suggest making it optional, disabled by default, and add a configure option to enable it. That's how I added some Fedora specific changes, such as:

@stefanor
Copy link
Copy Markdown
Contributor Author

stefanor commented Sep 3, 2024

Here are some benchmarks:

Google Sheet with Data

There is no discernable performance difference in minimal interpreter startup python -c ''.

import time
import sys
from subprocess import check_call

t1 = time.perf_counter()
for i in range(1000):
    check_call([sys.executable, "-c", ""])
t2 = time.perf_counter()
print((t2-t1) / 1000)

image

Looking at strace, I see only a single extra syscall.


We can manufacture an import-intensive benchmark:

import pkgutil
import time

t1 = time.perf_counter()
for module in pkgutil.walk_packages(onerror=lambda pkg: None):
    if module.name.startswith("test."):
        continue
    if module.name.endswith(".__main__"):
        continue
    if module.name in {"antigravity", "idlelib.idle", "this", "zen"}:
        continue
    try:
        __import__(module)
    except Exception:
        pass
t2 = time.perf_counter()
print((t2 - t1))

image

Analysing this, I see 116 stat syscalls on .so filenames jumping to 232. But in a 100ms total execution time, that's negligible, and I can't see any statistically significant results.

@stefanor
Copy link
Copy Markdown
Contributor Author

stefanor commented Sep 3, 2024

I'm only aware of Debian who uses "multiarch".

Also, note that this is used on all linux platforms. The default for (non-stable ABI) extensions is to include the multiarch tuple in the extension filename. Pick a random binary wheel built in manylinux, and you'll see them.

@stefanor: Did you consider to maintain this change as a downstream-only patch in Debian?

I would be happy to do that. Although I prefer not carrying downstream-only patches long-term if possible. Look at the mess around dist-packages for example.

Without this MR, there's no real point in supporting the multiarch tags in the non-stable ABI extensions. Either we get the benefit from doing it everywhere, or you say you don't need to support the feature, and we rip it out everywhere. We've got half a feature at the moment. Would you prefer it to all be behind a config argument?

If you would like to make it upstream, I would suggest making it optional, disabled by default, and add a configure option to enable it. That's how I added some Fedora specific changes, such as:

These are not entirely Fedora-specific. And that's a good argument for always trying to upstream.

https://docs.python.org/dev/using/configure.html#cmdoption-with-platlibdir

In Debian we use those paths for our cross-compilers. I imagine there is a scenario where it's useful to have Python headers in there.

https://docs.python.org/dev/using/configure.html#cmdoption-with-wheel-pkg-dir

We're a happy customer of this too, It meant one less patch to carry.

@stefanor
Copy link
Copy Markdown
Contributor Author

stefanor commented Sep 4, 2024

There is no discernable performance difference in minimal interpreter startup python -c ''.

Maybe 0.5% slowdown with PGO. For 1 extra syscall, I'm surprised it's visible. This may still not be significant.

image

@vstinner
Copy link
Copy Markdown
Member

@erlend-aasland @encukou: Would you mind to have a look at this issue? What do you think? Should it become the default behavior, or should it be a configure option?

@encukou
Copy link
Copy Markdown
Member

encukou commented Sep 23, 2024

IMO, the most important thing here is to keep this in mind for abi4 (or whatever we name the free-threading-compatible stable ABI). There it should be the only option.
If we get abi4 in 3.14, it's probably fine to keep this as a downstream-only patch for one more release.

@stefanor
Copy link
Copy Markdown
Contributor Author

IMO, the most important thing here is to keep this in mind for abi4 (or whatever we name the free-threading-compatible stable ABI). There it should be the only option.

Yeah, that sounds sensible.

If we get abi4 in 3.14, it's probably fine to keep this as a downstream-only patch for one more release.

Is that expected? We've been on abi3 for quite a while.

@stefanor
Copy link
Copy Markdown
Contributor Author

There it should be the only option.

If we want that for the stable ABI, do we want to do the same thing for the regular ABI? It would probably affect a lot of build systems that assume they can name their output extension.so

@encukou
Copy link
Copy Markdown
Member

encukou commented Sep 24, 2024

Is that expected? We've been on abi3 for quite a while.

Yes, over a decade. abi3 showing its age, and since it's incompatible free-threading we'll need a new alternative soon.
abi3 is not going away, but changing it this late in the cycle might not be worth it.

If we want that for the stable ABI, do we want to do the same thing for the regular ABI?

I don't see much reason to deprecate and remove that, so that we don't break the build systems you mention. (Note that even abi3/abi4 extensions will work with a bare .so extension, since the stable ABI is a subset of the full one.)

@stefanor
Copy link
Copy Markdown
Contributor Author

stefanor commented Oct 7, 2024

FWIW, in Debian we plan to backport this to 3.13 too.

I'm going to include it in our 3.13.0 upload.

@stefanor
Copy link
Copy Markdown
Contributor Author

stefanor commented Dec 19, 2024

🔔 (I'd like to remind people to please consider this)

@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions Bot added the stale Stale PR or inactive for long period of time. label Apr 18, 2026
…ple in the filename

This permits stable ABI extensions for multiple architectures to be
co-installed into the same directory, without clashing with each other,
the same way (non-stable ABI) regular extensions can.

It is listed below the current .abi3 suffix because setuptools will
select the first suffix containing .abi3, as the target filename.
We do this to protect older Python versions predating this patch.
@stefanor stefanor force-pushed the stable-abi-multiarch branch from eadad30 to 735f8af Compare May 16, 2026 17:42
@stefanor stefanor requested a review from FFY00 as a code owner May 16, 2026 17:42
@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented May 16, 2026

Copy link
Copy Markdown
Member

@encukou encukou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vstinner, what would it take for you to reconsider?

This is rather important for Debian/Ubuntu, doesn't hurt others (the stat call is insignificant as shown above), and makes things more consistent with the non-stable ABI suffix.

@FFY00
Copy link
Copy Markdown
Member

FFY00 commented May 19, 2026

But, if this is not a default, then it is not safe for us to ever name C extensions with a multiarch suffix, the way we do for non-stable-ABI ones, in public distributions. Personally, I'd prefer to be consistent about this. At least for abi4.

AFAIK, you are the only ones shipping such extensions, no?

@stefanor
Copy link
Copy Markdown
Contributor Author

AFAIK, you are the only ones shipping such extensions, no?

At the moment, we're the only ones with this patch. So, the only ones producing such extensions.

But, adding it and making it default (which this branch now does), means all Linux have a multiarch tag in linux abi3 and abi3t extensions, the same way they do for non-stable-ABI extensions.

@FFY00
Copy link
Copy Markdown
Member

FFY00 commented May 19, 2026

But that's something IMO we want to discourage...

If you need to share platlib between installations due to the Debian guidelines, fine, but that's not something we really want people doing. Separating platlib is the solution people should be using, if at all possible, with your approach only as a last resort.

I am okay with supporting your use-case in a best-effort capacity, but I do not think it should be the default behavior.

@FFY00
Copy link
Copy Markdown
Member

FFY00 commented May 19, 2026

That said, I don't want to block this PR, so if the rest of the maintainers agree, please go ahead.

The only thing I ask is to add tests.

@stefanor
Copy link
Copy Markdown
Contributor Author

If you need to share platlib between installations due to the Debian guidelines, fine, but that's not something we really want people doing

If that's the case, PEP3149 should be overturned. We implemented PEP3149, to allow us to do this. But stable-abi was forgotten, and when people started to use them, we ran into problems and filed this MR. From my position, PEP3149 works well, and I don't see any need to discourage its use.

I am okay with supporting your use-case in a best-effort capacity, but I do not think it should be the default behavior.

It's currently the default behaviour for non-stable ABI. Why should stable-abi be different? Surely we want to align?

@stefanor
Copy link
Copy Markdown
Contributor Author

The only thing I ask is to add tests.

Added a basic test for the presence of the tags.

Also, updated the NEWS text for the re-ordering of extensions.

Comment thread Doc/whatsnew/3.16.rst Outdated
@rgommers
Copy link
Copy Markdown
Contributor

From my position, PEP3149 works well, and I don't see any need to discourage its use.

It is a continuous source of friction and latent bugs. From a packaging tool author as well as package-with-c-exts author perspective, it's definitely not ideal. So @FFY00 is right I think to recommend that if one starts fresh (e.g., a new distro from scratch), please don't do this.

If that's the case, PEP3149 should be overturned. We implemented PEP3149, to allow us to do this.

That doesn't follow. Your use case is supported by that PEP, and is rooted in a long history - which is okay. Doesn't mean it's the recommended or optimal choice for new designs. Non-interleaved packages are clearly more robust, and the disk space / robustness tradeoffs are quite different today than 15-20 years ago.

It's currently the default behaviour for non-stable ABI. Why should stable-abi be different? Surely we want to align?

That I can agree with. For design consistency, it's quite reasonable to treat default/unstable and stable ABI extension names the same. Beyond just some work and having to maintain a "it's different from 3.15 onwards" code in packaging tools, I don't see a significant downside to accepting this PR.

@stefanor
Copy link
Copy Markdown
Contributor Author

It is a continuous source of friction and latent bugs. From a packaging tool author as well as package-with-c-exts author perspective, it's definitely not ideal.

Noted that you wouldn't recommand anybody else implement this interface. Being closer to it, I have a much more positive view. If you look at the problems it's solving, I really don't see any other obvious way to go about it that cause less friction. It massively reduces the amount of duplication and churn we have to do in the distro. It should also be a useful tool for anybody who wants to ship something for multiple Python releases and/or architectures.

How would you like to update the NEWS messaging to represent your view here?

If this is all something that Python wants to deprecate, we should design a replacement for it. We just had a packaging summit at PyCon. If that was something we should have discussed, I wish I'd known that.

@stefanor stefanor force-pushed the stable-abi-multiarch branch from 0117088 to 7b071c2 Compare May 20, 2026 16:31
@rgommers
Copy link
Copy Markdown
Contributor

How would you like to update the NEWS messaging to represent your view here?

I think that entry is fine, it's just a factual description of the technical change in this PR. I don't think it needs to touch on anything bigger-picture.

If this is all something that Python wants to deprecate

I certainly don't speak for Python (I'm not even a core dev), but: I don't think anyone wants to push for that. It's fine to exist.

We just had a packaging summit at PyCon. If that was something we should have discussed, I wish I'd known that.

The key thing that seems needed is integration testing for interleaved installs CPython, so that the next time any requirements get captured in time, rather than afterwards. That came up on Discourse before, can't find the link right now but it was recent so I guess you remember?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting merge stale Stale PR or inactive for long period of time.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants