Skip to content

feat: allow manylinux 2.28 and 2.34 on python 3.12+ when compiled on a different architecture #762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

alesanfra
Copy link

Issue: #700

Description of changes:
As described by @valerena here, when trying to build cross-architecture, the platform tags added to the pip command are not correct. With this PR, I've fixed the issue while preserving most of the original logic. The approach is to add all compatible platform tags for each python version and architecture pair to the pip command, as per the official pip documentation.

The code may look a bit hacky, but I tried to change as few lines as possible. Feel free to propose improvements - I'm ready to accept them.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@alesanfra alesanfra requested a review from a team as a code owner July 14, 2025 07:21
@github-actions github-actions bot added pr/external stage/needs-triage Automatically applied to new issues and PRs, indicating they haven't been looked at. area/workflow/python_pip labels Jul 14, 2025
@valerena
Copy link
Contributor

Hi @alesanfra . Thanks a lot for the contribution. There are some basic formatting checks that failed in your PR, that hopefully you can review and address. You can check those locally by running make pr.

I'm not able to review this right now, but I'll review in the next couple of days, and I'll check with others on the team as well..

Comment on lines +175 to +182
_COMPATIBLE_PLATFORM_ARM64 = [
"any",
"linux_aarch64",
"manylinux2014_aarch64",
}
"manylinux_2_17_aarch64",
"manylinux_2_28_aarch64",
"manylinux_2_34_aarch64",
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a codeowner here, just an interested party, but is there a reason to make this a list instead of a set? afaict order doesn't matter and set operations are very useful here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL; DR: I need the platforms to be sorted by release time.

Good question! The problem with the --platform option in pip download is that it tries to match exactly the platform you pass as an argument, excluding potentially compatible wheels compiled with an earlier version of glibc.

According to pip's documentation, it is up to the caller to pass the full list of compatible platforms to the target, because pip makes no a priori assumptions.

Let's take an example: suppose we want to install numpy on Amazon Linux 2023, which runs glibc version 2. 34, but have a different operating system or architecture (e.g., Mac), so let's try downloading numpy by adding the platform tag:

$ pip download numpy --platform manylinux_2_34_x86_64 --only-binary=:all:
ERROR: Could not find a version that satisfies the requirement numpy (from versions: none)
ERROR: No matching distribution found for numpy

This is strange but expected, because there is no version of numpy on pypi.org with that platform tag. You need to pass all possible compatible tags to the command, so a working command would be:

$ pip download numpy --platform manylinux_2_17_x86_64 --platform manylinux_2_28_x86_64 --platform manylinux_2_34_x86_64 --only-binary=:all:

With this command you will get numpy version 2.3.0, which comes with the manylinux_2_28_x86_64 platform tag.

This means that for each pair of glibc versions and architectures there is a different set of compatible platform tags. Unfortunately, I could not find a clever way to implement this in python, but I used this simple logic:

  1. For each architecture, list all possible platform tags ordered from oldest to newest;
  2. At runtime, determine the most recent platform tag compatible with our target (in the example it is manylinux_2_34_x86_64);
  3. Pass the platform determined in step 2 and all previously released platforms to the command.

This is why I turned the set into a list: I need the platforms to be sorted by release time.

If you have any ideas to improve the code feel free to share them here, this is the simplest approach I could find.

@bnusunny
Copy link

bnusunny commented Jul 16, 2025

@alesanfra Thanks for this PR. It looks great! I would suggest a few additional improvements to make it more maintainable.

  1. Add ManylinuxSpec dataclass
from dataclasses import dataclass

@dataclass(frozen=True)
class ManylinuxSpec:
    """Specification for a manylinux platform compatibility standard."""
    
    platform_template: str
    min_glibc_major: Optional[int] = None
    min_glibc_minor: Optional[int] = None
    introduced_year: Optional[int] = None
    
    def is_compatible_with_glibc(self, glibc_version: Tuple[int, int]) -> bool:
        """Check if this manylinux spec is compatible with the given glibc version."""
        if self.min_glibc_major is None:
            return True
        return (self.min_glibc_major, self.min_glibc_minor) <= glibc_version
    
    def format_for_arch(self, arch: str) -> str:
        """Format the platform template for the given architecture."""
        return self.platform_template.format(arch=arch)
  1. Replace static platform lists with ordered evolution
# Ordered list of manylinux platform specifications from oldest to newest
class DependencyBuilder(object):
    _MANYLINUX_EVOLUTION = [
        ManylinuxSpec("any"),
        ManylinuxSpec("linux_{arch}"),
        ManylinuxSpec("manylinux1_{arch}", 2, 5, 2016),
        ManylinuxSpec("manylinux2010_{arch}", 2, 12, 2019),
        ManylinuxSpec("manylinux2014_{arch}", 2, 17, 2019),
        ManylinuxSpec("manylinux_2_17_{arch}", 2, 17, 2019),
        ManylinuxSpec("manylinux_2_28_{arch}", 2, 28, 2022),
        ManylinuxSpec("manylinux_2_34_{arch}", 2, 34, 2023),
    ]
  1. Implement dynamic platform selection
class DependencyBuilder(object):
    @property
    def compatible_platforms(self) -> List[str]:
        """Get the list of all compatible platforms for the current architecture."""

        lambda_abi = get_lambda_abi(self.runtime)
        runtime_glibc = self._RUNTIME_GLIBC.get(lambda_abi, self._DEFAULT_GLIBC)
        arch = "aarch64" if self.architecture == ARM64 else X86_64
        
        platforms = []
        for spec in self._MANYLINUX_EVOLUTION:
            if spec.is_compatible_with_glibc(runtime_glibc):
                platforms.append(spec.format_for_arch(arch))
            else:
                break  # Stop at first incompatible version
                
        return platforms

Copy link
Contributor

@valerena valerena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave the potential improvements for later, since the PR looks good already considering the existing code base.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/workflow/python_pip pr/external stage/needs-triage Automatically applied to new issues and PRs, indicating they haven't been looked at.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants