Skip to content

Enable multi-arch package download and filtering in download_prerelease_packages.py#5028

Open
araravik-psd wants to merge 8 commits intomainfrom
users/arravikum/download_multi-arch
Open

Enable multi-arch package download and filtering in download_prerelease_packages.py#5028
araravik-psd wants to merge 8 commits intomainfrom
users/arravikum/download_multi-arch

Conversation

@araravik-psd
Copy link
Copy Markdown
Contributor

@araravik-psd araravik-psd commented May 4, 2026

Adds support for downloading and listing multi-arch packages from flat S3 layouts (e.g. v4/whl/) and introduces consistent filtering using both existing and multi-arch package allow lists.

Key Changes

  • Multi-arch support
  1. Added --multi-arch mode for flat S3 layouts
  2. Supports both listing (--list-packages-multi-arch) and download flows
  3. Removes reliance on architecture-based directory structure
  4. Added rocm_profiler to promotion list
  5. Updated how_to_do_release.md with script naming change and script usage

Package filtering

  1. Introduced PACKAGES_TO_PROMOTE_MULTI_ARCH for new packages introduced for multi arch
  2. Added is_allowed_multi_arch_package() to unify filtering across:
  • PACKAGES_TO_PROMOTE
  • PACKAGES_TO_PROMOTE_MULTI_ARCH

Multi-arch arch filtering

  • Extended existing --arch filtering support to multi-arch flows
  • Added filtering for both:
    multi-arch package listing
    multi-arch package downloads
  • Generic packages continue to be included
  • Device-specific packages are filtered based on requested gfx architectures
  1. Ensures only relevant packages (including device-specific ones) are processed

New functionality

  1. list_packages_multi_arch_verbose() for listing flat-layout packages
  2. download_multi_arch_packages() for downloading filtered packages
  3. has_version_in_directory() to validate package availability

CLI updates and new Variables added:

Added flags:
--multi-arch
--list-packages-multi-arch

Notes
Multi-arch flow assumes a flat prefix layout (no per-arch directories)
Single-arch flow remains unchanged and continues to use existing logic
Filtering is now consistent across listing and download flows

Testing

Listing functionality for both multi-arch and single arch are added below:
https://gist.github.com/araravik-psd/58a570bccd944fc8a5df7a96d6c6abe3

Downloading multi-arch packages
https://gist.github.com/araravik-psd/38682418ed118f8aa06d023b35d44911

"amd_torch_device",
"amd_torchvision_device",
"rocm_sdk_device",
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by: we may want a placeholder here for amd_torchaudio_device (and in a few other scripts/workflows) so we remember if pytorch/audio#4180 gets further along

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated with a placeholder

)

parser.add_argument(
"--list-packages-multi-arch",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"--list-packages-multi-arch",
"--list-multi-arch-packages",

?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done updated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also updated the tests with the argument change

https://gist.github.com/araravik-psd/58a570bccd944fc8a5df7a96d6c6abe3

"--multi-arch",
action=argparse.BooleanOptionalAction,
default=False,
help="--multi-arch requires prefix like v4/whl/",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This help is confusing since the option is boolean.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the help text here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to rename this script?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest renaming to download_release_artifacts.py. As this is not only for prereleases.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it's not artifacts, those are rather the tarballs we push to the artifact buckets. It is rather download_python_packages isn't it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes that is more accurate, changing the script name to download_python_packages.py

@araravik-psd araravik-psd requested review from ScottTodd and marbre May 6, 2026 02:31
@@ -167,6 +189,64 @@
}


def is_allowed_multi_arch_package(filename: str) -> bool:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this distinction important here? Multi-arch will always be in a different index / suffic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinction is not for separating storage layouts, multi-arch already uses a dedicated prefix.

The filtering here is used to restrict downloads/listings to promotable package families only, while excluding unrelated artifacts that may exist in the same bucket/prefix.

In particular, multi-arch introduces additional dynamic package families (for example amd_torch_device_* and rocm_sdk_device_*) that are not part of the legacy single-arch allow-list, so the helper centralizes that filtering logic for the multi-arch flow.

I am mainly trying to keep multi-arch package selection independent from the legacy single-arch categorization logic and not overload categorize_package() with different semantics used. I hope this is good, if you would like me to consolidate all together I can do that too

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could actually reuse quite some code from other scripts, see my comments in the upload script PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will create an issue to reuse build_tools/github_actions/publish_rocm_to_release_buckets.py functionality in the next version of the script and update this PR with the issue.

@araravik-psd araravik-psd requested a review from marbre May 7, 2026 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: TODO

Development

Successfully merging this pull request may close these issues.

4 participants