Skip to content

Conversation

wufann
Copy link

@wufann wufann commented Sep 1, 2025

PR Category

AMD MI300x/MI308x GPU backend

Type of Change

Feature

Description

  • AMD GPU backend.
  • AMD fp8 e4m3/e5m2 data format in tests/test_quant.py

Issue

Progress

  • Change is properly reviewed (1 reviewer required, 2 recommended).
  • Change is responded to an issue.
  • Change is fully covered by a UT.

Performance

@Galaxy1458
Copy link
Collaborator

Galaxy1458 commented Sep 2, 2025

Hi @wufann Thank you for your contribution to the AMD backend adaptation of FlagGems and this is a great job! We will schedule time to review the code you submitted. If you are willing, could you also share with us your motivation for submitting this PR and your future plans regarding it?

@wufann
Copy link
Author

wufann commented Sep 2, 2025

Hi, @Galaxy1458 Thanks for your reply. I am a solution engineer from AMD.

Motivation:
FlagGems is a very excellent triton operator library and triton officially supports AMD compiler backend. As more and more users use AMD GPUs, we can recommend the FlagGems solution to our customers and developers. At the same time, with FlagGems, the quality of AMD Triton can be improved.

Plan:

  1. Fix acc test. Blas, quant, libentry, shape_utils, tensor_constructor and tensor_wrapper can pass now, Need to fix some test cases in reduce, norm and attention.
  2. Tuning kernel and provide best tune_configs.
  3. Add other advanced kernels like intra/inter node comm or comm/gemm overlap.

@Galaxy1458
Copy link
Collaborator

@wufann We are glad to see such a complete plan, and it seems that this PR is still being refined. When it reaches the reviewable state, you can @Galaxy1458 and @StrongSpoon to review this PR.

@wufann wufann marked this pull request as draft September 3, 2025 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants