Skip to content

Conversation

@ani300
Copy link
Contributor

@ani300 ani300 commented Aug 6, 2025

Description of the change

  1. Adds a missing import for FP8 attention (subsumes fix: fixed fp8 import so as to not require user to import #175)
  2. Fixes the sharding of the weight scales for FP8 TP

This will not work without a PR in FMS (foundation-model-stack/foundation-model-stack#457)

Related issues or PRs

How to verify the PR

Was the PR tested

  • I have added >=1 unit test(s) for every new method I have added (if that coverage is difficult, please briefly explain the reason)
  • I have ensured all unit tests pass

Checklist for passing CI/CD:

  • All commits are signed showing "Signed-off-by: Name <[email protected]>" with git commit -signoff or equivalent
  • PR title and commit messages adhere to Conventional Commits
  • Contribution is formatted with tox -e fix
  • Contribution passes linting with tox -e lint
  • Contribution passes spellcheck with tox -e spellcheck
  • Contribution passes all unit tests with tox -e unit

Note: CI/CD performs unit tests on multiple versions of Python from a fresh install. There may be differences with your local environment and the test environment.

@ani300 ani300 changed the title FP8 TP fixes fix: FP8 TP fixes Aug 6, 2025
@github-actions github-actions bot added the fix label Aug 6, 2025
@ani300
Copy link
Contributor Author

ani300 commented Aug 6, 2025

We can close #174 as this one is a better fix

Copy link

@joerunde joerunde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works on vllm!

Copy link
Collaborator

@andrea-fasoli andrea-fasoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix sharding dimension. Looks good

@ani300 ani300 merged commit 207eb06 into main Aug 6, 2025
14 of 15 checks passed
joerunde added a commit to vllm-project/vllm-spyre that referenced this pull request Aug 8, 2025
# Description

The general idea is that the FP8 model can be used in all SB/CB
scenarios.

This requires unreleased changes from fms and fms-mo in order to work:
-
foundation-model-stack/foundation-model-stack#457
- foundation-model-stack/fms-model-optimizer#176

While the tests may pass on cpu, they will continue to fail on spyre

## Related Issues

Also addresses: #356

---------

Signed-off-by: Prashant Gupta <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Co-authored-by: Joe Runde <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants