Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TE: update for TE release_v2.0 support #1721

Merged
merged 8 commits into from
Feb 10, 2025

Conversation

kshitij12345
Copy link
Collaborator

@kshitij12345 kshitij12345 commented Jan 30, 2025

Fixes #1737

Update to support https://github.com/NVIDIA/TransformerEngine/tree/release_v2.0

Tested with existing tests on RTX6000, H100 and B200.

NOTE - This update drops support for TE 1.x releases.

@kshitij12345 kshitij12345 changed the title TE: update for TE release_v2.0 support [WIP] TE: update for TE release_v2.0 support Jan 30, 2025
@kshitij12345 kshitij12345 changed the title [WIP] TE: update for TE release_v2.0 support TE: update for TE release_v2.0 support Feb 6, 2025
Copy link
Collaborator

@IvanYashchuk IvanYashchuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I only have a few clarifying questions.

Note to other reviewers: this PR is not tested in CI (#196).

thunder/executors/transformer_engineex.py Show resolved Hide resolved
thunder/executors/transformer_engineex.py Outdated Show resolved Hide resolved
thunder/executors/transformer_engineex.py Show resolved Hide resolved
thunder/tests/distributed/test_ddp.py Show resolved Hide resolved
thunder/tests/distributed/test_ddp.py Show resolved Hide resolved
thunder/tests/distributed/test_fsdp.py Show resolved Hide resolved
@kshitij12345 kshitij12345 marked this pull request as ready for review February 7, 2025 11:39
Copy link
Collaborator

@t-vi t-vi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you @kshitij12345 @IvanYashchuk

@t-vi t-vi merged commit 621dce7 into Lightning-AI:main Feb 10, 2025
52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AttributeError: \'TELinear\' object has no attribute \'get_fp8_workspace\'. Did you mean: \'_fp8_workspaces\'?
3 participants