Add batch dim idx to support latest deepspeed DistributedAttention #1725

bhargaveede · 2025-01-27T09:26:18Z

https://github.com/microsoft/DeepSpeed/commits/master/deepspeed/sequence/layer.py
With latest changes in DistributedAttention, we need to add batch_dim_idx and rotary_pos_emb to Distributed Attention.

This change brings changes to add the batch_dim_idx in line with new changes.

deepspeedai/DeepSpeed@ffe0af2

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

…Attention (#37) * Temp patch for batch_dim * Adding batch_dim_idx as per latest deepspeed * Update modeling_llama.py

HuggingFaceDocBuilderDev · 2025-01-27T09:30:12Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

bhargaveede · 2025-01-29T11:58:13Z

@regisss
Distributed Attention implementation has been changed, and function signature is changed in Synapse 1.20.
To use Distributed Attention, we have to provide batch dimension and additional rope parameter from the model side.

Please review it and merge along with other 1.20 Dependent PRs

optimum/habana/transformers/models/llama/modeling_llama.py

regisss

I left one comments to address + @mlapinskix's comments

optimum/habana/transformers/models/llama/modeling_llama.py

bhargaveede · 2025-02-06T07:28:35Z

@regisss Addressed review comments. Can you review and merge

regisss

LGTM!

bhargaveede added 2 commits January 8, 2025 18:39

[SW-207148] Add batch dim idx to support latest deepspeed Distributed…

7c77b1d

…Attention (#37) * Temp patch for batch_dim * Adding batch_dim_idx as per latest deepspeed * Update modeling_llama.py

Restructuring code (#100)

076fe80

bhargaveede requested review from regisss and vivekgoe January 27, 2025 09:26

bhargaveede requested review from mandy-li and libinta as code owners January 27, 2025 09:26

bhargaveede added the synapse1.20 label Jan 27, 2025

mlapinskix reviewed Feb 3, 2025

View reviewed changes

optimum/habana/transformers/models/llama/modeling_llama.py Outdated Show resolved Hide resolved

mlapinskix reviewed Feb 3, 2025

View reviewed changes

optimum/habana/transformers/models/llama/modeling_llama.py Show resolved Hide resolved

libinta changed the title ~~[SW-207148] Add batch dim idx to support latest deepspeed DistributedAttention~~ Add batch dim idx to support latest deepspeed DistributedAttention Feb 3, 2025

libinta added the run-test Run CI for PRs from external contributors label Feb 3, 2025

regisss reviewed Feb 5, 2025

View reviewed changes

optimum/habana/transformers/models/llama/modeling_llama.py Outdated Show resolved Hide resolved

Addressing review comments

aad9c49

regisss changed the base branch from main to synapse_1_20 February 6, 2025 09:55

regisss approved these changes Feb 6, 2025

View reviewed changes

regisss merged commit bedc041 into huggingface:synapse_1_20 Feb 6, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add batch dim idx to support latest deepspeed DistributedAttention #1725

Add batch dim idx to support latest deepspeed DistributedAttention #1725

bhargaveede commented Jan 27, 2025 •

edited

Loading

HuggingFaceDocBuilderDev commented Jan 27, 2025

bhargaveede commented Jan 29, 2025

regisss left a comment

bhargaveede commented Feb 6, 2025

regisss left a comment

Add batch dim idx to support latest deepspeed DistributedAttention #1725

Add batch dim idx to support latest deepspeed DistributedAttention #1725

Conversation

bhargaveede commented Jan 27, 2025 • edited Loading

Before submitting

HuggingFaceDocBuilderDev commented Jan 27, 2025

bhargaveede commented Jan 29, 2025

regisss left a comment

Choose a reason for hiding this comment

bhargaveede commented Feb 6, 2025

regisss left a comment

Choose a reason for hiding this comment

bhargaveede commented Jan 27, 2025 •

edited

Loading