Skip to content

Conversation

@drisspg drisspg closed this Jan 5, 2026
@drisspg drisspg reopened this Jan 5, 2026
drisspg added a commit to drisspg/flash-attention that referenced this pull request Jan 5, 2026
Adds block-sparse support to SM90 backward pass:
- Block-sparse iteration with process_tile, get_block_sparse_iteration_info_bwd
- m_block_safe clamping for loads when subtile_factor>1
- Zero-fill for KV tiles with no Q blocks
- dQaccum_store with blocksparse_tensors parameter
- bwd_subtile_factor=2 for SM90 block sparsity (matches BlockMask 128 granularity)
- Tile size m_block_size=64 when using block sparsity

stack-info: PR: Dao-AILab#2136, branch: drisspg/stack/7

use_block_sparsity = block_sparse_tensors is not None

# For SM90 with block sparsity, use tile_m=64 with subtile_factor=2 to match
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was mostly to find the GCD between a m_block_size that would fit and the base block_m of 128 from fwd and block-sparse size for subtiling.

expected_count_shape, expected_index_shape = get_block_sparse_expected_shapes_bwd(
batch_size, num_head, seqlen_q, seqlen_k,
m_block_size, n_block_size, subtile_factor,
m_block_size, n_block_size, bwd_subtile_factor,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nb: bwd_subtile_factor is always 2 but we could make this larger in a follow up and allow for smaller tile sizes

@drisspg drisspg marked this pull request as draft January 5, 2026 05:42
@drisspg drisspg changed the base branch from drisspg/stack/1 to main January 5, 2026 05:42
@drisspg drisspg changed the base branch from main to drisspg/stack/1 January 5, 2026 05:42
@drisspg drisspg marked this pull request as ready for review January 5, 2026 05:43
@drisspg drisspg marked this pull request as draft January 5, 2026 17:04
@drisspg drisspg changed the base branch from drisspg/stack/1 to main January 5, 2026 17:04
@drisspg drisspg changed the base branch from main to drisspg/stack/1 January 5, 2026 17:04
@drisspg drisspg marked this pull request as ready for review January 5, 2026 17:04
@drisspg drisspg marked this pull request as draft January 5, 2026 19:08
@drisspg drisspg changed the base branch from drisspg/stack/1 to main January 5, 2026 19:08
@drisspg drisspg changed the base branch from main to drisspg/stack/1 January 5, 2026 19:08
@drisspg drisspg marked this pull request as ready for review January 5, 2026 19:08
@drisspg drisspg marked this pull request as draft January 7, 2026 01:37
@drisspg drisspg marked this pull request as ready for review January 7, 2026 01:38
@drisspg drisspg marked this pull request as draft January 9, 2026 03:02
@drisspg drisspg marked this pull request as ready for review January 9, 2026 03:03
@drisspg drisspg marked this pull request as draft January 9, 2026 03:13
@drisspg drisspg marked this pull request as ready for review January 9, 2026 03:14
drisspg added a commit to drisspg/flash-attention that referenced this pull request Jan 9, 2026
stack-info: PR: Dao-AILab#2136, branch: drisspg/stack/7
@drisspg drisspg marked this pull request as draft January 9, 2026 23:19
@drisspg drisspg marked this pull request as ready for review January 9, 2026 23:20
@drisspg drisspg marked this pull request as draft January 9, 2026 23:24
@drisspg drisspg marked this pull request as ready for review January 9, 2026 23:24
drisspg added a commit to drisspg/flash-attention that referenced this pull request Jan 9, 2026
stack-info: PR: Dao-AILab#2136, branch: drisspg/stack/7
@drisspg drisspg marked this pull request as draft January 9, 2026 23:39
@drisspg drisspg marked this pull request as ready for review January 9, 2026 23:39
drisspg added a commit to drisspg/flash-attention that referenced this pull request Jan 10, 2026
stack-info: PR: Dao-AILab#2136, branch: drisspg/stack/7
@drisspg drisspg marked this pull request as draft January 10, 2026 00:29
@drisspg drisspg marked this pull request as ready for review January 10, 2026 00:29
stack-info: PR: #2136, branch: drisspg/stack/7
@drisspg drisspg marked this pull request as draft January 10, 2026 00:50
@drisspg drisspg marked this pull request as ready for review January 10, 2026 00:50
@drisspg drisspg merged commit 27a3b54 into main Jan 10, 2026
1 check passed
@drisspg drisspg deleted the drisspg/stack/7 branch January 12, 2026 04:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants