Skip to content

Conversation

YizhouZ
Copy link
Collaborator

@YizhouZ YizhouZ commented Aug 11, 2025

First PR for cutlass chunk_prefill

jikunshang and others added 8 commits August 1, 2025 00:59
Signed-off-by: Kunshang Ji <[email protected]>
* add cutlass

Signed-off-by: Kunshang Ji <[email protected]>

* fix import

Signed-off-by: Kunshang Ji <[email protected]>

---------

Signed-off-by: Kunshang Ji <[email protected]>
@YizhouZ YizhouZ force-pushed the kunshang/flash_attn_interface branch from fb1b3ac to 0c93a3b Compare August 29, 2025 03:16
int head_size;
int max_blocks_per_seq;
int block_size;
bool is_causal;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a placeholder for sink support s_aux: Optional[torch.Tensor] = None,.

if(cuType == CutlassType::half) {
FMHAKernel<typename chunk_policy::ShapeQK, typename chunk_policy::ShapePV,
typename chunk_policy::ShapeOutPut, typename chunk_policy::SubgroupLayout, PipelineStages,
cutlass::half_t, XE_8x16x16_F32F16F16F32_TT>::dispatch(queue, args);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

qq: do we support bf16?

} else {
TORCH_INTERNAL_ASSERT(
false,
"");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some error log

@@ -0,0 +1,127 @@
/***************************************************************************************************
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can remove this file since the example is also removed.

is_causal);

if(return_softmax) {
auto softmax_lse = torch::empty_like(out);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems it will always return an empty tensor, please add some FIXME if not support for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants