Skip to content
This repository was archived by the owner on Jan 1, 2025. It is now read-only.
This repository was archived by the owner on Jan 1, 2025. It is now read-only.

Division by zero caused by mask operation #243

@Chenyang-1024

Description

@Chenyang-1024

If each pixel in the input image does not belong to the q-th class, then when generating the mask for masked attention, attn_mask[b, q, :] = True will be converted to attn_mask[b, q, :] = float('-inf') in nn.MultiheadAttention. Finally, when attn_mask is used for the Softmax(attn_mask, dim=-1) operation to calculate the attention map, the NaN caused by the divide by 0 error will appear. : (
This problem came up when I applied masked attention to my semantic segmentation task. : (
image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions