Skip to content

Question regarding Equation 5 Random Mixing in the Paper #13

@f-fuchs

Description

@f-fuchs

Should the matrix multiplication not be swapped?

$$ RandomMixing(X) = XW_R \Rightarrow RandomMixing(X) = W_RX $$

I think the dimension don`t work for the original equation because you are multiplying a $N \times C$ matrix with a $N \times N$ matrix. Secondly, in the source code I think you are actually multiplying the weight matrix with the input matrix.

class RandomMixing(nn.Module):
    def __init__(self, num_tokens=196, **kwargs):
        super().__init__()
        self.random_matrix = nn.parameter.Parameter(
            data=torch.softmax(torch.rand(num_tokens, num_tokens), dim=-1), 
            requires_grad=False)
    def forward(self, x):
        B, H, W, C = x.shape
        x = x.reshape(B, H*W, C)
        x = torch.einsum('mn, bnc -> bmc', self.random_matrix, x)
        x = x.reshape(B, H, W, C)
        return x

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions