Skip to content

Conversation

kuacakuaca
Copy link
Collaborator

pretty much the same as in torchaudio.models.conformer.

  1. Relative positional encoding is missing, i'm not sure if it should be added and if yes, do we implement it by ourselves or just import it from somewhere else.
  2. Should more parameters be added, e.g. key_dim, value_dim?

Copy link
Member

@albertz albertz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this at all?

Again, as in the other PRs (#4, #6), I would keep consistent to our existing implementation (see rwth-i6/returnn_common#58 for the initial code, but then also check the current code, also in the RETURNN frontend, also check with @mmz33). We discussed many things about the naming of modules and also their behavior.

In this specific case, you would use a standard generic RelPosSelfAttention/RelPositionMultiHeadedAttention. Or the standard PyTorch MultiheadAttention.

Also, this module here would not add layer-norm, or the residual connection, or dropout. That would all be handled by the outer module (ConformerEncoderLayer).

So, rather than reimplementing the already existing MultiheadAttention, what is missing is sth like RelPositionMultiHeadedAttention. We have that in RETURNN-common / RETURNN-frontend as well already, so I would suggest to just follow that implementation, unless there are good reasons to do sth differently. But I would keep at least the naming, arguments and variable names consistent.

@SimBe195
Copy link
Contributor

SimBe195 commented May 9, 2023

Some tests should also be added, similar to #4.

Copy link
Contributor

@Atticus1806 Atticus1806 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comments. Some minor things, but also some more fundamental.
Regarding the failing test I think you need to add torch to the requirements.txt since its not done in the main yet.

Copy link
Contributor

@michelwi michelwi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one comment left from my side

@curufinwe curufinwe merged commit 83af04d into main May 18, 2023
@christophmluscher christophmluscher deleted the add_conformer_mhsa_part branch June 16, 2023 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants