Skip to content

Tests for self attention using CumConcatLayer #590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 22, 2021

Conversation

Zettelkasten
Copy link
Member

Some tests for #391.
I didn't run any of this yet (so most likely buggy) because there is no implementation.
It still needs CumConcatLayer from #589, but also the common arguments for DotLayer from #569.
Also, some way to set dim tags (e.g. set_dim_tags: Dict[str, DimensionTag] for ReinterpretDataLayer. I'm not sure, didn't we add this already somewhere?)

We might also want more tests than this: e.g. some with positional encoding.

@Zettelkasten Zettelkasten requested review from albertz and a team as code owners August 26, 2021 14:44
@Zettelkasten Zettelkasten force-pushed the frithjof-generalized-self-att-tests branch from 66cb35e to 2337f50 Compare August 26, 2021 15:02
@Zettelkasten Zettelkasten marked this pull request as draft August 26, 2021 15:03
@albertz albertz force-pushed the albert-generalized-self-att branch 10 times, most recently from 2384c1b to cf604e1 Compare September 1, 2021 13:30
@albertz albertz force-pushed the albert-generalized-self-att branch 4 times, most recently from 626c35b to 75b72db Compare September 7, 2021 22:19
@albertz albertz force-pushed the albert-generalized-self-att branch 8 times, most recently from 6deb705 to a406d3f Compare September 12, 2021 00:40
Base automatically changed from albert-generalized-self-att to master September 12, 2021 00:59
@Zettelkasten Zettelkasten force-pushed the frithjof-generalized-self-att-tests branch 2 times, most recently from 405218b to dc58c5a Compare September 18, 2021 16:20
@Zettelkasten
Copy link
Member Author

This should be ready now.

I left in test_self_attention_optimize_out, which essentially does the same as test_reclayer_optimize_out_cum_concat_gen_self_att. Should we remove that?

@Zettelkasten Zettelkasten marked this pull request as ready for review September 18, 2021 16:26
@albertz
Copy link
Member

albertz commented Sep 18, 2021

There is also test_generalized_non_rec_self_attention already now.

I did not check in detail, is this really exactly the same, or just similar? What is the difference?
If there is some difference which makes sense for another test, then leave it. Otherwise remove it.

@Zettelkasten Zettelkasten force-pushed the frithjof-generalized-self-att-tests branch from dc58c5a to 90e780c Compare September 22, 2021 15:09
@Zettelkasten
Copy link
Member Author

Zettelkasten commented Sep 22, 2021

There is also test_generalized_non_rec_self_attention already now.

Oh okay, I missed that.

The test case in this PR also checks that SelfAttentionLayer does the same as the generalized self attention for the recurrent case (not just non-rec). Maybe we want to check that too.
But maybe we can just close this PR, it's a lot new code which originally was meant to be much more general. I think the test cases we already have are nicer because they use an explicit net dict and seem easier to debug if something breaks in the future.

@albertz
Copy link
Member

albertz commented Sep 22, 2021

Hm, I don't know. It might be nice as an example. I would say leave it.

@albertz albertz merged commit 9e088cd into master Sep 22, 2021
@albertz albertz deleted the frithjof-generalized-self-att-tests branch September 22, 2021 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants