Skip to content

Added Synonym insertion#160

Merged
kaustubhdhole merged 11 commits into
GEM-benchmark:mainfrom
dsfsi:synonym_insertion
Oct 28, 2021
Merged

Added Synonym insertion#160
kaustubhdhole merged 11 commits into
GEM-benchmark:mainfrom
dsfsi:synonym_insertion

Conversation

@vukosim
Copy link
Copy Markdown
Contributor

@vukosim vukosim commented Jul 25, 2021

No description provided.

@kaustubhdhole
Copy link
Copy Markdown
Collaborator

This is already implemented and is on the verge of merging: #51

@vukosim
Copy link
Copy Markdown
Contributor Author

vukosim commented Jul 26, 2021

@JosephSefara please see

@JosephSefara
Copy link
Copy Markdown
Contributor

@kaustubhdhole This implementation is different from #51 .

@kaustubhdhole
Copy link
Copy Markdown
Collaborator

Thank you for the clarification @JosephSefara. I think these changes look good to me.

@kaustubhdhole kaustubhdhole self-requested a review August 26, 2021 00:02
@kaustubhdhole
Copy link
Copy Markdown
Collaborator

Please pull main once in your branch.

@kaustubhdhole
Copy link
Copy Markdown
Collaborator

Okay, I just have one comment: Do you think it might be better to write synonynms besides the original words so that the sentence is still well-formed?

https://github.com/dsfsi/textaugment
"""
tasks = [TaskType.TEXT_CLASSIFICATION, TaskType.TEXT_TO_TEXT_GENERATION]
languages = ["en"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please add keywords here.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I would also recommend adding the robustness evaluation for your PR that can be added to the leaderboard.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • keywords added
  • readme and test.json contains the results of the Robustness Evaluation for
    • Text Classification
    • Text Generation

@kaustubhdhole
Copy link
Copy Markdown
Collaborator

@vukosim @JosephSefara ping!

@kaustubhdhole
Copy link
Copy Markdown
Collaborator

Also, one more thing: do you think it would be better to include the synonym within a bracket?

@JosephSefara
Copy link
Copy Markdown
Contributor

Okay, I just have one comment: Do you think it might be better to write synonynms besides the original words so that the sentence is still well-formed?

@kaustubhdhole, I don't understand you question but we are inserting a synonym next to its original word. Augmentation sometimes is about adding noise to the sentence hence the sentence might not be well-formed but still retains original context. E.g. stopwords removal #268 removes stop words, thus the sentence might not be well-formed.

@JosephSefara
Copy link
Copy Markdown
Contributor

Also, one more thing: do you think it would be better to include the synonym within a bracket?

@kaustubhdhole
That can be done but not a good idea since most people clean their text before augmentation.

  • Text cleaning includes removal of special characters including brackets.

Why remove special characters?

  • They do not carry any meaning and they create noise depending on the task being done.

@kaustubhdhole kaustubhdhole merged commit 6fecd62 into GEM-benchmark:main Oct 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants