Convert "word numbers" to their decimal representations#273
Conversation
|
Hi, this PR is still a work-in-progress; please do not review (UPDATE: ready now) |
Is this still in a do not review stage? The deadline for me to submit reviews is the 18th I believe |
|
Thanks @rteehas , still a WIP, will update shortly! |
|
Tagging @kaustubhdhole to see if I can hold off on reviewing this until it is in its final state |
Thanks -- almost done! |
|
Hey @rteehas done! Apologies for the delay. Please see the updated code, PR title, and PR comments. Let me know if you have any questions! Perhaps surprisingly, this transformation was rather nontrivial. I hope it will add value as an augmentation! |
|
|
||
|
|
||
| class WordsToNumbers(SentenceOperation): | ||
| tasks = [TaskType.TEXT_CLASSIFICATION, TaskType.TEXT_TO_TEXT_GENERATION, TaskType.TEXT_TAGGING] |
There was a problem hiding this comment.
Task TEXT_TAGGING is not applicable here because of a change in the number of words (Ex: I have two hundred fifty books --> I have 250 books.)
There was a problem hiding this comment.
You can add tasks PARAPHRASE_DETECTION, TEXTUAL_ENTAILMENT
|
|
||
| from text2nums import * | ||
|
|
||
|
|
There was a problem hiding this comment.
Please add a docstring to the class WordToNumbers.
|
|
||
| ## Previous Work | ||
|
|
||
| Several webpages exist to do this (as the code is fairly simple) but have various errors: | ||
|
|
||
| - https://www.browserling.com/tools/words-to-numbers cannot handle capital letters | ||
| - https://www.dcode.fr/writing-words-numbers does not provide source code | ||
|
|
||
| Our code is very loosely adapted from | ||
| https://stackoverflow.com/questions/493174/is-there-a-way-to-convert-number-words-to-integers, though our implementation | ||
| is more general and handles sentences where only part of the sentence refers to a number. | ||
|
|
||
| ## What are the limitations of this transformation? |
There was a problem hiding this comment.
Please add a robustness evaluation section here like PR #218.
| Our code is very loosely adapted from | ||
| https://stackoverflow.com/questions/493174/is-there-a-way-to-convert-number-words-to-integers, though our implementation | ||
| is more general and handles sentences where only part of the sentence refers to a number. | ||
|
|
There was a problem hiding this comment.
Also, you might want to add this to the README and mention a line or two how your transformation is different. https://github.com/GEM-benchmark/NL-Augmenter/blob/main/transformations/number-to-word/transformation.py
|
@motiwari would you like to address the above comments? |
|
Thanks for the ping -- I'll address the comments this week |
|
Hi @kaustubhdhole and @ashish3586 , thanks for the comments above -- I've addressed all your comments except adding the robustness evaluation. I just rebased this branch on |
|
@kaustubhdhole @ashish3586 @rteehas @sebastianGehrmann can you take a look at this? |
This transformation converts "word numbers" to their decimal representations in sentences, e.g.:
There are three hundred twelve million, five hundred thirty four thousand, six hundred seventy two people in the United States and one in every two is female.->There are 312,534,672 people in the United States and 1 in every 2 is female.This is a rather nontrivial transformation (see code). It is something of a reverse transformation to PR#71 and PR#39