Open
Description
Or ABC. What is the best way to do this? Could we choose 3 experiments that have the highest Bleu score (or the ones we think are the best candidates) and auto-create an AB test to determine which of the 3 model configurations are most ideal. This could include:
- An investigation has completed runs with the pretranslations saved
- A subset of the pretranslations are used for the AB testing (book selection?)
- Either a Google sheet or potentially a Google Form is auto-created with the 3 top pretranslation sets (randomly sorted - 6 verses?) and sent to one or more translators
- Potentially both - we could create the Google Sheet and then make the form off of the sheet. The sheet could then be re-populated with the results.
- After the form is filled out, the results show back up in a Google sheet or somewhere else de-randomized