-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
An idea that occurred to me in the Discussion part of the recent paper draft was a way to grow the dataset and eventually get full caption coverage. The idea is this:
- Train the model until its caption score on the validation set passes some threshold. Maybe 0.8, but this could be a setting we experiment with.
- At this point, most model outputs look like levels, but they may not match their input captions. However, we automatically caption samples that we produce during training (we may need to produce samples more often than every 20 epochs).
- Look at the automatic captions on these samples, and whenever the caption does not exist in the training data, add both the generated scene and the caption to the training data.
- Do this for some amount of time and watch the training set increase to cover a wider range of possible captions, and thus result in better text control of the model.
Metadata
Metadata
Assignees
Labels
No labels