Growing training set

An idea that occurred to me in the Discussion part of the recent paper draft was a way to grow the dataset and eventually get full caption coverage. The idea is this:

1. Train the model until its caption score on the validation set passes some threshold. Maybe 0.8, but this could be a setting we experiment with.
2. At this point, most model outputs look like levels, but they may not match their input captions. However, we automatically caption samples that we produce during training (we may need to produce samples more often than every 20 epochs).
3. Look at the automatic captions on these samples, and whenever the caption does not exist in the training data, add both the generated scene and the caption to the training data.
4. Do this for some amount of time and watch the training set increase to cover a wider range of possible captions, and thus result in better text control of the model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Growing training set #117

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Growing training set #117

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions