Add dry-run data preflight mode in train_model#511
Add dry-run data preflight mode in train_model#511AR10129 wants to merge 2 commits intomllam:mainfrom
Conversation
|
Thanks for this! A few observations from reading through:
This is just to take some load of the reviewers ! Hope you find this constructive ! @AR10129 Pardon me if I missed something, I would be grateful to learn ! |
|
Thanks for the thorough review, these are fair points. Let me address each one:
I’ll push a follow-up cleanup commit with these changes. |
Describe your changes
This PR adds an optional dry-run data preflight path in the training CLI to fail fast on dataset/configuration errors before model and trainer initialization.
The new flag validates one batch from the relevant dataloader(s) and checks batch structure, expected tensor dimensions, finite values, forcing-window consistency, and strictly increasing target times.
This reduces late pipeline failures and debugging time for invalid data/window settings.
No new runtime dependencies are introduced.
Issue Link
closes #510
Type of change
Checklist before requesting a review
pullwith--rebaseoption if possible).Checklist for reviewers
Each PR comes with its own improvements and flaws. The reviewer should check the following:
Author checklist after completed review
reflecting type of change (add section where missing):
Checklist for assignee