Skip to content

Adapt batch fix #379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 14, 2025
Merged

Adapt batch fix #379

merged 3 commits into from
May 14, 2025

Conversation

dfulu
Copy link
Member

@dfulu dfulu commented May 13, 2025

Pull Request

Description

There was an error in how we use the _adapt_batch(). Previously we used to edit the batches in-place but this was flagged to change in this issue and fixed in this PR.

When this the _adapt_batch() was updated this introduced an error where the batch was updated for the forward pass of the model, but not for calculating the MAE and for backprop. This meant that when training for an intraday model using batches which go out to day ahead - the model was actually being trained the predict the last 8 hours of the 36 hour window.

Checklist:

  • My code follows OCF's coding style guidelines
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked my code and corrected any misspellings

@dfulu dfulu requested a review from peterdudfield May 13, 2025 15:04
@dfulu dfulu added the bug Something isn't working label May 13, 2025
Copy link
Contributor

@peterdudfield peterdudfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aproved, could you either add a test for this? Or add a github issue to test this?

@dfulu
Copy link
Member Author

dfulu commented May 14, 2025

Aproved, could you either add a test for this? Or add a github issue to test this?

There isn't really an obvious test that could have caught this. The mistake was in the slicing inside the training and validation steps. Those train/val steps were passing but just weren't correctly indexed internally.

We'd need quite a big refactor to unit test this bit, or a really convoluted test to make sure the MAE values returned by the train/val steps were as expected. So I don't think this is worth testing

@dfulu dfulu merged commit 69ba024 into main May 14, 2025
3 checks passed
@dfulu dfulu deleted the adapt_batch_fix branch May 14, 2025 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants