-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add site training torchdataset #301
Comments
Would be good here, to also add datetime features, or should be these added to the batch? |
I'm not sure if we need this/if it makes sense to go here. We proposed to make batch (sample) classes in openclimatefix/ocf-data-sampler#71. These classes should implement Then in PVNet all we need is something like
|
I guess the idea was to instead of having it in I agree openclimatefix/ocf-data-sampler#71 will be a nice tidy up for all of these |
Yeh I see what you mean, but I'm not sure where the clean divide is. In PVNet we have the I have also been thinking about cross-validation and how we can split the samples after we've saved them. Currently in our backtest we have a model which was trained on 2019-2022 and validated on 2023. We make predictions on 2019-2023 with it for the backtest, so the backtest results are overfit. It we want to run cross validation for the backtest we would need to filter the pre-saved training samples by time. That means we'd need to string together |
I agree something should be done with cross-validation. I think we should move the Dataset for loading presamples either all here, or all in PVnet. I thinkt he current one for PVnet UK regional is in PVnet and this is much simplier as all it does is load the If we decide to keep it in PVNet, then we should atleast put an example in here. Just to show how we can use it? Maybe something we can chat about in one of the first ML meetings. Good to think about positives and negatives e.t.c |
I have a feeling that once we have created the sample class which has a |
We need a torch dataset, that loads premade batches.
This could live in PVnet, but I think it makes sense to live in this repo
The idea is to create a troch dataset that loads netcdf samples in, and change them to torch tensor, ready to be used by PVNet
Credit to @Sukh-P for this suggestion
And then in PVnet, we would have to update it to something like this
The text was updated successfully, but these errors were encountered: