-
Notifications
You must be signed in to change notification settings - Fork 30
Restructure learning config #692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #692 +/- ##
==========================================
- Coverage 45.72% 45.61% -0.11%
==========================================
Files 54 54
Lines 8031 8030 -1
==========================================
- Hits 3672 3663 -9
- Misses 4359 4367 +8
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
maurerle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good decision to create a new PR for it.
I think that the learning_config should be renamed if it is only a dict.
Just an idea:
If we always have a learning_role anyway and this learning_role does always have the learning_config - we could also just set the properties in the learning_role.
Therefore, we can ditch LearningConfig and have world.learning_role.trained_policies_save_path instead of world.learning_role.learning_config.trained_policies_save_path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Training a simulation (02a) on my GPU does not work in this branch, however it does fine on main
RuntimeError: Expected all tensors to be on the same device, but got mat1 is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA_addmm)
|
@jrasko Will look into the cuda problem tomorrow! Could you please try removing the "and not self.learning_config.evaluation_mode" in the learning role? |
Do you mean renaming it in the yaml files?
Yeah, I don't like the very long nesting either. But I felt like the config will be extended in the future and I wanted to keep everything that is user-settable via the config in one place. And it's still/already unpacked in the learning strategies, so then kept it at least centralized in the learning role. Before, all the settings were passed to the LearningStrategy as kwargs. |
@mthede yep, thats why. Removing this condition fixes the error. |
…ation options 1. No RL -> no learning_config -> no learning role 2. Single run with loaded RL strategies -> learning_mode: false and trained_policies_load_path is provided -> learning role, but no rl algorithm etc. necessary 3. Training run -> learning_mode: true 3.1. Training episodes 3.2. Evaluation episodes (config item evaluation_mode exists, but learning loop overwrites it anyway) 4. Continue learning -> continue_learning: true and trained_policies_load_path is provided
…g_config exists - run pre-commit
de192fb to
f2eeab2
Compare
- add abosult change to early stopping criterion, have that on my personal brnach for a while now
kim-mskw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the changes I wanted where discussed bilateral
| with patch.object(Storage, "calculate_marginal_cost", return_value=10.0): | ||
| # Calculate bids using the strategy | ||
| bids = strategy.calculate_bids( | ||
| bids = strategy.calculate_bids( # TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What did this TODO say?
At least add a comment here?
502b429 to
c372568
Compare
Description
With the new learning architecture on main, we can streamline how default parameters are handled. Previously, defaults were defined redundantly in multiple places, making it difficult to determine which values applied in practice.
The updated structure centralizes all defaults within the learning role or in the dedicated learning_config data class, improving clarity and maintainability.
It also improves the learning configuration logic, based on four different simulation options (with and without DRL)
3.1. Training episodes
3.2. Evaluation episodes (config item evaluation_mode exists, but learning loop overwrites it anyway)
Checklist
docfolder updates etc.)Additional Notes (optional)