Skip to content

Conversation

@ATATC
Copy link
Contributor

@ATATC ATATC commented Dec 6, 2025

See #106.

@ATATC ATATC added this to the 1.1.0 milestone Dec 6, 2025
@ATATC ATATC requested a review from perctrix December 6, 2025 03:44
@ATATC ATATC self-assigned this Dec 6, 2025
@ATATC ATATC added enhancement New feature or request todo New task or assignment labels Dec 6, 2025
@ATATC ATATC linked an issue Dec 6, 2025 that may be closed by this pull request
@ATATC ATATC requested a review from perctrix December 6, 2025 04:34
@perctrix
Copy link
Collaborator

perctrix commented Dec 6, 2025

@ATATC
One critical issue I've noticed is that the code saves the entire optimizer/scheduler objects:

torch.save(toolbox.optimizer, ...)

But saves only the model's state_dict. When loading, load_toolbox() creates a new model instance with new parameter tensors, while the loaded optimizer still references old parameter tensors from pickle. This causes optimizer.step() to update the wrong tensors - the new model's weights won't change. Does this issue exist?

@ATATC
Copy link
Contributor Author

ATATC commented Dec 6, 2025

That does exist.

@ATATC
Copy link
Contributor Author

ATATC commented Dec 6, 2025

Do the criterion and the scheduler have the same problem?

@perctrix
Copy link
Collaborator

perctrix commented Dec 6, 2025

I think so.

…mizer, scheduler, and criterion in `Trainer`. Updated `load_toolbox()` logic to align with this change. (#106)
@ATATC
Copy link
Contributor Author

ATATC commented Dec 6, 2025

Try this fix

@ATATC
Copy link
Contributor Author

ATATC commented Dec 6, 2025

Actually I ran a quick test the progress chart seems to be off. I continued training from epoch 3.
progress

@ATATC
Copy link
Contributor Author

ATATC commented Dec 6, 2025

@perctrix Can you do a debugging? I needa study for my physics lol

Copy link
Collaborator

@perctrix perctrix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried this code. I think it may be able to solve your problem.

…`_build_toolbox()` helper, improving reusability and modularity in `Trainer`. (#106)
@ATATC ATATC requested a review from perctrix December 6, 2025 19:24
@perctrix
Copy link
Collaborator

perctrix commented Dec 6, 2025

LGTM

@ATATC ATATC merged commit b1ce2fc into main Dec 6, 2025
3 checks passed
@ATATC ATATC deleted the 106 branch December 6, 2025 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request todo New task or assignment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support optionally saving and loading toolbox

3 participants