Open
Conversation
tomMoral
reviewed
Jan 7, 2026
Member
tomMoral
reviewed
Jan 15, 2026
Member
tomMoral
left a comment
There was a problem hiding this comment.
I did a few extra clean up.
Just one comment on the parameters, it feels like there are too many.
tomMoral
reviewed
Jan 15, 2026
Member
tomMoral
left a comment
There was a problem hiding this comment.
A few tweaks that can be helpful
Author
|
@tomMoral Shall we merge it like that, or do additional hyperparameter experiences? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

This PR intends to implement the SOAP algorithm from https://arxiv.org/abs/2409.11321
It is drafted from the original soap.py implementation available at https://github.com/nikhilvyas/SOAP/blob/main/soap.py
OpenAI codex was used to start this feature using the following prompt on the repository:
Default hyperparameter (lr and weight_decay) were not working on Simulated dataset (3e-3 and 1e-2). Default parameter of Adam neither (1e-3 and 1e-4).
Current working are 1e-4 and 1e-3
TODO