Open
Conversation
93311d3 to
99fb1ec
Compare
Scale pred_std by diff_std to prevent probabilistic loss from exploding when output_std=True. Closes mllam#347 Co-authored-by: GitHub Copilot <noreply@github.com>
99fb1ec to
f67dd44
Compare
Author
|
Hi @sadamov @joeloskarsson , I noticed this issue regarding |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Describe your changes
Immediately after softplus, scale the predictor standard deviation (
pred_std) using the one-step difference statistics (self.diff_std).Instead of beginning around ~0.69, this first scales
pred_stdto the empirical step differences. This keeps measurements like NLL and CRPS from experiencing logarithmic loss explosions in the early stages of training. This offers a better initialization scale, as was discussed with @joeloskarsson. We consider this to be the main initialization step; training curve evaluation is necessary to determine whether the suggested/ softplus(0.)scaling multiplier will be needed as a follow-up.None.
Issue Link
linked to #347
Type of change
Checklist before requesting a review
pullwith--rebaseoption if possible).Checklist for reviewers
Each PR comes with its own improvements and flaws. The reviewer should check the following:
Author checklist after completed review