Add overall predictive probability utility and validate model-comparison math#18
Open
shawnrhoads wants to merge 1 commit intomainfrom
Open
Add overall predictive probability utility and validate model-comparison math#18shawnrhoads wants to merge 1 commit intomainfrom
shawnrhoads wants to merge 1 commit intomainfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR enhances likelihood-based statistical utilities in the pyem package by fixing the integrated BIC calculation to properly aggregate trials across all subjects, improving numerical stability, and adding a new utility to compute overall predictive probability from per-subject NLL values.
Changes:
- Fixed
calc_BICintto count total trials across all subjects (previously only counted first subject) and improved numerical stability usinglogsumexp - Added
overall_predictive_probability_from_nllutility function to compute geometric-mean predictive probability with proper NaN handling - Added input validation to
pseudo_r2_from_nllto prevent silent misuse of invalid metric values
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| pyem/utils/stats.py | Fixed BIC trial counting bug, added logsumexp for numerical stability, added new predictive probability utility with validation, and hardened pseudo_r2_from_nll with metric validation |
| tests/test_stats.py | Added comprehensive unit tests covering trial aggregation, numerical stability, input validation, and formula correctness for all modified and new functions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Description
overall_predictive_probability_from_nllinpyem.utils.statswhich returns the geometric-mean predictive probability (or its log) computed asexp(-sum(nll) / nchoices_total).calc_BICintbehavior as the integrated approach: per-subject log-mean-exp over posterior samples usinglogsumexp, summing across subjects, and adding the complexity penaltynpar * log(total_trials), and ensured total-trial counting sums across all subjects.pseudo_r2_from_nllto raiseValueErrorfor unsupportedmetricvalues.tests/test_stats.pycoveringoverall_predictive_probability_from_nll(formula match, log-return, and input validation),calc_BICinttrial-count aggregation and numerical stability, andpseudo_r2_from_nllmetric validation.Testing
pytest -q tests/test_stats.pyand the test suite passed.pytest -q tests/test_compare.py::test_model_compare_basic tests/test_utils.py::test_model_identifiabilityand both tests passed.Codex Task