Skip to content

Add overall predictive probability utility and validate model-comparison math#18

Open
shawnrhoads wants to merge 1 commit intomainfrom
review-functions-for-model-comparison-metrics
Open

Add overall predictive probability utility and validate model-comparison math#18
shawnrhoads wants to merge 1 commit intomainfrom
review-functions-for-model-comparison-metrics

Conversation

@shawnrhoads
Copy link
Copy Markdown
Owner

Motivation

  • Align the integrated BIC implementation with the model-comparison math that aggregates per-subject log-mean-exp of likelihoods and then applies the parameter-count penalty.
  • Provide a simple, numerically-robust utility to report the overall predictive probability (geometric-mean predictive probability across all modeled choices) from per-subject NLL outputs.
  • Harden existing likelihood-based utilities to prevent silent misuse of aggregation metrics.

Description

  • Added overall_predictive_probability_from_nll in pyem.utils.stats which returns the geometric-mean predictive probability (or its log) computed as exp(-sum(nll) / nchoices_total).
  • Kept calc_BICint behavior as the integrated approach: per-subject log-mean-exp over posterior samples using logsumexp, summing across subjects, and adding the complexity penalty npar * log(total_trials), and ensured total-trial counting sums across all subjects.
  • Retained/ensured validation in pseudo_r2_from_nll to raise ValueError for unsupported metric values.
  • Added/updated unit tests in tests/test_stats.py covering overall_predictive_probability_from_nll (formula match, log-return, and input validation), calc_BICint trial-count aggregation and numerical stability, and pseudo_r2_from_nll metric validation.

Testing

  • Ran pytest -q tests/test_stats.py and the test suite passed.
  • Ran pytest -q tests/test_compare.py::test_model_compare_basic tests/test_utils.py::test_model_identifiability and both tests passed.
  • All added tests and the targeted existing checks completed successfully (no failures).

Codex Task

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances likelihood-based statistical utilities in the pyem package by fixing the integrated BIC calculation to properly aggregate trials across all subjects, improving numerical stability, and adding a new utility to compute overall predictive probability from per-subject NLL values.

Changes:

  • Fixed calc_BICint to count total trials across all subjects (previously only counted first subject) and improved numerical stability using logsumexp
  • Added overall_predictive_probability_from_nll utility function to compute geometric-mean predictive probability with proper NaN handling
  • Added input validation to pseudo_r2_from_nll to prevent silent misuse of invalid metric values

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
pyem/utils/stats.py Fixed BIC trial counting bug, added logsumexp for numerical stability, added new predictive probability utility with validation, and hardened pseudo_r2_from_nll with metric validation
tests/test_stats.py Added comprehensive unit tests covering trial aggregation, numerical stability, input validation, and formula correctness for all modified and new functions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@shawnrhoads shawnrhoads marked this pull request as ready for review February 18, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants