Skip to content

added regression tests for official models - fix for #164#166

Open
nkundiushuti wants to merge 2 commits intomainfrom
marius/regression-tests-checkpoints-load
Open

added regression tests for official models - fix for #164#166
nkundiushuti wants to merge 2 commits intomainfrom
marius/regression-tests-checkpoints-load

Conversation

@nkundiushuti
Copy link
Copy Markdown
Contributor

@nkundiushuti nkundiushuti commented Mar 24, 2026

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a slow integration regression test to ensure “official” ESP Hugging Face checkpoints produce stable numerical outputs on a deterministic mini-batch, plus a helper script to regenerate the expected output fingerprints when checkpoints intentionally change (addresses #164).

Changes:

  • Introduces a slow integration test that loads each official HF-backed ESP model and asserts a SHA-256 fingerprint of pooled outputs matches a hardcoded reference table.
  • Adds a regeneration script to recompute and print updated fingerprints (with optional JSON output).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
tests/integration/test_official_models_output_regression.py New slow integration test that fingerprints pooled model outputs for all official HF-backed esp_ models and enforces reference coverage.
scripts/regenerate_official_model_output_fingerprints.py Utility to regenerate the fingerprint mapping used by the integration regression test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@GaganNarula GaganNarula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with all the copilot comments. Other than that it looks good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants