added regression tests for official models - fix for #164#166
Open
nkundiushuti wants to merge 2 commits intomainfrom
Open
added regression tests for official models - fix for #164#166nkundiushuti wants to merge 2 commits intomainfrom
nkundiushuti wants to merge 2 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a slow integration regression test to ensure “official” ESP Hugging Face checkpoints produce stable numerical outputs on a deterministic mini-batch, plus a helper script to regenerate the expected output fingerprints when checkpoints intentionally change (addresses #164).
Changes:
- Introduces a slow integration test that loads each official HF-backed ESP model and asserts a SHA-256 fingerprint of pooled outputs matches a hardcoded reference table.
- Adds a regeneration script to recompute and print updated fingerprints (with optional JSON output).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
tests/integration/test_official_models_output_regression.py |
New slow integration test that fingerprints pooled model outputs for all official HF-backed esp_ models and enforces reference coverage. |
scripts/regenerate_official_model_output_fingerprints.py |
Utility to regenerate the fingerprint mapping used by the integration regression test. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
GaganNarula
requested changes
Mar 28, 2026
Contributor
GaganNarula
left a comment
There was a problem hiding this comment.
I think I agree with all the copilot comments. Other than that it looks good
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix for Comprehensive integration test on labelled data required to confirm model checkpoints load correctly #164