Conversation
There was a problem hiding this comment.
Pull request overview
Adds an optional CLI flag to models/spectf/report_spectf.py to export OOD prediction probabilities (y_hat_ood) into an HDF5 file alongside the generated report artifacts.
Changes:
- Add
--exportflag to the report generator CLI. - When enabled, write
y_hat_oodtooutdir/y_hat_ood.h5usingh5py.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| y_hat_ood[i*bs:i*bs+batch_len] = batch_y_hat | ||
|
|
||
| if export: | ||
| export_path = os.path.join(outdir, "y_hat_ood.h5") |
There was a problem hiding this comment.
The export always writes to a fixed filename ("y_hat_ood.h5") in outdir, so repeated runs will silently overwrite previous exports. Consider incorporating the run_name/timestamp into the filename and/or failing when the destination already exists unless an explicit overwrite flag is provided.
| export_path = os.path.join(outdir, "y_hat_ood.h5") | |
| export_timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") | |
| export_filename = f"y_hat_ood_{export_timestamp}.h5" | |
| export_path = os.path.join(outdir, export_filename) |
There was a problem hiding this comment.
This is a good suggestion.
| with h5py.File(export_path, "w") as f: | ||
| f.create_dataset("y_hat_ood", data=y_hat_ood) |
There was a problem hiding this comment.
y_hat_ood is being exported as float64 because the array is built via .astype(float) and np.zeros_like(..., dtype=float). For large OOD sets this can double disk usage; consider exporting as float32 and (optionally) using HDF5 compression (e.g., gzip) to keep file sizes manageable.
This PR adds the ability to export OOD model predictions as an HDF5 to
report_spectf.py. We intentionally avoid exporting sim test results because they are generated at runtime. The HDF5 produced here is aligned with the input OOD HDF5 and can be compared against the labels as-is.