Problem
The checkpoint resume capability for Monte Carlo GCS currently does not preserve nadir-equivalent error calculations, since nadir-equivalent error is calculated in the subsequent error-stats step. This prevents accurate aggregation of error statistics across multiple runs since the required error data isn't saved in the NetCDF files.
Proposed Improvements
- Calculate and Write Nadir-Equivalent Error:
- Implement or port the nadir-equivalent error calculation logic from
geolocation_error_stats.py into image_match.py.
- Ensure this error statistic is computed for each checkpoint and written to the NetCDF output during simulation.
- Loop Integration:
- Integrate the calculation step directly into the main simulation loop so checkpoints will have up-to-date NE error values.
- At the end of the simulation batch, run the full error statistics aggregation for all completed and resumed runs.
- Support Checkpoint Resume:
- When resuming from a checkpoint, read previously saved nadir-equivalent errors from NetCDF to incorporate them into aggregated statistics.
- Ensure resume behavior is robust for both partial and full run aggregation.
Implementation Plan
Background:
The nadir-equivalent error metric is essential in post-processing, allowing direct inclusion of previous runs' error values in aggregate statistics. Integrating it directly with periodic checkpointing will improve both robustness and downstream error analysis.
Additional Notes:
- Review whether any non-obvious dependencies between error calculation and NetCDF writing need explicit handling.
- Confirm compatibility for current checkpoint format and resume workflow.
Original issue below for context.
We need nadir-equivalent error to be calculated and written to netcdf in order to be able to resume and include previous runs in error stats calculations.
We should explore porting the nadir-equivalent error calculation step from geolocation_error_stats.py to the image_match.py or at least call that calculation IN loop, and THEN run the full error Stats at the end of the loop (for aggregate stats).
Problem
The checkpoint resume capability for Monte Carlo GCS currently does not preserve nadir-equivalent error calculations, since nadir-equivalent error is calculated in the subsequent error-stats step. This prevents accurate aggregation of error statistics across multiple runs since the required error data isn't saved in the NetCDF files.
Proposed Improvements
geolocation_error_stats.pyintoimage_match.py.Implementation Plan
geolocation_error_stats.pyandimage_match.pyto identify what needs porting or integration.Background:
The nadir-equivalent error metric is essential in post-processing, allowing direct inclusion of previous runs' error values in aggregate statistics. Integrating it directly with periodic checkpointing will improve both robustness and downstream error analysis.
Additional Notes:
Original issue below for context.
We need nadir-equivalent error to be calculated and written to netcdf in order to be able to resume and include previous runs in error stats calculations.
We should explore porting the nadir-equivalent error calculation step from geolocation_error_stats.py to the image_match.py or at least call that calculation IN loop, and THEN run the full error Stats at the end of the loop (for aggregate stats).