Fix checkpoint resume for Monte Carlo GCS

## Problem
The checkpoint resume capability for Monte Carlo GCS currently does not preserve nadir-equivalent error calculations, since nadir-equivalent error is calculated in the subsequent error-stats step. This prevents accurate aggregation of error statistics across multiple runs since the required error data isn't saved in the NetCDF files. 

## Proposed Improvements
1. **Calculate and Write Nadir-Equivalent Error:**
   - Implement or port the nadir-equivalent error calculation logic from `geolocation_error_stats.py` into `image_match.py`.
   - Ensure this error statistic is computed for each checkpoint and written to the NetCDF output during simulation.
2. **Loop Integration:**
   - Integrate the calculation step directly into the main simulation loop so checkpoints will have up-to-date NE error values.
   - At the end of the simulation batch, run the full error statistics aggregation for all completed and resumed runs.
3. **Support Checkpoint Resume:**
   - When resuming from a checkpoint, read previously saved nadir-equivalent errors from NetCDF to incorporate them into aggregated statistics.
   - Ensure resume behavior is robust for both partial and full run aggregation. 

## Implementation Plan
- [ ] Audit existing error metrics code in `geolocation_error_stats.py` and `image_match.py` to identify what needs porting or integration.
- [ ] Refactor the code so nadir-equivalent error calculation is available and called within the simulation loop where checkpoints are written.
- [ ] Update NetCDF output handling to include nadir-equivalent error for each checkpoint.
- [ ] Develop logic to load and aggregate previous checkpoint errors when resuming simulations. 
- [ ] Add/modify unit tests to cover resume and aggregation of error statistics.
- [ ] Document the new error calculation workflow and update relevant usage guides.

---
**Background:**
The nadir-equivalent error metric is essential in post-processing, allowing direct inclusion of previous runs' error values in aggregate statistics. Integrating it directly with periodic checkpointing will improve both robustness and downstream error analysis.

---
**Additional Notes:**
- Review whether any non-obvious dependencies between error calculation and NetCDF writing need explicit handling.
- Confirm compatibility for current checkpoint format and resume workflow.

---
_Original issue below for context._

---
We need nadir-equivalent error to be calculated and written to netcdf in order to be able to resume and include previous runs in error stats calculations.

We should explore porting the nadir-equivalent error calculation step from geolocation_error_stats.py to the image_match.py or at least call that calculation IN loop, and THEN run the full error Stats at the end of the loop (for aggregate stats).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix checkpoint resume for Monte Carlo GCS #100

Problem

Proposed Improvements

Implementation Plan

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fix checkpoint resume for Monte Carlo GCS #100

Description

Problem

Proposed Improvements

Implementation Plan

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions