#122 Correction Package Refactor — Complete Redesign#133
Draft
#122 Correction Package Refactor — Complete Redesign#133
Conversation
9 tasks
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #133 +/- ##
==========================================
+ Coverage 73.62% 77.87% +4.25%
==========================================
Files 67 94 +27
Lines 10655 13145 +2490
Branches 1204 1370 +166
==========================================
+ Hits 7845 10237 +2392
- Misses 2343 2405 +62
- Partials 467 503 +36 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…py (#132) * Initial plan * Rename GeolocationConfig classes and geolocation_error_stats.py * Convert Google-style docstrings to numpy style in error_stats.py * ruff
* breaking up modules * correction shim for backwards compatibility * configuration refactor * descope pipeline to shorten * show exports in init * type fix
…135) * add pydantic * Introduce Pydantic models for configuration with typed sub-models * Add unit tests for Pydantic-based correction configuration models * Refactor: Remove legacy alias handling and update parameter access in configuration
* Refactor dataio module: Update documentation and rename data-loader protocol types * Refactor CorrectionConfig: Simplify data loading process and enhance validation * Refactor clarreo_data_loaders: Remove loader protocols and transition to config-driven data loading * Refactor config: Introduce DataConfig for config-driven data loading and remove legacy loader protocols * Refactor correction.py: Remove legacy loader functions and integrate DataConfig for improved data handling * Refactor dataio.py: Remove loader protocols and update documentation for validation helpers * Refactor image_match.py: Remove ImageMatchingFunc protocol and update output validation * Refactor image_match.py: Remove ImageMatchingFunc protocol and update output validation * Refactor pipeline.py: Remove mission-specific loader functions and implement internal file loading for telemetry and science data * Refactor test_config.py: Add tests for DataConfig and remove legacy loader checks * Refactor test_correction.py: Replace loader functions with DataConfig for file-based loading * Refactor test_pairing.py: Remove redundant validation tests for pairing output * Add CLARREO preprocessing script for telemetry and science data * Refactor clarreo_data_loaders.py: Remove GCPLoader protocol and implement telemetry and science data loading functions * load instead of open * Refactor config.py: Remove GCP-related fields and clarify time field documentation * Refactor correction.py: Add _resolve_gcp_pairs function to enhance data processing * Refactor pipeline.py: Add _resolve_gcp_pairs function for GCP key validation * Refactor test_config.py: Remove GCP-related assertions and simplify DataConfig tests * Refactor test_correction.py: Remove 'corrected_timestamp' field from DataConfig instances
…137) * Add SearchStrategy enum for deterministic parameter sweeps * Add SearchStrategy enum and validation for correction analysis * Add support for multiple search strategies in parameter set generation * Add unit tests for parameter-set generation strategies * Refactor search strategy validation to improve error messaging consistency * Remove unused conversion functions for sigma to radians and seconds * Refactor parameter set comparison in tests for consistency and clarity * check parameter datatype * Fix formatting in parameters.py for improved readability * Update curryer/correction/parameters.py * Update curryer/correction/parameters.py * Update tests/test_correction/test_parameters.py * Update tests/test_correction/test_parameters.py * Add max_grid_sets parameter to limit GRID_SEARCH materialization * Add search strategy enum for deterministic parameter sweeps and enforce max_grid_sets limit * Add tests for max_grid_sets enforcement in GRID_SEARCH strategy
* cherry-pick regridding files from MM-104 * add regrid module and update imports in __init__.py * add RegridConfig model with validation for GCP chip regridding parameters * deprecate MATLAB file loading utilities in image_match; redirect to curryer.correction.image_io * update import for load_image_grid_from_mat to use image_io module * update import for integrated_image_match to use image_io module * apply rng seed Co-authored-by: Copilot <[email protected]> * remove cubic Co-authored-by: Copilot <[email protected]> * check interpolation method Co-authored-by: Copilot <[email protected]> * user helper for check point in cell Co-authored-by: Copilot <[email protected]> * refactor test cases to use default_rng for random data generation * fix: add missing newline for code readability * test gcp NetCDF and HDF5 support * refactor: remove unused tolerance variable in cell check * Update curryer/correction/data_structures.py Co-authored-by: Copilot <[email protected]> * vectorize regard Co-authored-by: Copilot <[email protected]> * relax tolerance on grid boundaries Co-authored-by: Copilot <[email protected]> * valid pixels only Co-authored-by: Copilot <[email protected]> * refactor: improve documentation and streamline code in GCP regridding * fix: update history attribute formatting in image_io.py * docs: add gcp_regridding.md to contents * fix: improve boundary handling in regrid.py to prevent NaN fill values * feat: add example scripts for regridding GCP chips to NetCDF format * fix: update variable names in image_io.py for consistency with GCP standards * fix: enhance variable loading in image_io.py for compatibility with multiple naming conventions --------- Co-authored-by: Copilot <[email protected]>
…rection loop (#138) * add verification module for geolocation compliance checks * add verification module for geolocation compliance checks * add unit tests for verification module and its components * handle HDF5 file loading in image_io.py when HDF4 library is unavailable * improve HDF file loading in image_io.py to handle errors and fallback between HDF4 and HDF5 * add validation function and update verify parameters to include optional work_dir * update verification tests to include work_dir parameter in verify function calls * add minimal example for production verification workflow on geolocated observations * add example script for weekly verification of geolocated observations * refactor verification module to enhance key attribute handling and improve dataset aggregation logic * refactor verification module to enhance key attribute handling and improve dataset aggregation logic * enhance verification module to improve JSON serialization and summary output formatting
…for CLARREO (#146) * remove tests * add integration tests package * refactor pytest configuration for test_correction to improve maintainability * refactor test_dataio.py to use pytest and improve test structure for maintainability * refactor test_image_match.py to use pytest and improve test structure for maintainability * refactor test_pairing.py to use pytest and improve test structure for maintainability * add __init__.py for CLARREO-specific correction tests * add image-matching and pipeline runner helpers for CLARREO integration tests * add synthetic data generation helpers for testing pipeline and e2e scenarios * add unit tests for kernel operations in test_kernel_ops.py * add tests for pipeline functions in test_pipeline.py * add tests for results_io functions in test_results_io.py * copy config Co-authored-by: Copilot <[email protected]> * copy config Co-authored-by: Copilot <[email protected]> * Update tests/test_correction/clarreo/_image_match_helpers.py Co-authored-by: Copilot <[email protected]> * Fix root_dir duplication and clarreo_cfg mutation in test files Agent-Logs-Url: https://github.com/lasp/curryer/sessions/6ba9aa43-35f1-476a-a4dc-ad8eae24511b Co-authored-by: mmaclay <[email protected]> * Strengthen test_generate_clarreo_config_json and fix JSON round-trip bugs in load_config_from_json Agent-Logs-Url: https://github.com/lasp/curryer/sessions/bbf5eb75-f1b0-46f8-af24-f2758fd30858 Co-authored-by: mmaclay <[email protected]> * Refactor assertions in test_clarreo_config.py for clarity and maintainability; streamline field retrieval in config.py --------- Co-authored-by: Copilot <[email protected]> Co-authored-by: copilot-swe-agent[bot] <[email protected]>
0a09d1c to
d6bb926
Compare
* phase 1 api correction * phase 2 api correction * phase 3 api correction * phase 4 api correction * phase 5 api correction * add test_error_stats * Update curryer/correction/config.py * Update curryer/correction/pipeline.py * Update curryer/correction/pipeline.py * Fix Windows temp file locking in _download_from_s3 and remove private _log_pairing_summary from package exports * Update tests/test_correction/test_verification.py * Fix inputs type hint in run_correction to use Sequence union type * ruff * Enhance documentation for dataio and io modules; clarify S3 URI support in pipeline
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The is the PR for epic #122 - and it will track changes from the sub-issues.
Correction Package Refactor — Complete Redesign
Top-level tracking issue for a complete refactor and architectural redesign of
curryer.correction.Goals
CurryerConfigunifying all of curryerPR Groupings & Sub-Issues
PR 1: Foundation — Naming & Module Split
PR 2: Config Redesign — Pydantic, Internalize Loading, Search Strategies
PR 3: New Features — Verification & GCP Regridding
PR 4: Test Cleanup & Future Planning
Dependency Graph
Context