Skip to content

Remove cdo/python-cdo from fre-cli, add xarray time averager#830

Draft
Copilot wants to merge 8 commits intomainfrom
copilot/deprecate-cdo-python-cdo-use
Draft

Remove cdo/python-cdo from fre-cli, add xarray time averager#830
Copilot wants to merge 8 commits intomainfrom
copilot/deprecate-cdo-python-cdo-use

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 1, 2026

Describe your changes

CDO/python-cdo has always been an awkward fit — it doesn't handle calendar='julian' well and adds a heavy external dependency. This PR removes CDO entirely and adds an xarray-based time averager as a replacement:

  • Removed CDO completely: cdoTimeAverager.py is now a thin stub that emits a loud FutureWarning ("CDO/python-cdo has been REMOVED") and delegates all work to the new xarrayTimeAverager
  • New xarrayTimeAverager class (xarrayTimeAverager.py): xarray-based time averaging and climatology generation supporting avg_type='all', 'seas', and 'month' (both weighted and unweighted)
  • Renamed frepytoolsTimeAveragerNumpyTimeAverager: the class in frepytoolsTimeAverager.py is now NumpyTimeAverager with the old name kept as a backward-compatible alias
  • pkg='cdo' entry point preserved but warns loudly via FutureWarning and routes to xarray under the hood — users are directed to switch to pkg='xarray' or pkg='fre-python-tools'
  • Added pkg='xarray' as a new CLI choice in both gen-time-averages and gen-time-averages-wrapper
  • Replaced CDO operations with xarray in shared code paths:
    • generate_time_averages.py: Cdo().mergetime()xr.open_mfdataset() (this was used by all packages, not just CDO)
    • frenctoolsTimeAverager.py: Cdo().select()xr.open_dataset().sel(time=ds['time'].dt.month == idx)
  • Removed cdo/python-cdo from required dependencies in environment.yml, meta.yaml, pyproject.toml
  • Updated pylintrc: configured class-rgx regex to allow mixedCase class names (e.g. xarrayTimeAverager, cdoTimeAverager) alongside PascalCase
  • Fixed weighted time-average dtype failures in xarrayTimeAverager:
    • _compute_time_weights now robustly handles all three cases xarray produces for time_bnds differences: timedelta64 (converted via astype('timedelta64[s]') / 86400), cftime object timedeltas (using .days + .seconds/86400), and numeric float/int (direct cast) — using numpy .values instead of the unreliable xarray .dt accessor
    • Added _is_numeric() guard in all weighted helper functions to skip non-numeric time-dependent variables (e.g. average_T1/average_T2 decoded as datetime64 by xarray for JULIAN-calendar files) that cannot be multiplied by float weights
  • Comprehensive unit tests added in test_xarrayTimeAverager.py (53 tests across 6 test classes):
    • TestIsNumeric — dtype classification helper for float, int, uint, datetime64, timedelta64, object
    • TestComputeTimeWeights — all three time_bnds encoding paths (float, timedelta64, no-bnds fallback) plus known-value correctness checks
    • TestWeightedTimeMean — numerical correctness, dim elimination, attribute preservation, non-numeric variable handling
    • TestWeightedSeasonalMean — season grouping, MAM/JJA/SON weighted values verified against known formulas
    • TestWeightedMonthlyMean — month dim, 12-month output, 2-year weighted mean correctness
    • TestXarrayTimeAveragerGenerateTimavg — full NetCDF round-trip tests for all/seas/month × weighted/unweighted, per-month file creation, ValueError path, input file not clobbered
  • Parametrized test cases expanded: added missing weighted seas and month cases (single-file and multi-file) to test_generate_time_averages.py
  • Other test updates:
    • New xarray wrapper tests (annual from monthly, annual from annual, monthly from monthly)
    • CDO test skip marks removed — CDO entry point now redirects to xarray, so tests run without python-cdo installed
    • test_cdoTimeAverager.py updated to verify FutureWarning is emitted

Issue ticket number and link (if applicable)

Checklist before requesting a review

  • I ran my code
  • I tried to make my code readable
  • I tried to comment my code
  • I wrote a new test, if applicable
  • I wrote new instructions/documentation, if applicable
  • I ran pytest and inspected it's output
  • I ran pylint and attempted to implement some of it's feedback
  • No print statements; all user-facing info uses logging module

Copilot AI linked an issue Apr 1, 2026 that may be closed by this pull request
Copilot AI and others added 3 commits April 1, 2026 23:51
…ion warnings, remove from required deps

Agent-Logs-Url: https://github.com/NOAA-GFDL/fre-cli/sessions/83cd5af3-5071-4657-ba27-a32d87e1ded5

Co-authored-by: ilaflott <6273252+ilaflott@users.noreply.github.com>
Agent-Logs-Url: https://github.com/NOAA-GFDL/fre-cli/sessions/83cd5af3-5071-4657-ba27-a32d87e1ded5

Co-authored-by: ilaflott <6273252+ilaflott@users.noreply.github.com>
Copilot AI changed the title [WIP] Deprecate cdo/python-cdo use in fre-cli Deprecate cdo/python-cdo use in fre-cli Apr 1, 2026
Copilot AI requested a review from ilaflott April 2, 2026 00:00
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 65.33333% with 52 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.69%. Comparing base (7e80492) to head (a77657e).

Files with missing lines Patch % Lines
...e/app/generate_time_averages/xarrayTimeAverager.py 68.36% 31 Missing ⚠️
...p/generate_time_averages/frenctoolsTimeAverager.py 4.54% 21 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #830      +/-   ##
==========================================
- Coverage   84.12%   83.69%   -0.43%     
==========================================
  Files          71       72       +1     
  Lines        4944     5016      +72     
==========================================
+ Hits         4159     4198      +39     
- Misses        785      818      +33     
Flag Coverage Δ
unittests 83.69% <65.33%> (-0.43%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
fre/app/freapp.py 97.14% <ø> (ø)
fre/app/generate_time_averages/__init__.py 100.00% <ø> (ø)
fre/app/generate_time_averages/cdoTimeAverager.py 100.00% <100.00%> (ø)
...p/generate_time_averages/frepytoolsTimeAverager.py 80.43% <100.00%> (+0.14%) ⬆️
...p/generate_time_averages/generate_time_averages.py 92.75% <100.00%> (+0.95%) ⬆️
...p/generate_time_averages/frenctoolsTimeAverager.py 25.97% <4.54%> (-0.70%) ⬇️
...e/app/generate_time_averages/xarrayTimeAverager.py 68.36% <68.36%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7e80492...a77657e. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ilaflott
Copy link
Copy Markdown
Member

ilaflott commented Apr 3, 2026

@copilot let's change this from "deprecate" to remove, i slightly misunderstood the meaning in the context of software. also add another mode of time averaging and climatology generation using xarray. tests please. and keep the pkg=cdo entry point to functionality around please, but give a loud warning and say you're going to use xarray instead. and also rename the frepythontools timeaverager the numpytimeaverager. you can follow pylint's recommendation on naming cases for classes etc., but please understand i miss mixed case from C++ and i wish i could configure pylint to allow it (example mixedCase) so if you could figure that out that would be super cool but it's ok if you can't. we are going to need a test for this in the test_cloud_runner CI workflow in NOAA-GFDL/fre-workflows so study those reqs too. hop to it, yee haw!

…er, update tests and pylintrc

Agent-Logs-Url: https://github.com/NOAA-GFDL/fre-cli/sessions/1c5f6605-f239-4dd0-94b7-53b5458889fd

Co-authored-by: ilaflott <6273252+ilaflott@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 3, 2026

@copilot let's change this from "deprecate" to remove, i slightly misunderstood the meaning in the context of software. also add another mode of time averaging and climatology generation using xarray....

Pushed in facdfba:

  • CDO fully removed — cdoTimeAverager is now a stub that warns loudly and redirects to xarrayTimeAverager
  • New xarrayTimeAverager class added with all/seas/month support (weighted + unweighted)
  • frepytoolsTimeAverager renamed to NumpyTimeAverager (old name kept as alias)
  • pkg='cdo' entry point preserved with loud FutureWarning, routes to xarray
  • pkg='xarray' added as CLI choice
  • pylintrc updated with class-rgx to allow mixedCase class names

Known issue: the weighted time-average path (unwgt=False) in xarrayTimeAverager has a dtype conflict when time_bnds are decoded as datetime64 — needs further work. The unweighted path and all other avg_types work correctly. I did not get to studying the test_cloud_runner CI workflow in fre-workflows due to time.

Copilot AI changed the title Deprecate cdo/python-cdo use in fre-cli Remove cdo/python-cdo from fre-cli, add xarray time averager Apr 3, 2026
@ilaflott
Copy link
Copy Markdown
Member

ilaflott commented Apr 3, 2026

@copilot fix unit test failure RE dtype

…er weighted average

Agent-Logs-Url: https://github.com/NOAA-GFDL/fre-cli/sessions/1fb856bc-30a4-4be7-bb77-b2f42b051266

Co-authored-by: ilaflott <6273252+ilaflott@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 3, 2026

@copilot fix unit test failure RE dtype

Fixed in a77657e. There were two root causes:

  1. _compute_time_weights: The test data uses calendar_type = "JULIAN", so xarray uses cftime objects for time_bnds (dtype object). The previous code used hasattr(weights.dt, 'days') which didn't handle cftime timedeltas. Fixed by calling .values to get a plain numpy array and dispatching on dtype.kind == 'm' (timedelta64), dtype == object (cftime), or numeric fallback.

  2. Non-numeric data variables: The atmos file contains average_T1 and average_T2 variables with units = "days since ..." that xarray decodes as datetime64. When the weighted loop tried ds['average_T1'] * float_weights, numpy raised ufunc 'multiply' cannot use operands with types dtype('<M8[ns]') and dtype('float64'). Fixed by adding an _is_numeric() guard in all three weighted helper functions — non-numeric time-dependent variables (datetime64, timedelta64) are retained as their first timestep value rather than being averaged.

@ilaflott
Copy link
Copy Markdown
Member

ilaflott commented Apr 3, 2026

@copilot we need coverage on the xarray time averager. more unit tests!

Copilot AI and others added 2 commits April 3, 2026 06:19
…ghted seas/month parametrized cases

Agent-Logs-Url: https://github.com/NOAA-GFDL/fre-cli/sessions/ac9c29b8-e26b-4463-804e-8f22154798d3

Co-authored-by: ilaflott <6273252+ilaflott@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 3, 2026

@copilot we need coverage on the xarray time averager. more unit tests!

Added in 8250249. New test_xarrayTimeAverager.py with 53 tests organized into 6 classes:

  • TestIsNumeric (8 tests) — float32, float64, int32, int64, uint8 → numeric; datetime64, timedelta64, object → not numeric
  • TestComputeTimeWeights (5 tests) — float bounds, timedelta64 bounds, no-bnds uniform fallback, dim name check, 2-timestep known values (31/28 days)
  • TestWeightedTimeMean (7 tests) — known weighted mean (1×31 + 2×28)/(31+28), uniform weights == arithmetic mean, time dim eliminated, static vars preserved, time_bnds excluded, non-numeric (datetime64) vars get first value, attrs preserved
  • TestWeightedSeasonalMean (7 tests) — season dim present, 4 seasons, time_bnds excluded, MAM/JJA/SON weighted values verified against known formulas, non-numeric vars excluded
  • TestWeightedMonthlyMean (6 tests) — month dim present, 12 months, single-year weighted == unweighted, 2-year Jan weighted mean, time_bnds excluded, non-numeric vars excluded
  • TestXarrayTimeAveragerGenerateTimavg (20 tests) — ValueError path, all/seas/month × wgt/unwgt output existence/shape/value checks, per-month file creation (.01.nc.12.nc), MAM seasonal value from file, infile not modified

Also added the missing weighted seas and month parametrized cases to test_generate_time_averages.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

deprecate cdo/python-cdo use in fre-cli

2 participants