Skip to content

Use different suffixes for the two ERI comparisons#4920

Merged
billsacks merged 3 commits intoESMCI:masterfrom
billsacks:eri_do_not_clobber_cprnc_out
Jan 24, 2026
Merged

Use different suffixes for the two ERI comparisons#4920
billsacks merged 3 commits intoESMCI:masterfrom
billsacks:eri_do_not_clobber_cprnc_out

Conversation

@billsacks
Copy link
Member

@billsacks billsacks commented Jan 16, 2026

Description

Prior to this change, ERI cprnc.out files from the base-hybrid comparison were clobbered by the cprnc.out files from the base-rest comparison. This PR fixes this issue. It also renames the comparison steps and the cprnc.out files to try to be more intuitive - labeling these cprnc.out files with the name of the case that's being tested, either "branch" (for the branch-vs-hybrid test) or "rest" (for the rest-vs-branch test). The phases are now named COMPARE_branch_hybrid and COMPARE_rest_branch.

Checklist

  • My code follows the style guidelines of this project (black formatting)
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • I have added tests that exercise my feature/fix and existing tests continue to pass
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding additions and changes to the documentation

Prior to this change, ERI cprnc.out files from the base-hybrid
comparison are clobbered by the cprnc.out files from the base-rest
comparison. This commit fixes this issue.

Resolves ESMCI#4912
@billsacks billsacks requested review from ekluzek and jgfouca January 16, 2026 21:31
@billsacks
Copy link
Member Author

@ekluzek I'd welcome your review since you ran into this problem

@jgfouca I initially requested a review from you because of your familiarity with the test system, but then remembered you said you don't have any ERI tests in E3SM, so actually a review from you probably isn't needed.

@billsacks billsacks removed the request for review from jgfouca January 16, 2026 21:33
@codecov
Copy link

codecov bot commented Jan 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 31.38%. Comparing base (e347176) to head (800e2d7).
⚠️ Report is 38 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4920      +/-   ##
==========================================
- Coverage   36.72%   31.38%   -5.35%     
==========================================
  Files         133      264     +131     
  Lines       19761    38869   +19108     
  Branches     4210     8260    +4050     
==========================================
+ Hits         7258    12199    +4941     
- Misses      11872    25430   +13558     
- Partials      631     1240     +609     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@billsacks
Copy link
Member Author

I have confirmed that, with my changes, ERI_Ld9_P8x1.f10_f10_mg37.I2000Clm50BgcCropQianRs.green_gnu.clm-default passes (as it did before - i.e., I didn't break the ERI test). And, in contrast to before, I now get separate cprnc.out files for the two sets of comparisons.

I also confirmed that the ERI test still reports failures when it should:

With this diff:

diff --git a/src/biogeophys/TemperatureType.F90 b/src/biogeophys/TemperatureType.F90
index 58e4c93e7..ce6362067 100644
--- a/src/biogeophys/TemperatureType.F90
+++ b/src/biogeophys/TemperatureType.F90
@@ -933,6 +933,7 @@ subroutine Restart(this, bounds, ncid, flag, is_simple_buildtemp, is_prog_buildt
     use abortutils      , only : endrun
     use ncdio_pio       , only : file_desc_t, ncd_double, ncd_int
     use restUtilMod
+    use clm_varctl            , only : nsrest, nsrStartup, nsrContinue, nsrBranch
     !
     ! !ARGUMENTS:
     class(temperature_type) :: this
@@ -959,6 +960,10 @@ subroutine Restart(this, bounds, ncid, flag, is_simple_buildtemp, is_prog_buildt
          long_name='vegetation temperature', units='K', &
          interpinic_flag='interp', readvar=readvar, data=this%t_veg_patch)
 
+    if (nsrest == nsrContinue) then
+       this%t_veg_patch = this%t_veg_patch * 1.01_r8
+    end if
+
     call restartvar(ncid=ncid, flag=flag, varname='T_STEM', xtype=ncd_double,  &
          dim1name='pft', &
          long_name='stem temperature', units='K', &

I get a failure in COMPARE_base_rest as expected

and with this diff:

diff --git a/src/biogeophys/TemperatureType.F90 b/src/biogeophys/TemperatureType.F90
index 58e4c93e7..ae468bd8d 100644
--- a/src/biogeophys/TemperatureType.F90
+++ b/src/biogeophys/TemperatureType.F90
@@ -933,6 +933,7 @@ subroutine Restart(this, bounds, ncid, flag, is_simple_buildtemp, is_prog_buildt
     use abortutils      , only : endrun
     use ncdio_pio       , only : file_desc_t, ncd_double, ncd_int
     use restUtilMod
+    use clm_varctl            , only : nsrest, nsrStartup, nsrContinue, nsrBranch
     !
     ! !ARGUMENTS:
     class(temperature_type) :: this
@@ -959,6 +960,10 @@ subroutine Restart(this, bounds, ncid, flag, is_simple_buildtemp, is_prog_buildt
          long_name='vegetation temperature', units='K', &
          interpinic_flag='interp', readvar=readvar, data=this%t_veg_patch)
 
+    if (nsrest == nsrBranch .and. flag == 'read') then
+       this%t_veg_patch = this%t_veg_patch * 1.01_r8
+    end if
+
     call restartvar(ncid=ncid, flag=flag, varname='T_STEM', xtype=ncd_double,  &
          dim1name='pft', &
          long_name='stem temperature', units='K', &

I get a failure in COMPARE_hybrid_base, as expected

@billsacks
Copy link
Member Author

I'll also note that I did a pretty detailed dive into the code to see if I would break anything by reordering the suffixes. Specifically, I looked for any asymmetries between the two suffixes. For the most part it looks like the two are treated symmetrically. But:

  • The cprnc output file name is based on suffix1… this is the asymmetry that created this issue in the first place… what I'm really interested in is whether that asymmetry shows up elsewhere
  • I didn't dive into the cprnc Fortran code to verify that it is symmetrical, though my recollection is that it is, or at least that it's nearly symmetrical (see also CIME calls cprnc in a non-symmetric way #4303)
  • One thing I worried about was the asymmetry of COMPARISON_FAILURE_COMMENT_OPTIONS. But this only appears in a function that says that it is meant to be called from baseline comparison, not in-test comparison.

Based on this, I feel fine about swapping the suffixes in the ERI test.

@billsacks
Copy link
Member Author

Actually, @ekluzek - please hold off on reviewing: I'm going to make another small change to address #4896 .

Previously it had a "base" suffix, which was vague for this test.
This way the cprnc output iles will be named with the test case rather
than the control case, which is more intuitive.
@billsacks billsacks requested a review from mnlevy1981 January 16, 2026 22:46
@billsacks
Copy link
Member Author

@ekluzek and @mnlevy1981 - this is now ready for review. See the edited description in the top-level comment in this PR. I have redone the testing that I noted above with the latest version of this code to ensure that an ERI passes when it should and fails in the correct comparison step when it should.

Copy link
Contributor

@ekluzek ekluzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think this is good. I looked at it close enough that I could understand what's going on and it makes sense. I have two comments that you can close, that are just my thinking out loud.

In the first one, I suggest possibly labelling the suffix for the first case run (currently using suffix==None). It isn't compared to anything so doesn't matter, and I think it's pretty common for the first case to use None here. But, since there are so many cases run here, it seemed like it could be helpful to label it?

Your reasoning and testing all makes sense to me.

So approving.

Copy link
Contributor

@mnlevy1981 mnlevy1981 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much clearer than what we had before!

@billsacks billsacks merged commit 81cda06 into ESMCI:master Jan 24, 2026
12 of 13 checks passed
@billsacks billsacks deleted the eri_do_not_clobber_cprnc_out branch January 24, 2026 04:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants