Homme(xx)/SL: Enhanced trajectory method. #6874

ambrad · 2025-01-07T00:42:54Z

This PR brings in a new feature that (1) increases accuracy of semi-Lagrangian tracer transport's trajectory calculations and (2) permits flexible trade-off between the trajectory accuracy and speed. This PR has the following parts:

F90 dycore support, with unit tests (principally in sl_advection.F90)
C++ dycore support, with unit tests (principally in ComposeTransportImplEnhancedTrajectory.cpp)
unit test driver updates: compose_ut.cpp
two new standalone-Homme tests, one each for the F90 and C++ dycores
new standalone-Homme transport test module for convergence testing: fully 3D, space-and-time-dependent surface pressure (dcmip2012_test1_conv_mod.F90)
CIME-based ERS tests, one each, for EAM and EAMxx
cleanup of a timer issue orthogonal to this PR: see commit "Hommexx: Rework skipping timers in first step."
updates to Homme machine files for Perlmutter
fix C++ dycore's handling of prescribed winds: had to move down in the call stack to match the F90 dycore

[non-BFB] due to two new CIME tests, two new standalone-Homme tests, and two modified standalone-Homme tests; otherwise BFB

Make this a cmake define triggered by 'SYCL_BUILD'. Move it outside of the threaded region for consistency with t_dis/enablef usage. Switch based on nstep==0. Use a local boolean to avoid repeated calls to t_enablef. (The define prevents any of this code from mattering in all cases except the intended application.)

Also add parameters to namelist definition file.

ambrad · 2025-01-07T00:45:05Z

This comment will be updated with testing results.

Confluence page
homme_integration failures look like this:

HOMMEBFB_P24.f19_g16_rx1.A.chrysalis_intel.C.20250106_175602_oau21c/TestStatus.log:
The following tests FAILED:
    1 - verifyBaselineResults (Failed)
   41 - thetah-sl-test11conv-r1t2-cdr20 (Failed)
   42 - thetah-sl-test11conv-r0t1-cdr30-rrm (Failed)
   44 - thetah-sl-testconv-3e (Failed)
   47 - thetah-sl-test11conv-r0t1-cdr30-rrm-kokkos (Failed)
   48 - thetah-sl-testconv-3e-kokkos (Failed)
HOMME_P24.f19_g16_rx1.A.chrysalis_intel.C.20250106_175602_oau21c/TestStatus.log:
The following tests FAILED:
    1 - verifyBaselineResults (Failed)
   30 - thetah-sl-test11conv-r1t2-cdr20 (Failed)
   31 - thetah-sl-test11conv-r0t1-cdr30-rrm (Failed)
   33 - thetah-sl-testconv-3e (Failed)

The CI checks that fail all show just this one known test failure: mam4_aero_microphys_standalone_baseline_cmp
e3sm_atm_integration on Chrysalis has all PASS except the new test, which doesn't have a baseline
e3sm_developer on Chrysalis passes
Perlmutter e3sm_scream_v1: e3sm_scream_v1 passes against baselines except for (of course) the new sl_nsubstep2 test
Perlmutter performance check: Good. (See below for description.)
Frontier e3sm_scream_v1_medres: Passes. Note: we don't have baselines on Frontier, and _D builds outside of e3sm_scream_v1_medres but inside e3sm_scream_v1 fail to build due to an ICE (known to us).
Frontier performance check: Good.

The performance check compares master and branch on the case SMS_Ln300.ne30pg2_ne30pg2.F2010-SCREAMv1.scream-perf_test--scream-output-preset-1--scream-L128. This does not exercise the new trajectory method; rather, it's checking for any accidental performance regressions. I use the line

grep "RUN_LOOP\"\|compose_transport\|trajectory\|cedr\|vertical\|dirk\|caar" timing/model_timing_stats

to look at representative timers.

oksanaguba · 2025-01-08T16:28:03Z

This needs much more extensive explanation on what's new and how to use the new SL (and advantages).
Could you please create a confluence page with these details? i assume you have a lot of plots, too, add those there as well.

ambrad · 2025-01-08T16:31:04Z

Could you please create a confluence page with these details? i assume you have a lot of plots, too, add those there as well.

Yes, I'm working on that. I'll post a link when I have a sufficiently useful draft.

(To turn it on, set semi_lagrange_trajectory_nsubstep = N for N > 0.)

oksanaguba · 2025-01-08T17:09:45Z

re fix C++ dycore's handling of prescribed winds: had to move down in the call stack to match the F90 dycore -- was it a bug and/or did you move the call for new functionality?

please list all new/modified tests explicitly with more explanation behind them. homme modified tests -- are they for your new functionality, or are they modified because moved prescribed winds call is nonbfb?

oksanaguba · 2025-01-08T17:10:58Z

oh sounds like there are also new unit tests?

oksanaguba

requires extensive documentation on confluence before being approved

ambrad · 2025-01-08T17:15:11Z

I'll describe the tests on Confluence.

homme modified tests -- are they for your new functionality, or are they modified because moved prescribed winds call is nonbfb?

This refers to a pre-existing H/Hxx SL BFB test pair that I've switched to one of the new test flow fields. The answers change since the flow field is different. And then there are two new standalone-Homme tests that revealed the difference in the prescribed-wind call stack in the C++ vs F90 dycores.

ambrad · 2025-01-08T19:02:45Z

@oksanaguba I put a link to a Confluence page above. I'll add draft material today.

ambrad · 2025-01-09T06:59:37Z

oh sounds like there are also new unit tests?

Yes, quite a number of them. They are called from the compose_ut.cpp unit test driver (as usual) and are in sl_advection.F90 and ComposeTransportImplEnhancedTrajectoryTests.cpp.

ambrad · 2025-01-09T07:04:43Z

please list all new/modified tests explicitly with more explanation behind them.

   41 - thetah-sl-test11conv-r1t2-cdr20     old test but with a slightly modified flow field
   42 - thetah-sl-test11conv-r0t1-cdr30-rrm     old test but with a slightly modified flow field
   47 - thetah-sl-test11conv-r0t1-cdr30-rrm-kokkos     Hommexx equivalent of above
   44 - thetah-sl-testconv-3e     new test with new flow field; uses the new trajectory method
   48 - thetah-sl-testconv-3e-kokkos     Hommexx equivalent of above

bartgol

My knowledge on the SL transport and the compose lib is very limited, so it was hard to do a good review. I would rely on folks from the compose proj for a better review.

That said, I think the code could use the injection of a fair amount of comments. The code is very "dense", which makes it hard to read. I know you are owning this code (and I'm glad you are), but for the sake of who may come next to maintain/modify the code (should you move to a different project), it would help a lot to have more inline guidance.

bartgol · 2025-01-09T16:49:00Z

components/homme/src/share/compose/compose_slmm_islmpi.hpp

@@ -251,6 +259,40 @@ void deep_copy (FixedCapList<T, DTD>& d, const FixedCapList<T, DTS>& s) {
 #endif
 }

+template <typename T>
+struct FixedCapListHostOnly {


What's the difference between this class and using a plain std::vector<T>? The capacity feature is analogue to the vector "reserve" feature, no?

It's true, but I have a general FixedCapList for most use cases on device and host. I made this HostOnly version to avoid host-device warnings having to do with mpi::Request. I could use std::vector, but (1) I like having slmm_kernel_assert_high in the impl and (2) I can use the same user code instead of having to switch it to std::vector syntax.

bartgol · 2025-01-09T17:43:46Z

components/homme/src/share/cxx/ComposeTransportImplEnhancedTrajectory.cpp

+  GPTLstop("compose_calc_enhanced_trajectory");
+}
+
+// Testing.


Any chance we can put the testing stuff in a separate file? Not much so we can skip compilation outside of testing (it's small enough) but mostly for maintenance. Seeing a 2k+ lines file is always discouraging :)

I like to keep unit tests collocated with the routines they are testing. However, that is just my own opinion. Where would you like me to move them? test_execs/thetal_kokkos_ut?

Actually, I would at least like to keep the tests in the main directory. I really like being able to call a unit-test routine for a class anywhere I want.

Mods pushed.

bartgol · 2025-01-09T17:52:31Z

components/homme/src/share/cxx/ComposeTransportImplEnhancedTrajectory.cpp

+}
+
+KOKKOS_FUNCTION void
+eta_interp_eta (const KernelVariables& kv, const int nlev,


Quite a few fcns in this file are without a comment. Would you mind adding a quick 2-liner at the top of each of them, explaining what they do (maybe throw in a formula, if that's easier)?

Mods pushed.

bartgol · 2025-01-09T17:58:40Z

components/homme/src/share/cxx/ComposeTransportImplEnhancedTrajectory.cpp

+    const auto wrk4 = Homme::subview(buf1d, kv.team_idx);
+    const auto vwrk = Homme::subview(buf2a, kv.team_idx);
+    // Reconstruct Lagrangian levels at t1 on arrival column:
+    //     eta_arr_int = I[eta_ref_mid([0,eta_dep_mid,1])](eta_ref_int)


Comments are super helpful in a complex, code-rich folder like compose. Here, I am not clear on the notation, and was wondering if you could make the comment a bit clearer. In particular:

what does the notation eta([0,eta_dep_mid,1]) mean? What's the inner vector meaning?

what is the notation I[x](y)?

I'll add comments. Re: these specific questions:

I[y(x)](xi) means an interpolant constructed from y(x) and evaluated at xi.

[0,eta_dep_mid,1] is the concatenation of the eta departure midpoint values with 0 and 1.

Mods pushed.

bartgol · 2025-01-09T18:23:45Z

components/homme/src/share/compose/compose_cedr_sl_run_global.cpp

@@ -5,6 +5,76 @@
 namespace homme {
 namespace sl {

+template <int np_, typename MT>
+void run_relaxed_local (CDR<MT>& cdr, const Data& d, Real* q_min_r,


Can you add some comment (inline or at the top of the fcn) regarding what's happening in here?

Perhaps also for the run_global fcn. I am not familiar with the algo, which doesn't help, but the code is a bit dense, which makes it hard to read anyways. Some comments (even just a big picture, not necessarily fine-grained) may help those who one day may have to maintain this.

I'll remove these CEDR additions. They correspond to Sect. 2.2.4 of https://gmd.copernicus.org/articles/15/6285/2022/gmd-15-6285-2022.html. I had these lying around and unwisely added them to this already lengthy PR.

Mods pushed.

bartgol · 2025-01-09T18:32:13Z

components/homme/src/share/compose/compose_homme.cpp

+template <typename MT>
+void sl_h2d (TracerArrays<MT>& ta, bool transfer, Real* dep_points, Int ndim) {
+#if defined COMPOSE_PORT
+# if defined COMPOSE_HORIZ_OPENMP


If COMPOSE_PORT is defined, then we're building for the kokkos dycore, right? In that case we should not have horiz openmp, right? I thought horiz openmp was used in the f90 dycore only...

Same in a few other places.

The Kokkos code path is supported in the F90 dycore; it's just not used in practice because of array ordering (requiring transposes and extra memory) and needing to support HORIZ_OPENMP on CPUs.

but the cmake logic seems to be such that COMPOSE_PORT is only true for the compose_f90 lib, which is the only one linked against the f90 targets. Maybe I'm misreading the cmake logic though...

(For the composec++ lib, not the composef90 one, but I get your point.) Yes, that's right, because in practice we don't want to use COMPOSE_PORT for the F90 dycore. That is, nobody would purposely build a dycore that way.

But it does actually work. For such a niche thing, I haven't wanted to waste testing resources to keep it working, but I use the capability when doing dev work on occasion. In short, I do want to retain lines of the sort you're highlighting even though our automated builds never activate them. Note the lines you highlighted correspond to a bridge routine to do the copies/transposes. (Detail: The bridge routine is also used in compose_ut unit tests for F90-vs-C++ comparisons, but of course in this case HORIZ_OPENMP is, as you pointed out, always false.)

I'll add this in a different PR. However, keep the generalized indexing changes and extra argument in the Homme interface because they are useful.

ambrad · 2025-01-09T22:10:11Z

@bartgol I've responded to all your comments. In most cases I've pushed associated code mods. In some cases I've explained why something is the way it is.

ambrad · 2025-01-10T03:08:42Z

@oksanaguba

requires extensive documentation on confluence before being approved

I've added a Confluence page and more comments to the code. See in particular the comments starting here.

I've responded to all your points above.

Is there anything more you would like before approving? I'd like to start merging this PR on Monday or Tuesday. Thanks.

This PR brings in a new feature that (1) increases accuracy of semi-Lagrangian tracer transport's trajectory calculations and (2) permits flexible trade-off between the trajectory accuracy and speed. This PR has the following parts: - F90 dycore support, with unit tests (principally in sl_advection.F90); - C++ dycore support, with unit tests (principally in ComposeTransportImplEnhancedTrajectory.cpp); - unit test driver updates: compose_ut.cpp; - two new standalone-Homme tests, one each for the F90 and C++ dycores; - new standalone-Homme transport test module for convergence testing: fully 3D, space-and-time-dependent surface pressure (dcmip2012_test1_conv_mod.F90); - CIME-based ERS tests, one each, for EAM and EAMxx; - cleanup of a timer issue orthogonal to this PR: see commit 'Hommexx: Rework skipping timers in first step.'; - updates to Homme machine files for Perlmutter; - fix C++ dycore's handling of prescribed winds: had to move down in the call stack to match the F90 dycore. e3sm_developer and e3sm_atm_integration pass on Chrysalis. EAMxx test suites pass on Perlmutter and Frontier. There are no performance effects when the stealth feature is off based on tests on Chrysalis (v3.LR 11-year control run), Frontier, and Perlmutter. [non-BFB] due to two new CIME tests, two new standalone-Homme tests, and two modified standalone-Homme tests; otherwise BFB.

ambrad and others added 5 commits January 6, 2025 17:54

Homme: Update Perlmutter machine file.

98a20f0

Homme: Update perlmutter machine file with PR 6423 fix.

7346ece

Homme(xx)/SL: Enhanced trajectory method.

7948e5b

EAM(xx): Add CIME EAM and EAMxx tests for SL enhanced trajectory method.

4d1bddb

Also add parameters to namelist definition file.

ambrad self-assigned this Jan 7, 2025

ambrad added Stealth PR has feature which, if turned on, could change climate. fka FCC HOMME HOMME standalone issues with the standalone HOMME code that dont impact E3SM EAMxx PRs focused on capabilities for EAMxx non-BFB PR makes roundoff changes to answers. labels Jan 7, 2025

ambrad requested review from bartgol, mt5555 and oksanaguba January 7, 2025 00:45

ambrad added 2 commits January 7, 2025 15:52

Hommexx: Add a timer for vertical_remap of dynamics variables.

2da1a3b

Homme/SL: Adjust auto-setting of halo.

be41c72

oksanaguba requested changes Jan 8, 2025

View reviewed changes

bartgol reviewed Jan 9, 2025

View reviewed changes

ambrad added 3 commits January 9, 2025 14:02

Homme/SL: Remove CAAS-point impl for now.

04d1673

I'll add this in a different PR. However, keep the generalized indexing changes and extra argument in the Homme interface because they are useful.

Hommexx/SL: Break main impl file into three.

f7958a8

Homme/SL: Rearrange some code; add more comments.

d8f088f

bartgol approved these changes Jan 9, 2025

View reviewed changes

ambrad requested a review from oksanaguba January 13, 2025 19:20

oksanaguba approved these changes Jan 13, 2025

View reviewed changes

ambrad merged commit 3dfcac8 into E3SM-Project:master Jan 14, 2025
19 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Homme(xx)/SL: Enhanced trajectory method. #6874

Homme(xx)/SL: Enhanced trajectory method. #6874

ambrad commented Jan 7, 2025 •

edited

Loading

ambrad commented Jan 7, 2025 •

edited

Loading

oksanaguba commented Jan 8, 2025

ambrad commented Jan 8, 2025 •

edited

Loading

oksanaguba commented Jan 8, 2025

oksanaguba commented Jan 8, 2025

oksanaguba left a comment

ambrad commented Jan 8, 2025 •

edited

Loading

ambrad commented Jan 8, 2025

ambrad commented Jan 9, 2025 •

edited

Loading

ambrad commented Jan 9, 2025 •

edited

Loading

bartgol left a comment

bartgol Jan 9, 2025

ambrad Jan 9, 2025

bartgol Jan 9, 2025

ambrad Jan 9, 2025

ambrad Jan 9, 2025

ambrad Jan 9, 2025

bartgol Jan 9, 2025

ambrad Jan 9, 2025

bartgol Jan 9, 2025

ambrad Jan 9, 2025

ambrad Jan 9, 2025

bartgol Jan 9, 2025

bartgol Jan 9, 2025

ambrad Jan 9, 2025

ambrad Jan 9, 2025

bartgol Jan 9, 2025

bartgol Jan 9, 2025

ambrad Jan 9, 2025

bartgol Jan 9, 2025

ambrad Jan 9, 2025 •

edited

Loading

ambrad commented Jan 9, 2025

ambrad commented Jan 10, 2025 •

edited

Loading

Homme(xx)/SL: Enhanced trajectory method. #6874

Homme(xx)/SL: Enhanced trajectory method. #6874

Conversation

ambrad commented Jan 7, 2025 • edited Loading

ambrad commented Jan 7, 2025 • edited Loading

oksanaguba commented Jan 8, 2025

ambrad commented Jan 8, 2025 • edited Loading

oksanaguba commented Jan 8, 2025

oksanaguba commented Jan 8, 2025

oksanaguba left a comment

Choose a reason for hiding this comment

ambrad commented Jan 8, 2025 • edited Loading

ambrad commented Jan 8, 2025

ambrad commented Jan 9, 2025 • edited Loading

ambrad commented Jan 9, 2025 • edited Loading

bartgol left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ambrad Jan 9, 2025 • edited Loading

Choose a reason for hiding this comment

ambrad commented Jan 9, 2025

ambrad commented Jan 10, 2025 • edited Loading

ambrad commented Jan 7, 2025 •

edited

Loading

ambrad commented Jan 7, 2025 •

edited

Loading

ambrad commented Jan 8, 2025 •

edited

Loading

ambrad commented Jan 8, 2025 •

edited

Loading

ambrad commented Jan 9, 2025 •

edited

Loading

ambrad commented Jan 9, 2025 •

edited

Loading

ambrad Jan 9, 2025 •

edited

Loading

ambrad commented Jan 10, 2025 •

edited

Loading