Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EAMXX: add a DataInterpolation class #6812

Merged
merged 15 commits into from
Jan 14, 2025
Merged

Conversation

bartgol
Copy link
Contributor

@bartgol bartgol commented Dec 6, 2024

The class is in charge of reading in time-dependent datasets from input file(s), interpolate to current model time, and possibly do horizontal/vertical interpolation


The implementation is not yet complete, but some early thoughts on whether we're on the right track or not may help. Besides, getting some early testing on ghci machines can also help.

Things left to do:

  • testing for time-only interpolation for Linear timeline
  • add creation of vert/horiz remappers: this needs some extra input from parameter list (a map file, and for vert remap some choices for top/bot extrapolation)
  • testing for time+horiz, time+vert, and time+horiz+vert
  • add possibility of using an "IOP remapper" (a closest-column kind for horiz interpolation). This may allow the upcoming IOP forcing atm proc to use the DataIntepolation routine directly.
  • start using DataInterpolation in eamxx. E.g., use in SPA, and fix the alloc issue (see linked issue)

I am NOT implementing a data interpolation for variables without a time dependence. It simply reduces to constructing 2 remappers, and since non-time dep data is most likely load-able at init, there's no real need for a complex data structure. Time interpolation is the most tricky of the 3 anyways, due to the different needs (nudging uses Linear timeline, while SPA uses a YearlyPeriodic one)

For reviewers: i recommend reviewing one commit at a time. That would help isolate concepts, so you can hopefully better follow. The hardest thing to review is the data interpolation unit tests, since there is quite a bit of code needed to generate cases that we can manually check without redoing the same operation as the interpolation class (copy+paste of src code in tests folder doesn't help).

Fixes #6810
Fixes #6820

@bartgol bartgol self-assigned this Dec 6, 2024
@bartgol bartgol added EAMxx PRs focused on capabilities for EAMxx code usability code cleanup labels Dec 6, 2024
Copy link

github-actions bot commented Dec 6, 2024

PR Preview Action v1.4.8
🚀 Deployed preview to https://E3SM-Project.github.io/E3SM/pr-preview/pr-6812/
on branch gh-pages at 2024-12-24 00:49 UTC

@bartgol bartgol force-pushed the bartgol/eamxx/data-interpolation branch 4 times, most recently from eb23fd7 to 6bfbb0f Compare December 6, 2024 21:52
@bartgol bartgol marked this pull request as draft December 7, 2024 00:33
Copy link
Contributor

@AaronDonahue AaronDonahue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to break this plan into three PRs. This one which introduces the DataInterpolation Class and applies it to time interpolation and then subsequent ones with horiz and vertical?

components/eamxx/src/share/util/scream_time_stamp.hpp Outdated Show resolved Hide resolved
@bartgol
Copy link
Contributor Author

bartgol commented Dec 9, 2024

Would it make sense to break this plan into three PRs. This one which introduces the DataInterpolation Class and applies it to time interpolation and then subsequent ones with horiz and vertical?

I don't think so. The three interpolations are tightly coupled via remappers. Sure, the remapper can be IdentityRemapper if no remap is requested, but the structure is there. I think it may be confusing to integrate a feature that seems to support something, but it doesn't.

For reviewers: I try to isolate concepts into their own commits. So one way to make reviews easier is to review one commit at a time.

Copy link
Contributor

@tcclevenger tcclevenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working through the PR. Here are some comments from "EAMxx: add a TimeInterval lightweight struct"

components/eamxx/src/share/util/scream_time_stamp.hpp Outdated Show resolved Hide resolved
components/eamxx/src/share/util/scream_time_stamp.hpp Outdated Show resolved Hide resolved
@mahf708
Copy link
Contributor

mahf708 commented Dec 12, 2024

Two notes:

  • naming convention: in eamland, what you call yearly_periodic is called CYCLICAL in the namelist. I think your naming convention/choice is much better, but just letting you know in case you want to stick with ancient eam stuff. The linear one is called "INTERP_MISSING_MONTHS" in the namelist, but I'd call it transient (the TR in the compset name). To reiterate, I would vote for your new naming convention, this is jfyi
  • as a matter of design, I would have thought the generic utility envisioned here would accept as input args (maybe in the setup call) the following. I think you're mostly doing this right now, albeit through parameterlist. I am not sure which is better, but I think doing it directly may be better? Idk ...
    1. map of fieldnames in (names of fields in files) to fields out (fields known to FM)
    2. for each field in 1, timeline type (periodic, transient, etc.)
    3. for each field in 1, integer period if periodic, else null
    4. for each field in 1, filename (possibly allowing globbing?)
    5. (I'd skip this one, but someone might complain:) for each field in 1, interpolation methods

The long-term goal is twofold:

  1. Unify the impls in a central location
  2. Allow for some flexibility (i.e., a process may want to read Field X as cyclical but Field Y as transient in the same runtime)

Edit: edited above, but now I guess goal number 2 can be automatically addressed if the DataInterpolation class is supposed to be taking care of only one field at a time. Is that what you intend? (I wasn't sure based on filenames and variables in the code --- note that files can have different variables for different purposes, not all read in the same processes, and sometimes a process will need to access different files for different fields.) If you intend an instant of the class for each field, that would certainly make life much easier, but maybe with perf hit?

@bartgol
Copy link
Contributor Author

bartgol commented Dec 16, 2024

@mahf708 Yeah, I decided to handle different inputs in the parameterlist, so that customer atm procs can simply fwd the sublist in their param list, rather than having to read it, just to pass it in. That said, it is true that the param list hides a bit the requirements. That is, it's not clear from the header what the expectations are on the input PL. Perhaps a constructor listing explicitly what should be inside is more meaningful, and may be more self-documenting. I will think about an alternative impl, and if I see it doesn't get too complex, I'll switch.

@bartgol bartgol force-pushed the bartgol/eamxx/data-interpolation branch 4 times, most recently from 415f59a to 83b7135 Compare December 24, 2024 00:44
@bartgol bartgol marked this pull request as ready for review December 24, 2024 00:45
* Allow to set only one src or tgt pressure profile. This is because
  we may not have any mid or any int field. No need to force us to
  create a valid field just for passing checks.
* Allow to query the remapper for the src/tgt mid/int pressure profile
* The class handles time, horizontal, and vertical interpolation
* The vertical/horizontal are optional, but the time dimension is REQUIRED
We were still assuming tgt layout could only have LEV
and not ILEV.
@bartgol bartgol force-pushed the bartgol/eamxx/data-interpolation branch from 83b7135 to adc3957 Compare December 24, 2024 00:46
@bartgol
Copy link
Contributor Author

bartgol commented Dec 24, 2024

@AaronDonahue @mahf708 @tcclevenger This PR is finally ready for review, just in time for xmas!

A few comments on what you'll find in it:

  • time stamp: added some QOL methods, and added a TimeInterval lightweight class. The latter has some logic to handle differently linear history and yearly periodic history.
  • IO mods: I wanted to be able to swap fields pointers in an input class without rebuilding it all. So I added a small capability there. Also, use some of the QOL methods above in IO control
  • vertical remap: allow more fine-grained handling of the vertical pressure coordinate. In particular:
    • the tgt fields CAN have interfaces: before, this was not the case cause we only used it in the model->file direction, but DataInterpolation uses it in the other direction
    • allow src fields to NEVER be at interfaces (that is, mid tgt field can remap to both mid/int tgt fields): again, this is b/c of the new "backward" use w.r.t. what's done in output
    • allow to set and query pressure profiles
  • DataInterpolation: this is the big class, but it came out relatively skinny. Highlights:
    • hremap and vremap are optional (we create identity remappers if not needed), but time interpolation is MANDATORY. Customers that have constant data can do the work themselves at init time
    • vertical remap, if not identity, can accept as a src pressure (i.e., in the input files) either a constant 1d profile or a time-dependent 3d profile. If other use cases will be needed, we'll adjust.
    • hremap follows the same restrictions as usual: it can only be identity or a refining remap (i.e. data on a COARSER grid). At some point, I want to get rid of this limitation, but it would require some serious work on the horiz remapper classes.
    • we store beg/end copies of fields, corresponding to the time interval [beg,end] in the data that contains the model time
    • setup in 3 phases: create time database, create remappers, init the end state to the last t_beg such that t_beg<t0. This will trigger an update of the end state during the 1st timestep. I opted for this (rather than load t_beg/t_end at init) since it makes the runtime code more uniform.
    • runtime is almost trivial, except for the data interval update. There, we swap the content of beg/end structures, and reload a new slice in end
  • data interp testing: this is not so trivial. It wasn't easy to come up with a framework that made the interpolation doable "by hand" without copy+pasting the code in the class itself. It all boils down to how the data is generated, which is roughly like this:
  dh = 1.0/(ngcols-1);
  dv = 1.0 / nlevs;
  h_value = col_gid * dh* + 1.0; // +1 ensures we don't have a col full of zeros at gid=0.
  v_value = lev * dv*;
  data(icol,ilev) = h_value*v_value; // for interfaces
  data(icol,ilev) = h_value*(v_value + dv/2); // for midpoints
  • We generate data for scalar/vector fields, that are either 2d or 3d. If 3d, they can be either midpoints or interface fields. So the list is s2d,v2d,s3d_m,v3d_m,s3d_i,v3d_i with obvious meaning of names.
  • The generated pressure profiles are taken as p3d = s3d_m, and p1d is a vertical slice of p3d (ensuring all ranks agree on it).
  • For vert remap testing:
    • in case of a 1d static pressure profile, I make the model pressure (the tgt pressure) have the SAME profile at all colums. This allows me to do away with vertical interpolation when computing the expected answer. That's because the model pressure in each col always matches what the input files claim is the pressure that it was generated at. Well, that would be true if nlevs was the same, but it's not. What we can say is that since the model pressure is computed in the same way at each col as it was computed when generating the data (as a fcn of the level), the way I manually compute the expected fields is consistent with what the vertical remapper produces. So there IS remapping in the data interp class, but I can directly compute the answer in the tests based on col/lev indices, without having to consider what the model pressure is
    • a similar argument as in the prev case, except that I don't set model_p = p1d at all columns. But the fact that I compute the pressure in a way consistent with how the data pressure profiles were generated allows me to compute the expected values for the fields directly from the col/lev index, without having to consider the value of pressure at that index.
  • misc warnings/cmake fixes

@mahf708 I opted for your version of the class, where inputs are specified via input parameters to ctor/setup methods, rather than relying on parameter lists. The different customers (spa, nudging, mam, iop (the latter not yet supported)) are therefore free to decide the input file syntax to trigger different behaviors. It is pointless to use a param list ctor for a single class that only has one constructor. Param list ctors are more apt for stuff that is constructed via factories (and hence has multiple concrete instanes, which may have different needs), so that we can have one interface to rule them all.

I recommend reviewsing one commit at a time, since it may help isolating concepts. Otherwise, review everything but the data interp class/tests first, and then focus on the data interp stuff.

Happy holidays!

Edit: I noticed a few extra baseline cmp failures on cuda. Since I don't see those tests failing on master, I suspect they are related to some mods in either the timestamp class or the io layer. I don't foresee them being very hard to find (famous last words), but I also think I may be fixing that next year. You can start to review at your earlies convenience (which may very well be next year). If you do before the fix comes in, you may get pinged again once that goes in.

@bartgol bartgol force-pushed the bartgol/eamxx/data-interpolation branch from adc3957 to f4bc654 Compare December 24, 2024 01:26
* The curr_month_beg method was using the wront time of day
* The next_write_ts method was not considering a corner case for
month/year frequence. Resetting to what was previously in master.
@bartgol bartgol force-pushed the bartgol/eamxx/data-interpolation branch from f913600 to b5ea775 Compare January 8, 2025 21:25
@bartgol
Copy link
Contributor Author

bartgol commented Jan 9, 2025

Expected fails on cuda. The v1 fail was during cleanup, when something went amiss for some reason. All is good.

@tcclevenger @AaronDonahue @mahf708 would you mind taking a look?

Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good! I looked over all edits quickly, and I focused most of my review on the data_interpolation_ test files. I have two broad comments, but we don't need to deal with them right away.

  • I think it would help if we add a test reading an existing file used for production (say an IOP file or some sort of another) with the caveat that we might want the field in the file and the field in the FM/code to have different names (I assume currently, we should do this with aliasing, right?)
  • If there's any way to reduce the length of the main eamxx_data_interpolation cpp file, that would be great. I remember we complained that our IO code was getting out of hand in terms of length. I notice the current file is ~450 lines. If we can bring that down to half that length (by extracting some stuff away to a util file or something) that may be nice

Again, nothing urgent to address, just thinking out loud :)

Thanks a lot for this, I think it will be a significant improvement, and I am looking forward to rewiring the process interfaces such that we use the new shiny tools!!!

Copy link
Contributor

@tcclevenger tcclevenger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I'll think about what IOP would need to use this

@bartgol
Copy link
Contributor Author

bartgol commented Jan 9, 2025

  • I think it would help if we add a test reading an existing file used for production (say an IOP file or some sort of another) with the caveat that we might want the field in the file and the field in the FM/code to have different names (I assume currently, we should do this with aliasing, right?)

Yes, the customer class can pass a vector of fields with aliased name. That is, the names of the fields passed to this class MUST match what's in the file, but the customer app can pass aliased fields to achieve the desired result.

  • If there's any way to reduce the length of the main eamxx_data_interpolation cpp file, that would be great. I remember we complained that our IO code was getting out of hand in terms of length. I notice the current file is ~450 lines. If we can bring that down to half that length (by extracting some stuff away to a util file or something) that may be nice

I don't think you can get below 450 for a class that has to handle a few different use cases. Almost 100 lines come just from the safety checks, so that leaves only 350 lines of code (including comments and empty lines for ease of readability). I highly doubt you could trim the file much (short of rm comments and all empty lines).

@bartgol bartgol marked this pull request as draft January 13, 2025 18:28
@bartgol bartgol marked this pull request as ready for review January 13, 2025 18:28
@bartgol bartgol merged commit 5691af8 into master Jan 14, 2025
28 of 32 checks passed
@bartgol bartgol deleted the bartgol/eamxx/data-interpolation branch January 14, 2025 01:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code cleanup code usability EAMxx PRs focused on capabilities for EAMxx
Projects
None yet
Development

Successfully merging this pull request may close these issues.

EAMxx: Kokkos View allocations in spa EAMXX: unify time interpolation implementations
4 participants