-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EAMXX: add a DataInterpolation class #6812
Conversation
|
eb23fd7
to
6bfbb0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to break this plan into three PRs. This one which introduces the DataInterpolation Class and applies it to time interpolation and then subsequent ones with horiz and vertical?
I don't think so. The three interpolations are tightly coupled via remappers. Sure, the remapper can be For reviewers: I try to isolate concepts into their own commits. So one way to make reviews easier is to review one commit at a time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Working through the PR. Here are some comments from "EAMxx: add a TimeInterval lightweight struct"
Two notes:
The long-term goal is twofold:
Edit: edited above, but now I guess goal number 2 can be automatically addressed if the DataInterpolation class is supposed to be taking care of only one field at a time. Is that what you intend? (I wasn't sure based on |
@mahf708 Yeah, I decided to handle different inputs in the parameterlist, so that customer atm procs can simply fwd the sublist in their param list, rather than having to read it, just to pass it in. That said, it is true that the param list hides a bit the requirements. That is, it's not clear from the header what the expectations are on the input PL. Perhaps a constructor listing explicitly what should be inside is more meaningful, and may be more self-documenting. I will think about an alternative impl, and if I see it doesn't get too complex, I'll switch. |
415f59a
to
83b7135
Compare
Simplifies some operations that looked clunky on the client end
* Allow to set only one src or tgt pressure profile. This is because we may not have any mid or any int field. No need to force us to create a valid field just for passing checks. * Allow to query the remapper for the src/tgt mid/int pressure profile
* The class handles time, horizontal, and vertical interpolation * The vertical/horizontal are optional, but the time dimension is REQUIRED
We were still assuming tgt layout could only have LEV and not ILEV.
83b7135
to
adc3957
Compare
@AaronDonahue @mahf708 @tcclevenger This PR is finally ready for review, just in time for xmas! A few comments on what you'll find in it:
@mahf708 I opted for your version of the class, where inputs are specified via input parameters to ctor/setup methods, rather than relying on parameter lists. The different customers (spa, nudging, mam, iop (the latter not yet supported)) are therefore free to decide the input file syntax to trigger different behaviors. It is pointless to use a param list ctor for a single class that only has one constructor. Param list ctors are more apt for stuff that is constructed via factories (and hence has multiple concrete instanes, which may have different needs), so that we can have one interface to rule them all. I recommend reviewsing one commit at a time, since it may help isolating concepts. Otherwise, review everything but the data interp class/tests first, and then focus on the data interp stuff. Happy holidays! Edit: I noticed a few extra baseline cmp failures on cuda. Since I don't see those tests failing on master, I suspect they are related to some mods in either the timestamp class or the io layer. I don't foresee them being very hard to find (famous last words), but I also think I may be fixing that next year. You can start to review at your earlies convenience (which may very well be next year). If you do before the fix comes in, you may get pinged again once that goes in. |
adc3957
to
f4bc654
Compare
* The curr_month_beg method was using the wront time of day * The next_write_ts method was not considering a corner case for month/year frequence. Resetting to what was previously in master.
f913600
to
b5ea775
Compare
Expected fails on cuda. The v1 fail was during cleanup, when something went amiss for some reason. All is good. @tcclevenger @AaronDonahue @mahf708 would you mind taking a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good! I looked over all edits quickly, and I focused most of my review on the data_interpolation_ test files. I have two broad comments, but we don't need to deal with them right away.
- I think it would help if we add a test reading an existing file used for production (say an IOP file or some sort of another) with the caveat that we might want the field in the file and the field in the FM/code to have different names (I assume currently, we should do this with aliasing, right?)
- If there's any way to reduce the length of the main eamxx_data_interpolation cpp file, that would be great. I remember we complained that our IO code was getting out of hand in terms of length. I notice the current file is ~450 lines. If we can bring that down to half that length (by extracting some stuff away to a util file or something) that may be nice
Again, nothing urgent to address, just thinking out loud :)
Thanks a lot for this, I think it will be a significant improvement, and I am looking forward to rewiring the process interfaces such that we use the new shiny tools!!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'll think about what IOP would need to use this
Yes, the customer class can pass a vector of fields with aliased name. That is, the names of the fields passed to this class MUST match what's in the file, but the customer app can pass aliased fields to achieve the desired result.
I don't think you can get below 450 for a class that has to handle a few different use cases. Almost 100 lines come just from the safety checks, so that leaves only 350 lines of code (including comments and empty lines for ease of readability). I highly doubt you could trim the file much (short of rm comments and all empty lines). |
The class is in charge of reading in time-dependent datasets from input file(s), interpolate to current model time, and possibly do horizontal/vertical interpolation
The implementation is not yet complete, but some early thoughts on whether we're on the right track or not may help. Besides, getting some early testing on ghci machines can also help.
Things left to do:
I am NOT implementing a data interpolation for variables without a time dependence. It simply reduces to constructing 2 remappers, and since non-time dep data is most likely load-able at init, there's no real need for a complex data structure. Time interpolation is the most tricky of the 3 anyways, due to the different needs (nudging uses Linear timeline, while SPA uses a YearlyPeriodic one)
For reviewers: i recommend reviewing one commit at a time. That would help isolate concepts, so you can hopefully better follow. The hardest thing to review is the data interpolation unit tests, since there is quite a bit of code needed to generate cases that we can manually check without redoing the same operation as the interpolation class (copy+paste of src code in tests folder doesn't help).
Fixes #6810
Fixes #6820