Updating section names to more useful categories, alphabetizing names within each section#140
Updating section names to more useful categories, alphabetizing names within each section#140mkavulich wants to merge 26 commits intoESCOMP:mainfrom
Conversation
…, and coordinates to appropriate sections
- New sections "timing" and "stochastic physics" - Continue populating dimensions, coordinates, system variables
- Rename "state_variables" --> "atmospheric properties" - Delete and reallocate "diagnostics" section - Rearrange atmospheric_composition into subsections - New "radiation" section - Continuing to depopulate bad "GFS_typedef" sections
- Fix some indentation - Rename "precipitation and hydrometeors" to "precipitation, cloud, and hydrometeor variables" - New section "control variables"; move all "do_" prefix variables here
…adding more parameterization-specific sections - New sections "Convective physics parameters", "Gravity wave drag parameters" - Merged the two different Aerosol sections; those that are model-specific added to description - Merged "Land and water surface properties" into "Land surface, subsurface, and vegetation properties" - Added an "Other" section for now, I hope to clean this up into more discrete categories going forward
…do this automatically
…; can be used for comparisons after reorganization
- Update CI tests to consistently call python scripts with python3 - Add execution permissions to python scripts
| run: | | ||
| tools/check_xml_unique.py standard_names.xml | ||
| tools/check_xml_unique.py standard_names.xml --field="description" | ||
| python3 tools/check_xml_unique.py standard_names.xml |
There was a problem hiding this comment.
It doesn't hurt, but why do we need this?
All of the scripts have
#!/usr/bin/env python3
in the shebang.
There was a problem hiding this comment.
I just wanted to go for consistency, but you're right it's unnecessary, so I've removed it from all script calls.
tools/list_names.py
Outdated
| from pathlib import Path | ||
|
|
||
| try: | ||
| from lxml import etree |
There was a problem hiding this comment.
Is there a reason we can't use functionality in lib/xml_tools.py, or at least the same Python XML libraries? Why install another, potentially redundant lxml library?
ESMStandardNames/tools/lib/xml_tools.py
Line 15 in 0a13a57
There was a problem hiding this comment.
I thought of this, but for some reason I thought the built-in XML couldn't output these nicely formatted and indented XML files. Turns out the built-in is actually better, fixing a few indent problems I noticed.
| - name: Marine | ||
| comment: null | ||
| standard_names: | ||
| - name: derivative_of_diurnal_thermocline_layer_thickness_wrt_surface_skin_temperature |
There was a problem hiding this comment.
I noticed that for the atmospheric variables, the surface and boundary levels, (often used for coupling?) are in a separate category from the atmospheric properties. Should we have a similar structure for Marine variables?
There was a problem hiding this comment.
I'm not opposed to further categorization, but I personally don't have a good sense of how ocean variables might be binned in this way. It seems to me that in all our current contexts (being atmosphere-centric), ocean modeling deals with just the surface and boundary layers, with nothing really done below that, so it might be redundant? I'm not sure to be honest.
…output these well-formatted XMLs with just the standard libraries. It even fixes some indent problems with the original script
Wording change from Dom Co-authored-by: Dom Heinzeller <dom.heinzeller@icloud.com>
climbfuji
left a comment
There was a problem hiding this comment.
Thanks very much for addressing my comments. This is a lot nicer now.
I am happy with the proposed sections.
Description
This PR reorganizes the existing standard names (with no changes, except some updated descriptions) into a new section heirarchy that removes references to specific modeling systems (specifically GFS typedefs). The new sections are mostly descriptive of the way the variable is used, and I attempted to make the sections as generic as possible. With this type of natural language organization I believe it's impossible to unambiguously assign certain variables to certain sections, but I have attempted to keep things as organized as possible. The new sections are in bold below
I am very open to feedback about changing these specific section names, so please review away. I tried to keep the sections as generic as possible, avoiding references to specific models or types of parameterization, but it wasn't always possible from my point of view.
Within each section, standard names are now alphabetized to give a more logical and unambiguous sorting. This was accomplished with a new tool,
tools/sort_standard_names.py, written by Claude Code running locally withgpt-oss:20b. I also added another Claude-Code-written tool,tools/list_names.py, that gives a monolithic alphabetized list of all standard names; I used this to ensure that no names were accidentally lost in the reorganization. I have thoroughly reviewed the Claude-generated scripts, and attest that I understand and approve of their functionality.I have integrated the alphabetization check into the GitHub CI, and added a rule about this alphabetization to the Rules document.
Because the alphabetization is maintained by a tool, it does constrain the formatting and indentation of the XML. I believe this is a fine tradeoff, since the Metadata files are designed to be human-readable and it's a lesser concern for the XML. But I'm open to feedback on this.
Finally, there was a lot of text in the comment of the Dimensions section that was specific to CCPP; I have removed this text and added it to the CCPP technical documentation (NCAR/ccpp-doc#80)
Issues
Resolves #135