BCS processing to support HWSDv2 soil preprocessing and new Peat GPA22#1196
BCS processing to support HWSDv2 soil preprocessing and new Peat GPA22#1196biljanaorescanin wants to merge 15 commits intodevelopfrom
Conversation
|
@gmao-rreichle I've tested with last commit BCS for v12 EASE M36 vs what is on develop now is zero diff and new ones are doing what they suppose to. So backward compatibility is preserved. |
…ed comments; white-space changes (mod_process_hres_data.F90, rmTinyCatchParaMod.F90)
gmao-rreichle
left a comment
There was a problem hiding this comment.
@biljanaorescanin : Many thanks for making the processing changes. I looked through the code in some detail but haven't yet had a chance to inspect the new bcs datasets. I'll do that later, but I'm confident that's just a formality since you already checked them.
See below for a few more comments and open questions.
Also, I added one commit that mostly edits the in-code comments but also makes one (probably non-zero-diff) change: 3d03f10?w=1
(The link hides white-space changes.)
We can discuss at the next tag-up.
...omp/GEOSphysics_GridComp/GEOSsurface_GridComp/Utils/Raster/makebcs/mod_process_hres_data.F90
Outdated
Show resolved
Hide resolved
| ! In all PEATMAP variants, subsurface orgC must *not* fall into peat class. | ||
| ! Reassign HWSD subsurface peat to peat-rich mineral Group 3. |
There was a problem hiding this comment.
I am still wondering if this legacy behavior is what we wanted originally, and if it makes sense for the new processing approach.
There was a problem hiding this comment.
I think that’s a fair question. In this PR I kept that behavior because the intended rule in the strict GPA22 path was that tile peat should be controlled by mapped top-layer peat support, not created independently by HWSD subsurface OC. So carrying forward the oc_sub cap was a way to prevent HWSD sub-layer peat from reintroducing peat outside the mapped top-layer signal.
In that sense it is internally consistent with the current strict approach, but I agree the implementation is not very transparent, especially since profile peat is then effectively made to follow top peat later in the PEATMAP path. If we want to revisit this more cleanly, I think the right rule would be: top-layer peat support determines whether the tile is peat, and profile follows consistently, rather than relying on capped oc_sub plus later write-time forcing.
For this PR I left the legacy oc_sub handling unchanged to minimize scope, but I’m happy to change it if you want the new processing path to express that rule more explicitly. Just let me know should I change this?
...omp/GEOSphysics_GridComp/GEOSsurface_GridComp/Utils/Raster/makebcs/mod_process_hres_data.F90
Show resolved
Hide resolved
...omp/GEOSphysics_GridComp/GEOSsurface_GridComp/Utils/Raster/makebcs/mod_process_hres_data.F90
Show resolved
Hide resolved
| ! threshold relative to the full weighted tile support. Otherwise | ||
| ! choose the dominant mineral orgC class (1–3). | ||
| ! | ||
| if (cFamily(4)/(real(i) + 2.33*real(i)) > PEATMAP_THRESHOLD_2) then |
There was a problem hiding this comment.
Two comments here:
-
Like for the top layer, for the tile to have o_clp=4 (i.e., profile is peat), the if condition requires that an absolute majority of all raster grid cells that make up the tile is peat, regardless of how many raster grid cells are no-data.
-
I'm still wondering how we want to handle the peat class for the sub-layer. Early on in the processing, we limit oc_sub to 8/sf (peat-rich mineral class). Then here we compute a weighted average of orgC in the top and sub layers with the goal to determine the orgC profile class. This makes little sense. I guess our choice of oc_top=33/sf for peat raster grid cells ensure that the profile class becomes peat regardless of the value of oc_sub: even when oc_sub =0, we have oc_profile = (0.333+0.70)=9.9 > cF_Lim(4). This works except for the case when the sub-layer orgC is good data but the top layer orgC is missing (e.g., outside the coverage of PEATMAP or GPA22). Then the result is case-dependent.
Would it make more sense to simply assign the peat class of the tile based on oc_top alone? That is, if the tile has oc_top>cF_Lim(4), then o_cl = 4 and o_clp = 4 ?
There was a problem hiding this comment.
I think this is a fair point. In the strict GPA22 path, we already remove sub-layer peat support earlier by capping oc_sub below the peat threshold. That means the weighted profile peat test here is no longer really the mechanism that assigns peat profile class. In practice, peat profile output is currently coming from the later PEATMAP rule that forces written profile peat when the top class is peat. So the intended science is really that top-layer GPA22 peat support determines tile peat, and profile follows consistently.
A cleaner strict-GPA22 implementation would therefore be to set o_clp = 4 whenever the strict top peat condition is met, and otherwise use the weighted profile support only to distinguish mineral classes 1–3.
Is that the behavior we want in the strict GPA22 path? I can change this if it is.
There was a problem hiding this comment.
I'm wondering if this if condition:
if (cFamily(4)/(real(i) + 2.33*real(i)) > PEATMAP_THRESHOLD_2)
can ever be true. In the loop above where cFamily is computed, we should never get cFamily>real(i) because oc_sub was earlier capped at cF_lim(4), so the sub layer should never contribute to the sum. After normalization with (real(i) + 2.33*real(i)), we should then always be below PEATMAP_THRESHOLD_2. In other words, we would never get o_clp=4 from this section of the code.
This would then always put us into the "mineral" block of the code below (~Line 4524), and poc_vec would be computed from the organic carbon content values of the contributing mineral raster grid cells.
What I don't understand is how still we end up with soil_class_com=253 even when o_clp<=3.
There was a problem hiding this comment.
Yes, I think that comment is right.
In the strict GPA22 path, after oc_sub is capped below the peat threshold, the weighted profile peat test can no longer make o_clp = 4.
The reason we still see profile peat in the output is that later in the PEATMAP path the written profile class fac is forced to 253 whenever the top class is peat.
In addition, soil_class_com(n) itself can still become 253 through the strict fallback (soil_class_com < 1 -> soil_class_top) or neighbor fill. So the output is peat/peat, but the current implementation is still mixed.
I think I found fix in my notes and will try that and see.
There was a problem hiding this comment.
In the strict GPA22 path, after oc_sub is capped below the peat threshold, the weighted profile peat test can no longer make o_clp = 4.
Then the if block makes no sense at all.
The reason we still see profile peat in the output is that later in the PEATMAP path the written profile class fac is forced to 253 whenever the top class is peat.
Where is that exactly?
In addition, soil_class_com(n) itself can still become 253 through the strict fallback (soil_class_com < 1 -> soil_class_top) or neighbor fill.
This should not happen very often. But we get %OCProf<8.72 for nearly all peat tiles.
There was a problem hiding this comment.
The place where written profile peat is still being forced is the later PEATMAP output block:
So the final written prof can become peat even when the upstream profile classification path remained on the mineral side.
There was a problem hiding this comment.
Ok, thanks. But at this point in the code, only the profile soil class is changed, and the associated parameters (e.g., poc_vec [%OCProf]) remain unchanged, even though for peat tiles they were calculated from the mineral raster grid cells.
It will NOT work to add modifications of poc_vec etc here. For backward compatibility, we need to keep the legacy path as is. For "STRICT GPA22", we need more invasive changes further up. It makes no sense to reset oc_sub to mineral, and then diagnose o_clp from that. After the reset, o_clp can never be peat, at least not if it's based on the logic in what's currently in L4347.
My hope is that with the appropriate in-line "if (STRICT GPA)" conditions, we can salvage the overall structure of the code and fix the new logic while maintaining legacy calcs in one subroutine. If we had more time and resources, we'd want to rewrite all of this stuff from scratch in a new subroutine soil_para_hwsd_peat() or some such, but that would almost certainly require more effort than we can afford at this time.
There was a problem hiding this comment.
Last commit should resolve this. I've shared BCs set produced with this commit with you on chat to confirm.
| endif | ||
|
|
||
| ! If the profile class is missing, search "neighboring" tiles for a fill value, | ||
| ! where "neighbors" are defined in terms of proximity of tile index (not necessarily geographical proximity) |
There was a problem hiding this comment.
Note that "neighbor" here is simply a neighbor in indexing space, and not necessarily in geographic space. This approach is used in the legacy approach and can't be changed for the legacy approach, but it may be worth thinking about something more geographically meaningful for "STRICT_GPA22" processing. Not a high priority though, and probably doesn't impact many tiles anyway. I'm adding the comment mostly so we can revisit the question later.
There was a problem hiding this comment.
Agree. In this section, “neighbor” is still defined in index space, not geographic space. For the legacy path that needs to stay unchanged for backward compatibility. In the strict GPA22 path we only tightened the consistency of what gets copied once a donor is chosen, we did not redesign the donor search itself. A geographically based search could be cleaner for strict GPA22, but it would also be more computationally expensive. Since this fill is rare and the current behavior is already much improved, I’d treat that as a follow-up rather than part of this PR. But we can change this if you insist.
There was a problem hiding this comment.
I need to revise my earlier response to this question after the follow-up investigation.
The concern about index-space “neighbors” is more relevant to STRICT_GPA22 than I first implied. The raw GPA22 → conservative PEATMAP rasterization checks out, and the first discrepancy appears later, in the PEATMAP → M09 stage. In the strict path, the early donor fill still uses index-space neighbor selection, and once a donor is chosen it can copy peat-defining state (soil_class_top, soil_class_com, pmap) from that donor. So this is now a plausible source of the most suspicious anomaly subset we found.
In particular, the strongest anomaly set is small: 20 tiles where the raw GPA22 tile box is fill-only but M09 still ends up peat; 18 of those 20 are in Maritime SE Asia. That makes the donor-fill behavior worth tightening in STRICT_GPA22.
If we want to resolve this in the PR, I see two options:
Option 1 (minimal fix): keep the legacy donor search, but in STRICT_GPA22 do not allow the fill step to create peat identity from the donor. In practice, that means tightening the strict-path fill so donor-selected values cannot newly assign peat-defining state (soil_class_top, soil_class_com, pmap) to a target tile without direct strict GPA22 / PEATMAP support.
Option 2 (cleaner fix): switch donor selection in STRICT_GPA22 from index-space to geographic-space.
My preference for this PR is Option 1, because it enforces the intended strict GPA22 logic with much less algorithmic risk. Option 2 is conceptually cleaner, but it is a broader redesign and computationally more expensive.
The main location that needs attention is here:
I have an idea for the minimal fix and will test it to see whether it resolves this anomaly set.
There was a problem hiding this comment.
Update: Unfortunately, after testing, this block does not appear to be responsible for the 20-tile anomaly subset. The strict donor-fill path never triggered in this case, and replacing it did not change the results.
| ! NGDC : Soil parameters from Reynolds et al. 2000, doi:10.1029/2000WR900130 (MERRA-2, Fortuna, Ganymed, Icarus) | ||
| ! HWSD : Merged HWSDv1.21-STATSGO2 soil properties on 43200x21600 with Woesten et al. (1999) parameters | ||
| ! HWSD_b : As in HWSD but with surgical fix of Argentina peatland issue (38S,60W) | ||
| ! HWSD_c : HWSD updated from v1.2 to v2. And sub and top are D2 layer |
There was a problem hiding this comment.
Instead of HWSD_c we should call this HWSDv2 for clarity.
|
|
||
| logical, public, save :: use_PEATMAP = .false. | ||
| logical, public, save :: jpl_height = .false. | ||
| logical, public, save :: PEATMAP_STRICT_GPA22 = .false. |
There was a problem hiding this comment.
The config variables and the variable names in the code will need some cleanup.
- We won't need to run the legacy PEATMAP inputs with the "STRICT_GPA22" processing tweaks. It would be cleaner to identify the name of the peat map data product, and have the if conditions based on the product name.
- The name "PEATMAP" is specific to the legacy peat map, and the name of the "GPA22" map in the published literature is "Global Peat Map 2.0". (I think GPA22 was just a working name during the earlier phase of product development.)
- So instead of the logical "use_PEATMAP", we may want to have an integer config variables that something like:
PEAT_INFO = 0 (HWSD),
1 (HWSD and PEATMAP)
2 (Global Peat Map 2.0)
E.g., the if condition "if (use_PEATMAP)" would then translate to something like:
select case (PEAT_INFO)
case (0)
! use_PEATMAP=.false.
case (1,2)
! use_PEATMAP=.true.
etc
There was a problem hiding this comment.
Yes, that makes sense. The current use_PEATMAP + PEATMAP_STRICT_GPA22 split was mainly introduced to isolate and validate the GPA22 logic while preserving legacy behavior. I agree that the long term configuration would be cleaner if it keyed off the peat information product explicitly rather than overloading the legacy PEATMAP name. I can work on cleaning that up and push a follow-up commit once it is done and retested.
gmao-rreichle
left a comment
There was a problem hiding this comment.
@biljanaorescanin : I still don't understand how this code works. See comment below
| ! threshold relative to the full weighted tile support. Otherwise | ||
| ! choose the dominant mineral orgC class (1–3). | ||
| ! | ||
| if (cFamily(4)/(real(i) + 2.33*real(i)) > PEATMAP_THRESHOLD_2) then |
There was a problem hiding this comment.
I'm wondering if this if condition:
if (cFamily(4)/(real(i) + 2.33*real(i)) > PEATMAP_THRESHOLD_2)
can ever be true. In the loop above where cFamily is computed, we should never get cFamily>real(i) because oc_sub was earlier capped at cF_lim(4), so the sub layer should never contribute to the sum. After normalization with (real(i) + 2.33*real(i)), we should then always be below PEATMAP_THRESHOLD_2. In other words, we would never get o_clp=4 from this section of the code.
This would then always put us into the "mineral" block of the code below (~Line 4524), and poc_vec would be computed from the organic carbon content values of the contributing mineral raster grid cells.
What I don't understand is how still we end up with soil_class_com=253 even when o_clp<=3.
This PR has two parts.
First, it adds the full HWSDv2 soil preprocessing workflow (STEP1–STEP5), including supporting QA utilities and documentation. This workflow is included in the PR for continued development and testing, but it is not intended to be enabled for production BCS generation at this time, since current testing shows it produces substantially different land/soil outcomes than the existing baseline.
Second, it updates raster BCS soil processing to improve PEATMAP/GPA22-based peat handling in SoilProperties generation. The main production-targeted change in this PR is the strict GPA22 peat logic, which removes problematic HWSD-driven peat assignments and makes peat classification/property assignment internally consistent in the generated tiles.
Production intent in this PR:retain legacy non-GPA22 behavior unchanged - tested for zero diff to v12 BCs
enable strict GPA22 peat handling through flags - tested with v15 BCs
prevent HWSD from generating peat outside the GPA22 peat support
preserve consistent peat/mineral tile identity during downstream soil parameter repair
Included in this PR:HWSDv2 STEP1–STEP4 preprocessing (R / Python / Fortran)
STEP5 IDL tile generation (noMASK mode)
PEATMAP/GPA22 updates in SoilProperties generation
QA utilities (STATSGO comparison, plotting, tile checks)
README documenting code sequence, flags, and usage
Current status:HWSDv2 full preprocessing path:
included for testing/development, not yet intended for production BCSs
PEATMAP/GPA22 updates:
validated for the targeted peat-classification fix and intended for use in the current BCS workflow
More details on the HWSDv2 workflow, inputs, and processing sequence are documented in the README.
NOTEs:This PR does not switch production BCS generation over to the full HWSDv2 path by default. That path remains available behind flags while the GPA22 peat updates are the primary functional change being advanced here.
We enable use of GPA22 Peat with
BCSv15.This PR is zero diff trivial for GCM since all is preprocessing for BCS. It is also zero diff tested for v12 BCs.