Add compression to SBN atmospheric product generation#4649
Add compression to SBN atmospheric product generation#4649ChristopherHill-NOAA wants to merge 7 commits intoNOAA-EMC:dev/gfs.v17from
Conversation
|
@DavidHuber-NOAA @TravisElless-NOAA - I see that the shfmt scan failed. I've looked into this but it doesn't appear related to this PR. Can you confirm if that's true or not. I have previously run a case with all output on. To avoid having to re-run a forecast, I am copying over the com products folder to original.products after this completes, I will use the code here to re-run the meta tasks gfs_awips_20km_1p0deg to see the effect of these changes on a single cycle high resolution run. Hopefully then we can finally have our answer about the SBN that is needed. Do any files need to be deleted from the COM directory before this re-run for a successful re-run? Do others see other flaws in this test process? @ChristopherHill-NOAA - You said you ran CI on this for technical testing, do you have the output of that to share? |
|
@JessicaMeixner-NOAA You are correct. It's not related to this PR. I'll open a separate PR to resolve this. |
|
Original products can be found in: Running with the two line change from Chris, the output can be found at: All gfs_awips_20km_1p0deg were rerun. |
|
9.1G original.products/atmos/wmo This at least has trended in the correct direction. We do need to confirm that the packing was applied and that quality of the output is acceptable. |
|
A spot check of output from the Sample degrib2 output for Values of the total volume of operational products for 2026030506 from v16.3, original products from v17, and newly generated products from v17 are shown below, and entered in this spreadsheet:
The greatest value of v16.3 SBN product volume observed over the past week was 9336.94 MB, from cycle 2026031012. |
|
@ChristopherHill-NOAA - does your decrease in volume size also account for the files that are being removed? I believe we were going to have about 70 MB from removed files? |
|
The total volume values include xtrn.awpgfs* files for v16.3, which are scheduled to be removed and are not being generating with v17. Assuming we are having to reduce SBN product volume for v17 against the same cycle for v16, then a further reduction of 70 MB would be needed in the case of 2026030506. If there is otherwise a hard limit volume (e.g. 9400 MB) to be met, that would be easier to reconcile. As noted with #4614, I will try different versions of g2lib for potentially reducing product volume. |
|
Our goal for SBN is to be neutral or decreased from GFSv16. |
|
As discussed in #4614, the initially committed code change to From a spot test of the relevant segment of code with F024 GRIB2 files generated by retrov17_01_realtime run of cycle 2026030506, the original, first commit, and second commit product file sizes (in bytes) are as follows: When expanding these results to a hypothetical full forecast range for each product, which is 40 forecast hours for the '003' grid and 54 forecast hours for the other grids, the file size values (in MB) become: A partial workflow test, similar to one conducted on March 12, may help to confirm the volume reduction of a full cycle of SBN products that are stored to |
|
According to degrib2 output, the addition of 'undefined' values for a PV-coordinate variables (and cloud-layer PRES) corresponds to an increase of the listed number of data points available for the variable to the maximum number available for any variable, but with no change to the calculated minimum, maximum, or average values. In the case of the F024 CONUS grid product ( The degrib2 output for the F024 sample of original products and new products are available here: grib2_awpgfs_003_f024_new_degrib2.txt |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
|
@JessicaMeixner-NOAA Apologies for the multiple commits, as I should have made and pushed the changes from WCOSS. The most recent commit of code should be ready for your testing. |
|
Updated output is here: /lfs/h2/emc/ptmp/jessica.meixner/comroot/prod01/gfs.20260305/06/products The original v17 products are in original.products -- I forgot to copy over the first test of this PR, so I do not have that. |
|
The SBN product size spreadsheet is updated to reflect the latest workflow test with the code changes newly committed to this PR. A summary of volume values (MB) for cycle 2026030506 is provided here: In addition to the ... the volume for v16 includes ... the volumes for v17 include
|
|
{reposting revised comment} With the latest changes to |
|
Following Jessica's successful test of the committed code changes, and in following Jason's email from this morning, reiterating the request for review of this PR. |
|
I am running CI tests on WCOSS2. It would be good to have a reviewer from the products team as well. @ChristopherHill-NOAA do you have a suggestion for that? |
|
@JessicaMeixner-NOAA I agree with adding a reviewer from the products team. I am adding @WenMeng-NOAA and @BenjaminBlake-NOAA . |
| export opt25=":(APCP|ACPCP|PRATE|CPRAT):" | ||
| export opt26=' -set_grib_max_bits 25 -fi -if ' | ||
| export opt27=":(APCP|ACPCP|PRATE|CPRAT|DZDT):" | ||
| export opt28=' -new_grid_interpolation budget -fi ' |
There was a problem hiding this comment.
@ChristopherHill-NOAA You might add new option (opt29) for switching compression here
| export opt28=' -new_grid_interpolation budget -fi ' | |
| export opt29='-set_grib_type complex2' |
| gridconus="lambert:265.0:25.0:25.0 226.541:369:20318.0 12.19:257:20318.0" | ||
| # shellcheck disable=SC2086,SC2248 | ||
| ${WGRIB2} tmp_masterfile ${opt1uv} ${opt21} ${opt22} ${opt23} ${opt24} ${opt25} ${opt26} \ | ||
| ${opt27} ${opt28} -new_grid ${gridconus} "awps_file_f${fcsthr}_${GRID}" |
There was a problem hiding this comment.
@ChristopherHill-NOAA Switching compression can be completed here.
| ${opt27} ${opt28} ${opt29}-new_grid ${gridconus} "awps_file_f${fcsthr}_${GRID}" |
| gridak="nps:210.0:60.0 170.0:277:22500 35.0:225:22500" | ||
| # shellcheck disable=SC2086,SC2248 | ||
| ${WGRIB2} tmp_masterfile ${opt1uv} ${opt21} ${opt22} ${opt23} ${opt24} ${opt25} ${opt26} \ | ||
| ${opt27} ${opt28} -new_grid ${gridak} "awps_file_f${fcsthr}_${GRID}" |
There was a problem hiding this comment.
| ${opt27} ${opt28} ${opt29} -new_grid ${gridak} "awps_file_f${fcsthr}_${GRID}" |
| gridprico="latlon 271.75:275:0.25 50.75:205:-0.25" | ||
| # shellcheck disable=SC2086,SC2248 | ||
| ${WGRIB2} tmp_masterfile ${opt1} ${opt21} ${opt22} ${opt23} ${opt24} ${opt25} ${opt26} \ | ||
| ${opt27} ${opt28} -new_grid ${gridprico} "awps_file_f${fcsthr}_${GRID}" |
There was a problem hiding this comment.
| ${opt27} ${opt28} ${opt29} -new_grid ${gridprico} "awps_file_f${fcsthr}_${GRID}" |
| gridpac="mercator:20.0 110.0:837:20000:270.0 -45.0:725:20000:65.7345" | ||
| # shellcheck disable=SC2086,SC2248 | ||
| ${WGRIB2} tmp_masterfile ${opt1} ${opt21} ${opt22} ${opt23} ${opt24} ${opt25} ${opt26} \ | ||
| ${opt27} ${opt28} -new_grid ${gridpac} "awps_file_f${fcsthr}_${GRID}" |
There was a problem hiding this comment.
| ${opt27} ${opt28} ${opt29} -new_grid ${gridpac} "awps_file_f${fcsthr}_${GRID}" |
| grid003="latlon 0:360:1.0 90:181:-1.0" | ||
| # shellcheck disable=SC2086,SC2248 | ||
| ${WGRIB2} tmp_masterfile ${opt1} ${opt21} ${opt22} ${opt23} ${opt24} ${opt25} ${opt26} \ | ||
| ${opt27} ${opt28} -new_grid ${grid003} "awps_file_f${fcsthr}_${GRID}" |
There was a problem hiding this comment.
| ${opt27} ${opt28} ${opt29} -new_grid ${grid003} "awps_file_f${fcsthr}_${GRID}" |
|
@ChristopherHill-NOAA I suggest to combine compression switch with the interpolation process. You might also test the overall runtime, as these SBN data file generation could have impact data dissemination latency via SBN in GFS operation. |
|
@ChristopherHill-NOAA Thanks for your work on this. I agree with @WenMeng-NOAA 's suggestion to combine the -set_grib_type complex2 command with the wgrib2 interpolation step - it should have the same effect as doing the compression in a separate command. |
|
@ChristopherHill-NOAA - I will re-run after any code changes requested by @WenMeng-NOAA and @BenjaminBlake-NOAA In the meantime, I wanted to let you know that the regression tests failed for one test due to large gempak log files which was also reported here: #3630 (comment) I will not finish running the regression tests until the code changes have been confirmed. I'll also re-run the high resolution tests to confirm any code changes are not unintentionally changing output sizes. |
Description
NCO requires that the total volume of post-processing products to be transmitted through the SBN not exceed current operational levels. The generation of these products in retrospective runs of GFSv17 have marginally exceeded operational levels, and the files must be reduced in size (in lieu of the outright removal of some).
This PR simply adds a compression/packing attribute to the generation of WMO-headed atmospheric products within the script exgfs_atmos_awips_20km_1p0deg.sh, accomplished by revising the
-set_grib_typeoption of WGRIB2 from "same" to "complex2".This PR is intended to resolve #4614.
Type of change
Change characteristics
How has this been tested?
The GFS workflow was cloned and built on WCOSS. A CI test only confirmed nominal functionality with the code change, as the products resulting from the low resolution run were too small for the code change to produce the necessary effect.
Offline tests executing a segment of
exgfs_atmos_awips_20km_1p0deg.shwith files fromproducts/atmos/grib2/0p25as input resulted in a reduction to the size of the products.It is recommended that the changes from this PR be tested in a single cycle, high-resolution test of GFSv17, to confirm its intended effect to reduce SBN product volume to the extent required by NCO.
Checklist