Switch ``allocatable`` to ``pointer`` in ecRAD derived types by wertysas · Pull Request #81 · ecmwf-ifs/ecrad

wertysas · 2025-09-29T14:12:30Z

This PR switches all allocatable members of the ecRAD derived types (single_level_type, thermodynamics_type, gas_type, ylcloud_type, aerosol_type, flux_type) to pointers. The three main changes to the code are:

A switch of the allocatable attribute to pointer of the non-scalar members in the derived types and changing corresponding allocated calls to associated throughout the source code.
New methods in the easy_netcdf utility module that can be used with pointers instead of allocatables
Deallocation of the derived types at the end of the standalone drivers to prevent memory leaks

Furthermore, this PR also contains some minor code clean and bug fixes:

Updated change_namelist.sh script that is BSD/GNU agnostic and now works on Mac OS
A rewrite of the method set_units_gas that makes it non-recursive.
New throughput metric prints (Cols/s) to the ecrad standalone drivers.

In general the performance of the pointer version is better than before, see below for more detailed performance results. Several of the test configurations have also been tested with Valgrind to ensure that the switch from allocatables to pointers does not introduce any memory leaks.

--------------------------------------------

Performance results

The performance of this change has been evaluated on a CPU node on ECMWF's HPC using data from the maelstrom weather data set. The following configurations have been included in the performance tests:

input configurations: configCY49R1/configCY49R1_ecckd
single/double precision
thread count: 1, 2, 4, 8, 16, 32, 64, 128
nproma: 1, 2, 4, 8, 16, 32, 64

Generally, the pointer versions have better performance than the allocatable versions as can be seen in the table below that summarise the data of all performance runs (448 configs) with the intel fort compiler version 2021.4.0 20210910

Relative Difference in Runtime After Switching from Allocatables to Pointers (ifort 2021.4.0 20210910)

Statistic	Value
Mean	-13.82
Standard Deviation	7.14
Minimum	-50.94
1st Quartile (25%)	-16.93
Median (50%)	-14.29
3rd Quartile (75%)	-9.82
Maximum	9.37

Note: Negative values indicate performance improvement (faster runtime) when using pointers instead of allocatables.

Figures for selected nproma and thread counts

The figures below show the average runtime of different configurations in single and double precision with nproma = 8,32 threads=16,32 and the INTEL and GNU compiler. Compiler versions:

INTEL ifort version 2021.4.0 20210910
GNU Fortran (ECMWF) 13.2.0

…nter versions of put and copy routines

reuterbal · 2025-10-02T11:35:36Z

@drieg @m-burba @huppd: For some reason I cannot add you as reviewers directly, but this is the pull request discussed at today's meeting

drieg · 2025-10-13T15:27:19Z

radiation/radiation_single_level.F90

-         &   cos_sza, &       ! (ncol) Cosine of solar zenith angle
-         &   skin_temperature ! (ncol) Skin temperature (K)
+    real(jprb), pointer, dimension(:) :: &
+         &   cos_sza=>null(), &          ! (ncol) Cosine of solar zenith angle


I seem to have stumbled across an issue with the GCC compiler. When creating an object of type single_level_type in ICON, these pointers are associated which leads to an error at the first call of single_level%allocate below, since there is nothing to deallocate:

if (associated(this%cos_sza)) then deallocate(this%cos_sza) this%cos_sza => null() end if

As far as I can see, the code is correct and these pointers should not be associated, but I cannot figure out the reason for this strange behavior of the GCC compiler. I tried to create a small reproducer, but the reproducer works as intended. After nullifying the pointers in the object of single_level_type explicitely, I got the same errors in gas_type. With a long list of nullifies for all objects of ecrad input and output types, everything seems to work. At least it is running, I did not check the results in detail.

I am bit puzzled, did anyone experience something similar?

Short update: It seems that we found the cause of this issue. The associated pointers show only up inside an OMP parallel region. The objects were private, but that seems not to initialize them correctly. Setting the objects to firstprivate seems to fix it.

That's great, thanks for the update. (First)privatization of derived types can be a problem and not all compilers are always able to do that correctly.

That's why I avoided them in our current implementation. But if we do not want to nullify every variable in the derived types inside the parallel region, this seems to be the best solution. The OpenMP standard actually specifies this (Thanks to my colleague Roland Wirth for pointing this all out):

OpenMP 4.5, 2.15.3.3:
"If any statement of the construct references a list item, a new list
item of the same type and type parameters is allocated. This allocation
occurs once for each task generated by the construct and once for each
SIMD lane used by the construct. The initial value of the new list item
is undefined. The initial status of a private pointer is undefined."

OpenMP Version 5.1 2020 (2.21.3):
"If any statement of the construct references a list item, a new list
item of the same type and type parameters is allocated. This allocation
occurs once for each task generated by the construct and once for each
SIMD lane used by the construct. If the type of the list item has
default initialization, the new list item has default initialization.
Otherwise, the initial value of the new list item is undefined. The
initial status of a private pointer is undefined."

drieg · 2025-10-13T15:34:12Z

radiation/radiation_gas.F90

    ! Mixing ratios of variable gases, dimensioned (ncol, nlev,
    ! NMaxGases)
-    real(jprb), allocatable, dimension(:,:,:) :: mixing_ratio
+    real(jprb), pointer, dimension(:,:,:) :: mixing_ratio=>null()


One big advantage of the pointer implementation is that we could get rid of a lot of array copies. Since we do not have the mixing ratios in a contiguous array, this would still require additional allocations or copies for this example. Same issue for the aerosol type. An array of pointers (or a linked list?) could allow us to use pointers in this example as well. Would this be an option for you?

Maybe there are some more issues for this approach to use pointers instead of allocations. I have no implementation up to now, but this would be really interesting for us.

The same is true in IFS - but they are not always in the right units to allow associating them directly. Thus, we have to copy them anyway to store the correctly scaled representation. Just double-checking that this isn't the case in ICON?

Apart from that, a list of pointers may make use on GPU a little more tricky (requires attaching the device pointers in addition to the host pointers) but that should be solvable.

The same is true in IFS - but they are not always in the right units to allow associating them directly. Thus, we have to copy them anyway to store the correctly scaled representation. Just double-checking that this isn't the case in ICON?

This is completely dependent on namelist choices and differs for the different species. At least for some, e.g. water vapor, we could set pointers right away.

drieg · 2025-10-20T12:57:18Z

Thanks for this development! With a naive implementation (i.e. still allocating the pointers plus copy instead of pointing to ICON memory) we see a slightly worse performance on NEC. This will surely look better when we get rid of the copies. From our side, please continue, we are looking forward to this change.

wertysas · 2025-10-21T09:20:47Z

Thanks for this development! With a naive implementation (i.e. still allocating the pointers plus copy instead of pointing to ICON memory) we see a slightly worse performance on NEC. This will surely look better when we get rid of the copies. From our side, please continue, we are looking forward to this change.

Thank you for the comments and for testing this out so quickly! We will go ahead and test this with the full IFS.

wertysas · 2025-11-11T08:36:53Z

As I mentioned in the last ecRad technical development meeting, this has been tested with the IFS and the pointer version has the same performance as the old allocatable version for us in the IFS. Thus, I am marking this as ready for review now. Please feel free to review and test these changes with your codes if you would like to and haven't had the chance to do so already!

wertysas and others added 8 commits September 29, 2025 12:35

make set_units_gas non-recursive

d5a1d9d

change allocatables to pointers in ecrad derived types

b88b64d

add pointer version of get methods in easy_netcdf util

39a8d7d

make change_namelist.sh BSD/GNU agnostic

42029bd

make pointer args intent(inout) in easy_netcdf + remove redundant poi…

94fdf13

…nter versions of put and copy routines

clean up ecrad derived types at end of driver

e2f46ce

add throughput metric prints to ecrad drivers

5c889fe

easy_netcd add get_ptr generic overload, work around for intel comiler

e143bd6

github-actions bot requested review from reuterbal and rjhogan September 29, 2025 14:12

wertysas removed the request for review from rjhogan September 29, 2025 14:12

wertysas requested a review from rjhogan October 2, 2025 12:51

drieg reviewed Oct 13, 2025

View reviewed changes

wertysas marked this pull request as ready for review November 11, 2025 08:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch `allocatable` to `pointer` in ecRAD derived types#81

Switch `allocatable` to `pointer` in ecRAD derived types#81
wertysas wants to merge 8 commits intoecmwf-ifs:masterfrom
wertysas:je-develop

wertysas commented Sep 29, 2025 •

edited

Loading

Uh oh!

reuterbal commented Oct 2, 2025

Uh oh!

drieg Oct 13, 2025 •

edited

Loading

Uh oh!

drieg Oct 14, 2025

Uh oh!

reuterbal Oct 14, 2025

Uh oh!

drieg Oct 14, 2025

Uh oh!

drieg Oct 13, 2025

Uh oh!

reuterbal Oct 14, 2025

Uh oh!

drieg Oct 14, 2025

Uh oh!

drieg commented Oct 20, 2025

Uh oh!

wertysas commented Oct 21, 2025

Uh oh!

wertysas commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

wertysas commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

--------------------------------------------

Performance results

Relative Difference in Runtime After Switching from Allocatables to Pointers (ifort 2021.4.0 20210910)

Figures for selected nproma and thread counts

Uh oh!

reuterbal commented Oct 2, 2025

Uh oh!

drieg Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drieg Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

reuterbal Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

drieg Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

drieg Oct 13, 2025

Choose a reason for hiding this comment

Uh oh!

reuterbal Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

drieg Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

drieg commented Oct 20, 2025

Uh oh!

wertysas commented Oct 21, 2025

Uh oh!

wertysas commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wertysas commented Sep 29, 2025 •

edited

Loading

drieg Oct 13, 2025 •

edited

Loading