Skip to content

Branch of fesom2.6 including recom and tracer parallelisation#681

Open
a270105 wants to merge 85 commits intomainfrom
fesom2.6_recom_tp
Open

Branch of fesom2.6 including recom and tracer parallelisation#681
a270105 wants to merge 85 commits intomainfrom
fesom2.6_recom_tp

Conversation

@a270105
Copy link
Collaborator

@a270105 a270105 commented Jan 29, 2025

Tracer parallelisation can be switched on by __usetp in fesom2/src/CMakeLists.txt and onlye used when recom is on.

ogurses and others added 8 commits October 15, 2024 15:08
Define new variables to track tracer changes
due to advection and diffusion.

We want to save for now diffusion and advection
contribution to the tracer changes. Horizontal and
vertical diffusion includes Redi
parametrization (if it is set .true.).
Fill __ciso directive to ensure that
carbon isotope code works. Medusa interface is
added.
@a270105
Copy link
Collaborator Author

a270105 commented Jan 29, 2025

By compilation I found an issue regrading new compilations. I need to manually delete the executable file fesom.x in fesom2.build/bin and then compile the model. Otherwise, after a successful compilation I will still find the old fesom.x in fesom2/bin.

Copy link
Collaborator

@JanStreffing JanStreffing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of review:
There were a few places that need changing, some that may not need to change but would require a closer look that everything is indeed ok. There are a number of style improvements to be made. All comment that start with ! kh date should start without the name and date. Same for !YY, !O:G !OG and !.OG. Git blame can tell you this info.

In addition to the standalone FESOM2 CI tests, this branch needs to be tested by me as part of AWI-CM3, by @ackerlar as part of AWI-ESM2, by @sebastianbeyer or @suvarchal as part of IFS-FESOM, by @mbutzin with active tracers. I'm assuming here, that you have already tested this branch with recom and without _usetp

CMakeLists.txt Outdated
set(USE_ICEPACK OFF CACHE BOOL "compile fesom with the Iceapck modules for sea ice column physics.")
set(OPENMP_REPRODUCIBLE OFF CACHE BOOL "serialize OpenMP loops that are critical for reproducible results")
set(RECOM_COUPLED OFF CACHE BOOL "compile fesom including biogeochemistry, REcoM3")
set(RECOM_COUPLED ON CACHE BOOL "compile fesom including biogeochemistry, REcoM3")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be turned back for the main branch.

integer :: AB_order=2
integer :: ID
!___________________________________________________________________________
! TODO: Make it as a part of namelist.tra
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be done before more?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to clarify with Özgür

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JanStreffing
Copy link
Collaborator

In addition to the changes of the review, the merge conflicts should be solved. Most of them look like simple double additions of new features that both have to be kept. e.g. parallel tracers and icebergs.

@JanStreffing JanStreffing added the enhancement New feature or request label Feb 3, 2025
@ackerlar
Copy link
Collaborator

ackerlar commented Feb 5, 2025

Is there any template fesom-recom yaml that I can use for testing?

@ackerlar
Copy link
Collaborator

ackerlar commented Feb 5, 2025

I get this error during compilation:

/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/associate_mesh_ass.h(57): warning #5117: Bad # preprocessor line
#if defined(__async_icebergs)
-^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/associate_mesh_ass.h(59): warning #5117: Bad # preprocessor line
#endif
-^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90(363): error #6784: The number of actual arguments cannot be greater than the number of dummy arguments.   [OASIS_INIT_COMP]
    CALL oasis_init_comp(comp_id, comp_name, ierror, num_program_groups = num_fesom_groups)
---------^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90(961): error #6404: This name does not have a type, and must have an explicit type.   [MPI_COMM_FESOM_SAME_RANK_IN_GROUPS]
      call MPI_Bcast(action, 1, MPI_LOGICAL, 0, MPI_COMM_FESOM_SAME_RANK_IN_GROUPS, MPIerr)
------------------------------------------------^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90(961): error #6404: This name does not have a type, and must have an explicit type.   [MPIERR]
      call MPI_Bcast(action, 1, MPI_LOGICAL, 0, MPI_COMM_FESOM_SAME_RANK_IN_GROUPS, MPIerr)
------------------------------------------------------------------------------------^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90(976): error #6404: This name does not have a type, and must have an explicit type.   [MYDIM_NOD2D]
          call MPI_Bcast(data_array, myDim_nod2d, MPI_DOUBLE_PRECISION, 0, MPI_COMM_FESOM_SAME_RANK_IN_GROUPS, MPIerr)
-------------------------------------^
compilation aborted for /home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90 (code 1)
make[2]: *** [src/CMakeFiles/fesom.dir/build.make:205: src/CMakeFiles/fesom.dir/cpl_driver.F90.o] Error 1

@a270105
Copy link
Collaborator Author

a270105 commented Feb 5, 2025

I am still testing the code. So far I can run ocean-only without __usetp in CMakeLists and compile the model with __usetp. But I have not yet tested the coupled setup.

@a270105
Copy link
Collaborator Author

a270105 commented Feb 5, 2025

I get this error during compilation:

/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/associate_mesh_ass.h(57): warning #5117: Bad # preprocessor line
#if defined(__async_icebergs)
-^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/associate_mesh_ass.h(59): warning #5117: Bad # preprocessor line
#endif
-^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90(363): error #6784: The number of actual arguments cannot be greater than the number of dummy arguments.   [OASIS_INIT_COMP]
    CALL oasis_init_comp(comp_id, comp_name, ierror, num_program_groups = num_fesom_groups)
---------^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90(961): error #6404: This name does not have a type, and must have an explicit type.   [MPI_COMM_FESOM_SAME_RANK_IN_GROUPS]
      call MPI_Bcast(action, 1, MPI_LOGICAL, 0, MPI_COMM_FESOM_SAME_RANK_IN_GROUPS, MPIerr)
------------------------------------------------^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90(961): error #6404: This name does not have a type, and must have an explicit type.   [MPIERR]
      call MPI_Bcast(action, 1, MPI_LOGICAL, 0, MPI_COMM_FESOM_SAME_RANK_IN_GROUPS, MPIerr)
------------------------------------------------------------------------------------^
/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90(976): error #6404: This name does not have a type, and must have an explicit type.   [MYDIM_NOD2D]
          call MPI_Bcast(data_array, myDim_nod2d, MPI_DOUBLE_PRECISION, 0, MPI_COMM_FESOM_SAME_RANK_IN_GROUPS, MPIerr)
-------------------------------------^
compilation aborted for /home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90 (code 1)
make[2]: *** [src/CMakeFiles/fesom.dir/build.make:205: src/CMakeFiles/fesom.dir/cpl_driver.F90.o] Error 1

I just compiled it with FESOM_COUPLED ON and didn't get any error. Did you compile the model with esm-tools?

@ackerlar
Copy link
Collaborator

ackerlar commented Feb 5, 2025

@a270105 yes, I used esm_tools: esm_master get-awiesm-2.6, changed the FESOM branch to fesom2.6_recom_tp and then esm_master comp-awiesm-2.6

which oasis version are you using?

@ackerlar
Copy link
Collaborator

ackerlar commented Feb 5, 2025

I got at least rid of this error

/home/a/a270124/model_codes/testing_and_debugging/awiesm-2.6/fesom-2.6/src/cpl_driver.F90(363): error #6784: The number of actual arguments cannot be greater than the number of dummy arguments.   [OASIS_INIT_COMP]
    CALL oasis_init_comp(comp_id, comp_name, ierror, num_program_groups = num_fesom_groups)

when switching to oasis branch feat/multi-group-support. Before I used 2.8mct-awiesm-2.1. However, the other errors remain

call cpl_oasis3mct_init(f%partit,f%partit%MPI_COMM_FESOM)
! call cpl_oasis3mct_init(f%partit,f%partit%MPI_COMM_FESOM)
! kh 02.12.21
#if defined(__recom) && defined(__usetp)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"localCommunicator" is neither defined in fesom_module nor in MOD_PARTIT. Should this really be "localCommunicator" or "f%partit%MPI_COMM_FESOM"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't really answer this. These lines already exited in fesom2.5-recom and another earlier version where Özgür and I merged their complex recom version and my simpler paleo version. But I can't find them in my paleo version where Kai first time implemented tracer parallelisation. There this part of code is still in fvom_main.f90 and only contains:
#if defined (__oasis)

! kh 21.03.22 pass num_fesom_groups to coupler
call cpl_oasis3mct_init(MPI_COMM_FESOM, num_fesom_groups)
#endif
t1 = MPI_Wtime()

call par_init

......

So I think it should be for the case with tracer parallelisation:
call cpl_oasis3mct_init(f%partit, f%partit%MPI_COMM_FESOM, num_fesom_groups)

@ackerlar
Copy link
Collaborator

ackerlar commented Feb 5, 2025

FESOM compiled for me with some minor changes in MPI calls. I also changed

-        call cpl_oasis3mct_init(f%partit, f%partit%localCommunicator, num_fesom_groups)
+        call cpl_oasis3mct_init(f%partit, f%partit%MPI_COMM_FESOM, num_fesom_groups)

in fesom_module as localCommunicator is not defined outside of cpl_driver. Please check whether this makes sense.

However, the model crashes as FESOM is missing some entries in namelist.config. Is there a recom specific namelist.config?

@a270105
Copy link
Collaborator Author

a270105 commented Feb 5, 2025

@a270105 yes, I used esm_tools: esm_master get-awiesm-2.6, changed the FESOM branch to fesom2.6_recom_tp and then esm_master comp-awiesm-2.6

which oasis version are you using?

@a270105 a270105 closed this Feb 5, 2025
@a270105 a270105 reopened this Feb 5, 2025
@a270105
Copy link
Collaborator Author

a270105 commented Feb 5, 2025

FESOM compiled for me with some minor changes in MPI calls. I also changed

-        call cpl_oasis3mct_init(f%partit, f%partit%localCommunicator, num_fesom_groups)
+        call cpl_oasis3mct_init(f%partit, f%partit%MPI_COMM_FESOM, num_fesom_groups)

in fesom_module as localCommunicator is not defined outside of cpl_driver. Please check whether this makes sense.

However, the model crashes as FESOM is missing some entries in namelist.config. Is there a recom specific namelist.config?

Do you compile with __usetp? If not, there is no additional entry needed in namelist.config.

@ackerlar
Copy link
Collaborator

ackerlar commented Feb 6, 2025

Do you compile with __usetp? If not, there is no additional entry needed in namelist.config.

Yes, I set RECOM_COUPLED=ON which gives

if(${RECOM_COUPLED})
#   target_compile_definitions(${PROJECT_NAME} PRIVATE __recom USE_PRECISION=2 __usetp)
#   target_compile_definitions(${PROJECT_NAME} PRIVATE __recom USE_PRECISION=2)
   target_compile_definitions(${PROJECT_NAME} PRIVATE __recom USE_PRECISION=2 __3Zoo2Det __coccos __usetp)
endif()

in src/CMakeLists.txt and __usetp should be used if I understand correctly.

@JanStreffing JanStreffing modified the milestones: FESOM 2.7.1, FESOM 2.8 Feb 18, 2026
@JanStreffing
Copy link
Collaborator

@suvarchal, Can you give an estimate until when you can provide an alternative RECOM parallelization method? If not, I suggest we go ahead with this merge.

@JanStreffing
Copy link
Collaborator

@suvarchal agrees to merge this. However the branch is now in a state where some of the tests are failing.
Can you have a look @a270105? In particular:

 Error termination. Backtrace:
At line 430 of file /__w/fesom2/fesom2/src/gen_model_setup.F90
Fortran runtime error: End of file

As soon as the tests are running we can merge. Afterward we can work on getting this to run inside AWI-ESM3.

@a270105
Copy link
Collaborator Author

a270105 commented Mar 3, 2026

I think that we do not need the subroutine read_namelist_run_config any more, since the only parameter that is needed for tracer parallelisation is num_fesom_groups and is read in gen_model_setup.F90 (L165):
#if defined(__recom) && defined(__usetp) ! number of groups for multi FESOM group loop parallelization integer :: num_fesom_groups=1 namelist /run_config/ num_fesom_groups #endif

@JanStreffing
Copy link
Collaborator

I think that we do not need the subroutine read_namelist_run_config any more, since the only parameter that is needed for tracer parallelisation is num_fesom_groups and is read in gen_model_setup.F90 (L165): #if defined(__recom) && defined(__usetp) ! number of groups for multi FESOM group loop parallelization integer :: num_fesom_groups=1 namelist /run_config/ num_fesom_groups #endif

Feel free to remove. I think what I tried broke it even more.

@a270105
Copy link
Collaborator Author

a270105 commented Mar 3, 2026

I just commented the line in fesom_module.F90 to call this subrountine and compiled the code.
Should all tests be passed for this branch? At least I need different namelist.config, namelist.tra and namelist.recom to run my tests.

@JanStreffing
Copy link
Collaborator

JanStreffing commented Mar 3, 2026

These tests all need to work, yes. @suvarchal, can you have a look with @a270105 ?

@suvarchal
Copy link
Collaborator

These tests all need to work, yes. @suvarchal, can you have a look with @a270105 ?

give me a day, i will fix these tests, we can hopefully merge then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request RECOM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Porting fast FESOM/RECOM to main branch

8 participants