Skip to content

Adding cycling to 3dvar_cf suite#752

Draft
mer-a-o wants to merge 15 commits intodevelopfrom
feature/mer-a-o/add-3dvar-cf-cycling
Draft

Adding cycling to 3dvar_cf suite#752
mer-a-o wants to merge 15 commits intodevelopfrom
feature/mer-a-o/add-3dvar-cf-cycling

Conversation

@mer-a-o
Copy link
Copy Markdown
Contributor

@mer-a-o mer-a-o commented Mar 26, 2026

Description

This PR adds a new cycling 3DVar suite for GEOS-CF. The experiment is designed to run a 12h forecast starting from the beginning of the assimilation window and apply the increment from the middle of the window using IAU. The forecast length and frequency can be set using forecast_length and forecast_output_frequency but I didn't test changing the forecast length.

Summary of changes:

  • RC files that get templated with SWELL are in src/swell/configuration/jedi/interfaces/geos_cf/namelists. The remaining RC files (static) are copied into the scratch directory from geos_cf_run_dir.

  • get_background task fetches backgrounds from R2D2 using background_experiment in the config as the experiment ID. In cycling experiments (those with "cycle" in their name), the experiment ID is set to background_experiment for the first cycle. For subsequent cycles, it is set to the r2d2_experiment_id of the current experiment, so that backgrounds from the previous cycle are fetched for the variational task in the current one.

  • save_forecast task stores forecasts in R2D2 using r2d2_experiment_id as the experiment ID. These files are then fetched from R2D2 during the get_background task.

  • Restarts are saved (save_restart) and fetched (get_restart) from R2D2. To save space, restart files are not stored on R2D2 at every cycle. The rst_store_interval key controls how many cycles pass before restart files are stored as real files rather than symlinks. In intermediate cycles, restarts are saved as symlinks.

  • prep_forecast prepares the scratch directory within the current cycle's run directory:

    • Copies static RC files from geos_cf_run_dir and copies/edits RC files from src/swell/configuration/jedi/interfaces/geos_cf/namelists.
    • Copies GEOS-FP files for replay into the scratch directory.
    • Changes the format of the JEDI increment file using inc_template.
  • run_forecast submits gcm_run_geoscf.j (a lighter version of gcm_run.j used for running GEOS-CF) to the queue and waits until the job finishes. One limitation: the Cylc interface shows this job as "running" whether it is waiting in the queue or actually executing. There may be better approaches for handling gcm_run.j execution.

  • clean_cycle.py now cleans the scratch directory of the previous cycle after the current cycle completes. This way, the restart files from the previous cycle that are saved as symlinks are kept until the current cycle uses them.

@mer-a-o mer-a-o requested review from jeromebarre and viral211 March 26, 2026 22:40
@mer-a-o mer-a-o added the compo Atmospheric composition related issues label Mar 26, 2026
@mer-a-o mer-a-o requested a review from rtodling March 30, 2026 14:05
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file should not exist in the repo - it should only exist in the experiment. A way to avoid this file being in the repo is to have swell create this file based on the initial date/time of the experiment. From there on, the experiment (GCM) will create this file on its own; the file can then be recycled from one cycle to the next.

END_DATE: 29990302 210000

# length of a sigle segment/job 1day
JOB_SGMT: >>SWELL_FC_LENGTH<<
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with existing conventions in GEOS can we make:
'>>SWELL_FC_LENGTH<< '
into:
'>>>SWELL_FC_LENGTH<<<'
?

I imagine there are a number of other places with similar templating notation. Can the same comment apply to all?

# -------------------------
RECORD_FINAL: NO
RECORD_FREQUENCY: 000000 000000
RECORD_REF_DATE: >>SWELL_FC6H_DATE<< 21220215
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here too - please convert doube gt lt into triple gt / lt.

PC360x181-DC.LM: 72


geoscf_jedi.template: '%y4%m2%d2T%h2%n200Z.nc4',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd to me. If I understand this correctly, the cycle you do for CF is no different than what we do for ATMOS - which is a 12 hour cycle in which the first 6 hours model is forced by the analysis (and updated with Compo-Ana), followed by a 6-hour free running model. The prognostic files (backgrounds to the next Compo-Ana) should come out of the last 6-hour part of the integration, but it looks to me what you have here is the backgrounds being written at some frequency throughout the 12-hour integration. I don't think that's what you want.

geoscf_jedi.format: 'CFIO',
geoscf_jedi.mode: 'instantaneous' ,
#geoscf_jedi.frequency: 010000,
geoscf_jedi.frequency: >>SWELL_FC_OUTPUT_FREQ<<,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should talk about the naming conventions for these templated frequencies and such.

bkg_steps = []

# Parse config
background_experiment = self.config.background_experiment()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to understand the reason for the changes here. Is this because the backgrounds for various cycles for the swell experiment can be located outside of r2d2?

# --------------------------------------------------------------------------------------------------

r2d2_model_dict = {
'geos_cf': 'geos_cf',
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the name and location of this task, I assume this is a general task to hand restarts - but this seems wired to geos_cf ... am I misreading?

It seems that restarts are being handled by r2d2 - is there a way to extend this so that restarts can also be in tar-ball ? Not asking for it to be implemented now, but I'm wondering. The idea of lose restarts is something the ADAS moved away quite sometime ago given the sheer number of restarts and what those do to inode count.

@@ -0,0 +1,19 @@
SpeciesName: CO
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I puzzled as to why this RC and other RCs and GEOS yaml need to be here. These files should be in the tag of GEOS that supports CF, no? What am I missing?

@@ -0,0 +1,20 @@
SpeciesName: NO2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above for RC files ...

@@ -0,0 +1,2769 @@
Samplings:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as for the RC files - this should be in the GEOS tag and in the settings of your experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compo Atmospheric composition related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants