Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of ERA5T added. #89

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
Open

Support of ERA5T added. #89

wants to merge 12 commits into from

Conversation

dabhicusp
Copy link
Collaborator

In the current ARCO-ERA5, files are updated from ECMWF on a monthly cadence (on roughly the 9th of each month) with a 3 month delay. This PR added support of ERA5T, in which files are updated on a daily cadence with a 6 days delay.

This new ARCO-ERA5 working into the 3 CRON-JOB.

  1. Running cron-job daily which download ERA5T data for the AR & CO(Model Level) - ( Download data -> 6 day behind the current_day).
  2. Running cron-job on monthly(6th day of every month) which download ERA5T data for the CO(Single Level) - ( Download data -> previous last month).
  3. Running cron-job on monthly(9th day of every month) which download ERA5 data for all - ( Download data -> Third previous month).
    • Download raw data at temp_location.
    • Compare the ERA5 data with the ERA5T & if found the difference, update it.
    • Update raw ERA5T data which downloaded through step 1 & 2 with this ERA5.

Note: Actually we got the data 5 day behind the current_day but we took 1 day's buffer for the safety purposes.

@fzeiser
Copy link

fzeiser commented Jan 20, 2025

Great addition! Could you describe whether this adds support for era5T or_replaces_ era5 with era5T?

From a brief look at the code it seems to me like the result would be a single file with:

  • era5, up to real time minus ~3 month
  • era5T, [real time minus ~3 month, real time minus ~6 days]
    Is that correct? I think it is a great addition to be able to retrieve era5T also from a fast source and in the same format. It might be interesting to keep this as two datasets though, so that you can reproduce a previous state, e.g. retrieve the exact same data as one had 1 year ago. Of course, I understand that storage space needs to be addressed, too.

@DarshanSP19
Copy link
Collaborator

DarshanSP19 commented Jan 22, 2025

Great addition! Could you describe whether this adds support for era5T or_replaces_ era5 with era5T?

From a brief look at the code it seems to me like the result would be a single file with:

  • era5, up to real time minus ~3 month
  • era5T, [real time minus ~3 month, real time minus ~6 days]
    Is that correct? I think it is a great addition to be able to retrieve era5T also from a fast source and in the same format. It might be interesting to keep this as two datasets though, so that you can reproduce a previous state, e.g. retrieve the exact same data as one had 1 year ago. Of course, I understand that storage space needs to be addressed, too.

@fzeiser Just a correction in the 2nd part.

  • era5T, [real time minus ~1 month, real time minus ~6 days]

And yes there isn't any plan to maintain two different datasets (era5 and era5T). Also era5 will only replace the data if there will be any data discrepancy between era5 and era5T in future. That happens very rarely. Will also notify here as well if there is any such case.

@fzeiser
Copy link

fzeiser commented Jan 23, 2025

Woud it make sense to include the "expver" coordinate, from which one can directly read whether data from a given day is from era5, or era5T?

I searched the internet for some more information, and it seems like also cds just overrides the era5T values once era5 is available. (I understand that it might be technically slightly different from what you propose, as you want to override only if values have changed. The effect when downloading the data is of course the same though.):
https://forum.ecmwf.int/t/final-validated-era5-product-to-differ-from-era5t-in-july-2024/6685/6

Could we link to a page which list differences in era5 and era5T that is officially maintained? Otherwise, have a automatic log file somewhere? I know of only ~4-5 incidences since 2021, see e.g.

But would be nice to have a single combined overview of occurrences.

@dabhicusp
Copy link
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants