Skip to content

Separate GetObs into individual tasks per observation#724

Draft
mranst wants to merge 20 commits intodevelopfrom
feature/mranst/individual_obs
Draft

Separate GetObs into individual tasks per observation#724
mranst wants to merge 20 commits intodevelopfrom
feature/mranst/individual_obs

Conversation

@mranst
Copy link
Collaborator

@mranst mranst commented Mar 5, 2026

This is an alternative to #721, we can look at merging either one or the other depending on which makes the most sense. Looping through observations has been removed in favor of fetching each observation as its own task

Pros:
Much simpler to see which observations are failing to fetch and why
Can re-trigger failed tasks easily

Cons:
May complicate flow.cylc somewhat
Implementation may be a bit hacky

In order to limit the amount of concurrent tasks, I introduced a key for this cylc parameter that is limited to 10 by default

Copy link
Contributor

@mer-a-o mer-a-o left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two lines to hofx_cf suite to make GetObservations nested like the image below:

Image

It'll make the tui a little cleaner. Is this something that we can use for other suites too?

@mranst
Copy link
Collaborator Author

mranst commented Mar 6, 2026

It'll make the tui a little cleaner. Is this something that we can use for other suites too?

Cool, I didn't know inherit would group tasks like that in the tui, thanks!

Copy link
Collaborator

@Dooruk Dooruk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the ability to see individual observations, it would tremendously help in a monitoring setting. It's better than #721 in that regard.

One downside is now we would have 30 log files instead of 1 for each GetObs task, which is an IOnode issue over multiple timesteps. But we could look into some other solution for that. I agree it complicates flow.cylc but not too bad IMO.

Does it help with the execution time? I can test it when I'm not abusing R2D2 API calls but maybe you already made comparisons?

@mranst
Copy link
Collaborator Author

mranst commented Mar 12, 2026

I really like the ability to see individual observations, it would tremendously help in a monitoring setting. It's better than #721 in that regard.

One downside is now we would have 30 log files instead of 1 for each GetObs task, which is an IOnode issue over multiple timesteps. But we could look into some other solution for that. I agree it complicates flow.cylc but not too bad IMO.

Does it help with the execution time? I can test it when I'm not abusing R2D2 API calls but maybe you already made comparisons?

IOnodes, always a concern... I'm not sure there's a good solution to this, since having each have their separate log file is part of the appeal of doing it like this

It's a little difficult to gauge because there's not the clean time readout at the end, but from what I can tell from the scheduling log it takes over twice as long as #721 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants