Skip to content

add a stream attribute output_done_marker to allow generating a 'done' file after stream writing is completed #237

@guoqing-noaa

Description

@guoqing-noaa

In our realtime/retro ensembles runs, we got random getkf crashes which were troublesome and difficult to diagnose. This issue was reported in NOAA-EMC/rrfs-workflow#1395.

The root cause was traced to partially written mapsout files when a forecast terminated unexpectedly during the file-writing process. This can occur for various reasons, such as system glitches or slow disk performance preventing complete output within the allocated walltime.

A key contributing factor is that it is currently difficult to reliably determine whether an mpasout file has been fully written.

We added a new stream attribute output_done_marker to allow generating a 'done' file once stream writing is completed.
RRFSx#20

This greatly streamlines our workflow, eliminating crashes and reducing the need for complex validation logic.
NOAA-EMC/rrfs-workflow#1401

Additionally, we no longer need to wait several minutes (typically around 5 minutes) for a new file to become “old” (to make sure the writing is completed) before triggering downstream tasks. Given the large number of ensembles and cycles we run, this results in substantial cumulative time savings.

This change depends on the merge of PR #217, where the newly added stream_mgr_set_property_c(...) function provides a convenient way to set stream attributes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions