Skip to content

Fix globbing CWD bug and temp file race in Archive task#7

Closed
Copilot wants to merge 2 commits intofeature/arch_cycfrom
copilot/fix-relative-path-globbing-bug
Closed

Fix globbing CWD bug and temp file race in Archive task#7
Copilot wants to merge 2 commits intofeature/arch_cycfrom
copilot/fix-relative-path-globbing-bug

Conversation

Copy link

Copilot AI commented Feb 24, 2026

Description

Two bugs in ush/python/pygfs/task/archive.py:

  1. Relative-path glob resolution_process_additional_datasets() converts dataset paths to ROTDIR-relative paths, but _create_fileset() expanded globs against the process CWD, not ROTDIR, causing required files to be falsely reported missing.

  2. Temp file concurrency racereplace_string_from_to_file() (inside _rename_cyclone_expt()) wrote to a hardcoded /tmp/track_file before moving it to the destination. Multiple concurrent jobs on the same node would clobber each other's temp file.

Fix 1 — _create_fileset(): glob relative to ROTDIR

Added optional rotdir parameter; wraps glob expansion in with chdir(rotdir) if rotdir is not None else contextlib.nullcontext(). Call site in _process_additional_datasets() now passes arch_dict['ROTDIR'].

# before
dataset["fileset"] = Archive._create_fileset(dataset)

# after
dataset["fileset"] = Archive._create_fileset(dataset, arch_dict['ROTDIR'])

Fix 2 — replace_string_from_to_file(): unique temp file per invocation

Replaced fixed /tmp/track_file with tempfile.NamedTemporaryFile(delete=False, dir=os.path.dirname(filename_out)) so each concurrent job gets a unique temp file in the destination directory (same filesystem → atomic rename). Added except block to unlink the temp file on failure.

New imports: contextlib, tempfile.

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this change expected to change outputs (e.g. value changes to existing outputs, new files stored in COM, files removed from COM, filename changes, additions/subtractions to archives)? NO
    • GFS
    • GEFS
    • SFS
    • GCAFS
  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO
    • EMC verif-global
    • GDAS
    • GFS-utils
    • GSI
    • GSI-monitor
    • GSI-utils
    • UFS-utils
    • UFS-weather-model
    • wxflow

How has this been tested?

  • Code review and static analysis (CodeQL: 0 alerts)

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary
Original prompt

Create a minimal, low-risk change set to address two issues in the Archive task.

Repository: RussTreadon-NOAA/global-workflow
Base branch: feature/arch_cyc

Changes required

  1. Fix relative-path globbing bug in ush/python/pygfs/task/archive.py.

    • Current behavior: _process_additional_datasets() converts rendered dataset required/optional entries from absolute paths to relative paths (relative to ROTDIR) via _convert_dataset_paths_to_relative(). Then _create_fileset() expands globs using glob.glob(item), which operates relative to the current working directory, not necessarily ROTDIR. This can incorrectly report required files missing.
    • Minimal fix: Ensure glob expansion for these relative dataset entries occurs with ROTDIR as the working directory.
    • Implementation guidance (minimal scope):
      • Update Archive._create_fileset() to accept an optional rotdir argument (or infer from atardir_set if present) and, when provided, run the globbing logic inside with chdir(rotdir): ... (the module already imports chdir from wxflow).
      • Update the call site in _process_additional_datasets() to pass arch_dict['ROTDIR'].
      • Keep behavior otherwise unchanged: required files still error if missing; optional still warns.
  2. Fix concurrency race in _rename_cyclone_expt() within ush/python/pygfs/task/archive.py.

    • Current behavior: nested helper replace_string_from_to_file() writes a fixed temp file /tmp/track_file and then moves it to the final output path. This is unsafe when multiple jobs run concurrently on the same node.
    • Minimal fix: Use a unique temporary file via tempfile.NamedTemporaryFile(delete=False, ...), preferably created in the destination directory (or system temp dir) and then atomically moved into place with shutil.move().
    • Ensure any temporary file is cleaned up on exception.

Constraints

  • Keep the patch minimal and low-risk.
  • No other functional changes beyond the two fixes above.
  • Update imports as needed.
  • Run/adjust any lint/formatting only as necessary for the changes.

Files in scope

  • ush/python/pygfs/task/archive.py

Notes

  • The user also referenced ush/python/pygfs/utils/archive_tar_vars.py but requested changes are in archive.py.

This pull request was created from Copilot chat.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: RussTreadon-NOAA <26926959+RussTreadon-NOAA@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix relative-path globbing bug in archive task Fix globbing CWD bug and temp file race in Archive task Feb 24, 2026
@RussTreadon-NOAA
Copy link
Owner

Good changes but outside the scope of this branch.

Will close now and possibly revisit later in a different branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants