Skip to content

Refactor archive methods for robustness and consistency#5

Merged
RussTreadon-NOAA merged 2 commits intofeature/arch_cycfrom
copilot/refactor-arch-methods
Feb 18, 2026
Merged

Refactor archive methods for robustness and consistency#5
RussTreadon-NOAA merged 2 commits intofeature/arch_cycfrom
copilot/refactor-arch-methods

Conversation

Copy link

Copilot AI commented Feb 18, 2026

Description

Refactors _arch_warm_start_increments and _arch_warm_restart_ics in ush/python/pygfs/task/archive.py to eliminate code duplication, add input validation, and fix fragile string matching that could cause incorrect archiving behavior.

Changes

New helper method _normalize_arch_cyc:

  • Centralizes ARCH_CYC normalization (int → list[int], tuple → list[int])
  • Validates inputs with descriptive errors
  • Eliminates duplicate normalization logic in both methods

Improved _arch_warm_start_increments:

  • Validates ARCH_FCSTICFREQ with explicit error handling
  • Early returns reduce nesting
  • Complete parameter documentation

Improved _arch_warm_restart_ics:

  • Replaces magic number 24 with HOURS_PER_DAY constant
  • Case-insensitive RUN matching (RUN.lower()) prevents case-sensitivity bugs
  • Uses .startswith("gdas") instead of "gdas" in RUN to avoid false substring matches (e.g., "mygdas_test")
  • Explicitly handles both "gdas" and "enkfgdas" runs
  • Validates ARCH_WARMICFREQ with explicit error handling

Example of improved string matching:

# Before: fragile substring match
if "gdas" in RUN:  # Matches "gdas", "GDAS", "mygdas", "gdas_test"

# After: robust prefix match
is_gdas_run = RUN.startswith("gdas") or RUN.startswith("enkfgdas")

All changes maintain backward compatibility with existing configurations.

Type of change

  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this change expected to change outputs (e.g. value changes to existing outputs, new files stored in COM, files removed from COM, filename changes, additions/subtractions to archives)? NO
  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

Unit tests created covering:

  • ARCH_CYC normalization (int, list, tuple, string conversion, error cases)
  • Cycle hour matching for both methods
  • GDAS cycle hour adjustment logic
  • Case-insensitive RUN matching
  • Frequency validation (zero, negative, positive)

Code passes pycodestyle validation.

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • Any new scripts have been added to the .github/CODEOWNERS file with owners
  • I have made corresponding changes to the system documentation if necessary
Original prompt

Problem Statement

Refactor the _arch_warm_start_increments and _arch_warm_restart_ics methods in ush/python/pygfs/task/archive.py to improve code quality, consistency, and robustness.

Current Issues

  1. Inconsistent ARCH_CYC handling: _arch_warm_start_increments only handles integers, while both methods should support integers or lists of integers
  2. Missing input validation: No checks for required fields or valid types
  3. Fragile string matching: In _arch_warm_restart_ics, using "gdas" in RUN is case-sensitive and could match unintended substrings
  4. Magic numbers: The value 24 (hours per day) appears without explanation
  5. Poor variable naming: Variables like arch_cyc_val and arch_cyc could be clearer
  6. Incomplete docstrings: Missing parameter and return documentation
  7. Code duplication: Both methods have similar logic for normalizing ARCH_CYC

Required Changes

1. Add Helper Method _normalize_arch_cyc

Add a new private method after line 751 (before _arch_warm_start_increments) to handle normalization of ARCH_CYC to a list of integers:

def _normalize_arch_cyc(self, arch_cyc: Union[int, List[int], Tuple[int, ...]]) -> List[int]:
    """
    Normalizes ARCH_CYC configuration to a list of integers.

    Parameters
    ----------
    arch_cyc : int, list of int, or tuple of int
        Cycle hour(s) for archiving

    Returns
    -------
    List[int]
        List of cycle hours as integers

    Raises
    ------
    ValueError
        If arch_cyc is not an int, list, or tuple, or contains non-integer values
    """
    if isinstance(arch_cyc, int):
        return [arch_cyc]
    elif isinstance(arch_cyc, (list, tuple)):
        try:
            return [int(cyc) for cyc in arch_cyc]
        except (ValueError, TypeError) as e:
            raise ValueError(f"ARCH_CYC list must contain only integers: {e}")
    else:
        raise ValueError("ARCH_CYC must be an int or list/tuple of ints.")

Note: Add the required import at the top of the file:

  • Add Union and Tuple to the imports from typing on line 8

2. Refactor _arch_warm_start_increments (lines 752-778)

Replace the existing method with:

def _arch_warm_start_increments(self, arch_dict: AttrDict) -> bool:
    """
    Determines whether warm restart increments should be archived for the current cycle.

    Parameters
    ----------
    arch_dict : AttrDict
        Dictionary containing configuration options, including:
        - ARCH_CYC (int or list of int): Valid cycle hours for archiving
        - ARCH_FCSTICFREQ (int): Frequency in days for archiving forecast ICs
        - current_cycle (datetime): The current cycle datetime
        - SDATE (datetime): Reference start date
        - assim_freq (int or str): Assimilation frequency in hours

    Returns
    -------
    bool
        True if warm restart increments should be archived, False otherwise.
    """

    # Normalize ARCH_CYC to a list of integers
    cycle_hours = self._normalize_arch_cyc(arch_dict.ARCH_CYC)

    # Check if current cycle hour matches any configured cycle hour
    cycle_HH = int(strftime(arch_dict.current_cycle, "%H"))
    if cycle_HH not in cycle_hours:
        return False

    # Validate archiving frequency
    try:
        fcsticfreq = int(arch_dict.ARCH_FCSTICFREQ)
    except (ValueError, TypeError) as e:
        raise ValueError(f"ARCH_FCSTICFREQ must be an integer: {e}")

    if fcsticfreq <= 0:
        return False

    # Calculate offset cycle and check day frequency
    try:
        SDATE = arch_dict.SDATE
        assim_freq = int(arch_dict.assim_freq)
        ics_offset_cycle = add_to_datetime(
            arch_dict.current_cycle, 
            to_timedelta(f"+{assim_freq}H")
        )
    except (AttributeError, KeyError, ValueError, TypeError) as e:
        raise ValueError(f"Invalid configuration for date calculations: {e}")

    days_since_sdate = (ics_offset_cycle - SDATE).days
    if days_since_sdate % fcsticfreq == 0:
        return True

    return False

3. Refactor _arch_warm_restart_ics (lines 780-811)

Replace the existing method with:

def _arch_warm_restart_ics(self, arch_dict: AttrDict) -> bool:
    """
    Determines whether warm initial conditions (ICs) should be archived for the current cycle.

    For GDAS and EnKFGDAS runs, the archive cycle hours are adjusted by subtracting the
    assimilation frequency, as ICs lag forecast increments by that amount.

    Parameters
    ----------
    arch_dict : AttrDict
        Dictionary containing configuration options, including:
        - ARCH_CYC (int or list of int): Target cycle hour(s) for archiving
        - ARCH_WARMICFREQ (int): Frequency in days for archiving warm ICs
        - current_cycle (datetime): The current cycle datetime
        - SDATE (datetime): Reference start date
        - assim_freq (int or str): Assimilation frequency in hours
        - RUN (str): Run typ...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

<!-- START COPILOT CODING AGENT TIPS -->
---Let Copilot coding agent [set things up for you](https://github.com/RussTreadon-NOAA/global-workflow/issues/new?title=+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.

- Added Union and Tuple to typing imports
- Added _normalize_arch_cyc helper method for DRY ARCH_CYC normalization
- Refactored _arch_warm_start_increments with better validation and docstrings
- Refactored _arch_warm_restart_ics with improved error handling and case-insensitive RUN matching
- Replaced magic number 24 with HOURS_PER_DAY constant
- Enhanced docstrings with complete parameter and return documentation
- Improved variable names for clarity (cycle_hours, adjusted_cycle_hours)
- Added comprehensive error handling with descriptive messages
- Used .startswith() for more robust GDAS detection
- All changes validated with tests

Co-authored-by: RussTreadon-NOAA <26926959+RussTreadon-NOAA@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor methods for arch warm start and restart Refactor archive methods for robustness and consistency Feb 18, 2026
Copy link
Owner

@RussTreadon-NOAA RussTreadon-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Approve.

@RussTreadon-NOAA RussTreadon-NOAA marked this pull request as ready for review February 18, 2026 14:14
@RussTreadon-NOAA RussTreadon-NOAA merged commit 89ee509 into feature/arch_cyc Feb 18, 2026
6 checks passed
@RussTreadon-NOAA RussTreadon-NOAA deleted the copilot/refactor-arch-methods branch February 18, 2026 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants