Skip to content

acquire ftp: add overwrite mode and metadata-aware sync for filled-frame use cases #217

@Hackshaven

Description

@Hackshaven

What’s missing

Zyra’s acquire ftp --sync-dir correctly avoids re-downloading files that already exist locally. However, in workflows where missing data is later filled by the upstream FTP server, this causes stale fallback data (e.g., nearest-fill frames) to persist, even after the true data becomes available.

Why it matters

  • Users often pre-fill missing data locally using interpolation or replication.
  • Later, the upstream FTP server may backfill those frames with real data — but the filenames don't change.
  • Zyra skips downloading these improved frames unless the user manually deletes the local copies.
  • In progressive datasets (e.g., reanalysis, delayed QC), this behavior breaks assumptions about data accuracy and freshness.

Proposed solution

Enhance acquire ftp with smarter sync behavior:

📦 Overwrite Logic

  • Default: Overwrite local files if the remote version is newer (based on modification timestamps, if available).
  • --overwrite-existing: Always overwrite local files, regardless of timestamp.
  • --recheck-existing: Compare remote vs. local file size or other metadata when timestamps are missing.
  • --min-remote-size: Only overwrite if the remote file is significantly larger.
  • --prefer-remote: Always trust the remote version.

🧠 Metadata-Aware Sync

Optional new flags for file quality tracking:

  • --prefer-remote-if-meta-newer: Use frames-meta.json timestamps (if present) to detect upstream updates.
  • --skip-if-local-done: Skip download if a .done file exists next to the local copy.
  • --recheck-missing-meta: Re-download any file missing a .meta or .json companion.

These would allow intelligent re-syncing of filled frames using Zyra's own metadata or common FTP workflow conventions.

Benefits

  • Maintains accurate local mirrors as upstream datasets evolve.
  • Prevents fallback data from silently persisting.
  • Avoids manual file deletion or full re-syncs.
  • Supports scientific integrity in progressive/backfilled datasets.
  • Aligns with Zyra’s frames-meta.json system and promotes metadata-driven workflows.

Labels: workflow-gap, enhancement

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestworkflow-gapMissing CLI functionality or structural gap

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions