Skip to content

added argo-workflows scraper#3392

Open
kinjalh wants to merge 1 commit intomasterfrom
compat-scraper-argo-wfs
Open

added argo-workflows scraper#3392
kinjalh wants to merge 1 commit intomasterfrom
compat-scraper-argo-wfs

Conversation

@kinjalh
Copy link
Copy Markdown
Member

@kinjalh kinjalh commented Apr 6, 2026

Test Plan

tested via CLI commands

Checklist

  • If required, I have updated the Plural documentation accordingly.
  • I have added tests to cover my changes.
  • I have added a meaningful title and summary to convey the impact of this PR to a user.

Plural Flow: console

@kinjalh kinjalh added the enhancement New feature or request label Apr 6, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 6, 2026

Greptile Summary

This PR adds a new compatibility scraper for argo-workflows, following the same conventions as sibling scrapers in utils/compatibility/scrapers/. It introduces a Python scraper, a static compatibility metadata YAML, and a manifest entry so the main runner picks it up automatically.

  • The scraper primarily determines supported K8s versions by fetching hack/k8s-versions.sh from each GitHub tag; it falls back to the timestamp-based inference (used by argo-cd.py) when that file is absent.
  • argo-workflows.yaml correctly mirrors the structure of other entries (icon, git/release URLs, Helm repository URL, chart name).
  • manifest.yaml correctly inserts argo-workflows in the Argo-tools group between argo-cd and vector.
  • Two minor style concerns: (1) fetch_k8s_versions_from_tag is called for every tag with no throttling, generating many sequential HTTP requests; (2) fetch_github_tags fetches up to 4 pages of tags (400) while get_github_releases_timestamps caps at 2 pages (200 releases), so older tags lacking hack/k8s-versions.sh may be silently skipped.

Confidence Score: 4/5

Safe to merge; all changes are additive and follow established codebase conventions.

The scraper closely mirrors existing scrapers (argo-rollouts, argo-cd) in structure and delegation to shared utilities. No critical logic errors were found. The three flagged items are all style/best-practice suggestions: per-tag HTTP request volume, a page-count asymmetry that may silently skip old versions, and a missing eolApiSlug that other addons include. None of these prevent the scraper from producing correct output for recent versions.

utils/compatibility/scrapers/argo-workflows.py — review the per-tag HTTP call pattern and the page-count mismatch with get_github_releases_timestamps; static/compatibilities/argo-workflows.yaml — consider adding eolApiSlug

Important Files Changed

Filename Overview
utils/compatibility/scrapers/argo-workflows.py New scraper for argo-workflows; fetches K8s versions from hack/k8s-versions.sh per tag with a timestamp-based fallback; makes one unthrottled HTTP request per tag and has a page-count asymmetry with get_github_releases_timestamps
static/compatibilities/argo-workflows.yaml New compatibility metadata file with correct icon, URLs, and helm chart info; missing eolApiSlug field present in other addon YAMLs
static/compatibilities/manifest.yaml Correctly adds argo-workflows to the scraper manifest in the Argo-tools grouping

Reviews (1): Last reviewed commit: "added argo-workflows scraper" | Re-trigger Greptile

Comment on lines +42 to +55
def fetch_github_tags():
"""Fetch release tags from GitHub API."""
tags = []
for page in range(1, 5):
url = f"https://api.github.com/repos/{github_repo_owner}/{github_repo_name}/tags"
response = requests.get(url, params={"page": page, "per_page": 100})
if response.status_code != 200:
print_error(f"Failed to fetch GitHub tags. Status code: {response.status_code}")
break
page_tags = [tag["name"] for tag in response.json()]
if not page_tags:
break
tags.extend(page_tags)
return tags
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Per-tag HTTP requests with no throttling

fetch_k8s_versions_from_tag is called for every tag that passes the "-" in tag filter and has a matching chart version. With up to 400 tags fetched across 4 pages, this can generate hundreds of sequential unauthenticated HTTP requests to raw.githubusercontent.com with no rate limiting or delay between calls.

The argo-rollouts.py scraper follows the same pattern, so this is consistent with the codebase — but given the larger number of argo-workflows releases, it may be worth adding a short delay or pre-checking whether the version already exists in the YAML before fetching.

Comment on lines +63 to +67

chart_versions = get_chart_versions(app_name)
kube_releases = get_kube_release_info()
argo_releases = list(reversed(list(get_github_releases_timestamps(github_repo_owner, github_repo_name))))
pruned_argo_releases = {r.lstrip("v"): ts for r, ts in argo_releases if "-" not in r}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Page-count asymmetry between tag fetching and release fetching

fetch_github_tags fetches up to 4 pages (≤ 400 tags), but get_github_releases_timestamps in utils.py only fetches 2 pages (≤ 200 releases). For any tag that falls outside the 200-release window and does not have a hack/k8s-versions.sh file, the fallback at lines 85–91 will find no matching entry in pruned_argo_releases and silently skip that version:

print_warning(f"No K8s version info found for {tag}, skipping.")
continue

In practice the most recent releases are always within the first 200 entries, so current data should be complete. For completeness of historical data, consider aligning the two page counts or documenting the intentional cap.

Comment on lines +1 to +6
icon: https://avatars.githubusercontent.com/u/30269780?s=200&v=4
git_url: https://github.com/argoproj/argo-workflows
release_url: https://github.com/argoproj/argo-workflows/releases/tag/v{vsn}
helm_repository_url: https://argoproj.github.io/argo-helm
chart_name: argo-workflows
versions: []
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing eolApiSlug field

Several other addon YAML files (e.g. cert-manager.yaml, cilium.yaml) include an eolApiSlug field, which main.py uses to enrich version entries with EOL dates from endoflife.date. Argo Workflows has an entry there under the slug argo-workflows.

Omitting this field means the EOL enrichment step will be skipped entirely for this addon. Consider adding:

Suggested change
icon: https://avatars.githubusercontent.com/u/30269780?s=200&v=4
git_url: https://github.com/argoproj/argo-workflows
release_url: https://github.com/argoproj/argo-workflows/releases/tag/v{vsn}
helm_repository_url: https://argoproj.github.io/argo-helm
chart_name: argo-workflows
versions: []
icon: https://avatars.githubusercontent.com/u/30269780?s=200&v=4
git_url: https://github.com/argoproj/argo-workflows
release_url: https://github.com/argoproj/argo-workflows/releases/tag/v{vsn}
helm_repository_url: https://argoproj.github.io/argo-helm
chart_name: argo-workflows
eolApiSlug: argo-workflows
versions: []

release_url: https://github.com/argoproj/argo-workflows/releases/tag/v{vsn}
helm_repository_url: https://argoproj.github.io/argo-helm
chart_name: argo-workflows
versions: []
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should have a list of versions discovered, did the scraper not run or is it just not yet functional?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants